AI Agents Reshape Knowledge Work, Increasing Autonomy and Efficiency

An arXiv paper (arXiv:2606.07489) by Jeremy Yang et al. uses production data from Perplexity's Search and Computer products to quantify how agentic systems change knowledge work. According to the paper, Computer performs 26 minutes of autonomous work per user session versus 33 seconds for Search, and reports per-query dissatisfaction rates 55% lower on Computer than on Search. The authors find matched-task completion time falls from 269 to 36 minutes, reducing estimated time and cost by 87% and 94%, respectively. The paper also documents that Computer shifts follow-up queries toward higher-order verification and extension, crosses occupational boundaries more often, and enables composite tasks that Search usage among the same users rarely attempts. These results indicate substantial autonomy, quality, and scope effects when moving from conversational assistants to autonomous agents.
What happened
According to the arXiv paper (arXiv:2606.07489) by Jeremy Yang et al., the authors analyze production logs from Perplexity's Search and Computer products to compare conversational-assistant usage with agentic, autonomous workflows. The paper finds that Computer performs 26 minutes of autonomous work per user session, versus 33 seconds for Search, as measured on near-identical initial-query pairs. The paper reports that autonomy corresponds with higher execution quality, with per-query dissatisfaction rates 55% lower on Computer than on Search. The authors also measure task completion time dropping from 269 to 36 minutes, and estimate time and cost reductions of 87% and 94%, respectively, compared to human work with Search alone. Finally, the paper documents that Computer queries more often cross occupational boundaries, bundle composite subtasks, require higher-order cognition, and surface work activities that are essentially absent from the same users interacting with Search.
Technical details
The paper uses matched-session natural experiments, pairing near-identical initial queries routed to each product to control for task intent. Per the paper, the measured autonomy manifests as automated task decomposition and execution that would otherwise require manual orchestration in Search-led workflows. The authors quantify changes in follow-up query distribution, execution duration, and user dissatisfaction to support claims about efficiency and quality. The analysis is empirical and based on production telemetry rather than controlled lab tasks; the submission is available as arXiv:2606.07489.
Editorial analysis - technical context
Companies and researchers evaluating agentic systems should view these results as an empirical case study showing large efficiency gains when autonomy handles multi-step, interdependent subtasks. Industry-pattern observations suggest that when an agent reliably performs decomposition and execution, user activity migrates toward verification, extension, and cross-domain synthesis. That pattern raises familiar technical questions for practitioners about agent reliability, verification pipelines, evaluation metrics for composite tasks, and cost accounting for end-to-end automation versus human-in-the-loop workflows.
Context and significance
Editorial analysis: The magnitude of the reported reductions in time and cost-87% and 94% by the paper's estimates-makes this study notable for practitioners tracking productivity gains from agentic systems. If similar effects replicate across other platforms and domains, they would materially change tooling priorities, benchmark design, and the engineering effort devoted to integration and verification. At the same time, the shift toward higher-order follow-ups highlights an increased need for evaluation frameworks that measure verification workload and failure modes, not just single-turn accuracy.
What to watch
Editorial analysis: Observers should watch for independent replications on other platforms and domain verticals, published evaluations of agent verification strategies, transparent accounting of cost-savings including compute and supervision, and studies that break down which task types benefit most from autonomy. Also monitor work studying user trust, hallucination rates in composite tasks, and how occupational-boundary crossing affects regulatory or compliance requirements.
Scoring Rationale
The paper provides production-data evidence quantifying large efficiency and quality gains from agentic systems, a notable empirical contribution for practitioners. Its relevance is limited to empirical scope and replication needs, so it is important but not paradigm-shifting.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems

