Introduce Our New Paper "OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use"

OSAgentSurvey · January 2, 2025, 10:06am

Hi everyone!

We’re excited to share our latest research: " OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use." This work delves into the rapidly evolving field of OS Agents——(M)LLM-based Agents using computing devices (e.g., computers and mobile phones) by operating within the environments and interfaces (e.g., Graphical User Interface (GUI)) provided by operating systems (OS) to automate tasks.

Link to Full Paper: OS-Agent-Survey/OS-Agent-Survey

Link to our Homepage: https://os-agent-survey.github.io/

Highlights from the Paper:

Foundational Insights: We define what constitutes OS Agents, exploring their core components (environment, observation space, and action space) and essential capabilities like understanding, planning, and grounding.
Construction Methodologies: Dive into the use of domain-specific foundation models, agent frameworks, and key techniques like supervised fine-tuning and memory mechanisms that empower these agents.
Evaluation Benchmarks: A review of protocols and metrics used to assess OS Agents and provide a comprehensive look at existing related benchmarks.
Challenges and Future Directions: From safety and privacy to personalization and self-evolution, we outline the critical challenges and opportunities ahead.

Join the Conversation:

We’ve created an open-source GitHub repository to support ongoing research and foster collaboration in this domain.
We’d love to hear your thoughts! What do you think about the future of OS Agents? Let’s discuss!

saiman123 · March 21, 2025, 4:31am

Pioneering survey! Comprehensively explores MLLM-based agents’ cross-device potential with insightful OS interaction analysis. Valuable roadmap for automating real-world computing tasks. Open-source implementation accelerates community progress. Shapes next-gen human-AI collaboration paradigms.

Topic		Replies	Views
Seeking arXiv Endorsement for cs.AI Submission — Motivation Structure in Human-like AI Beginners	1	19	June 30, 2025
AANN: Agents As Neural Networks Research	0	46	March 8, 2025
Community content of the week (02/03/2022) Community Calls	0	1765	February 3, 2022
New Framework smolagents Beginners	3	713	January 15, 2025
Any ML professionals mind helping out with an academic survey? Community Calls	0	335	August 2, 2023

Introduce Our New Paper "OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use"

Highlights from the Paper:

Join the Conversation:

Related topics