
Wanism’s Newsletter
What happened in tech that actually mattered, and what did it mean?
What happened in tech that actually mattered, and what did it mean?
As Large Language Models (LLMs) demonstrate increasingly powerful reasoning capabilities, enabling AI to decompose tasks independently has become the next crucial development phase. AI Agents represent the next frontier that all major tech companies pursue, exemplified by Anthropic’s launch of Computer Use and Microsoft’s announcement of customizable Copilot Studio.
On October 22, 2024, Anthropic announced its upgraded Claude 3.5 Sonnet and Claude 3.5 Haiku AI models, alongside launching a PC operation functionality for Claude AI assistant, branded as “Computer Use.”
Anthropic states that the “Computer Use” feature remains in early development and is unavailable to the general public. However, API access has made it available to developers for further application development.
“Computer Use” enables AI to view screen content, launch programs, input content, and click mouse buttons. This means when tasks are assigned to the AI Agent, it can use the mouse to open software, correctly select application features, and utilize applications to complete assigned tasks. If results must be pasted into a notepad, the AI model can perform copy-and-paste operations, demonstrating comprehensive computer control capabilities.
However, developer feedback shows unexpected pauses and operations have occurred during usage.
For instance, when tasked with compiling data on the top ten companies by market capitalization, including Apple’s revenue, profits, CEO, company address, contact email, and company website information, Claude AI would first search for the list of top ten companies by market value, then search for the specified data for each company, and finally organize all information into an Excel spreadsheet and save it, completing the entire task.
Another example involves finding sunset viewing locations, calculating distances, and determining departure times. When given this task, Claude AI would first search for optimal sunset viewing spots nearby, then check the distance from the hotel, verify tomorrow’s sunset time, and search for nearby parking. If sunset is at 6:00 PM with a 30-minute drive, Claude AI would plan for a 5:20 PM departure from the hotel and input the entire itinerary, navigation maps, and other information into a note-taking application.
These examples demonstrate AI computer use scenarios, replicating a natural person’s steps to complete tasks. Anthropic’s Computer Use functionality represents the next step for large language models toward becoming Agents. If current large language models possess 30% of Agent capabilities, Computer Use functionality demonstrates advancement to 70%. While AI Agent outputs may not be perfect, they can effectively complete many tasks.
Looking at long-term development, Anthropic aims to assist with daily routine tasks and customer service matters through artificial intelligence. While similar applications exist in the market, Anthropic appears to be pursuing more concrete operational capabilities through its AI technology, particularly for functions that conventional conversational AI cannot handle.
In this update, Anthropic simultaneously launched the Claude 3.5 Sonnet upgrade and the new Claude 3.5 Haiku AI model.
The new Claude 3.5 Sonnet claims performance superiority over OpenAI’s GPT-4 and even exceeds the previously released Claude 3 Opus, with more than twice as fast response speeds. It is now available to users, including those with free accounts.
Claude 3.5 Haiku is designed for faster response times while operating at lower costs. Its performance even surpasses the previously released Claude 3 Opus model. It is expected to be available through Anthropic API resources or via Amazon Bedrock and Google Cloud Vertex AI platforms by late October. The cost for inputting one million characters is approximately $0.25, while outputting one million characters costs about $1.25.
On October 21, 2024, at the AI Tour event in London, Microsoft announced plans to launch AI Agents. Next month, Copilot’s Copilot Studio will allow customization of AI Agents, including capabilities for enterprises to build autonomous AI agent services to handle various repetitive tasks.
This means if an enterprise needs an AI robot to execute Task A, they can customize it through Copilot Studio. While the original Copilot concept focused on improving task efficiency as an assistant, Agents represent a step further, offering more complete task execution capabilities and reducing the need for manual operation.
Microsoft anticipates that future enterprise employees will be able to establish multiple AI agents through Copilot Studio to assist with daily repetitive tasks. These agents can be designed for teams or companies, creating autonomous operational services. Additionally, through SaaS architecture, AI models, low-code design interfaces, and numerous integration interfaces, enterprises can more easily establish agent service functions that comply with internal standards.
Observing current market dynamics, 2025 will likely be a turning point in AI Agent development. All major tech giants, including Microsoft, OpenAI, Anthropic, Google, and Meta, are preparing for this transformation. More proof-of-concept demonstrations and initial product launches are expected by the end of 2024, paving the way for comprehensive development in 2025. This development process will encompass three aspects: enhancement of underlying models, operating system integration, and application innovation.
This transformation represents more than technological advancement; it signifies a reorganization of the entire industry ecosystem. Just as personal computers and smartphones changed work methods, AI Agents will redefine human-machine interaction patterns. This is not merely a new tool but a new computing paradigm, laying the foundation for the next wave of technological innovation. What is visible is the beginning; as technology matures and application scenarios expand, AI Agents will fundamentally transform work methods and efficiency standards in the coming years.