ChatGpt can now do everything for us: how it works and how to best use Agent

OpenAI has launched Agent, a new chatbot feature that can perform complex digital tasks on behalf of users from start to finish. Agent is based on a new, dedicated model that more deeply integrates the deep search and Operator modes, already available to paid users, with the bot's conversational interface.

The new agent must be trained with a specific prompt and then, OpenAI says, it does everything on its own: it searches the web for relevant information and uses a virtual computer to make decisions and perform advanced actions on external services, including e-commerce and platforms that require authorization.

How ChatGpt agent works

OpenAI has equipped the agent with various tools for accessing and interacting with the web: a visual browser that navigates through the graphical interface; a text browser, used for simpler searches; and for more advanced users, a terminal and direct access to the API (the programming interface).

Using the "Connectors" feature, the model can connect to apps like Gmail or Github to obtain more precise information and refine searches. By controlling the agent's browser, users can also log in directly to external sites.

With these digital tools, the model can collect information through APIs from other sites, analyze large amounts of text with the text browser, or interact visually with websites designed for human users, much like browser extensions that automatically control the mouse.

Before performing risky actions or actions that require access to private data, OpenAI explains, the agent requires the user's approval and intervention. Otherwise, it can collect, organize, and present information completely autonomously, generating specific files such as spreadsheets, text files, and PowerPoint presentations.

What can it do?

OpenAI has provided several practical examples of what can be done using the chatbot's agent mode. In one promotional video, a company engineer uses the agent to create a travel itinerary to Palm Springs for the Indian Wells Open Tennis Tournament.

The system searches for match dates, then connects to the user's calendar (via connector) to see what commitments are already in place, then moves to the browser to search for possible flights from San Francisco and compiles a travel proposal.

In another example, the agent creates a spreadsheet based on budget data for the city of San Francisco; in yet another, it creates a presentation on financial support for tech companies in Singapore and compiles a report on office availability. In all examples, the emphasis is on how the agent mode frees up the user's time, allowing them to go to lunch or walk the dog while the system works for them: when the search and any files are ready, a notification arrives on their smartphone via the app.

The examples are certainly not interesting or useful for understanding how Agent works, but they appear particularly US-centric and intended for a relatively narrow professional audience. OpenAI assures us, however, that Agent can lend itself to a much wider range of applications than those outlined in the press and marketing materials.

The problem of hallucinations

We have no doubt that this is the case, but a problem remains: what to do with the still-inescapable hallucinations? In one example, the engineer suggests that the budget information collected in an Excel file by the agent is "98% correct."

But without further guidance, how can we know how important that 2% is? Even a small error in a client presentation can cost us a job; in other, more serious cases, it can lead to compliance issues and legal repercussions.

And while it's true that a human would have taken a few hours to create the same Excel file, and perhaps even made some mistakes, it will still take much longer than OpenAI assumes to review that file, ensure there are no gaps, and review and search for data that may appear to be incorrect.

I want a digital life

In short, agentic mode is certainly an impressive (and worrying) step forward for OpenAI's potential, but its acceptance as a major innovation rests on the assumption that much of OpenAI's narrative is based on: that the errors and hallucinations that the company's magnificent models and advances continue to suffer from can simply be ignored.

The other aspect not to be underestimated is the level of digitalization of one's life required for the system to function properly. I don't know about you, but we don't usually use Google Calendar to robotically organize every aspect of our lives, including dinners with friends or nights out months away. In other words, a travel agent, for us, couldn't function at all like the example promo above due to a simple lack of data.

A gap that, to be solved, requires giving up all spontaneity and serendipity: the advantage is that a closed-source bot from an American company can save us a few hours organizing a trip on our behalf. In the meantime, we can get bored, scroll through Instagram, or maybe even work even harder.

The security problem

OpenAI also bluntly admits that this is the most potentially dangerous model released to date, given the ability to automate web-based actions with direct real-world consequences. For this reason, the company assures, the alignment and security limits are very stringent.

The agent cannot perform high-risk tasks, conduct financial transactions, or provide legal advice. It has also been trained to minimize the risk of prompt injection (the "hijacking" of system directives with malicious prompts) and to reject malicious or potentially dangerous and illegal requests. Finally, any critical steps, such as sending emails, never occur automatically without the user's explicit approval.

When will Chagpt Agent arrive in Italy?

Users can activate the agent during any conversation with the chatbot by selecting the corresponding mode from the tool list. Simply send your prompt, and the agent will do the rest. Results aren't immediate: as with the deep search function, it takes some time, sometimes even hours, depending on the complexity of the request. The result can then be further refined with additional requests.

For now, Agent is only available in the US, Canada, and the UK for users on the Pro, Plus, and Team plans. The rollout began today, July 18, and will continue over the next few days. Education and Enterprise users will receive the update in the coming weeks. Because the model is particularly resource-hungry, requests will be limited: Pro users will have 400 messages per month, while others will only have 40, with the option to add requests by purchasing additional credits.

The Operator feature will remain available for some time, and then it will be retired. Agent is not yet available in Italy and the rest of Europe. The company is "finalizing the launch schedule." Given the pervasiveness of the new model and the potential access to so much sensitive information, we assume OpenAI's lawyers have a lot of work to do to ensure compliance with European privacy regulations.

How to record meeting minutes with ChatGpt agent

The "record mode" feature has arrived in Italy, allowing you to record and transcribe meetings, interviews, and brainstorming sessions. It can be activated via a new "rec" button located at the bottom right of the chatbot interface. Currently, however, record mode is only available to paid plan users and only on the Mac desktop app for subscription plan users.

Clicking the button starts recording and opens a dedicated pop-up to pause or end the session, which can then be sent to OpenAI's servers for transcription and summary. The result is a schematic report highlighting key points and tasks (if any). Record mode also works very well in Italian and seems to us, in its own small way, a much more immediately practical step forward than the agentic mode.

The transcript of a test meeting with record mode. All good, except that chatGPT didn't understand "chatGPT."

Privacy concerns must also be considered here. OpenAI says the recordings are used solely for transcription and then destroyed. However, if the user has opted in to train the model via their chats ("Improve ChatGpt for everyone" in the preferences), then the report and further chat interactions could be used by OpenAI as training material.