Hugging Face has released Transformers Agents 2.0, a major update to its framework for building AI agents. The new version introduces a groundbreaking feature: the ability for agents to make and receive phone calls autonomously. This expands the platform's utility beyond text and code execution into real-world voice interactions.
"We're enabling AI agents to operate in the physical world through voice," said the Hugging Face team.
The update includes enhanced tool-use capabilities, improved memory management, and seamless integration with external APIs. Developers can now create voice-enabled assistants that can handle customer service calls, schedule appointments, or conduct surveys without human intervention.
Transformers Agents 2.0 builds on the success of its predecessor by simplifying the creation of multi-step reasoning agents. The phone call feature leverages OpenAI's Whisper for speech recognition and ElevenLabs for natural-sounding speech synthesis, all orchestrated through the Transformers library.
Early adopters have demonstrated use cases ranging from ordering pizza to conducting job interviews. However, the feature raises ethical considerations around consent and transparency when AI calls humans.
Hugging Face emphasizes that developers must implement safety measures, such as disclosing the AI nature at the start of calls. The framework is open-source, allowing community scrutiny and customization.
Transformers Agents 2.0 is available now on the Hugging Face Hub, with extensive documentation and example projects for developers to get started.