Microsoft Unveils New Features for Trustworthy AI Agents
Microsoft has introduced new features to make building trustworthy AI agents easier and more secure. At the company’s annual developer conference, Microsoft Build, the tech giant unveiled platform updates that focus on enhancing the safety and reliability of AI agents.
AI agents are becoming increasingly popular as they can perform tasks on behalf of users, from simple actions like sending emails to complex tasks such as approving procurement orders. However, this increased capability also raises significant safety concerns as these agents access user data.
Enhanced Monitoring and Testing
To address these concerns, Microsoft has introduced new tools to simplify the testing and monitoring of AI agents. Agent Evaluators help measure how well agents understand and carry out user goals, while a new AI Red Teaming Agent automates the red teaming of generative AI systems by recreating realistic attack scenarios. This helps identify vulnerabilities and mitigate risks.
“By using new agentic technology and turning our safety evaluation system into an agent, that will just adversarially red team your system; it is way, way easier to use, and it also results in better testing, because the agent is able to iteratively adversarially attack your system and try to get through,” said Sarah Bird, CPO of Responsible AI at Microsoft.
Improved Observability and Security
The Agent Observability features in Azure AI Foundry provide developers with a single dashboard to view metrics such as performance, quality, cost, and safety throughout the AI development cycle. Defender alerts in Foundry offer real-time visibility into security threats, enabling developers to assess and respond to suspicious activity.
Microsoft has also integrated its Microsoft Purview service with Azure AI Foundry, allowing users to apply enterprise-grade security governance and compliance controls to AI systems.
New Identity Management and Guardrails
Microsoft is introducing the preview of Microsoft Entra Agent ID, which expands access management to AI agents built with Azure AI Foundry and Microsoft Copilot Studio. This allows admins to control access permissions for AI agents similarly to how they manage human identities.
Additional guardrails include the Spotlighting capability in Content Safety, which strengthens the Prompt Shields guardrail to better detect and stop indirect prompt injections. Other new features include a PII detection guardrail and a task adherence guardrail in Content Safety.
These updates demonstrate Microsoft’s commitment to making AI agents more trustworthy and secure, addressing the growing concerns around their adoption in various industries.
