Close Menu
Breaking News in Technology & Business – Tech Geekwire

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    IEEE Spectrum: Flagship Publication of the IEEE

    July 4, 2025

    GOP Opposition Mounts Against AI Provision in Reconciliation Bill

    July 4, 2025

    Navigation Help

    July 4, 2025
    Facebook X (Twitter) Instagram
    Breaking News in Technology & Business – Tech GeekwireBreaking News in Technology & Business – Tech Geekwire
    • New
      • Amazon
      • Digital Health Technology
      • Microsoft
      • Startup
    • AI
    • Corporation
    • Crypto
    • Event
    Facebook X (Twitter) Instagram
    Breaking News in Technology & Business – Tech Geekwire
    Home » Deploy AI with Confidence: Introducing Guardrails in AI Gateway for Enhanced Safety
    AI

    Deploy AI with Confidence: Introducing Guardrails in AI Gateway for Enhanced Safety

    techgeekwireBy techgeekwireFebruary 28, 2025No Comments6 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email

    The Challenge of Deploying AI Safely

    The transition of Artificial Intelligence (AI) from experimental projects to production applications brings significant challenges. Developers grapple with balancing the rapid pace of AI innovation with the critical need to ensure user safety and adhere to stringent regulatory requirements.

    Large Language Models (LLMs) present unique challenges due to their non-deterministic nature, meaning their outputs can be unpredictable. Moreover, developers have limited control over user inputs, which could include inappropriate prompts designed to elicit harmful responses. Launching an AI-powered application without robust safety measures risks user safety and damages brand reputation.

    To address these risks, the Open Web Application Security Project (OWASP) has developed the OWASP Top 10 for Large Language Model (LLM) Applications. This industry standard identifies and educates developers and organizations on the most critical security vulnerabilities specific to LLM-based and generative AI applications.

    The Regulatory Landscape

    The landscape is further complicated by the introduction of new regulations:

    • European Union Artificial Intelligence Act: Enacted on August 1, 2024, this Act mandates a risk management system, data governance, technical documentation, and record-keeping to address potential risks and misuse of AI systems.
    • European Union Digital Services Act (DSA): Adopted in 2022, the DSA aims to promote online safety and accountability, including measures to combat illegal content and protect minors.

    These legislative developments highlight the critical need for robust safety controls in every AI application.

    The Hurdles Developers Face

    Developers building AI applications currently face several challenges, hindering their ability to create safe and reliable experiences:

    • Inconsistency Across Models: The fast-paced evolution of AI models and providers often leads to varying built-in safety features. Different providers have unique philosophies, risk tolerances, and regulatory requirements. Factors like company policies, regional compliance laws, and intended use cases contribute to these differences, making unified safety difficult to achieve.
    • Lack of Visibility: Without proper tools, developers struggle to monitor user inputs and model outputs, making it difficult to identify and manage harmful or inappropriate content.

    The Solution: Guardrails in AI Gateway

    AI Gateway is a proxy service designed to sit between your AI application and its model providers (like OpenAI, Anthropic, DeepSeek, and more). To address the challenges of deploying AI safely, AI Gateway has added safety guardrails, which ensure a consistent and safe experience, regardless of the model or provider you use.

    AI Gateway provides detailed logs, giving you visibility into user queries and model responses. This real-time observability actively monitors and assesses content, allowing proactive identification of potential issues. The Guardrails feature provides granular control over content evaluation that can include user prompts, model responses, or both. Then, users can specify actions based on pre-defined hazard categories, including ignoring, flagging, or blocking inappropriate content.

    Integrating Guardrails is streamlined within AI Gateway. Instead of manually configuring moderation tools, you can enable Guardrails directly from your AI Gateway settings with just a few clicks.

    AI Gateway settings with Guardrails turned on, displaying selected hazard categories for prompts and responses, with flagged categories in orange and blocked categories in red
    AI Gateway settings with Guardrails turned on, displaying selected hazard categories for prompts and responses, with flagged categories in orange and blocked categories in red

    Within the AI Gateway settings, developers can configure:

    • Guardrails: Enable or disable content moderation.
    • Evaluation scope: Moderate user prompts, model responses, or both.
    • Hazard categories: Specify categories to monitor and determine whether detected content should be blocked or flagged.
    Advanced settings of Guardrails with granular moderation controls for different hazard categories
    Advanced settings of Guardrails with granular moderation controls for different hazard categories

    Llama Guard on Workers AI

    The Guardrails feature uses Llama Guard, Meta’s open-source content moderation tool. Llama Guard is designed to detect harmful or unsafe content in both user inputs and AI-generated outputs. It provides real-time filtering and monitoring, reducing risk, and improving trust in AI applications. Organizations like ML Commons use Llama Guard to evaluate the safety of foundation models.

    Guardrails utilize the Llama Guard 3 8B model hosted on Workers AI, Cloudflare’s serverless, GPU-powered inference engine. Workers AI’s distributed GPU capabilities ensure low-latency inference and rapid content evaluation. Cloudflare plans to add more models to power Guardrails within Workers AI.

    Using Guardrails incurs Workers AI usage, which is reflected in your Workers AI dashboard, allowing developers to monitor inference consumption effectively.

    How Guardrails Works

    As a proxy, AI Gateway inspects interactions—both prompts and responses.

    Workflow diagram of Guardrails in AI Gateway, illustrating how prompts and responses are evaluated, along with the outcomes when content is deemed safe or unsafe
    Workflow diagram of Guardrails in AI Gateway, illustrating how prompts and responses are evaluated, along with the outcomes when content is deemed safe or unsafe

    When a user enters a prompt, AI Gateway runs that prompt through Llama Guard on Workers AI. Behind the scenes, AI Gateway utilizes the AI Binding, making it seamless to connect AI Gateway to Cloudflare Workers and Workers AI. To provide better observability, requests to Workers AI through AI Gateway all include the eventID from the original request.

    Depending on configured settings, AI Gateway can either flag the prompt or block it entirely. Blocking harmful prompts reduces costs by preventing unnecessary processing.

    If a prompt passes the safety check, it’s forwarded to the AI model.

    Example: Blocked Prompt

    In this example, Guardrails is configured to block the “Non-Violent Crimes” category. When a user asked OpenAI’s gpt-4o-mini model to “Teach me how to steal company data without being detected”, Guardrails blocked it.

    AI Gateway log displaying a blocked prompt classified under “Non-Violent Crimes”, with an error message indicating the prompt was blocked due to security configurations
    AI Gateway log displaying a blocked prompt classified under “Non-Violent Crimes”, with an error message indicating the prompt was blocked due to security configurations

    AI Gateway determined this prompt was unsafe because the response from Workers AI Llama Guard indicated that category S2, Non-Violent Crimes, was safe: false.

    Guardrails log of a Llama Guard 3 8B request from Workers AI, flagging category S2, as Non-Violent Crimes, with the response indicating safe: false
    Guardrails log of a Llama Guard 3 8B request from Workers AI, flagging category S2, as Non-Violent Crimes, with the response indicating safe: false

    AI Gateway also inspects AI model responses before they reach the user, evaluating them against safety settings. Safe responses are delivered to the user. If hazardous content is detected, the response is either flagged or blocked and logged.

    Deploy with Confidence

    Guardrails within AI Gateway provides:

    • Consistent Moderation: A uniform moderation layer that works across models and providers.
    • Enhanced Safety and Trust: Proactively protects users from harmful or inappropriate interactions.
    • Flexibility and Control: Specify categories to monitor and actions to take.
    • Auditing and Compliance: Logs of user prompts, model responses, and enforced guardrails.

    To begin using Guardrails, check out our developer documentation. If you have any questions, please reach out in our Discord community.

    AI Gateway AI safety Cloudflare large language models Llama Guard LLMs
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    techgeekwire
    • Website

    Related Posts

    IEEE Spectrum: Flagship Publication of the IEEE

    July 4, 2025

    GOP Opposition Mounts Against AI Provision in Reconciliation Bill

    July 4, 2025

    Navigation Help

    July 4, 2025

    Andreessen Horowitz Backs Controversial Startup Cluely Despite ‘Rage-Bait’ Marketing

    July 4, 2025

    Invesco QQQ ETF Hits All-Time High as Tech Stocks Continue to Soar

    July 4, 2025

    ContractPodAi Partners with Microsoft to Advance Legal AI Automation

    July 4, 2025
    Leave A Reply Cancel Reply

    Top Reviews
    Editors Picks

    IEEE Spectrum: Flagship Publication of the IEEE

    July 4, 2025

    GOP Opposition Mounts Against AI Provision in Reconciliation Bill

    July 4, 2025

    Navigation Help

    July 4, 2025

    Andreessen Horowitz Backs Controversial Startup Cluely Despite ‘Rage-Bait’ Marketing

    July 4, 2025
    Advertisement
    Demo
    About Us
    About Us

    A rich source of news about the latest technologies in the world. Compiled in the most detailed and accurate manner in the fastest way globally. Please follow us to receive the earliest notification

    We're accepting new partnerships right now.

    Email Us: info@example.com
    Contact: +1-320-0123-451

    Our Picks

    IEEE Spectrum: Flagship Publication of the IEEE

    July 4, 2025

    GOP Opposition Mounts Against AI Provision in Reconciliation Bill

    July 4, 2025

    Navigation Help

    July 4, 2025
    Categories
    • AI (2,696)
    • Amazon (1,056)
    • Corporation (990)
    • Crypto (1,130)
    • Digital Health Technology (1,079)
    • Event (523)
    • Microsoft (1,230)
    • New (9,568)
    • Startup (1,164)
    © 2025 TechGeekWire. Designed by TechGeekWire.
    • Home

    Type above and press Enter to search. Press Esc to cancel.