Languages
use cases
Modernize applications
Modernize security
Modernize networks
CxO topics
Industries
Resources
Engage
products
SASE and workspace security
Application security
Application performance
Networking
plans & pricing
Global services
documentation
Products
Artificial Intelligence
Compute
Media
Storage & database
Plans & Pricing
Partnership Types
Build
Explore
Support
Company info
Trust, Privacy, & Safety
Public Interest
AI Gateway
How it works
Use cases
FAQs
Cloudflare AI Gateway provides centralized visibility and control for your AI applications. Connect your apps with a single line of code to monitor usage, costs, and errors. Reduce risks and expenses through caching, rate limiting, request retries, and model fallbacks. Ensure reliability, scalability, and productivity with minimal effort.
Connect your AI apps to AI Gateway for a unified dashboard and control costs with usage stats, rate limiting, and caching.
Gain visibility into prompts, AI API requests, errors, token usage, costs, and more. Logs are available for auditing and troubleshooting.
Unify the top AI providers including Hugging Face, OpenAI, Anthropic and Workers AI, for comprehensive visibility into your AI applications.
HOW IT WORKS
By shifting features such as rate limiting, caching, and error handling to the proxy layer, organizations can apply unified configurations across AI apps and inference service providers. AI Gateway sits between your application and the AI provider to give you multivendor AI observability and control.
"Without AI Gateway, it’s difficult to see which applications are driving the majority of the costs with the OpenAI API … We can choose to limit the number of requests used by certain tools to control costs."
Rightblogger
Top AI Gateway use cases
Real-time insights and reliability with logs, metrics, rate limiting, caching, and monitoring.
Effortlessly connect the most popular providers- Workers AI, Hugging Face, OpenAI, Anthropic, and more with just one line of code.
Optimize costs and reduce latency with custom caching. Control scaling and prevent excessive activity with rate limiting.
Documentation
Video
Solution brief
AI Gateway provides centralized visibility and control for AI applications. It acts as a proxy between your application and AI providers to help you monitor usage, control costs, and reduce risks. AI Gateway offers logs, metrics, rate limiting, caching, and monitoring for AI applications.
AI Gateway helps with controlling costs for all your AI applications, gaining easy-to-use analytics for troubleshooting, and unifying management across the most popular AI providers.
AI Gateway supports most popular providers, including Hugging Face, OpenAI, Perplexity, Anthropic, Replicate, Groq, and Cloudflare's own Workers AI.
AI Gateway helps AI app developers control costs by providing usage statistics, allowing them to implement rate limiting to prevent excessive usage, and using caching to reduce latency and more efficiently serve redundant requests.
You can gain visibility into prompts, AI API requests, errors, token usage, and costs. Logs are also available for auditing and troubleshooting.
AI Gateway makes it easy for AI application developers to improve the resilience of their applications by defining request retry and model fallbacks in case of an error. AI Gateway can serve requests directly from Cloudflare's cache instead of the model providers, helping the application to more efficiently serve users at scale. And rate limiting can throttle excessive requests to prevent denial of service to legitimate users.
Someone from Cloudflare will be in touch with you shortly.
In submitting this form, you agree to receive information from Cloudflare related to our products, events, and special offers. You can unsubscribe from such messages at any time. We never sell your data, and we value your privacy choices. Please see our Privacy Policy for information.