Workers AI
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_slug}/workers-ai/
When making requests to Workers AI, replace https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run
in the URL you’re currently using with https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_slug}/workers-ai
.
Then add the model you want to run at the end of the URL. You can see the list of Workers AI models and pick the ID.
You’ll need to generate an API token with Workers AI read access and use it in your request.
Request to Workers AI llama modelcurl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_slug}/workers-ai/@cf/meta/llama-3-8b-instruct \ --header 'Authorization: Bearer {cf_api_token}' \ --header 'Content-Type: application/json' \ --data '{"prompt": "What is Cloudflare?"}'
Request to Workers AI text classification modelcurl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_slug}/workers-ai/@cf/huggingface/distilbert-sst-2-int8 \ --header 'Authorization: Bearer {cf_api_token}' \ --header 'Content-Type: application/json' \ --data '{ "text": "Cloudflare docs are amazing!" }'
OpenAI compatible endpoints
Workers AI supports OpenAI compatible endpoints for text generation (/v1/chat/completions
) and text embedding models (/v1/embeddings
). This allows you to use the same code as you would for your OpenAI commands, but swap in Workers AI easily.
Request to OpenAI compatible endpointcurl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_slug}/workers-ai/v1/chat/completions \ --header 'Authorization: Bearer {cf_api_token}' \ --header 'Content-Type: application/json' \ --data '{ "model": "@cf/meta/llama-3-8b-instruct", "messages": [ { "role": "user", "content": "What is Cloudflare?" } ] }'