Skip to main content

API Providers

Every API in Llama Stack is backed by one or more providers. You can swap providers without changing your application code.

Provider Types

TypeHow it worksExamples
Remote (remote::)Adapts an external service via a thin clientOllama, OpenAI, vLLM, Bedrock, Fireworks, Together
Inline (inline::)Runs entirely inside the Llama Stack processFAISS, SQLite-vec, sentence-transformers, Llama Guard

Llama Stack provides at least one inline provider for each API so you can run a fully featured stack locally without any external dependencies.

Multiple Providers Per API

You can configure multiple providers for the same API. The routing table dispatches requests to the right provider based on the resource (model, vector store, etc.):

providers:
inference:
- provider_id: ollama
provider_type: remote::ollama
config:
base_url: http://localhost:11434/v1
- provider_id: openai
provider_type: remote::openai
config:
api_key: ${env.OPENAI_API_KEY}

Each provider automatically discovers its available models at startup. Requests are routed based on the model identifier - models from Ollama go to the Ollama provider, models from OpenAI go to the OpenAI provider. Same endpoint, same client code.

Available Providers

See the Providers section for the full list of supported providers per API.