Distributions
A distribution is a pre-configured run config that bundles specific providers for a target environment. Think Kubernetes distributions (AKS, EKS, GKE) - the API stays the same, each distribution wires different backends.
The Starter Distribution
Most users should start with starter. It includes all providers and auto-enables them based on available environment variables:
# Local with Ollama (auto-detected)
uv run llama stack run starter
# With OpenAI
OPENAI_API_KEY=sk-xxx uv run llama stack run starter
The starter distribution uses FAISS for vector storage, sentence-transformers for embeddings, and pypdf for file processing - all running locally with no external dependencies.
Types of Distributions
Self-hosted: Run Llama Stack on your own infrastructure with your choice of inference backend (Ollama, vLLM, or cloud providers like OpenAI, Bedrock, etc.). The starter distribution covers this use case.
Hardware-specific: Optimized for specific hardware. Examples include nvidia and dell distributions with vendor-specific provider configurations.
Custom Distributions
You can create your own distribution by writing a run config YAML:
version: 2
distro_name: my-distro
providers:
inference:
- provider_id: vllm
provider_type: remote::vllm
config:
base_url: http://my-gpu-server:8000/v1
vector_io:
- provider_id: pgvector
provider_type: remote::pgvector
config:
host: my-postgres-server
See Building Distributions and Customizing Run Config for details.