Skip to main content

Available Distributions

DistributionUse CaseInference
starterGeneral purpose, prototyping, productionOllama, OpenAI, vLLM, Bedrock, and more
nvidiaNVIDIA NeMo MicroservicesNVIDIA NIM
CustomYour own provider mixAny supported provider

The starter distribution works for most use cases. It includes all providers and auto-enables them based on available environment variables:

uv run llama stack run starter

It supports local inference (Ollama), cloud providers (OpenAI, Bedrock, Azure, etc.), and everything in between. See the Starter Guide for details.

NVIDIA

Optimized for NVIDIA NeMo Microservices. See the NVIDIA Guide.

Passthrough

A minimal distribution that forwards requests to a remote Llama Stack server. See the Passthrough Guide.

Remote-Hosted

Some partners host Llama Stack endpoints that you can connect to directly. See Remote-Hosted Distributions for available options.

Custom

Build your own distribution when you need a specific provider mix. See Building Custom Distributions.