Skip to main content

Remote-Hosted Distributions

Remote-hosted distributions are Llama Stack endpoints hosted by partners that you can connect to directly.

DistributionEndpointInference
Togetherhttps://llama-stack.together.airemote::together
Fireworkshttps://llamastack-preview.fireworks.airemote::fireworks

Connecting

Point any OpenAI-compatible client at the endpoint:

from openai import OpenAI

client = OpenAI(
base_url="https://llama-stack.together.ai/v1",
api_key="your-together-api-key",
)

response = client.chat.completions.create(
model="meta-llama/Llama-3.2-3B-Instruct",
messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)

Or use the CLI:

pip install llama-stack-client
llama-stack-client configure --endpoint https://llama-stack.together.ai
llama-stack-client models list