Skip to main content

API Providers

OGX composes 23 inference providers, 15 vector stores, 6 tool runtimes, and 3 file storage options into a single deployable server. No other open-source project covers this surface area in one process.

Providers come in two types:

  • Remote: adapts an external service (Ollama, OpenAI, vLLM, Bedrock, etc.)
  • Inline: runs in-process within OGX (FAISS, sentence-transformers, file-search, etc.)
info

At least one inline provider exists for each API so you can run a fully featured stack locally without any external dependencies.

Provider categories

Reference

CategoryCountExamplesDocs
Inference
23Ollama, vLLM, OpenAI, Bedrock, Anthropic, Gemini, WatsonX, and more
Vector IO
15FAISS, ChromaDB, Qdrant, Milvus, PGVector, Weaviate, Elasticsearch
Tool Runtime
6File Search, Brave Search, Tavily, MCP, Wolfram Alpha
Files
3Local filesystem, S3, OpenAI Files
DatasetIO
2Local filesystem, HuggingFace
External
Build your own provider

Why this matters

Gateways like LiteLLM route inference requests to multiple providers. That's useful, but inference routing is one API.

OGX composes the full application: an agent running on the Responses API can call a model via any inference provider, search documents in any vector store, invoke tools via MCP, and stream the result back. All in one server process, all through OpenAI-compatible endpoints.

The composition is the hard part. Making inference + vector stores + files + tools + agentic orchestration work together correctly across dozens of provider combinations is what takes years, not months.

OpenAI Compatibility

OGX is OpenAI-first. The primary API surface implements the OpenAI spec. An Anthropic Messages adapter (/v1/messages) translates to the inference API for teams that use the Anthropic SDK.

See the OpenAI compatibility guide and known Responses API limitations.