API Providers
OGX composes 23 inference providers, 15 vector stores, 6 tool runtimes, and 3 file storage options into a single deployable server. No other open-source project covers this surface area in one process.
Providers come in two types:
- Remote: adapts an external service (Ollama, OpenAI, vLLM, Bedrock, etc.)
- Inline: runs in-process within OGX (FAISS, sentence-transformers, file-search, etc.)
At least one inline provider exists for each API so you can run a fully featured stack locally without any external dependencies.
Provider categories
Ollama, vLLM, OpenAI, Bedrock, Anthropic, Gemini, WatsonX, and more
FAISS, SQLite-Vec, ChromaDB, Qdrant, Milvus, PGVector, Weaviate
File Search, Brave Search, Tavily, MCP, Wolfram Alpha
Local filesystem and S3 storage backends
Local filesystem and HuggingFace dataset loading
Build your own provider and integrate it with OGX
Reference
| Category | Count | Examples | Docs |
|---|---|---|---|
Inference | 23 | Ollama, vLLM, OpenAI, Bedrock, Anthropic, Gemini, WatsonX, and more | Inference Providers→ |
Vector IO | 15 | FAISS, ChromaDB, Qdrant, Milvus, PGVector, Weaviate, Elasticsearch | Vector IO Providers→ |
Tool Runtime | 6 | File Search, Brave Search, Tavily, MCP, Wolfram Alpha | Tool Runtime Providers→ |
Files | 3 | Local filesystem, S3, OpenAI Files | Files Providers→ |
DatasetIO | 2 | Local filesystem, HuggingFace | DatasetIO Providers→ |
External | — | Build your own provider | External Providers Guide→ |
Why this matters
Gateways like LiteLLM route inference requests to multiple providers. That's useful, but inference routing is one API.
OGX composes the full application: an agent running on the Responses API can call a model via any inference provider, search documents in any vector store, invoke tools via MCP, and stream the result back. All in one server process, all through OpenAI-compatible endpoints.
The composition is the hard part. Making inference + vector stores + files + tools + agentic orchestration work together correctly across dozens of provider combinations is what takes years, not months.
OpenAI Compatibility
OGX is OpenAI-first. The primary API surface implements the OpenAI spec. An Anthropic Messages adapter (/v1/messages) translates to the inference API for teams that use the Anthropic SDK.
See the OpenAI compatibility guide and known Responses API limitations.