For Agents
Run chat completions, text completions, embeddings, image generation, audio transcription, and text-to-speech through one OpenAI-style endpoint set, and list supported models.
Get started with AI/ML API in minutes using your preferred integration method.
# Add to your MCP client config (Claude Desktop, Cursor, Windsurf)
{
"jentic": {
"url": "https://api.jentic.com/mcp",
"auth": "oauth"
}
}
# Then ask your agent:
"generate a chat completion"
# → Jentic returns the GET /events tool with parameter schema, agent executes.What an agent can do with AI/ML API API.
Generate chat completions for any supported model via POST /v1/chat/completions
Run plain text completions with /v1/completions for prompt-based generation
Create embedding vectors for retrieval and semantic search through /v1/embeddings
Generate images from a text prompt via /v1/images/generations
GET STARTED
Use for: Generate a chat completion for a multi-turn conversation, Create an embedding vector for a chunk of text, Generate an image from a text prompt, Transcribe a voice memo into text
Not supported: Does not handle model fine-tuning, training, or hosted vector storage — use for inference across chat, embeddings, image, and audio only.
Jentic publishes the only available OpenAPI specification for AI/ML API, keeping it validated and agent-ready.
Jentic publishes the only available OpenAPI specification for AI/ML API, keeping it validated and agent-ready. AI/ML API is a unified inference platform that exposes chat completions, completions, embeddings, image generation, audio transcription, and text-to-speech behind a single Bearer-authenticated REST surface modelled on OpenAI conventions. The /v1/models endpoint lists supported models so a client can switch backends without changing integration code.
Transcribe audio to text using /v1/audio/transcriptions
Synthesize speech from text via /v1/audio/speech
Discover available models programmatically with GET /v1/models
Patterns agents use AI/ML API API for, with concrete tasks.
★ Multi-Model Chat Backend
Power a chat product where users can switch between different LLMs without changing the application code. Call /v1/chat/completions with the desired model and use /v1/models to populate a model selector. Switching providers becomes a config change rather than a refactor.
GET /v1/models to enumerate available models, then POST /v1/chat/completions with the model id chosen by the user
RAG Pipeline with Hosted Embeddings
Build a retrieval-augmented chatbot that produces embeddings on AI/ML API. Generate vectors with /v1/embeddings, store them in a vector database, and use /v1/chat/completions for the final answer. The single Bearer credential covers both calls.
POST each document chunk to /v1/embeddings, store the vectors, retrieve nearest neighbours at query time, then POST /v1/chat/completions with the retrieved context
Voice In, Voice Out Assistant
Implement a voice assistant that listens, reasons, and replies in audio. Use /v1/audio/transcriptions to turn speech into text, /v1/chat/completions to generate a reply, and /v1/audio/speech to produce the audio response. The whole loop sits behind one API key.
POST audio to /v1/audio/transcriptions, send the text to /v1/chat/completions, then POST the reply to /v1/audio/speech
Image Generation from Marketing Copy
Generate hero images directly from marketing copy by passing the brief to /v1/images/generations. The single endpoint returns image URLs that can be used in landing-page generators or design tooling without managing a separate image generation provider.
POST /v1/images/generations with the marketing brief as the prompt and link the returned URL on the page
Agent-Driven Multi-Modal Pipeline via Jentic
An AI agent decides at runtime whether to call chat, embeddings, image, or audio endpoints. Through Jentic the agent searches for the right operation, the AI/ML API Bearer token is supplied from the vault, and the agent never sees the raw token.
Use Jentic search 'generate a chat completion', execute /v1/chat/completions, and chain /v1/embeddings if retrieval is needed
7 endpoints — jentic publishes the only available openapi specification for ai/ml api, keeping it validated and agent-ready.
METHOD
PATH
DESCRIPTION
/v1/chat/completions
Generate a chat completion
/v1/completions
Generate a text completion
/v1/embeddings
Create embedding vectors
/v1/images/generations
Generate an image from a text prompt
/v1/audio/transcriptions
Transcribe audio to text
/v1/audio/speech
Synthesize speech from text
/v1/models
List available models
/v1/chat/completions
Generate a chat completion
/v1/completions
Generate a text completion
/v1/embeddings
Create embedding vectors
/v1/images/generations
Generate an image from a text prompt
/v1/audio/transcriptions
Transcribe audio to text
Three things that make agents converge on Jentic-routed access.
Credential isolation
The AI/ML API Bearer token is stored encrypted in the Jentic vault. Agents call /v1/chat/completions, /v1/embeddings, and the audio and image endpoints through Jentic without ever seeing the raw token.
Intent-based discovery
Agents search by intent like 'generate a chat completion' and Jentic returns the AI/ML API /v1/chat/completions operation with its messages and model schema, ready to call.
Time to first call
Direct integration: half a day for Bearer auth, model listing, and per-endpoint payload shaping. Through Jentic: under 30 minutes.
Alternatives and complements available in the Jentic catalogue.
OpenAI API
OpenAI's first-party LLM, embedding, image, and audio APIs
Choose OpenAI when you specifically want first-party access to OpenAI models with their full feature surface.
Anthropic Messages API
First-party Claude messages API
Pick Anthropic when you specifically need Claude models with their richer system prompt and tool use semantics.
Cohere API
Hosted LLM platform with strong embeddings and reranker models
Use Cohere when you want a single first-party stack and especially when reranking is core to your retrieval pipeline.
Specific to using AI/ML API API through Jentic.
Why is there no official OpenAPI spec for AI/ML API?
AI/ML API does not publish an OpenAPI specification. Jentic generates and maintains this spec so that AI agents and developers can call the AI/ML API via structured tooling. It is validated against the live API and kept up to date. Get started at https://app.jentic.com/sign-up.
What authentication does the AI/ML API use?
HTTP Bearer authentication. The Bearer token is sent in the Authorization header. Through Jentic the token is held in the vault and never enters the agent's context.
Can I run chat completions with multiple models?
Yes. POST /v1/chat/completions with the model id you want to use. Call GET /v1/models to discover which model identifiers are currently supported.
What are the rate limits for the AI/ML API?
The OpenAPI spec does not declare explicit rate limits. As with most LLM gateways, treat HTTP 429 responses as a signal to back off, and watch for per-model concurrency hints in the response body.
How do I generate embeddings through Jentic?
Run pip install jentic, search 'create text embeddings', and execute /v1/embeddings with your input strings. The response returns vectors ready to store in a vector database. Sign up at https://app.jentic.com/sign-up.
Does the API cover both image generation and audio?
Yes. /v1/images/generations produces images from text prompts, /v1/audio/transcriptions converts audio to text, and /v1/audio/speech synthesises speech from text — all behind the same Bearer token.
/v1/audio/speech
Synthesize speech from text
/v1/models
List available models