OpenAI API

Name: OpenAI API API
Brand: OpenAI API
Availability: InStock

✓ Official Vendor SpecAI/MLLanguage Modelsbearer126 EndpointsREST

For Agents

Generate text completions, create embeddings, transcribe audio, and produce images using OpenAI models. Supports function calling, multi-modal inputs, and structured JSON output for agent workflows.

Quickstart

Get started with OpenAI API in minutes using your preferred integration method.

# Add to your MCP client config (Claude Desktop, Cursor, Windsurf)
{
  "jentic": {
    "url": "https://api.jentic.com/mcp",
    "auth": "oauth"
  }
}

# Then ask your agent:
"generate a chat completion with GPT-4o"

# → Jentic returns the GET /events tool with parameter schema, agent executes.

Capabilities

What an agent can do with OpenAI API API.

Generate multi-turn chat completions with function calling and tool use

Create vector embeddings for semantic search and retrieval-augmented generation

Transcribe and translate audio files using Whisper models

Synthesize natural-sounding speech from text with multiple voice options

GET STARTED

Start building with OpenAI API API

Explore with Jentic

View OpenAPI Document

Use Cases

Patterns agents use OpenAI API API for, with concrete tasks.

★ AI Agent Tool Calling via Jentic

AI agents discover and invoke OpenAI chat completions through Jentic's intent-based search, enabling multi-step reasoning with function calling. Agents search for the operation by intent, receive the input schema, and execute completions without manual SDK setup. Supports tool_choice parameters for deterministic function invocation and parallel tool calls for complex agent workflows.

Search Jentic for 'generate a chat completion with function calling', load the /chat/completions schema, and execute a request with model gpt-4o and a weather tool definition

Retrieval-Augmented Generation Pipeline

Build RAG pipelines by generating vector embeddings with the /embeddings endpoint and combining them with chat completions for grounded answers. The text-embedding-3-small model produces 1536-dimensional vectors at low cost, while text-embedding-3-large offers higher accuracy for production retrieval systems. Agents can index thousands of documents and query relevant chunks in a single orchestration flow.

Generate embeddings for 10 document chunks using text-embedding-3-small via POST /embeddings and store the resulting vectors

Audio Transcription and Translation

Transcribe audio files in 57 languages using the Whisper model through the /audio/transcriptions endpoint, or translate non-English audio directly to English text via /audio/translations. Accepts mp3, mp4, mpeg, mpga, m4a, wav, and webm formats up to 25 MB. Returns timestamped segments for subtitle generation or plain text for downstream processing.

Transcribe a 5-minute WAV audio file using POST /audio/transcriptions with model whisper-1 and return timestamped JSON segments

Batch Inference for High-Volume Processing

Process thousands of requests asynchronously at 50% reduced cost using the Batch API. Submit JSONL files containing chat completion, embedding, or moderation requests via POST /batches, then poll for completion. Results are returned as downloadable output files. Ideal for nightly report generation, bulk content moderation, or large-scale embedding jobs that do not require real-time responses.

Upload a JSONL file with 500 chat completion requests via POST /files, create a batch job via POST /batches, and poll GET /batches/{batch_id} until completion

Custom Model Fine-Tuning

Fine-tune GPT-4o-mini or GPT-3.5-turbo on domain-specific training data to improve accuracy for specialized tasks like classification, extraction, or tone matching. Upload training JSONL via /files, launch a fine-tuning job via POST /fine_tuning/jobs, monitor with checkpoints, and deploy the resulting model for inference. Supports validation files and hyperparameter configuration.

Upload a training JSONL file via POST /files with purpose 'fine-tune', then create a fine-tuning job via POST /fine_tuning/jobs with model gpt-4o-mini-2024-07-18

Key Endpoints

126 endpoints — generate text, images, embeddings, and audio with 126 endpoints spanning chat completions, assistants, fine-tuning, batch processing, and moderation.

METHOD

PATH

DESCRIPTION

POST

/chat/completions

Generate a chat completion with optional function calling

POST

/embeddings

Create vector embeddings from input text

POST

/audio/transcriptions

Transcribe audio to text using Whisper

POST

/audio/speech

Generate speech audio from text input

POST

/images/generations

Generate images from text prompts

POST

/fine_tuning/jobs

Create a model fine-tuning job

POST

/batches

Create an asynchronous batch processing job

GET

/models

List all available models

POST

/chat/completions

Generate a chat completion with optional function calling

POST

/embeddings

Create vector embeddings from input text

POST

/audio/transcriptions

Transcribe audio to text using Whisper

POST

/audio/speech

Generate speech audio from text input

POST

/images/generations

Generate images from text prompts

Why though Jentic?

Three things that make agents converge on Jentic-routed access.

Credential isolation

OpenAI Bearer tokens are stored encrypted in the Jentic vault (MAXsystem). Agents receive scoped access tokens — raw API keys (sk-...) never enter the agent's context or logs.

Intent-based discovery

Agents search by intent (e.g., 'generate a chat completion with function calling') and Jentic returns matching OpenAI operations with their full input schemas, including tool definitions and response_format options, so the agent can invoke the right endpoint without parsing SDK docs.

Time to first call

Direct OpenAI integration: 1-3 days for auth setup, error handling, retry logic, and model routing. Through Jentic: under 1 hour — search for the operation, load the schema, execute.

Related APIs

Alternatives and complements available in the Jentic catalogue.

Alternative

Anthropic API

Claude models with 200K context window, focused on safety and instruction-following

Choose Anthropic when you need longer context windows (200K tokens), stronger instruction adherence, or prefer Claude's reasoning style for complex analytical tasks

Alternative

Cohere API

Enterprise-focused LLM API with native RAG support and reranking

Choose Cohere when you need built-in reranking for search results, enterprise data privacy requirements, or Command models fine-tuned for business tasks

Alternative

Mistral AI API

Open-weight models with competitive performance at lower cost

Choose Mistral when you need cost-efficient inference, open-weight model flexibility, or EU data residency for GDPR compliance

Complementary

Pinecone API

Vector database for storing and querying OpenAI embeddings

Use Pinecone alongside OpenAI to store generated embeddings and perform similarity search at scale in RAG pipelines

Complementary

ElevenLabs API

Advanced voice synthesis and cloning beyond OpenAI's TTS capabilities

Use ElevenLabs when you need voice cloning, multilingual voice synthesis with emotion control, or higher-quality audio output than OpenAI TTS

FAQs

Specific to using OpenAI API API through Jentic.

What authentication does the OpenAI API use?

The OpenAI API uses Bearer token authentication. You pass your API key in the Authorization header as 'Bearer sk-...'. Through Jentic, your OpenAI API key is stored encrypted in the MAXsystem vault and agents receive scoped access tokens, so the raw secret key never enters the agent context.

Can I generate structured JSON output with the OpenAI API?

Yes. The POST /chat/completions endpoint supports a response_format parameter set to 'json_object' which constrains the model to output valid JSON. For stricter schemas, use function calling with a JSON Schema definition to guarantee the output matches your expected structure.

What are the rate limits for the OpenAI API?

Rate limits vary by model tier and organization. GPT-4o allows up to 10,000 requests per minute on Tier 5 accounts, while lower tiers start at 500 RPM. Embedding models support higher throughput. The API returns x-ratelimit-remaining and x-ratelimit-reset headers with each response for programmatic tracking.

How do I generate embeddings for a RAG pipeline through Jentic?

Search Jentic for 'create text embeddings' to discover the POST /embeddings operation. Load the schema, then execute with model 'text-embedding-3-small' and your input texts. Jentic returns the operation schema with parameters so your agent can call it directly without browsing OpenAI documentation. Install the SDK with pip install jentic to get started.

What models are available through the OpenAI API?

The GET /models endpoint lists all available models. Current options include GPT-4o and GPT-4o-mini for chat, text-embedding-3-small and text-embedding-3-large for embeddings, whisper-1 for transcription, tts-1 and tts-1-hd for speech synthesis, and dall-e-3 for image generation. Fine-tuned models also appear in this list after training completes.

Can I use function calling to connect GPT to external tools?

Yes. Pass a tools array to POST /chat/completions with function definitions including name, description, and a JSON Schema for parameters. The model returns a tool_calls array when it decides to invoke a function. You execute the function and pass results back in a follow-up message with role 'tool'. Supports parallel tool calls and forced tool_choice.

Is the OpenAI API free to use?

OpenAI offers a free tier with limited credits for new accounts. After that, pricing is pay-per-token: GPT-4o costs $2.50 per million input tokens and $10 per million output tokens. Embedding models are significantly cheaper at $0.02 per million tokens for text-embedding-3-small. The Batch API provides a 50% discount for non-real-time workloads.