For Agents
Run OCR on documents and analyse images with vision features through two focused endpoints.
Get started with AIMLAPI in minutes using your preferred integration method.
# Add to your MCP client config (Claude Desktop, Cursor, Windsurf)
{
"jentic": {
"url": "https://api.jentic.com/mcp",
"auth": "oauth"
}
}
# Then ask your agent:
"extract text from a document image"
# → Jentic returns the GET /events tool with parameter schema, agent executes.What an agent can do with AIMLAPI API.
Extract text from a document image with POST /ocr
Analyse an image with vision features through POST /vision
Run OCR-and-vision pipelines without adopting a full LLM provider
Pair structured OCR output with vision analysis on the same input image
Authenticate every call via the apiKeyAuth scheme so the integration stays simple
GET STARTED
Use for: I need to extract text from a scanned document, Analyse what is in this image and describe it, Run OCR on an invoice and return the line items, Detect what an uploaded photo shows
Not supported: Does not handle chat, embeddings, image generation, or audio — use for OCR and vision-feature image analysis only.
Jentic publishes the only available OpenAPI specification for AIMLAPI, keeping it validated and agent-ready.
Jentic publishes the only available OpenAPI specification for AIMLAPI, keeping it validated and agent-ready. AIMLAPI exposes two purpose-built endpoints for document and image understanding: /ocr extracts text from a document image, and /vision analyses an image with vision features. Authentication uses an apiKey scheme (apiKeyAuth). The surface is intentionally narrow, suited to integrations that need OCR and vision analysis without adopting a full multi-modal LLM stack.
Patterns agents use AIMLAPI API for, with concrete tasks.
★ Invoice OCR Pipeline
Extract text from inbound invoice PDFs and images by sending them to /ocr. The response provides the extracted text, which can then be parsed into supplier, totals, and line items. The endpoint is purpose-built for OCR, so the integration code stays small and predictable.
POST each invoice image to /ocr and parse the returned text into a structured invoice record
Image Understanding for Content Pipelines
Analyse images uploaded to a CMS or content moderation pipeline using /vision. The endpoint returns features about the image that downstream code can use for tagging, captioning, or moderation decisions. Combined with /ocr, the same vendor handles both textual and visual inputs.
POST a CMS upload to /vision and store the returned features alongside the asset for later search and moderation
OCR Plus Vision for Mixed Inputs
Some inputs need both OCR and vision analysis — for example a product photo with a serial number embedded. Run /ocr to extract the serial and /vision to describe the product, then merge the results into a single record.
POST the same image to /ocr and /vision in parallel, then merge serial number with vision features into one record
Agent-Driven Document and Image Triage via Jentic
An agent receives an upload, decides whether it is a document or a photo, and calls /ocr or /vision accordingly. Through Jentic the agent searches for the right operation, the AIMLAPI key is supplied from the vault, and the agent never sees the raw key.
Use Jentic search 'extract text from a document image', execute /ocr, and route non-document images to /vision instead
2 endpoints — jentic publishes the only available openapi specification for aimlapi, keeping it validated and agent-ready.
METHOD
PATH
DESCRIPTION
/ocr
Perform OCR on a document
/vision
Analyse an image with vision features
/ocr
Perform OCR on a document
/vision
Analyse an image with vision features
Three things that make agents converge on Jentic-routed access.
Credential isolation
The AIMLAPI apiKeyAuth header value is stored encrypted in the Jentic vault. Agents call /ocr and /vision through Jentic without ever seeing the raw key.
Intent-based discovery
Agents search 'extract text from a document image' or 'analyse an image with vision features' and Jentic returns the matching AIMLAPI operation, ready to call with the input image.
Time to first call
Direct integration: a few hours for key handling and request shaping. Through Jentic: minutes — search, load, execute.
Alternatives and complements available in the Jentic catalogue.
AI/ML API
Broader AIMLAPI surface covering chat, embeddings, image generation, and audio
Use the aiml-api spec when you need chat, embeddings, image generation, or audio in addition to OCR and vision.
Cloudmersive OCR API
Dedicated OCR provider with broader format support
Pick Cloudmersive when OCR accuracy on specific formats matters more than having vision and OCR under the same vendor.
Clarifai API
Vision platform with classification and custom model training
Choose Clarifai when you need richer vision capabilities or custom-trained image models.
Specific to using AIMLAPI API through Jentic.
Why is there no official OpenAPI spec for AIMLAPI?
AIMLAPI does not publish an OpenAPI specification. Jentic generates and maintains this spec so that AI agents and developers can call AIMLAPI via structured tooling. It is validated against the live API and kept up to date. Get started at https://app.jentic.com/sign-up.
What authentication does the AIMLAPI use?
An apiKey scheme called apiKeyAuth, sent on each request. Through Jentic the key is held in the vault and never enters the agent's context.
Can I run OCR on a document image?
Yes. POST the document image to /ocr and the API returns the extracted text. For richer parsing, run a structured-data extractor over the returned text on your side.
What are the rate limits for the AIMLAPI?
The OpenAPI spec does not declare explicit rate limits. The endpoint surface is narrow, so the practical limit is the provider's per-key quota — respect HTTP 429 responses and back off accordingly.
How do I run image vision analysis through Jentic?
Run pip install jentic, search 'analyse an image with vision features', execute /vision with the image, and consume the returned analysis. Sign up at https://app.jentic.com/sign-up.
How does this differ from the broader AI/ML API spec?
This spec exposes the focused /ocr and /vision endpoints. The companion AI/ML API spec covers chat completions, embeddings, image generation, and audio. Pick this one for OCR and vision only.