Cloud Vision API

Name: Cloud Vision API API
Brand: Cloud Vision API
Availability: InStock

✓ Official Vendor SpecAI/MLVisionoauth223 EndpointsREST

For Agents

Run OCR, label detection, face detection, landmark recognition, and explicit content checks on images and PDFs so an agent can extract structured data from visual content.

Quickstart

Get started with Cloud Vision API in minutes using your preferred integration method.

# Add to your MCP client config (Claude Desktop, Cursor, Windsurf)
{
  "jentic": {
    "url": "https://api.jentic.com/mcp",
    "auth": "oauth"
  }
}

# Then ask your agent:
"extract text from an image with OCR"

# → Jentic returns the GET /events tool with parameter schema, agent executes.

Capabilities

What an agent can do with Cloud Vision API API.

Extract typed and handwritten text from images and PDFs with full OCR layout

Detect object labels and bounding boxes for inventory and content tagging

Recognize landmarks, logos, and well-known products in user-supplied images

Score images for adult, violent, racy, medical, and spoof content via SafeSearch

GET STARTED

Start building with Cloud Vision API API

Explore with Jentic

View OpenAPI Document

Use Cases

Patterns agents use Cloud Vision API API for, with concrete tasks.

★ Document Digitization and OCR

Operations and back-office teams convert scanned invoices, contracts, and forms into searchable text using the Vision API's DOCUMENT_TEXT_DETECTION feature. The API returns the full hierarchical layout of pages, blocks, paragraphs, words, and symbols with confidence scores, which the digitization pipeline maps into structured records. Asynchronous batch endpoints handle large multi-page PDFs without blocking caller threads.

Submit a files:asyncBatchAnnotate with DOCUMENT_TEXT_DETECTION on a 200-page PDF in gs://invoices/q3.pdf

User-Generated Image Moderation

Marketplaces and social platforms screen every user-uploaded photo with SafeSearch and label detection before publishing. The images:annotate endpoint returns likelihood ratings for adult, violent, racy, medical, and spoof content alongside detected labels, letting the moderation pipeline auto-block clear violations and route ambiguous cases to human review. Synchronous mode keeps response times suitable for upload flows.

Annotate an image with SAFE_SEARCH_DETECTION and reject if adult or violent likelihood is LIKELY or VERY_LIKELY

Retail Visual Search

Retailers index their product catalog into a Vision Product Search corpus, then accept query images at runtime to return visually similar SKUs ranked by similarity. The productSearch annotate path matches against the configured product set and returns matching product IDs with bounding boxes for each detected object in the query image. This powers in-app 'find similar products' features without training a custom model.

Annotate a customer photo with PRODUCT_SEARCH and return the top three matching SKUs from the home goods product set

AI Agent Image Understanding

An AI agent integrated through Jentic answers prompts like 'what is in this photo?' or 'is this image safe to publish?' by discovering the Vision API by intent search, calling images:annotate with the relevant feature set, and returning the structured response. Because the API uses OAuth 2.0 with the cloud-platform scope, Jentic isolates the token in the MAXsystem vault and exposes only a scoped reference.

Search Jentic for analyze an image, load the schema, and call images:annotate with LABEL_DETECTION and SAFE_SEARCH_DETECTION

Key Endpoints

23 endpoints — the cloud vision api performs image and pdf analysis including label detection, ocr, face detection, landmark and logo recognition, explicit content (safesearch) detection, object localization, and product search.

METHOD

PATH

DESCRIPTION

POST

/v1/images:annotate

Synchronous batch annotation of one or more images

POST

/v1/images:asyncBatchAnnotate

Asynchronous batch annotation for large image sets

POST

/v1/files:annotate

Synchronous annotation of multi-page PDF and TIFF files

POST

/v1/files:asyncBatchAnnotate

Asynchronous annotation of large PDF and TIFF documents in Cloud Storage

POST

/v1/images:annotate

Synchronous batch annotation of one or more images

POST

/v1/images:asyncBatchAnnotate

Asynchronous batch annotation for large image sets

POST

/v1/files:annotate

Synchronous annotation of multi-page PDF and TIFF files

POST

/v1/files:asyncBatchAnnotate

Asynchronous annotation of large PDF and TIFF documents in Cloud Storage

Why though Jentic?

Three things that make agents converge on Jentic-routed access.

Credential isolation

Cloud Vision OAuth tokens are stored encrypted in the Jentic vault (MAXsystem). Agents receive scoped access tokens — raw OAuth tokens never enter the agent's context, which matters because the cloud-platform scope grants broad project-level access.

Intent-based discovery

Agents search Jentic with intents like 'extract text from an image' or 'detect labels' and Jentic returns the images:annotate operation with its full input schema, including the feature enum and image source options, so the agent can construct a valid request without reading Google's discovery doc.

Time to first call

Direct Cloud Vision integration: 2-3 days for OAuth setup, request batching, and feature-by-feature response parsing. Through Jentic: under 1 hour — search, load schema, execute.

Related APIs

Alternatives and complements available in the Jentic catalogue.

Alternative

Cloud Video Intelligence API

Per-frame and per-shot annotation for video; Vision is the still-image equivalent

Choose Video Intelligence when the input is a video file; choose Vision when the input is an image or PDF page.

Complementary

Cloud Translation API

Translate text extracted by Vision OCR into other languages

Choose Translation when an agent needs to localize OCR output that Vision returns.

Complementary

Cloud Storage API

Stage images and PDFs in a bucket so Vision can read them by URI

Choose Cloud Storage to upload and manage the source files; Vision reads them via gs:// URIs.

Complementary

Sensitive Data Protection (DLP) API

Scan OCR output for PII and PHI before storing or surfacing it

Choose DLP when an agent needs to redact sensitive content discovered in Vision OCR results.

FAQs

Specific to using Cloud Vision API API through Jentic.

What authentication does the Cloud Vision API use?

The Cloud Vision API uses OAuth 2.0 with the https://www.googleapis.com/auth/cloud-platform or cloud-vision scope. Through Jentic, the OAuth token is stored encrypted in the MAXsystem vault and only a scoped reference is exposed to the agent at execution time.

Can I run OCR on a multi-page PDF with the Cloud Vision API?

Yes. Use POST /v1/files:asyncBatchAnnotate with DOCUMENT_TEXT_DETECTION to OCR a PDF or TIFF stored in Cloud Storage. The async endpoint returns an operation name; the final annotation result is written to a Cloud Storage destination you specify in the request.

What are the rate limits for the Cloud Vision API?

Default project quotas allow 1,800 requests per minute and 16 images per request, with feature-specific image-size and PDF-page caps. Higher quotas can be requested in the Google Cloud Console; pricing is per feature per image.

How do I detect explicit content in an image through Jentic with the Cloud Vision API?

Install Jentic with pip install jentic, search for detect explicit content in image, load the schema for POST /v1/images:annotate, then call it with features set to SAFE_SEARCH_DETECTION and the image source as either a Cloud Storage URI or base64 content. The response includes adult, violent, racy, medical, and spoof likelihood values.

Does the Cloud Vision API support handwriting recognition?

Yes. Use DOCUMENT_TEXT_DETECTION rather than the simpler TEXT_DETECTION feature; DOCUMENT_TEXT_DETECTION is tuned for dense text and handwriting and returns full document layout. Accuracy depends on legibility, contrast, and language.

Why does my Vision API request return INVALID_ARGUMENT for an inline image?

Inline image content must be valid base64-encoded bytes under the 10 MB request limit, and the image format must be one of JPEG, PNG, GIF, BMP, WEBP, RAW, ICO, PDF, or TIFF. For larger files, upload to Cloud Storage and pass the gs:// URI in image.source.imageUri instead.