Cloud TPU API

Name: Cloud TPU API API
Brand: Cloud TPU API
Availability: InStock

✓ Official Vendor SpecCloud InfrastructureComputeoauth217 EndpointsREST

For Agents

Provision, list, reset, and tear down Cloud TPU nodes and queued resources, and discover available accelerator types and runtime versions per zone.

Quickstart

Get started with Cloud TPU API in minutes using your preferred integration method.

# Add to your MCP client config (Claude Desktop, Cursor, Windsurf)
{
  "jentic": {
    "url": "https://api.jentic.com/mcp",
    "auth": "oauth"
  }
}

# Then ask your agent:
"create a cloud tpu node"

# → Jentic returns the GET /events tool with parameter schema, agent executes.

Capabilities

What an agent can do with Cloud TPU API API.

Provision TPU VM nodes in a specified zone with a chosen accelerator type

List active nodes, accelerator types, and supported runtime versions per zone

Reset, stop, and start TPU nodes during long-running training jobs

Read guest attributes from a TPU node for debugging and monitoring

GET STARTED

Start building with Cloud TPU API API

Explore with Jentic

View OpenAPI Document

Use Cases

Patterns agents use Cloud TPU API API for, with concrete tasks.

★ On-Demand Training Cluster Provisioning

ML platform teams use the Cloud TPU API to spin up a TPU pod slice for a scheduled training run, attach the cluster to their JAX or PyTorch script, and tear it down when training completes. The API is called from a CI pipeline that allocates a v4-128 slice, waits until READY, runs the training script, then deletes the node — keeping spend tightly bound to actual training time rather than leaving an idle slice running.

POST a node-create request via /v2/{parent}/nodes with acceleratorType=v4-128 and runtimeVersion=tpu-vm-v4-base, poll until state=READY, run the training job, then DELETE the node.

Queued Resource Allocation for Spiky Demand

Research teams that can wait for capacity use queued resources to request TPUs in higher-demand regions without holding capacity. The API submits a queued-resource request for a target slice size; Google fulfils the request when capacity is available, transitioning to ACTIVE state. This pattern is essential for accessing scarce v5p slices.

Submit a queued-resource POST under /v2/{parent}/queuedResources targeting a v5p-512 slice in us-east5, then poll the queued resource until state=ACTIVE.

Capacity Discovery Across Zones

Before scheduling a training job, an MLOps service queries available TPU types and runtime versions in each candidate zone to build a routing decision. Listing accelerator types under /v2/{parent}/acceleratorTypes and runtime versions under /v2/{parent}/runtimeVersions for each zone lets the orchestrator pick the cheapest viable region without trial-and-error provisioning failures.

GET /v2/projects/{project}/locations/{zone}/acceleratorTypes for each zone in a candidate list and intersect with the runtime versions supported.

Agent-Managed Training Lifecycle via Jentic

An ML training agent receives a 'fine-tune Llama on this dataset' instruction, allocates a TPU through Jentic, monitors the training job, and tears down the TPU on completion. Jentic isolates the GCP credential, exposes the start/poll/stop operations as discrete tool calls, and keeps the lifecycle state across long polls.

Through Jentic, search 'create a cloud tpu node', execute the create operation with the requested accelerator type, poll node status, and execute the delete operation when the training callback signals completion.

Key Endpoints

17 endpoints — cloud tpu api provisions and manages tensor processing unit nodes used for training and serving large machine-learning models.

METHOD

PATH

DESCRIPTION

GET

/v2/{+name}/locations

List zones where Cloud TPU is available for the project

GET

/v2/{+name}/operations

List long-running TPU operations in a zone

POST

/v2/{+name}:cancel

Cancel a long-running operation

POST

/v2/{+name}:reset

Reset a TPU node

GET

/v2/{+name}:getGuestAttributes

Read guest attributes from a TPU node

GET

/v2/{+name}/locations

List zones where Cloud TPU is available for the project

GET

/v2/{+name}/operations

List long-running TPU operations in a zone

POST

/v2/{+name}:cancel

Cancel a long-running operation

POST

/v2/{+name}:reset

Reset a TPU node

GET

/v2/{+name}:getGuestAttributes

Read guest attributes from a TPU node

Why though Jentic?

Three things that make agents converge on Jentic-routed access.

Credential isolation

Service-account JSON is stored encrypted in the Jentic vault. Agents call TPU provisioning through Jentic and never hold the raw credential during long-running training lifecycles.

Intent-based discovery

Agents search 'create a cloud tpu node' or 'list tpu accelerator types' and Jentic returns the matching v2 operation with full path-template input schema.

Time to first call

Direct Cloud TPU integration: 2-5 days for provisioning, polling, and lifecycle handling. Through Jentic: under 1 hour to wire create-and-poll into an agent.

Related APIs

Alternatives and complements available in the Jentic catalogue.

Complementary

Google Compute Engine API

Provisions the surrounding VMs, networks, and disks that TPU nodes attach to

Use Compute Engine for the orchestrator VM and storage; use Cloud TPU for the accelerator slice itself.

Alternative

Google Kubernetes Engine API

GKE node pools can attach TPUs as an alternative to direct TPU API provisioning

Choose GKE when running long-lived training services that need autoscaling and rolling updates; choose direct Cloud TPU for one-off training runs.

Complementary

Google Cloud Storage API

Stores training datasets and model checkpoints read by TPU workloads

Always paired — TPU nodes mount or stream data from Cloud Storage during training.

FAQs

Specific to using Cloud TPU API API through Jentic.

What authentication does the Cloud TPU API use?

OAuth 2.0 with the cloud-platform scope is required. Production usage is via service-account credentials with the tpu.admin role on the project. Through Jentic, the service-account JSON is stored encrypted in the vault and Jentic mints scoped tokens per call so the agent never holds the raw credential.

Can I provision a TPU pod slice with the Cloud TPU API?

Yes. Submit a node-create request under /v2/projects/{project}/locations/{zone}/nodes specifying the acceleratorType (e.g. v4-128 for a 128-chip slice) and runtimeVersion. The request returns a long-running operation; poll it until done, then GET the node to confirm state=READY before connecting your training framework.

What are the rate limits for the Cloud TPU API?

Control-plane operations allow several requests per second per project; the binding constraint is the per-project TPU quota (chip count by accelerator family) which must be raised through Cloud Console quota request for production workloads. Use queued resources rather than hot-looping create requests when capacity is scarce.

How do I allocate a TPU through Jentic?

Search Jentic for 'create a cloud tpu node', load the create operation under /v2/{parent}/nodes, and execute it with parent=projects/PROJECT/locations/ZONE plus acceleratorType and runtimeVersion. Poll GET on the resulting operation name until done. Get started at https://app.jentic.com/sign-up.

Does the Cloud TPU API support v5e and v5p chips?

Yes. The acceleratorTypes endpoint lists v5e and v5p variants per zone where they are available — typically a small set of regions for v5p and broader availability for v5e. Use the runtimeVersions endpoint to confirm which TPU VM images support the chosen accelerator type.

Is the Cloud TPU API free?

The API itself is free; you pay for the TPU node-hours consumed at the published per-chip-hour rate that varies by accelerator type. Stopping or deleting a node ends billing immediately, so always tear down nodes when training completes.