For Agents
Provision, list, reset, and tear down Cloud TPU nodes and queued resources, and discover available accelerator types and runtime versions per zone.
Get started with Cloud TPU API in minutes using your preferred integration method.
# Add to your MCP client config (Claude Desktop, Cursor, Windsurf)
{
"jentic": {
"url": "https://api.jentic.com/mcp",
"auth": "oauth"
}
}
# Then ask your agent:
"create a cloud tpu node"
# → Jentic returns the GET /events tool with parameter schema, agent executes.What an agent can do with Cloud TPU API API.
Provision TPU VM nodes in a specified zone with a chosen accelerator type
List active nodes, accelerator types, and supported runtime versions per zone
Reset, stop, and start TPU nodes during long-running training jobs
Read guest attributes from a TPU node for debugging and monitoring
GET STARTED
Use for: I need to provision a new Cloud TPU node for training, List all TPU nodes in a project and zone, Reset a TPU node that is unresponsive, Find which TPU accelerator types are available in us-central1
Not supported: Does not handle GPU provisioning, model training framework configuration, or dataset storage — use for managing Cloud TPU nodes, queued resources, and operations only.
Cloud TPU API provisions and manages Tensor Processing Unit nodes used for training and serving large machine-learning models. Through it teams allocate single nodes or queued resources, list available accelerator types and TensorFlow runtime versions per zone, manage device guest attributes, and stop or reset nodes when jobs finish. The API is the control plane behind every TPU VM that PyTorch, JAX, and TensorFlow workloads run on in Google Cloud.
Cancel long-running TPU operations
Manage queued-resource allocations for TPU pod slices
Patterns agents use Cloud TPU API API for, with concrete tasks.
★ On-Demand Training Cluster Provisioning
ML platform teams use the Cloud TPU API to spin up a TPU pod slice for a scheduled training run, attach the cluster to their JAX or PyTorch script, and tear it down when training completes. The API is called from a CI pipeline that allocates a v4-128 slice, waits until READY, runs the training script, then deletes the node — keeping spend tightly bound to actual training time rather than leaving an idle slice running.
POST a node-create request via /v2/{parent}/nodes with acceleratorType=v4-128 and runtimeVersion=tpu-vm-v4-base, poll until state=READY, run the training job, then DELETE the node.
Queued Resource Allocation for Spiky Demand
Research teams that can wait for capacity use queued resources to request TPUs in higher-demand regions without holding capacity. The API submits a queued-resource request for a target slice size; Google fulfils the request when capacity is available, transitioning to ACTIVE state. This pattern is essential for accessing scarce v5p slices.
Submit a queued-resource POST under /v2/{parent}/queuedResources targeting a v5p-512 slice in us-east5, then poll the queued resource until state=ACTIVE.
Capacity Discovery Across Zones
Before scheduling a training job, an MLOps service queries available TPU types and runtime versions in each candidate zone to build a routing decision. Listing accelerator types under /v2/{parent}/acceleratorTypes and runtime versions under /v2/{parent}/runtimeVersions for each zone lets the orchestrator pick the cheapest viable region without trial-and-error provisioning failures.
GET /v2/projects/{project}/locations/{zone}/acceleratorTypes for each zone in a candidate list and intersect with the runtime versions supported.
Agent-Managed Training Lifecycle via Jentic
An ML training agent receives a 'fine-tune Llama on this dataset' instruction, allocates a TPU through Jentic, monitors the training job, and tears down the TPU on completion. Jentic isolates the GCP credential, exposes the start/poll/stop operations as discrete tool calls, and keeps the lifecycle state across long polls.
Through Jentic, search 'create a cloud tpu node', execute the create operation with the requested accelerator type, poll node status, and execute the delete operation when the training callback signals completion.
17 endpoints — cloud tpu api provisions and manages tensor processing unit nodes used for training and serving large machine-learning models.
METHOD
PATH
DESCRIPTION
/v2/{+name}/locations
List zones where Cloud TPU is available for the project
/v2/{+name}/operations
List long-running TPU operations in a zone
/v2/{+name}:cancel
Cancel a long-running operation
/v2/{+name}:reset
Reset a TPU node
/v2/{+name}:getGuestAttributes
Read guest attributes from a TPU node
/v2/{+name}/locations
List zones where Cloud TPU is available for the project
/v2/{+name}/operations
List long-running TPU operations in a zone
/v2/{+name}:cancel
Cancel a long-running operation
/v2/{+name}:reset
Reset a TPU node
/v2/{+name}:getGuestAttributes
Read guest attributes from a TPU node
Three things that make agents converge on Jentic-routed access.
Credential isolation
Service-account JSON is stored encrypted in the Jentic vault. Agents call TPU provisioning through Jentic and never hold the raw credential during long-running training lifecycles.
Intent-based discovery
Agents search 'create a cloud tpu node' or 'list tpu accelerator types' and Jentic returns the matching v2 operation with full path-template input schema.
Time to first call
Direct Cloud TPU integration: 2-5 days for provisioning, polling, and lifecycle handling. Through Jentic: under 1 hour to wire create-and-poll into an agent.
Alternatives and complements available in the Jentic catalogue.
Google Compute Engine API
Provisions the surrounding VMs, networks, and disks that TPU nodes attach to
Use Compute Engine for the orchestrator VM and storage; use Cloud TPU for the accelerator slice itself.
Google Kubernetes Engine API
GKE node pools can attach TPUs as an alternative to direct TPU API provisioning
Choose GKE when running long-lived training services that need autoscaling and rolling updates; choose direct Cloud TPU for one-off training runs.
Google Cloud Storage API
Stores training datasets and model checkpoints read by TPU workloads
Always paired — TPU nodes mount or stream data from Cloud Storage during training.
Specific to using Cloud TPU API API through Jentic.
What authentication does the Cloud TPU API use?
OAuth 2.0 with the cloud-platform scope is required. Production usage is via service-account credentials with the tpu.admin role on the project. Through Jentic, the service-account JSON is stored encrypted in the vault and Jentic mints scoped tokens per call so the agent never holds the raw credential.
Can I provision a TPU pod slice with the Cloud TPU API?
Yes. Submit a node-create request under /v2/projects/{project}/locations/{zone}/nodes specifying the acceleratorType (e.g. v4-128 for a 128-chip slice) and runtimeVersion. The request returns a long-running operation; poll it until done, then GET the node to confirm state=READY before connecting your training framework.
What are the rate limits for the Cloud TPU API?
Control-plane operations allow several requests per second per project; the binding constraint is the per-project TPU quota (chip count by accelerator family) which must be raised through Cloud Console quota request for production workloads. Use queued resources rather than hot-looping create requests when capacity is scarce.
How do I allocate a TPU through Jentic?
Search Jentic for 'create a cloud tpu node', load the create operation under /v2/{parent}/nodes, and execute it with parent=projects/PROJECT/locations/ZONE plus acceleratorType and runtimeVersion. Poll GET on the resulting operation name until done. Get started at https://app.jentic.com/sign-up.
Does the Cloud TPU API support v5e and v5p chips?
Yes. The acceleratorTypes endpoint lists v5e and v5p variants per zone where they are available — typically a small set of regions for v5p and broader availability for v5e. Use the runtimeVersions endpoint to confirm which TPU VM images support the chosen accelerator type.
Is the Cloud TPU API free?
The API itself is free; you pay for the TPU node-hours consumed at the published per-chip-hour rate that varies by accelerator type. Stopping or deleting a node ends billing immediately, so always tear down nodes when training completes.