About Foundry Map

Foundry Map answers one question Microsoft makes painful to answer: which Azure AI models are available in which regions, for which deployment type, at what price?

How it works

A Python pipeline calls Microsoft.CognitiveServices/locations/{location}/models for every Azure region that hosts AI Foundry, merges with the Azure Retail Prices API, and commits the normalised JSON to this repo. The static site you're reading is rebuilt on every commit.

Data freshness

Currently the pipeline is run manually. Daily automation via GitHub Actions + Azure OIDC is on the roadmap. The timestamp on each page shows when the data was last generated.

Model lifecycle

Each model in this catalogue carries a lifecycle badge sourced from Azure's ARM Microsoft.CognitiveServices/locations/{location}/models response. Microsoft's API enumerates four values; we add a fifth (Unknown) for partner models where the publisher hasn't populated the field. Definitions below come from the Azure AI Foundry model-retirements and models-sold-directly-by-azure docs.

GA

Generally Available

Production-ready. Covered by Azure SLAs and supported by Microsoft. Available for at least 12 months from launch; Microsoft commits to at least 60 days' notice before retirement. Safe default for new workloads. ARM enum: GenerallyAvailable.

Preview

Preview

Released for evaluation — Microsoft explicitly does not recommend Preview for production. No GA SLA. Typical lifespan 90–120 days, with at least 30 days' notice before Microsoft auto-upgrades existing deployments to a newer Preview or GA version. ARM enum: Preview.

Retiring

Retiring

A retirement date has been announced and the deprecation window is now open: existing deployments continue to function, but new deployments cannot be created by customers who haven't deployed the model before. Plan a migration before the retirement date — after that point requests return errors. ARM enum: Deprecating.

Deprecated

Deprecated

Past the deprecation date and approaching or at retirement. No new deployments can be created. Existing deployments may continue until the published retirement date, after which Azure OpenAI returns error responses. Microsoft reserves the right to issue emergency retirements for security or compliance, bypassing the standard notice window. ARM enum: Deprecated.

Unknown

Unknown

Not a Microsoft enum value. We use this when ARM returns no lifecycle_status for a model — typically Foundry Models from partners and the community where the publisher (not Microsoft) controls lifecycle, support and billing. Don't assume GA semantics or Azure SLA coverage; check the publisher's terms.

Caveats Microsoft mentions: not every model passes through Retiring before Deprecated — some are retired directly. The ARM REST schema enumerates the four values above but ships them with no descriptions; semantic definitions live in the linked retirements docs. The naming inconsistency between the API enum (Deprecating) and the docs (Retiring / Deprecation) refers to the same phase.

Deployment types

Each model in this catalogue lists the deployment types (SKUs) it supports. Microsoft's 11 SKU codes collapse into a 3 × 3 matrix — three billing/latency modes (Standard, Provisioned Managed, Batch) crossed with three data-routing scopes (single region, Data Zone, Global) — plus two special cases. Definitions below come from the Foundry deployment types and PTU onboarding docs.

Mode ↓   Scope → Single region Data Zone (US / EU) Global
Standard (pay-per-token, sync) Standard DataZoneStandard GlobalStandard
Provisioned Managed (reserved PTU, low-variance latency) ProvisionedManaged DataZoneProvisionedManaged GlobalProvisionedManaged
Batch (async, 24h SLA, 50% discount) Batch DataZoneBatch GlobalBatch

Pick the routing scope by residency

Pick the mode by traffic shape

Standard — pay-per-token, real-time

The default sync inference mode. You pay per input/output token; throughput is governed by quota (TPM/RPM). Best for low-to-medium volume and bursty workloads. At sustained high RPS you'll see latency variability — that's the cue to look at Provisioned Managed. Available as Standard, DataZoneStandard, GlobalStandard.

Provisioned Managed (PTU) — reserved capacity

You buy a fixed number of Provisioned Throughput Units, billed per PTU/hour or via Azure Reservations (1-month / 1-year discounts). In return you get guaranteed throughput and much lower latency variance. Minimums vary by model and scope (e.g. gpt-5: 50 PTU regional, 15 PTU global/data zone). Reservations for Global, Data Zone and Regional are not interchangeable. Best for production workloads with predictable load. Available as ProvisionedManaged, DataZoneProvisionedManaged, GlobalProvisionedManaged.

Batch — async, half price

Submit a JSONL file of requests, get results back within 24 hours. Pricing is roughly 50% of Standard. Uses a separate enqueued-token quota so it doesn't compete with your online traffic. No real-time SLA — Microsoft says jobs "might take longer" than 24 h. Perfect for embeddings backfills, bulk classification, document summarisation. Never use for anything interactive. Available as Batch, DataZoneBatch, GlobalBatch.

Special tiers

Developer — fine-tune evaluation only

Pay-per-token tier designed exclusively for testing fine-tuned models cheaply. Significant constraints: no data residency guarantees, no SLA, and a fixed 24-hour lifetime after which the deployment is auto-deleted. Routing is global. Use it for hourly-cost-sensitive smoke tests of a custom fine-tune before promoting to a real deployment type. Never use for production, regulated data, or anything requiring uptime.

Provisioned (no suffix) — legacy

ARM still returns a bare Provisioned SKU on some older models. This is the pre-August-2024 Commitment-payment-model Provisioned offering, semantically equivalent to today's ProvisionedManaged (single-region, reserved capacity) but using the deprecated commitment purchase model. Not available to new customers or for models introduced after August 2024 — treat as legacy and migrate to ProvisionedManaged.

Caveats Microsoft mentions: deploying a model isn't the same as having capacity — for Provisioned tiers, quota and capacity are separate concepts; you may need to try multiple regions. Data Zone definitions: the US zone covers all Azure US regions; the EU zone covers EU member nations (Microsoft's broader Azure EU Data Boundary also includes the EFTA states — Iceland, Liechtenstein, Norway, Switzerland — but the Foundry deployment-types page itself only explicitly mentions EU member nations).

Known limitations

Source

Code + data live at github.com/waynegoosen/foundry-map. Issues and PRs welcome.

Not affiliated with Microsoft. "Azure" is a trademark of Microsoft Corporation; used here descriptively.