Skip to content

Model Definition Specification

File naming: models/<model_id>.model.md

Audience: Platform engineers, ML engineers, product owners


Overview

A model definition file declares one named model that agents can reference. It maps a stable, human-readable model identifier (e.g., claude-4-sonnet) to a specific provider, provider-level model name, and configuration. The compiler resolves model references from agent definitions and injects the correct provider configuration into the compiled payload. The runtime never reads raw model identifiers — it only executes compiled payloads.

This indirection serves two purposes. First, it decouples agent files from provider-specific naming conventions: an agent author writes model: "claude-4-sonnet" rather than model_id: "us.anthropic.claude-sonnet-4-20250514-v1:0". Second, it allows the platform team to update provider configurations, swap regions, or rotate API key references in a single file without touching any agent definition.

Credentials are never stored in model definition files. Instead, each provider config uses a credentials block with a source field that tells the runtime where to fetch the value at execution time — environment variable, AWS Secrets Manager, GCP Secret Manager, Azure Key Vault, or AWS IAM role.


File structure

---
[YAML front matter — all structured fields]
---

# Description
[optional prose — what this model is, when to use it, usage guidance]

The prose body is optional and intended for the model registry documentation UI. It does not affect runtime behavior.


Description (prose body)

The # Description heading opens the optional Markdown body that follows the closing --- of the YAML front matter. It has no effect on compilation or runtime behavior — it exists exclusively for human readers and the model registry documentation UI.

What to include:

  • What the model is — provider, model family, and any notable architecture notes (e.g., multimodal, reasoning model, instruction-tuned variant).
  • When to use it — recommended scenarios, relative strengths compared to other registered models, and workload types it is optimized for.
  • When not to use it — known limitations, cost or latency trade-offs, or cases where a different registered model is a better fit.
  • Operational notes — how credentials are resolved at runtime, any region or quota constraints, and local-development alternatives (e.g., swapping source: "iam_role" for source: "env").

Keep the description focused and actionable. Aim for two to four short paragraphs. Avoid duplicating the one-to-two sentence summary already in meta.description — the prose body is the right place to expand on context, trade-offs, and usage guidance that would not fit there.


YAML front matter — complete field reference

Top-level required fields

spec_version: "1.2"
The AML format version. Must equal a platform-approved version string.

model_id: "claude-4-sonnet"
Stable, immutable identifier for this model entry. Used as the value of runtime.model in agent definition files. Lowercase kebab-case. Must match ^[a-z0-9_-]{3,64}$. Once published, the model_id cannot change — create a new entry for a different model.

version: "1.0.0"
Semantic version of this model definition (MAJOR.MINOR.PATCH). Increment whenever provider config, capabilities, or defaults change.

status: "active"
Lifecycle state. Enum: active | deprecated | disabled. A deprecated model can still be compiled but triggers a lint warning on any agent that references it. A disabled model causes a hard compile error.


meta — descriptive metadata (required)

meta:
  name: "Claude 4 Sonnet (Bedrock, us-east-1)"  # Required — display name in the registry UI
  description: "Balanced model for intelligence and speed via Amazon Bedrock."  # Optional
  owner: "platform-ml-team"                      # Optional
  tags: ["bedrock", "anthropic", "vision"]        # Optional — searchable labels
  last_updated: "2026-01-15"                      # Optional — ISO 8601 date

meta.name is the only required sub-field. All others are optional but recommended for registry discoverability.


provider — provider declaration (required)

A single provider object with two required sub-fields: type selects the provider, config holds the provider-specific settings whose shape is validated against the declared type. Any secret values inside config (API keys, tokens, endpoint URLs) must use a credentials block — literal values are a hard validation error.

provider:
  type: "bedrock"       # Required — selects the provider; determines the required shape of config
  config: { ... }       # Required — provider-specific settings; validated against type

type enum:

Value Provider
bedrock Amazon Bedrock
anthropic Anthropic direct API
openai OpenAI or OpenAI-compatible endpoint
litellm LiteLLM unified interface
ollama Ollama (local models)
llamaapi Llama API (Meta)
mistral Mistral AI direct API
writer Writer (Palmyra models)
custom Custom provider — requires a custom config block

bedrock — Amazon Bedrock

provider:
  type: "bedrock"
  config:
    model_id: "us.anthropic.claude-sonnet-4-20250514-v1:0"   # Required — Bedrock model ID
    region: "us-east-1"                                       # Required — AWS region
    credentials:
      source: "iam_role"    # iam_role | service-account (default: iam_role)

Or with a service account stored in a secret manager:

provider:
  type: "bedrock"
  config:
    model_id: "us.anthropic.claude-sonnet-4-20250514-v1:0"
    region: "us-east-1"
    credentials:
      source: "service-account"
      secret_id: "prod/aws/bedrock-credentials"  # Secret containing the AWS credentials JSON

credentials.source: "iam_role" is the recommended production setting — the runtime assumes the IAM role declared in the agent definition and requires no extra fields. service-account reads AWS credentials from a JSON object stored in a secret manager. The resolved secret must contain access_key_id and secret_access_key fields. See the Transport & Credentials reference for the full structure and supported backends.

Model access must be enabled in Amazon Bedrock for the specified model_id and region. See the AWS documentation.

anthropic — Anthropic direct API

provider:
  type: "anthropic"
  config:
    model: "claude-sonnet-4-5"   # Required — Anthropic model name
    credentials:                 # Required — credentials
      source: "aws_secrets_manager"
      secret_id: "prod/anthropic/api-key"
      region: "us-east-1"
      key: "api_key"           # Optional — if the secret is a JSON object

Alternatively, using an environment variable:

provider:
  type: "anthropic"
  config:
    model: "claude-sonnet-4-5"
    credentials:
      source: "env"
      name: "ANTHROPIC_API_KEY"

The literal key value must never appear in this file. See credentials — authenticating to providers below for all supported backends.

openai — OpenAI or OpenAI-compatible endpoint

provider:
  type: "openai"
  config:
    model: "gpt-4o"                       # Required — model name
    credentials:                          # Required — credentials
      source: "aws_secrets_manager"
      secret_id: "prod/openai/api-key"
      region: "us-east-1"
    base_url: "https://api.openai.com/v1" # Optional — override for compatible endpoints
    organization: "org-abc123"            # Optional — OpenAI organization ID

base_url can point to any OpenAI-compatible API (Azure OpenAI, vLLM, LM Studio, etc.). If omitted, the official OpenAI endpoint is used.

litellm — LiteLLM unified interface

provider:
  type: "litellm"
  config:
    model: "openai/gpt-4o"   # Required — LiteLLM model string (provider/model format)
    credentials:             # Required — credentials
      source: "aws_secrets_manager"
      secret_id: "prod/openai/api-key"
      region: "us-east-1"
    api_base: "https://..."  # Optional — override base URL
    extra_params:            # Optional — passed through to LiteLLM
      drop_params: true

LiteLLM model strings follow the provider/model convention (e.g., anthropic/claude-sonnet-4-5, mistral/mistral-large-latest). See the LiteLLM provider docs for the full list.

ollama — Ollama (local models)

provider:
  type: "ollama"
  config:
    model: "llama3.2"                     # Required — Ollama model name
    host: "http://localhost:11434"         # Optional — Ollama host (default: localhost:11434)

Ollama runs models locally. No API key is required. host must be accessible from the runtime environment. Recommended for development and offline/privacy scenarios only.

llamaapi — Llama API (Meta)

provider:
  type: "llamaapi"
  config:
    model: "Llama-4-Scout-17B-16E-Instruct-FP8"  # Required — model name
    credentials:                                  # Required — credentials
      source: "aws_secrets_manager"
      secret_id: "prod/llamaapi/api-key"
      region: "us-east-1"

mistral — Mistral AI

provider:
  type: "mistral"
  config:
    model: "mistral-large-latest"  # Required — Mistral model name
    credentials:                   # Required — credentials
      source: "gcp_secret_manager"
      project: "my-gcp-project"
      secret: "mistral-api-key"
      version: "latest"

writer — Writer (Palmyra)

provider:
  type: "writer"
  config:
    model: "palmyra-x5"  # Required — model name
    credentials:         # Required — credentials
      source: "azure_key_vault"
      vault_url: "https://myvault.vault.azure.net"
      secret_name: "writer-api-key"

custom — custom provider

provider:
  type: "custom"
  config:
    class_path: "myorg.models.CustomProvider"  # Required — fully qualified Python class
    config:                                     # Passed as kwargs to the provider constructor
      model: "my-model"
      endpoint: "https://models.myorg.internal/v1"
      credentials:                                # Credentials supported in custom config too
        source: "aws_secrets_manager"
        secret_id: "prod/custom-model/api-key"
        region: "us-east-1"

Custom providers must implement the Strands Model interface. They are registered by platform engineers and are not available to self-service authors.

credentials — authenticating to providers

Every credentials field across all providers uses the same structure: a source field selects the backend; all other fields depend on it. This applies uniformly whether the provider uses an API key, a cloud secret manager, or AWS identity-based auth.

# AWS IAM role — no secret needed (Bedrock only)
credentials:
  source: "iam_role"

# Service account JSON secret (Bedrock with explicit keys)
credentials:
  source: "service-account"
  secret_id: "prod/aws/bedrock-credentials"   # Secret containing the AWS credentials JSON
# The resolved secret must be a JSON object: { "access_key_id": "...", "secret_access_key": "..." }
# See the Transport & Credentials reference for the full structure and supported backends.

# Single environment variable (API key providers)
credentials:
  source: "env"
  name: "OPENAI_API_KEY"         # Name of the environment variable

# AWS Secrets Manager
credentials:
  source: "aws_secrets_manager"
  secret_id: "prod/openai/api-key"  # Secret name or full ARN
  region: "us-east-1"               # Optional — defaults to the provider's region
  key: "api_key"                    # Optional — JSON key if the secret value is a JSON object

# GCP Secret Manager
credentials:
  source: "gcp_secret_manager"
  project: "my-gcp-project"
  secret: "openai-api-key"
  version: "latest"                  # Optional — defaults to 'latest'

# Azure Key Vault
credentials:
  source: "azure_key_vault"
  vault_url: "https://myvault.vault.azure.net"
  secret_name: "openai-api-key"
  version: ""                        # Optional — omit or leave empty for latest

source enum: iam_role | service-account | env | aws_secrets_manager | gcp_secret_manager | azure_key_vault.

The runtime resolves credentials once at agent startup and caches the value for the duration of the run. The resolved value is never written to logs or persisted. A credentials block that fails to resolve at startup is a hard runtime error — the agent will not start.


capabilities:
  context_window: 200000        # Max input context in tokens
  max_output_tokens: 16000      # Max tokens the model can generate per call
  supports_tools: true          # Whether the model's API natively supports the tool-calling protocol
  supports_system_prompt: true  # Whether the model accepts a system prompt as a separate input
  supports_streaming: true      # Whether the model supports token-by-token streaming output
  modalities: ["text", "image"] # Enum array — allowed values below
Field Type Description
context_window integer Maximum context window size in tokens.
max_output_tokens integer Maximum tokens the model can generate in a single call.
supports_tools boolean Whether the model's API natively supports the tool/function-calling protocol — i.e., the runtime can pass structured tool definitions in the request and receive a structured tool_use / function_call response block. This is a model-level API capability, not a statement about agent behaviour. Models with supports_tools: false cannot be used with agents that have a non-empty tools section.
supports_system_prompt boolean Whether the model accepts a system prompt as a distinct input, separate from the user turn.
supports_streaming boolean Whether the model supports token-by-token streaming output.
modalities string[] Input modalities supported by this model. Enum: text | image | audio | document. Use this to declare vision (image), audio, or document understanding support rather than separate boolean flags.

The compiler uses capabilities to validate agent definitions. For example, if an agent declares knowledge bases with document retrieval but the model does not list "document" in modalities, the compiler emits a lint warning. If supports_tools: false, any agent referencing this model with a non-empty tools section is a hard validation error.


defaults — default inference parameters (optional)

defaults:
  temperature: 0.7          # Default temperature; overridden by agent runtime.temperature
  max_tokens: 4096          # Default max output tokens; overridden by agent runtime.max_output_tokens
  top_p: 1.0                # Nucleus sampling parameter
  stop_sequences: []        # Token sequences that stop generation

Agent-level runtime fields override these defaults. Provider-level limits take precedence over both — a model that supports a maximum of 16,000 output tokens cannot be forced beyond that limit regardless of what the agent or defaults declare.


How agents reference model definitions

In an agent definition file, the runtime.model and runtime.fallback_model fields hold a model_id string that the compiler resolves to a .model.md file:

runtime:
  model: "claude-4-sonnet"          # Resolved to models/claude-4-sonnet.model.md
  fallback_model: "claude-4-haiku"  # Resolved to models/claude-4-haiku.model.md

An unresolvable model_id (no matching .model.md file with status: active) is a hard compile error. A reference to a model with status: deprecated is a lint warning. A reference to a model with status: disabled is a hard compile error.


Validation rules

Rule Severity
model_id does not match ^[a-z0-9_-]{3,64}$ Hard error
provider.type value is not in the supported enum Hard error
provider.config shape does not match the declared provider.type Hard error
A credentials field contains a literal value instead of a structured block with source Hard error
credentials.source is not in the supported enum Hard error
credentials for aws_secrets_manager is missing secret_id Hard error
credentials for gcp_secret_manager is missing project or secret Hard error
credentials for azure_key_vault is missing vault_url or secret_name Hard error
credentials for env is missing name Hard error
credentials for service-account is missing secret_id Hard error
status: disabled and the model is referenced by a compiled agent Hard error
capabilities.supports_tools: false and the referencing agent has tools Hard error
status: deprecated and the model is referenced by an active agent Lint warning
provider: ollama used in an agent with status: active in production Lint warning
credentials.source is not iam_role in a production-targeted Bedrock definition Lint warning
credentials.source: env used in a production-targeted definition (prefer a secret manager) Lint warning
capabilities block is absent Lint warning
last_updated is more than 180 days ago Lint warning

Complete example

spec_version: "1.2"
model_id: "claude-4-sonnet"
version: "1.0.0"
status: "active"

meta:
  name: "Claude 4 Sonnet (Bedrock, us-east-1)"
  owner: "platform-ml-team"
  last_updated: "2026-01-15"

provider:
  type: "bedrock"
  config:
    model_id: "us.anthropic.claude-sonnet-4-20250514-v1:0"
    region: "us-east-1"
    credentials:
      source: "iam_role"

capabilities:
  context_window: 200000
  max_output_tokens: 16000
  supports_tools: true
  supports_system_prompt: true
  supports_streaming: true
  modalities: ["text", "image"]

defaults:
  temperature: 0.7
  max_tokens: 4096

Description

Claude 4 Sonnet is Anthropic's balanced model for intelligence and speed, accessed via Amazon Bedrock in us-east-1. It supports text and image inputs, native tool use, and a 200,000-token context window.

Use this model for general-purpose agent workloads that require strong reasoning, tool orchestration, or document understanding without the latency or cost of a larger frontier model. Prefer claude-4-opus when maximum reasoning depth matters more than throughput.

Credentials use source: "iam_role" — the runtime assumes the IAM role declared in the agent definition and requires no extra secret configuration. For local development, set source: "service-account" and store a JSON object with access_key_id and secret_access_key in a secret manager or local secret source — see the Transport & Credentials reference.