Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions docs/docs/providers/agents/index.mdx
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
description: "Agents
description: |
Agents

APIs for creating and interacting with agentic systems."
APIs for creating and interacting with agentic systems.
sidebar_label: Agents
title: Agents
---
Expand All @@ -12,6 +13,6 @@ title: Agents

Agents

APIs for creating and interacting with agentic systems.
APIs for creating and interacting with agentic systems.

This section contains documentation for all available providers for the **agents** API.
2 changes: 1 addition & 1 deletion docs/docs/providers/agents/inline_meta-reference.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Meta's reference implementation of an agent system that can use tools, access ve

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `persistence` | `<class 'inline.agents.meta_reference.config.AgentPersistenceConfig'>` | No | | |
| `persistence` | `AgentPersistenceConfig` | No | | |

## Sample Configuration

Expand Down
27 changes: 14 additions & 13 deletions docs/docs/providers/batches/index.mdx
Original file line number Diff line number Diff line change
@@ -1,14 +1,15 @@
---
description: "The Batches API enables efficient processing of multiple requests in a single operation,
particularly useful for processing large datasets, batch evaluation workflows, and
cost-effective inference at scale.
description: |
The Batches API enables efficient processing of multiple requests in a single operation,
particularly useful for processing large datasets, batch evaluation workflows, and
cost-effective inference at scale.

The API is designed to allow use of openai client libraries for seamless integration.
The API is designed to allow use of openai client libraries for seamless integration.

This API provides the following extensions:
- idempotent batch creation
This API provides the following extensions:
- idempotent batch creation

Note: This API is currently under active development and may undergo changes."
Note: This API is currently under active development and may undergo changes.
sidebar_label: Batches
title: Batches
---
Expand All @@ -18,14 +19,14 @@ title: Batches
## Overview

The Batches API enables efficient processing of multiple requests in a single operation,
particularly useful for processing large datasets, batch evaluation workflows, and
cost-effective inference at scale.
particularly useful for processing large datasets, batch evaluation workflows, and
cost-effective inference at scale.

The API is designed to allow use of openai client libraries for seamless integration.
The API is designed to allow use of openai client libraries for seamless integration.

This API provides the following extensions:
- idempotent batch creation
This API provides the following extensions:
- idempotent batch creation

Note: This API is currently under active development and may undergo changes.
Note: This API is currently under active development and may undergo changes.

This section contains documentation for all available providers for the **batches** API.
6 changes: 3 additions & 3 deletions docs/docs/providers/batches/inline_reference.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ Reference implementation of batches API with KVStore persistence.

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `kvstore` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | Configuration for the key-value store backend. |
| `max_concurrent_batches` | `<class 'int'>` | No | 1 | Maximum number of concurrent batches to process simultaneously. |
| `max_concurrent_requests_per_batch` | `<class 'int'>` | No | 10 | Maximum number of concurrent requests to process per batch. |
| `kvstore` | `KVStoreReference` | No | | Configuration for the key-value store backend. |
| `max_concurrent_batches` | `int` | No | 1 | Maximum number of concurrent batches to process simultaneously. |
| `max_concurrent_requests_per_batch` | `int` | No | 10 | Maximum number of concurrent requests to process per batch. |

## Sample Configuration

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/providers/datasetio/inline_localfs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Local filesystem-based dataset I/O provider for reading and writing datasets to

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `kvstore` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
| `kvstore` | `KVStoreReference` | No | | |

## Sample Configuration

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/providers/datasetio/remote_huggingface.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ HuggingFace datasets provider for accessing and managing datasets from the Huggi

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `kvstore` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
| `kvstore` | `KVStoreReference` | No | | |

## Sample Configuration

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/providers/datasetio/remote_nvidia.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ NVIDIA's dataset I/O provider for accessing datasets from NVIDIA's data platform
| `api_key` | `str \| None` | No | | The NVIDIA API key. |
| `dataset_namespace` | `str \| None` | No | default | The NVIDIA dataset namespace. |
| `project_id` | `str \| None` | No | test-project | The NVIDIA project ID. |
| `datasets_url` | `<class 'str'>` | No | http://nemo.test | Base URL for the NeMo Dataset API |
| `datasets_url` | `str` | No | http://nemo.test | Base URL for the NeMo Dataset API |

## Sample Configuration

Expand Down
7 changes: 4 additions & 3 deletions docs/docs/providers/eval/index.mdx
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
description: "Evaluations
description: |
Evaluations

Llama Stack Evaluation API for running evaluations on model and agent candidates."
Llama Stack Evaluation API for running evaluations on model and agent candidates.
sidebar_label: Eval
title: Eval
---
Expand All @@ -12,6 +13,6 @@ title: Eval

Evaluations

Llama Stack Evaluation API for running evaluations on model and agent candidates.
Llama Stack Evaluation API for running evaluations on model and agent candidates.

This section contains documentation for all available providers for the **eval** API.
2 changes: 1 addition & 1 deletion docs/docs/providers/eval/inline_meta-reference.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Meta's reference implementation of evaluation tasks with support for multiple la

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `kvstore` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
| `kvstore` | `KVStoreReference` | No | | |

## Sample Configuration

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/providers/eval/remote_nvidia.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ NVIDIA's evaluation provider for running evaluation tasks on NVIDIA's platform.

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `evaluator_url` | `<class 'str'>` | No | http://0.0.0.0:7331 | The url for accessing the evaluator service |
| `evaluator_url` | `str` | No | http://0.0.0.0:7331 | The url for accessing the evaluator service |

## Sample Configuration

Expand Down
7 changes: 4 additions & 3 deletions docs/docs/providers/files/index.mdx
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
description: "Files
description: |
Files

This API is used to upload documents that can be used with other Llama Stack APIs."
This API is used to upload documents that can be used with other Llama Stack APIs.
sidebar_label: Files
title: Files
---
Expand All @@ -12,6 +13,6 @@ title: Files

Files

This API is used to upload documents that can be used with other Llama Stack APIs.
This API is used to upload documents that can be used with other Llama Stack APIs.

This section contains documentation for all available providers for the **files** API.
6 changes: 3 additions & 3 deletions docs/docs/providers/files/inline_localfs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ Local filesystem-based file storage provider for managing files and documents lo

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `storage_dir` | `<class 'str'>` | No | | Directory to store uploaded files |
| `metadata_store` | `<class 'llama_stack.core.storage.datatypes.SqlStoreReference'>` | No | | SQL store configuration for file metadata |
| `ttl_secs` | `<class 'int'>` | No | 31536000 | |
| `storage_dir` | `str` | No | | Directory to store uploaded files |
| `metadata_store` | `SqlStoreReference` | No | | SQL store configuration for file metadata |
| `ttl_secs` | `int` | No | 31536000 | |

## Sample Configuration

Expand Down
4 changes: 2 additions & 2 deletions docs/docs/providers/files/remote_openai.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ OpenAI Files API provider for managing files through OpenAI's native file storag

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `api_key` | `<class 'str'>` | No | | OpenAI API key for authentication |
| `metadata_store` | `<class 'llama_stack.core.storage.datatypes.SqlStoreReference'>` | No | | SQL store configuration for file metadata |
| `api_key` | `str` | No | | OpenAI API key for authentication |
| `metadata_store` | `SqlStoreReference` | No | | SQL store configuration for file metadata |

## Sample Configuration

Expand Down
8 changes: 4 additions & 4 deletions docs/docs/providers/files/remote_s3.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,13 @@ AWS S3-based file storage provider for scalable cloud file management with metad

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `bucket_name` | `<class 'str'>` | No | | S3 bucket name to store files |
| `region` | `<class 'str'>` | No | us-east-1 | AWS region where the bucket is located |
| `bucket_name` | `str` | No | | S3 bucket name to store files |
| `region` | `str` | No | us-east-1 | AWS region where the bucket is located |
| `aws_access_key_id` | `str \| None` | No | | AWS access key ID (optional if using IAM roles) |
| `aws_secret_access_key` | `str \| None` | No | | AWS secret access key (optional if using IAM roles) |
| `endpoint_url` | `str \| None` | No | | Custom S3 endpoint URL (for MinIO, LocalStack, etc.) |
| `auto_create_bucket` | `<class 'bool'>` | No | False | Automatically create the S3 bucket if it doesn't exist |
| `metadata_store` | `<class 'llama_stack.core.storage.datatypes.SqlStoreReference'>` | No | | SQL store configuration for file metadata |
| `auto_create_bucket` | `bool` | No | False | Automatically create the S3 bucket if it doesn't exist |
| `metadata_store` | `SqlStoreReference` | No | | SQL store configuration for file metadata |

## Sample Configuration

Expand Down
23 changes: 12 additions & 11 deletions docs/docs/providers/inference/index.mdx
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
---
description: "Inference
description: |
Inference

Llama Stack Inference API for generating completions, chat completions, and embeddings.
Llama Stack Inference API for generating completions, chat completions, and embeddings.

This API provides the raw interface to the underlying models. Three kinds of models are supported:
- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.
- Embedding models: these models generate embeddings to be used for semantic search.
- Rerank models: these models reorder the documents based on their relevance to a query."
This API provides the raw interface to the underlying models. Three kinds of models are supported:
- LLM models: these models generate "raw" and "chat" (conversational) completions.
- Embedding models: these models generate embeddings to be used for semantic search.
- Rerank models: these models reorder the documents based on their relevance to a query.
sidebar_label: Inference
title: Inference
---
Expand All @@ -17,11 +18,11 @@ title: Inference

Inference

Llama Stack Inference API for generating completions, chat completions, and embeddings.
Llama Stack Inference API for generating completions, chat completions, and embeddings.

This API provides the raw interface to the underlying models. Three kinds of models are supported:
- LLM models: these models generate "raw" and "chat" (conversational) completions.
- Embedding models: these models generate embeddings to be used for semantic search.
- Rerank models: these models reorder the documents based on their relevance to a query.
This API provides the raw interface to the underlying models. Three kinds of models are supported:
- LLM models: these models generate "raw" and "chat" (conversational) completions.
- Embedding models: these models generate embeddings to be used for semantic search.
- Rerank models: these models reorder the documents based on their relevance to a query.

This section contains documentation for all available providers for the **inference** API.
8 changes: 4 additions & 4 deletions docs/docs/providers/inference/inline_meta-reference.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,12 @@ Meta's reference implementation of inference with support for various model form
|-------|------|----------|---------|-------------|
| `model` | `str \| None` | No | | |
| `torch_seed` | `int \| None` | No | | |
| `max_seq_len` | `<class 'int'>` | No | 4096 | |
| `max_batch_size` | `<class 'int'>` | No | 1 | |
| `max_seq_len` | `int` | No | 4096 | |
| `max_batch_size` | `int` | No | 1 | |
| `model_parallel_size` | `int \| None` | No | | |
| `create_distributed_process_group` | `<class 'bool'>` | No | True | |
| `create_distributed_process_group` | `bool` | No | True | |
| `checkpoint_dir` | `str \| None` | No | | |
| `quantization` | `Bf16QuantizationConfig \| Fp8QuantizationConfig \| Int4QuantizationConfig, annotation=NoneType, required=True, discriminator='type'` | No | | |
| `quantization` | `Bf16QuantizationConfig \| Fp8QuantizationConfig \| Int4QuantizationConfig \| None` | No | | |

## Sample Configuration

Expand Down
6 changes: 3 additions & 3 deletions docs/docs/providers/inference/remote_anthropic.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ Anthropic inference provider for accessing Claude models and Anthropic's AI serv

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |

## Sample Configuration

Expand Down
8 changes: 4 additions & 4 deletions docs/docs/providers/inference/remote_azure.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ https://learn.microsoft.com/en-us/azure/ai-foundry/openai/overview

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
| `api_base` | `<class 'pydantic.networks.HttpUrl'>` | No | | Azure API base for Azure (e.g., https://your-resource-name.openai.azure.com) |
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
| `api_base` | `HttpUrl` | No | | Azure API base for Azure (e.g., https://your-resource-name.openai.azure.com) |
| `api_version` | `str \| None` | No | | Azure API version for Azure (e.g., 2024-12-01-preview) |
| `api_type` | `str \| None` | No | azure | Azure API type for Azure (e.g., azure) |

Expand Down
8 changes: 4 additions & 4 deletions docs/docs/providers/inference/remote_bedrock.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@ AWS Bedrock inference provider using OpenAI compatible endpoint.

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
| `region_name` | `<class 'str'>` | No | us-east-2 | AWS Region for the Bedrock Runtime endpoint |
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
| `region_name` | `str` | No | us-east-2 | AWS Region for the Bedrock Runtime endpoint |

## Sample Configuration

Expand Down
8 changes: 4 additions & 4 deletions docs/docs/providers/inference/remote_cerebras.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@ Cerebras inference provider for running models on Cerebras Cloud platform.

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
| `base_url` | `<class 'str'>` | No | https://api.cerebras.ai | Base URL for the Cerebras API |
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
| `base_url` | `str` | No | https://api.cerebras.ai | Base URL for the Cerebras API |

## Sample Configuration

Expand Down
6 changes: 3 additions & 3 deletions docs/docs/providers/inference/remote_databricks.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ Databricks inference provider for running models on Databricks' unified analytic

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
| `api_token` | `pydantic.types.SecretStr \| None` | No | | The Databricks API token |
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
| `api_token` | `SecretStr \| None` | No | | The Databricks API token |
| `url` | `str \| None` | No | | The URL for the Databricks model serving endpoint |

## Sample Configuration
Expand Down
8 changes: 4 additions & 4 deletions docs/docs/providers/inference/remote_fireworks.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@ Fireworks AI inference provider for Llama models and other AI models on the Fire

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
| `url` | `<class 'str'>` | No | https://api.fireworks.ai/inference/v1 | The URL for the Fireworks server |
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
| `url` | `str` | No | https://api.fireworks.ai/inference/v1 | The URL for the Fireworks server |

## Sample Configuration

Expand Down
Loading
Loading