Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Containerfile
Original file line number Diff line number Diff line change
Expand Up @@ -62,4 +62,4 @@ ENTRYPOINT ["python3.12", "src/lightspeed_stack.py"]
LABEL vendor="Red Hat, Inc."

# no-root user is checked in Konflux
USER 1001
USER 1001
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change

74 changes: 74 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -506,6 +506,80 @@ Container images are built for the following platforms:
1. `linux/amd64` - main platform for deployment
1. `linux/arm64`- Mac users with M1/M2/M3 CPUs

## Building Container Images

The repository includes production-ready container configurations that support two deployment modes:

1. **Server Mode**: lightspeed-core connects to llama-stack as a separate service
2. **Library Mode**: llama-stack runs as a library within lightspeed-core

### Llama-Stack as Separate Service (Server Mode)

When using llama-stack as a separate service, the existing `docker-compose.yaml` provides the complete setup. This builds two containers for lightspeed core and llama stack.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

llama-stack (with hyphen). To keep things consistent, I'm seeing Llama-stack, llama-stack and now llama stack. Let's pick one


**Configuration** (`lightspeed-stack.yaml`):
```yaml
llama_stack:
use_as_library_client: false
url: http://llama-stack:8321 # container name from docker-compose.yaml
api_key: xyzzy
```

In the root of this project simply run:

```bash
# Set your OpenAI API key
export OPENAI_API_KEY="your-api-key-here"

# Start both services
podman compose up --build

# Access lightspeed-core at http://localhost:8080
# Access llama-stack at http://localhost:8321
```

### Llama-Stack as Library (Library Mode)

When embedding llama-stack directly in the container, use the existing `Containerfile` directly (this will not build the llama stack service in a separate container). First modify the `lightspeed-stack.yaml` config to use llama stack in library mode.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, llama-stack (with hyphen)


**Configuration** (`lightspeed-stack.yaml`):
```yaml
llama_stack:
use_as_library_client: true
library_client_config_path: /app-root/run.yaml
```

**Build and run**:
```bash
# Build lightspeed-core with embedded llama-stack
podman build -f Containerfile -t my-lightspeed-core:latest .

# Run with embedded llama-stack
podman run \
-p 8080:8080 \
-v ./lightspeed-stack.yaml:/app-root/lightspeed-stack.yaml:Z \
-v ./run.yaml:/app-root/run.yaml:Z \
-e OPENAI_API_KEY=your-api-key \
my-lightspeed-core:latest
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command is more for macosx. Please also include this linux command:

podman run -p 8080:8080 -v ./lightspeed-stack.yaml:/app-root/lightspeed-stack.yaml:Z -v ./run.yaml:/app-root/run.yaml:Z -e OPENAI_API_KEY=$OPENAI_API_KEY my-lightspeed-core:latest


For macosx users:
```bash
podman run \
-p 8080:8080 \
-v ./lightspeed-stack.yaml:/app-root/lightspeed-stack.yaml:ro \
-v ./run.yaml:/app-root/run.yaml:ro \
-e OPENAI_API_KEY=your-api-key \
my-lightspeed-core:latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just nitpick: This command is for linux. Please swap those commands

command macosx:

podman run -p 8080:8080 -v ./lightspeed-stack.yaml:/app-root/lightspeed-stack.yaml:ro \
  -v ./run.yaml:/app-root/run.yaml:ro \
  -e OPENAI_API_KEY=your-api-key \
  my-lightspeed-core:latest

command for linux:

podman run -p 8080:8080 -v ./lightspeed-stack.yaml:/app-root/lightspeed-stack.yaml:Z -v ./run.yaml:/app-root/run.yaml:Z -e OPENAI_API_KEY=$OPENAI_API_KEY my-lightspeed-core:latest

```

### Verify it's running properly

A simple sanity check:

```bash
curl -H "Accept: application/json" http://localhost:8080/v1/models
```


# Endpoints
Expand Down
4 changes: 2 additions & 2 deletions lightspeed-stack.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Lightspeed Core Service (LCS)
service:
host: localhost
host: 0.0.0.0
port: 8080
auth_enabled: false
workers: 1
Expand All @@ -13,7 +13,7 @@ llama_stack:
# Alternative for "as library use"
# use_as_library_client: true
# library_client_config_path: <path-to-llama-stack-run.yaml-file>
url: http://localhost:8321
url: http://llama-stack:8321
api_key: xyzzy
user_data_collection:
feedback_enabled: true
Expand Down
125 changes: 125 additions & 0 deletions run.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
version: '2'
image_name: minimal-viable-llama-stack-configuration

apis:
- agents
- datasetio
- eval
- inference
- post_training
- safety
- scoring
- telemetry
- tool_runtime
- vector_io
benchmarks: []
container_image: null
datasets: []
external_providers_dir: null
inference_store:
db_path: .llama/distributions/ollama/inference_store.db
type: sqlite
logging: null
metadata_store:
db_path: .llama/distributions/ollama/registry.db
namespace: null
type: sqlite
providers:
agents:
- config:
persistence_store:
db_path: .llama/distributions/ollama/agents_store.db
namespace: null
type: sqlite
responses_store:
db_path: .llama/distributions/ollama/responses_store.db
type: sqlite
provider_id: meta-reference
provider_type: inline::meta-reference
datasetio:
- config:
kvstore:
db_path: .llama/distributions/ollama/huggingface_datasetio.db
namespace: null
type: sqlite
provider_id: huggingface
provider_type: remote::huggingface
- config:
kvstore:
db_path: .llama/distributions/ollama/localfs_datasetio.db
namespace: null
type: sqlite
provider_id: localfs
provider_type: inline::localfs
eval:
- config:
kvstore:
db_path: .llama/distributions/ollama/meta_reference_eval.db
namespace: null
type: sqlite
provider_id: meta-reference
provider_type: inline::meta-reference
inference:
- provider_id: openai
provider_type: remote::openai
config:
api_key: ${env.OPENAI_API_KEY}
post_training:
- config:
checkpoint_format: huggingface
device: cpu
distributed_backend: null
provider_id: huggingface
provider_type: inline::huggingface
safety:
- config:
excluded_categories: []
provider_id: llama-guard
provider_type: inline::llama-guard
scoring:
- config: {}
provider_id: basic
provider_type: inline::basic
- config: {}
provider_id: llm-as-judge
provider_type: inline::llm-as-judge
- config:
openai_api_key: '********'
provider_id: braintrust
provider_type: inline::braintrust
telemetry:
- config:
service_name: 'lightspeed-stack-telemetry'
sinks: sqlite
sqlite_db_path: .llama/distributions/ollama/trace_store.db
provider_id: meta-reference
provider_type: inline::meta-reference
tool_runtime:
- provider_id: model-context-protocol
provider_type: remote::model-context-protocol
config: {}
vector_io:
- config:
kvstore:
db_path: .llama/distributions/ollama/faiss_store.db
namespace: null
type: sqlite
provider_id: faiss
provider_type: inline::faiss
scoring_fns: []
server:
auth: null
host: null
port: 8321
quota: null
tls_cafile: null
tls_certfile: null
tls_keyfile: null
shields: []
vector_dbs: []

models:
- model_id: gpt-4-turbo
provider_id: openai
model_type: llm
provider_model_id: gpt-4-turbo
4 changes: 2 additions & 2 deletions test.containerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ COPY README.md ./
COPY src/ ./src/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: Please drop changes in this file, I update mine PR to include this kind of changes.
#292


RUN microdnf install -y --nodocs --setopt=keepcache=0 --setopt=tsflags=nodocs \
python3.12 python3.12-devel python3.12-pip git tar
python3.12 python3.12-devel python3.12-pip git tar gcc gcc-c++ make

RUN curl -LsSf https://astral.sh/uv/install.sh | sh

Comment on lines 19 to 20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Harden the curl | sh step and pin uv version

Piping curl to sh without pipefail can mask download failures; also pin uv for reproducible builds.

-RUN curl -LsSf https://astral.sh/uv/install.sh | sh
+RUN set -euxo pipefail; curl -fsSL https://astral.sh/uv/install.sh | sh -s -- --version ${UV_VERSION:-latest}

Add this ARG near the top of the file (outside the selected range):

ARG UV_VERSION=0.4.29
🤖 Prompt for AI Agents
In test.containerfile around lines 20-21, the RUN that pipes curl to sh is
unsafe and unpinned; add an ARG near the top of the file: UV_VERSION=0.4.29, and
update the RUN to enable strict failure handling and pass the pinned version to
the installer (use set -o pipefail -e and curl with --fail --show-error
--location, piping to sh -s -- "$UV_VERSION") so the build fails on download
errors and the uv version is reproducible.

Expand All @@ -25,4 +25,4 @@ RUN uv -h
# Include dev deps for testing (pytest, behave, etc.)
RUN uv sync --locked --no-install-project --group dev --group llslibdev

CMD ["uv", "run", "llama", "stack", "run", "run.yaml"]
CMD ["uv", "run", "llama", "stack", "run", "run.yaml"]