Repository for the oreilly live training course: "Getting Started with Llama3": https://learning.oreilly.com/live-events/getting-started-with-llama-2/0636920098588/
Conda
- Install anaconda
- This repo was tested on a Mac with python=3.10.
- Create an environment:
conda create -n oreilly-llama3 python=3.10 - Activate your environment with:
conda activate oreilly-llama3 - Install requirements with:
pip install -r requirements/requirements.txt - Setup your openai API key
Pip
-
Create a Virtual Environment: Navigate to your project directory. Make sure you hvae python3.10 installed! If using Python 3's built-in
venv:python -m venv oreilly-llama3
If you're using
virtualenv:virtualenv oreilly-llama3
-
Activate the Virtual Environment:
- On Windows:
.\oreilly-llama3\Scripts\activate
- On macOS and Linux:
source oreilly-llama3/bin/activate
- On Windows:
-
Install Dependencies from
requirements.txt:pip install python-dotenv pip install -r requirements/requirements.txt
-
Setup your openai API key
Remember to deactivate the virtual environment once you're done by simply typing:
deactivate- Change the
.env.examplefile to.envand add your OpenAI API key.
pip install jupyterpython3 -m ipykernel install --user --name=oreilly-llama3
These notebooks follow a structured learning path from basics to advanced topics:
-
Quickstart with Ollama - Get started running local LLMs using Ollama
-
Introduction to RAG - Learn the fundamentals of RAG with interactive visualizations of embeddings and chunking
-
Local RAG with Llama 3 - Build a complete local RAG system using Llama 3 and PDF documents
-
Tool Calling with Ollama - Learn how to implement tool calling with local LLMs (Gmail integration example)
-
Llama 3.1 Structured Outputs - Generate structured outputs using Pydantic models with Llama 3.1
-
Local Agent from Scratch - Build a simple agent from scratch using tool calling
-
Simple Agentic RAG - Build a ReAct-based agentic RAG system from scratch
-
Fine-Tuning Llama 3: What You Need to Know - Comprehensive guide to fine-tuning concepts (LoRA, QLoRA, PEFT)
-
Fine-Tuning Walkthrough with Hugging Face - Practical fine-tuning implementation
-
Quantization Precision Format Code Explanation - Deep dive into model quantization
-
GUI for Llama 3 Options - Explore different GUI options for working with Llama models
-
Best Local LLMs in Practice (2025 Edition) - Compare and explore the best local models available
-
vLLM Setup Guide - Complete guide to setting up and using vLLM for high-performance inference
Older versions and experimental notebooks are available in the notebooks/legacy-notebooks/ directory.
- LLM Model Sizes Guide - Comprehensive guide to different model sizes and their use cases
- Best Local Models 2025 - Updated guide to the top-performing open-source models that can run locally with <64GB RAM, including Qwen, DeepSeek, Mixtral, and others
- Performance Benchmarks: Latest benchmark scores for reasoning, coding, and multilingual tasks
- Hardware Requirements: Detailed RAM and GPU requirements for each model
- Deployment Instructions: Step-by-step setup for Ollama, LM Studio, and other tools
- Use Case Recommendations: Which models work best for specific applications
- Model Comparisons: Side-by-side analysis of capabilities and trade-offs
- Qwen2.5 Series (Alibaba) - Exceptional multilingual and reasoning capabilities
- DeepSeek-V3 & DeepSeek-Coder - Specialized programming and development
- Mixtral 8x22B - Efficient Mixture of Experts architecture
- Gemma 2 (Google) - Efficient and safety-focused models
- Command-R+ (Cohere) - Optimized for RAG and tool use
- Yi-Large (01.AI) - Strong bilingual performance