Skip to content

DataTalksClub/open-source-llm-zoomcamp

Repository files navigation

Open-Source LLM Zoomcamp Overview

Open-Source LLM Zoomcamp: Building and Deploying LLMs on AMD Hardware

Welcome to the Open-Source LLM Zoomcamp, where we'll explore how to build, tune, and deploy large language models together. We'll be using AMD's MI300x GPUs (hosted on Saturn Cloud) to learn hands-on with open-source LLMs.

Join Slack#course-open-source-llm Channel on SlackTelegram Announcements ChannelFAQTweet about the Course

Who Is This For?

This course might be a good fit if you:

  • Are an ML practitioner wanting to dive deeper into open-source LLM stacks
  • Have a software engineering background and want to get hands-on with LLMs
  • Are a researcher or open-source enthusiast interested in reproducible ML
  • Work in MLOps and want to explore AMD's ROCm ecosystem

What We'll Cover

  • Course overview
  • Overview of open-source AI ecosystem
  • Intro to Large Language Models (LLMs)
  • Hugging Face and different LLMs
  • Environment setup
  • Introduction to ROCm and AMD GPUs
  • ROCm vs CUDA
  • Setting up Saturn Cloud for ROCm + MI300x
  • Running DeepSeek R1 (tutorial)
  • Build a simple Streamlit chat app
  • Serving LLMs with vLLM
  • Homework: Run and serve an LLM on Saturn Cloud
  • Fine-tuning concepts
  • Llama Factory workflow
  • Using Llama Factory for fine-tuning
  • Preparing a dataset
  • Fine-tuning DeepSeek R1 (tutorial)
  • Improving the chatbot from module 1
  • Bonus: text-to-image models
  • Homework: Fine-tune a model

You'll find a dataset you're interested in and fine-tune an open-source LLM for a specific domain (e.g. legal documents, medical data, or technical documentation) and deploy it so others can use it.

How to Join?

We're starting in 2025! Sign up here to join us.

Your Instructor

Coming Soon

  • Course Channel on DTC Slack
  • Telegram Channel with Announcements
  • Pre-launch Q&A Stream
  • Launch Stream with Course Overview
  • Course Google Calendar
  • FAQ
  • Course Playlist

About DataTalks.Club

DataTalks.Club is a community of data enthusiasts learning and growing together. We're all about sharing knowledge, helping each other out, and making data science more accessible.

Join us: • WebsiteSlack CommunityNewsletterEventsCalendarYouTubeGitHubLinkedInTwitter

About

A free mini-course about Open-Source LLMs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published