Program: Undergraduate
Course Areas
Human-Centered Data Science
Catalog Description
Hands-on experience in data preparation, model fine tuning, and performance evaluation for popular open-source frameworks.
Instructor Description
This course offers an introduction to Fine-Tuning Open-Source Large Language Models (LLMs) through project-based applications and real-world examples. The course will begin with a foundational understanding of Natural Language Processing (NLP), focusing on Text Preprocessing techniques such as Tokenization and Vectorization. A basic overview of Large Language Models will be provided, covering the fundamental structure and architecture of commonly used Open-Source Frameworks. The course will then focus on three key methods for fine-tuning LLMs: Self-Supervised, Supervised and Reinforcement Learning. Each method will be explored through both theoretical explanations and practical group-based projects, applying these concepts to real-world examples. Students will engage in hands-on projects to strengthen their understanding of how to customize and optimize LLMs for specific tasks or domains.
Prerequisites
Upper-division standing; Informatics 310D and Informatics 304 (or one of the following approved substitutions: C S 303E, C S 312, C S 312H, C S 313E).
Scheduled and Upcoming Classes for this Course
Class Name | Semester | Day(s) | Start Time(s) | End Time(s) | Building | Room |
---|---|---|---|---|---|---|
I 320D: Topics in Human-Centered Data Science: Fine Tuning Open-Source Large Language Models
Louis Gutierrez Syllabus |
Spring Term 2025 |
|
|
|
|
|
Past Classes for this Course
No past classes to list for this course.