I 320D: Topics in Human-Centered Data Science: Fine Tuning Open-Source Large Language Models

Day	Start	End	Building	Room
Tuesday Thursday	06:30 PM 06:30 PM	08:00 PM 08:00 PM	CBA CBA	4.344 4.344

Catalog Description

Hands-on experience in data preparation, model fine tuning, and performance evaluation for popular open-source frameworks.

Instructor Description

This course offers an introduction to Fine-Tuning Open-Source Large Language Models (LLMs) through project-based applications and real-world examples. The course will begin with a foundational understanding of Natural Language Processing (NLP), focusing on Text Preprocessing techniques such as Tokenization and Vectorization. A basic overview of Large Language Models will be provided, covering the fundamental structure and architecture of commonly used Open-Source Frameworks. The course will then focus on three key methods for fine-tuning LLMs: Self-Supervised, Supervised and Reinforcement Learning. Each method will be explored through both theoretical explanations and practical group-based projects, applying these concepts to real-world examples. Students will engage in hands-on projects to strengthen their understanding of how to customize and optimize LLMs for specific tasks or domains.

More information about this course

Spring Term 2025

Unique ID

28189

Instructor

Louis Gutierrez

Mode: In Person

Restrictions

Restricted to undergraduate Informatics majors through registration period 1. Informatics minors may add classes and join waitlists beginning in period 2. Outside students will be permitted to join our waitlists beginning with period 3.