Topics in Human-Centered Data Science: Fine Tuning Open-Source Large Language Models

Instructor Description

This course offers an introduction to Fine-Tuning Open-Source Large Language Models (LLMs) through project-based applications and real-world examples. The course will begin with a foundational understanding of Natural Language Processing (NLP), focusing on Text Preprocessing techniques such as Tokenization and Vectorization. A basic overview of Large Language Models will be provided, covering the fundamental structure and architecture of commonly used Open-Source Frameworks. The course will then focus on three key methods for fine-tuning LLMs: Self-Supervised, Supervised and Reinforcement Learning. Each method will be explored through both theoretical explanations and practical group-based projects, applying these concepts to real-world examples. Students will engage in hands-on projects to strengthen their understanding of how to customize and optimize LLMs for specific tasks or domains.

Course Areas

Human-Centered Data Science

Course classes

Instructor	Title	Year	Semester	Syllabus
Louis Gutierrez	I 320D: Topics in Human-Centered Data Science: Fine Tuning Open-Source Large Language Models	2025	Spring 2025 Term

Instructor

Title

Year

Semester

Syllabus

Louis Gutierrez

I 320D: Topics in Human-Centered Data Science: Fine Tuning Open-Source Large Language Models

2025

Spring 2025 Term