Class Description

Home
Class Description

Fall 2022

I 320D Topics in Human-Centered Data Science: Data Engineering

Unique ID: 28275

James Howison
Charles (Chip) Young

Tues
Thurs

11:00 AM - 12:30 PM PAR 208

Syllabus

Review Previous Course Iterations & Syllabi

In Person

DESCRIPTION

Principles and practices in Data Engineering. Emphasis on the data engineering lifecycle and how to build data pipelines to collect, transform, analyze and visualize data from operational systems. This is a hands-on and highly interactive course. Students will learn analytical data modeling techniques for organizing and querying data. They will learn how to transform data into dimensional models, how to build data products, and how to visualize the data. We will also examine the various roles data engineers can have in an organization and career paths for data professionals

COURSE NOTES

The class will balance general principles with hands-on experience with some of the tools, languages, and techniques of the modern data stack. Emphasis will be placed on SQL as the primary language of data engineering along with low- or no-code tools that leverage SQL, plus a little python. We’ll walk through building data pipelines end-to-end, from ingesting source data to creating analytical data products that deliver value to organizations. We’ll use business intelligence tools to build visualizations using those data products. We will look at both batch processing and streaming systems to understand their pros and cons. We’ll talk about data lakes, data warehouses, ETL/ELT, and batch and streaming systems to understand the pros and cons of each. We will look at issues around data quality, understand the uses of data catalogs, examine data lineage and data profiling tools, and discuss data governance in organizations. Time permitting, we’ll also discuss trends and future directions in data engineering. https://www.ischool.utexas.edu/sites/default/files/webform/syllabi/Data%20Engineering%20Draft%20Course%20Schedule.pdf

PREREQUISITES

Upper-division standing; Informatics 310D and Informatics 304 (or one of the following approved substitutions: C S 303E, C S 312, C S 312H, C S 313E).

RESTRICTIONS

Informatics majors will have top registration priority through the early periods of registration. Informatics minors are encouraged to join the waitlist, which will begin promoting students on July 18 if seats remain available.

All other students will need to complete this Registration Support Questionnaire in order to request a seat in any of our classes.