Course Offerings
I310D- Introduction to Human-Centered Data Science is a survey course that introduces students to the theory and practice of data science through a human-centered lens, with emphasis on how design choices influence algorithmic results. Students will gain comfort and facility with fundamental principles of data science including (a) Programming for Data Science with Python (b) Data Engineering (c) Database Systems (d) Machine Learning and (e) Human centered aspects such as privacy, bias, fairness, transparency, accountability, reproducibility, interpretability, and societal implications. Each week’s class divided into two segments: (a) Theory and Methods, a concise description of theoretical concept in data science, and (b) Tutorial, a hands-on session on applying the theory just discussed to a real-world task on publicly available data. We will use Python for programming and cover Python basics in the beginning of the course. For modules related to databases, we will use PostGre SQL.
No description provided.
The class explores the principles of relational database design, and SQL as a query language in depth.
Principles and practices in Data Engineering. Emphasis on the data engineering lifecycle and how to build data pipelines to collect, transform, analyze and visualize data from operational systems. This is a hands-on and highly interactive course. Students will learn analytical data modeling techniques for organizing and querying data. They will learn how to transform data into dimensional models, how to build data products, and how to visualize the data. We will also examine the various roles data engineers can have in an organization and career paths for data professionals
This course will cover relevant fundamental concepts in machine learning (ML) and how they are used to solve real-world problems. Students will learn the theory behind a variety of machine learning tools and practice applying the tools to real-world data such as numerical data, textual data (natural language processing), and visual data (computer vision). Each class is divided into two segments: (a) Theory and Methods, a concise description of an ML concept, and (b) Lab Tutorial, a hands-on session on applying the theory just discussed to a real-world task on publicly available data. We will use Python for programming. By the end of the course, the goals for the students are to: 1. Develop a sense of where to apply machine learning and where not to, and which ML algorithm to use 2. Understand the process of garnering and preprocessing a variety of “big” real-world data, to be used to train ML systems 3. Characterize the process to train machine learning algorithms and evaluate their performance 4. Develop programming skills to code in Python and use modern ML and scientific computing libraries like SciPy and scikit-learn 5. Propose a novel product/research-focused idea (this will be an iterative process), design and execute experiments, and present the findings and demos to a suitable audience (in this case, the class).
Practical skills and understandings required to effectively work with open source software and understand the projects that build them. Includes git-based collaboration as well as conceptual understanding of licenses, security, technical and social processes in open source development. Class projects involve working with digital trace data from open source repositories.
This course offers students in Information Science a comprehensive exploration into the theories, techniques, and tools of data visualization. It is designed to equip students with the skills to effectively communicate complex information visually, enabling data analysis and decision-making. Through a combination of lectures, hands-on projects, and case studies, students will learn how to design and implement effective and aesthetically appealing data visualizations for a variety of data types and audiences. Upon successful completion of this course, students will be able to: • Understand the principles and psychology of visual perception and how they influence data visualization. • Critically evaluate the effectiveness of different data visualization techniques for varying data types and user needs. • Master the use of leading data visualization tools and libraries such as D3.js, or Tableau. • Develop interactive dashboards and reports that effectively communicate findings to both technical and non-technical audiences. • Apply design principles to create visually appealing, accurate, and accessible data visualizations.
Introduction to the emerging field of Explainable Artificial Intelligence (XAI) from the perspectives of a developer and end-user. Students will gain hands-on experience with some of the most commonly used explainability techniques and algorithms.
Leveraging Text Mining, Natural Language Processing, and Computational Linguistics to address real-world textual data challenges, including document processing, keyword extraction, question answering, translation, summarization, sentiment analysis, search, recommendation, and information extraction. Each week, classes include (a) Theory and Methods for NLP concepts and (b) Lab Tutorials for practical application with Python on multilingual text datasets.
This course lays the foundation for data science education targeting health informatics students interested in learning more broadly about biomedical informatics. No previous coding experience is required. The students will be introduced to basic concepts and tools for data analysis. The focus is on hands-on practice and enjoyable learning. The course will use python as the programming language, and Jupyter Notebooks as the development environment (our “home base”) for the examples, tutorials, and assignments. We use Jupyterlab Notebooks because they are both the industry standard and a nice way to load, visualize, and analyze data and describe our findings in one environment. We will also learn GitHub to document changes and backup our work and, eventually, for use as a collaboration tool. Hands-on data analysis, final projects, and associated presentations will be mandatory for the completion of the course. The outcome for the class is that each student will have a GitHub repository with all of their work (Jupyter notebooks, data, etc.), including a final project that will be presented to the class. Specific topics to be covered include GitHub, Linux/Unix File system, Jupyter Notebooks, Python Programming, and Data Visualization.
This course offers an introduction to Fine-Tuning Open-Source Large Language Models (LLMs) through project-based applications and real-world examples. The course will begin with a foundational understanding of Natural Language Processing (NLP), focusing on Text Preprocessing techniques such as Tokenization and Vectorization. A basic overview of Large Language Models will be provided, covering the fundamental structure and architecture of commonly used Open-Source Frameworks. The course will then focus on three key methods for fine-tuning LLMs: Self-Supervised, Supervised and Reinforcement Learning. Each method will be explored through both theoretical explanations and practical group-based projects, applying these concepts to real-world examples. Students will engage in hands-on projects to strengthen their understanding of how to customize and optimize LLMs for specific tasks or domains.
INF 382C: Understanding and Serving Users
What does it really mean to be user-centered? How do we practice user-centered design in a professional and methodical manner? What research findings can we rely on to help us improve user experiences? This is a readings/discussion course that examines in depth what we know about people (that is, what does scientific research actually tell us) and how can we apply this knowledge in the real-world of experience design. We examine human psychology, from physical ergonomics to cultural dispositions, stopping off on cognition and social analyses en route, so as to have a holistic, robust perspective on what it means to understand users. The readings are complemented with an examination of methods e.g., what is a cognitive walkthrough and how do you do it reliably? what are the limitations of heuristic evaluations? The goal is to give you a solid grounding in the practices of user-centered thinking, regardless of your area of application, and prepare you for professional level contributions in the user-experience world. There is no teamwork, all students deliver individual term papers and design critique diaries. There are also no pre-requisites -- technical or theoretical, the class is open to all.
Explore common data collection, management, and sharing practices in information technology and emerging technologies, such as search engines and AI systems. Students will read papers and engage in discussions about the pros and cons of established data practices and learn about the three main components of responsible data management: 1) consent and ownership, 2) privacy and anonymity, and 3) broader impact. Students will also practice how to collect data, make data-driven decisions, and design data-driven products through group projects as UX designers, researchers, and data scientists. The course will bring in interdisciplinary perspectives with guest speakers from archive science, engineering, and respponsible AI, to provide a holistic view of broader data ecosystems and infrastructures.
In this course, we will work to understand and address the challenges of misinformation, disinformation, and strategic manipulation in online environments. First, we will work to develop a deep understanding of the problem space. We will read and discuss existing research (both historical and contemporary) on how and why misinformation and disinformation spread. Next, we will explore the process, both personal and interpersonal, by which these issues can be approached and addressed in our own lives. This will involve reflecting on our own presuppositions, beliefs, and biases about information; and doing a project in which we apply the principles of Human-Centered Design to investigate different design directions for addressing misleading information. Students will gain important contextual knowledge and hands-on design experience that they can take into future professional domains (from education to policy to technology), where they can contribute to building more trustworthy information systems.
Accessible UX provides students working (or planning to work) in any area of UX, Digital Product Management, or Development with key skills and insights into the current accessibility landscape, in addition to specific guidelines and WCAG conformance specifications. The course is divided into foundational and tactical modules. The first half of the course provides a comprehensive overview of Accessibility and its importance. The second half of the course involves evaluating real-world applications and websites per the WCAG guidelines, producing Accessibility reports, planning studies (with persons with disabilities), and designing for accessibility. Course Goals 1. Become proficient in recognizing accessibility issues in key domains 2. Understand successful team and organizational behaviors in Accessibility 3. Learn how Accessible UX and Development is accomplished 4. Evaluate Web and App experiences using the WCAG framework from W3.org/WAI
The purpose of this course is to provide theoretical and practical foundations for information professionals who wish to design and evaluate search systems and services, taking user-centered approaches. This course explores search user interfaces, search behavior, search interaction, search user experience, search as learning, search for creativity, and research methods for understanding information behavior and evaluating search systems. Students will learn search behavior across various contexts, including academic and professional settings, everyday life, and digital learning environments. Students will gain insights into how people interact with, use, and evaluate information in a variety of application areas, such as web search engines, domain-specific search systems, digital libraries, social search platforms, and generative AI-based systems.
Learning key data wrangling maneuvers in abstract and implementations in SQL, Excel, R Tidyverse, and Python Pandas. Maneuvers in data transformations include Nest, Pivot, Mutate (inc. separate/unite), Group/Summarize and Rectangling. Projects include working with "wild caught" data datasets (usually CSV or JSON) and computational notebook environments (e.g., iPython, Jupyter, Rmarkdown, Quarto). Fall 2024 has changes from previous syllabus now that we have Database Design and Introduction to Programming. Nonetheless, the previous syllabus is still useful as it links to course materials that show the teaching approach and type of assignments. http://howisonlab.github.io/datawrangling/#Schedule_of_classes
Introduction to combining human and machine intelligence to benefit people and society. Explore cutting-edge research on a number of subjects related to human-AI interaction, including the psychological and societal impacts of AI as well as design guidelines and methods for human-centered AI.