Course Descriptions

DATA 501: Data Science Visualization Lab

Credits: 1

Take in conjunction with HCDE 511: Information Visualization, this class provides students with additional opportunities to practice and discuss data visualization concepts, with an emphasis on user-centered design (UCD) approaches and software development. Students  work in small groups on structured data visualization exercises and UCD methods and to implement simple visualizations.

DATA 512: Human-Centered Data Science

Credits: 5

This course focuses on fundamental principles of data science and its human implications. We’ll cover data ethics; data privacy; differential privacy; algorithmic bias; legal frameworks and intellectual property; provenance and reproducibility; data curation and preservation; user experience design and usability testing for big data; ethics of crowdwork; data communication; and societal impacts of data science.

DATA 514: Data Management for Data Science

Credits: 5

This course introduces students to database management systems and techniques that use these systems. Topics covered include data models; query languages; database tuning and optimization; data warehousing; and parallel processing.

DATA 515: Software Design for Data Science

Credits: 5

This course introduces students to software design and engineering practices and concepts, including version control, testing and automatic build management.

DATA 516: Scalable Data Systems & Algorithms

Credits: 5

This course focuses on principles and algorithms for data management and analysis at scale. We’ll cover designs of and how to use traditional and modern big data systems, as well as the basics of cloud computing.

DATA 556: Introduction to Statistics &  Probability

Credits: 5

In this course, you’ll get an overview of probability; conditional probability and independence; Bayes’ theorem; discrete and continuous random variables, including jointly distributed random variables; key distributions, including normal distribution and its spin-offs; properties of expectation and variance; conditional expectation; covariance and correlation; central limit theorem; law of large numbers; and parameter estimation.

DATA 557: Applied Statistics & Experimental Design

Credits: 5

This course focuses on inferential statistical methods for discrete and continuous random variables, including tests for difference in means and proportions; linear and logistic regression; causation versus correlation; confounding; resampling methods; and study design.

DATA 558: Statistical Machine Learning for Data Scientists

Credits: 5

This course covers bias-variance trade-off; training versus test error; overfitting; cross-validation; subset selection methods; regularized approaches for linear/logistic regression: ridge and lasso; non-parametric regression: trees, bagging, random forests; local regression and splines; generalized additive models; support vector machines; k-means and hierarchical clustering; and principal components analysis.

DATA 590: Data Science Capstone I - Project Preparation

Credits: 2

This course is part one of a two-course capstone sequence where students organize project teams, select project topics, write a project proposal and begin preparing project data sets.

DATA 591: Data Science Capstone II - Project Implementation

Credits :3

This course is part two of a two-course capstone sequence designed to build upon the student-driven project from DATA 590. Students synthesize and apply knowledge and techniques acquired throughout the Master of Science in Data Science program for  working with large data sets, deriving insights from data and sharing insights with other people.

HCDE 511: Information Visualization

Credits: 4

This course covers the design and presentation of digital information, teaching students how to use graphics, animation, sound and other modalities to present information to users. You'll also learn about vision and perception; methods for presenting complex information to enhance comprehension and analysis; and how to incorporate  visualization techniques into human-computer interfaces.