1 credit
Summer 2025 Distance Learning Upper DivisionThis course provides an overview of data science methods used for data-driven discovery, extraction of knowledge, and informed decision making. The course covers fundamental computational methods and statistical techniques used to correctly reason about uncertainty, conduct hypothesis tests, infer causal relationships, and apply and evaluate predictive models. The course highlights how sampling biases can impact fairness in decision making. Throughout, students get hands-on experience on how to make correct and explainable inferences from data. Experience in Python Programming, Probability, Statistics and Linear Algebra is required.
Learning Outcomes1Support and justify claims with evidence from data and discuss how human biases can impact interpretations.
2Recognize the basic approach to formulating statistical hypothesis tests and apply parametric hypothesis tests for both discrete and continuous data.
3Interpret p-values and apply them effectively to support claims about data.
4Formulate and test conjectures with controlled experiments, including A/B testing.
5Recognize the types of errors encountered when conducting multiple hypothesis tests.
6Use computational approaches based on randomization for hypothesis testing and uncertainty quantification.
7Discuss ethical issues such as privacy and fairness that can be associated with automating decisions based on machine learning methods.
8Apply the basic components of classification methods using Python libraries.
9Evaluate the performance of learned models with learning curves and k-fold cross validation and assess significance with hypothesis tests.