1 credit
Summer 2025 Distance Learning Upper DivisionThe course introduces students to the fundamentals of Data Engineering with a focus on tools and computational techniques to gather, construct, manipulate, summarize, and visualize data sets as a means to extract knowledge from the underlying data. Python and Python libraries are used. Completion of the course will allow learners to perform basic data analysis on data sets. Experience in Python Programming and Linear Algebra is required. The course also prepares learners for additional instruction in the courses Data Engineering II and Foundations of Decision Making.
Learning Outcomes1Identify key file types (TXT, CSV, HTML) and their characteristics. Using Python, read data in these formats.
2Create and execute Python scripts to parse, select, transform, summarize, and visualize data.
3Explain how to identify and fill in missing values in data values.
4Create informative visualizations from given data and recognize the key qualities of good visualizations.
5Apply Pandas functions to slice, dice, and summarize datasets.
6Apply the process of sampling data and sample probabilistically.
7Demonstrate how to transform and construct features (e.g., standardization, distances).
8Compute summary statistics from data (e.g., covariance, correlation).
9Explain how to solve simple data analysis problems.