CS 50023: Data Engineering I

1 credit

Summer 2025 Distance Learning Upper Division
Data from
Summer 2025
last updated 8/18/2025
Summer 2025 Instructors:

The course introduces students to the fundamentals of Data Engineering with a focus on tools and computational techniques to gather, construct, manipulate, summarize, and visualize data sets as a means to extract knowledge from the underlying data. Python and Python libraries are used. Completion of the course will allow learners to perform basic data analysis on data sets. Experience in Python Programming and Linear Algebra is required. The course also prepares learners for additional instruction in the courses Data Engineering II and Foundations of Decision Making.

Learning Outcomes

1Identify key file types (TXT, CSV, HTML) and their characteristics. Using Python, read data in these formats.

2Create and execute Python scripts to parse, select, transform, summarize, and visualize data.

3Explain how to identify and fill in missing values in data values.

4Create informative visualizations from given data and recognize the key qualities of good visualizations.

5Apply Pandas functions to slice, dice, and summarize datasets.

6Apply the process of sampling data and sample probabilistically.

7Demonstrate how to transform and construct features (e.g., standardization, distances).

8Compute summary statistics from data (e.g., covariance, correlation).

9Explain how to solve simple data analysis problems.

Course CS 50023 from Purdue University - West Lafayette.

Restrictions

Graduates

GPA by professor

3.8Other terms
Ruby...(Spring 2021)
3.7
Jenn...(Spring 2021)

No grades available

Empty course schedule

Community

Have something to say?

BoilerCoursesis an unofficial catalog for Purdue courses
made by Purdue students.
CS 50023: Data Engineering I