Hold on just a sec...
2 credits
Spring 2026 Lecture Upper DivisionThis course prepares master students with the advanced and frontier technique for analyzing unstructured data that is built on the prerequisite of basic data analysis on structured data (e.g., data mining using python). Upon completion, students are expected to be competent and skilled in collecting, processing, and analyzing online unstructured data using the state-of-art artificial intelligence modeling (e.g., neural networks and deep learning methods) and the mainstream open-sourced toolboxes (e.g., NLTK, scikit-learn, keras, and TensorFlow); students should at least be proficient to design and lead industry-level practical projects conceptually when working with data scientist. We take text and image as examples for illustration and cover applications of: Text mining and feature extraction (e.g., sentiment, topic, readability, etc.); Image classification (e.g., recognition, etc.); Advanced applications (e.g., translations, chatbot, image expression, etc.). The course is formatted as a combination of seminars and hands-on coding practices in class. Students are expected to code (and debug) in the class. The main method topics covered include Crawling unstructured data using scrapy. Representation and embedding of text and image. Unsupervised learning for unstructured data (e.g., LDA topic modeling, clustering, etc.) Supervised learning for unstructured data (e.g., classic ML models, A/D/C/RNN, LSTM, etc.). Permission of instructor required.
Learning Outcomes1Be competent and skilled in collecting, processing, and analyzing online unstructured data using the state-of-art artificial intelligence modeling (e.g., neural networks and deep learning methods) and the mainstream open-sourced toolboxes (e.g., NLTK, scikit-learn, keras, and TensorFlow).
2Design and lead industry-level practical projects in joint with data scientist.