Spring 2021
INFO 290T 002 - LEC 002
Special Topics in Technology
Data Engineering
Joseph M Hellerstein, Aditya Parameswaran
Jan 19, 2021 - May 07, 2021
Tu, Th
05:00 pm - 06:29 pm
Internet/Online
Class #:19579
Units: 2to4
Instruction Mode:
Pending Review
Time Conflict Enrollment Allowed
Offered through
School of Information
Current Enrollment
Total Open Seats:
24
Enrolled: 26
Waitlisted: 0
Capacity: 50
Waitlist Max: 25
Open Reserved Seats:
30 reserved for Information Management and Systems: Masters & PhD Students
Hours & Workload
2 to 8 hours of outside work hours per week, and 1 to 4 hours of instructor presentation of course materials per week.
Final Exam
FRI, MAY 14TH
11:30 am - 02:30 pm
Other classes by Joseph M Hellerstein
Other classes by Aditya Parameswaran
Course Catalog Description
Specific topics, hours, and credit may vary from section to section and year to year.
Class Description
This new class on Data Engineering will cover the principles and practices of managing data at scale, with a focus on use cases in data analysis and machine learning. We will cover the entire life cycle of data management and science, ranging from data preparation to exploration, visualization and analysis, to machine learning and collaboration.
The class will balance foundational concerns with exposure to practical languages, tools, and real-world concerns. We will study the foundations of prevalent data models in use today, including relations, tensors, and dataframes, and mappings between them. We will study SQL as a means to query and manipulate data at scale, including performance concerns like views and indexes, query processing and optimization, and transactions, all from a user perspective. We will study the foundations and realities of data preparation, including hands-on work with real-world data using standard Python and SQL frameworks. We will explore data exploration modalities for non-programmers, including the fundamentals behind spreadsheet systems and interactive visual analytics packages. We will look at approaches for managing the machine learning lifecycle of data preparation, model selection and training, model serving and monitoring. Time permitting we will look at technologies for moving, sharing, and caching data including event streaming systems, key-value/document stores, log analytics, and search engines.
Class Notes
Prerequisites:
* COMPSCI C100/DATA C100/STAT C100 or
* COMPSCI 189 or
* INFO 251 or
* DATA 144/INFO 254 or
* equivalent upper-division course in data science.
AND
* COMPSCI 61A or
* COMPSCI 88 or
* INFO 206B or
*.. show more
* COMPSCI C100/DATA C100/STAT C100 or
* COMPSCI 189 or
* INFO 251 or
* DATA 144/INFO 254 or
* equivalent upper-division course in data science.
AND
* COMPSCI 61A or
* COMPSCI 88 or
* INFO 206B or
*.. show more
Prerequisites:
* COMPSCI C100/DATA C100/STAT C100 or
* COMPSCI 189 or
* INFO 251 or
* DATA 144/INFO 254 or
* equivalent upper-division course in data science.
AND
* COMPSCI 61A or
* COMPSCI 88 or
* INFO 206B or
* equivalent courses in programming. This class will not assume deep experience with databases or big data solutions.
Prerequisites will be MANUALLY ENFORCED during the first week of class. Faculty will be provided a list of all enrolled students and their pre-req status for review/drop by the end of week 2.
INFO 290T-002 is reserved for MIMS students. Undergraduate students interested in this course should enroll/waitlist for COMPSCI 194.035 and should contact the CS department for enrollment questions.
ATTENDANCE POLICY: Due to COVID-19, this course will be taught remotely. Real-time attendance is NOT required for this course. There will be recordings of live sessions that students can watch on their own time. However, students must be present for mid-term and final exams, there will be NO make-up exams offered. show less
* COMPSCI C100/DATA C100/STAT C100 or
* COMPSCI 189 or
* INFO 251 or
* DATA 144/INFO 254 or
* equivalent upper-division course in data science.
AND
* COMPSCI 61A or
* COMPSCI 88 or
* INFO 206B or
* equivalent courses in programming. This class will not assume deep experience with databases or big data solutions.
Prerequisites will be MANUALLY ENFORCED during the first week of class. Faculty will be provided a list of all enrolled students and their pre-req status for review/drop by the end of week 2.
INFO 290T-002 is reserved for MIMS students. Undergraduate students interested in this course should enroll/waitlist for COMPSCI 194.035 and should contact the CS department for enrollment questions.
ATTENDANCE POLICY: Due to COVID-19, this course will be taught remotely. Real-time attendance is NOT required for this course. There will be recordings of live sessions that students can watch on their own time. However, students must be present for mid-term and final exams, there will be NO make-up exams offered. show less
Rules & Requirements
Repeat Rules
Reserved Seats
Current Enrollment
Open Reserved Seats:
30 reserved for Information Management and Systems: Masters & PhD Students
Textbooks & Materials
See class syllabus or https://calstudentstore.berkeley.edu/textbooks for the most current information.
Guide to Open, Free, & Affordable Course Materials
Associated Sections
None