2021 Spring INFO 290T 002 LEC 002

Spring 2021

INFO 290T 002 - LEC 002

Special Topics in Technology

Data Engineering

Joseph M Hellerstein, Aditya Parameswaran

Jan 19, 2021 - May 07, 2021
Tu, Th
05:00 pm - 06:29 pm
Internet/Online
Class #:19579
Units: 2to4

Instruction Mode: Pending Review
Time Conflict Enrollment Allowed

Offered through School of Information

Current Enrollment

Total Open Seats: 24
Enrolled: 26
Waitlisted: 0
Capacity: 50
Waitlist Max: 25
Open Reserved Seats:
30 reserved for Information Management and Systems: Masters & PhD Students

Hours & Workload

2 to 8 hours of outside work hours per week, and 1 to 4 hours of instructor presentation of course materials per week.

Final Exam

FRI, MAY 14TH
11:30 am - 02:30 pm

Other classes by Joseph M Hellerstein

Other classes by Aditya Parameswaran

Course Catalog Description

Specific topics, hours, and credit may vary from section to section and year to year.

Class Description

This new class on Data Engineering will cover the principles and practices of managing data at scale, with a focus on use cases in data analysis and machine learning. We will cover the entire life cycle of data management and science, ranging from data preparation to exploration, visualization and analysis, to machine learning and collaboration. The class will balance foundational concerns with exposure to practical languages, tools, and real-world concerns. We will study the foundations of prevalent data models in use today, including relations, tensors, and dataframes, and mappings between them. We will study SQL as a means to query and manipulate data at scale, including performance concerns like views and indexes, query processing and optimization, and transactions, all from a user perspective. We will study the foundations and realities of data preparation, including hands-on work with real-world data using standard Python and SQL frameworks. We will explore data exploration modalities for non-programmers, including the fundamentals behind spreadsheet systems and interactive visual analytics packages. We will look at approaches for managing the machine learning lifecycle of data preparation, model selection and training, model serving and monitoring. Time permitting we will look at technologies for moving, sharing, and caching data including event streaming systems, key-value/document stores, log analytics, and search engines.

Class Notes

Prerequisites:
* COMPSCI C100/DATA C100/STAT C100 or
* COMPSCI 189 or
* INFO 251 or
* DATA 144/INFO 254 or
* equivalent upper-division course in data science.
AND
* COMPSCI 61A or
* COMPSCI 88 or
* INFO 206B or
*.. show more
Prerequisites:
* COMPSCI C100/DATA C100/STAT C100 or
* COMPSCI 189 or
* INFO 251 or
* DATA 144/INFO 254 or
* equivalent upper-division course in data science.
AND
* COMPSCI 61A or
* COMPSCI 88 or
* INFO 206B or
* equivalent courses in programming. This class will not assume deep experience with databases or big data solutions.

Prerequisites will be MANUALLY ENFORCED during the first week of class. Faculty will be provided a list of all enrolled students and their pre-req status for review/drop by the end of week 2.

INFO 290T-002 is reserved for MIMS students. Undergraduate students interested in this course should enroll/waitlist for COMPSCI 194.035 and should contact the CS department for enrollment questions.

ATTENDANCE POLICY: Due to COVID-19, this course will be taught remotely. Real-time attendance is NOT required for this course. There will be recordings of live sessions that students can watch on their own time. However, students must be present for mid-term and final exams, there will be NO make-up exams offered. show less

Rules & Requirements

Repeat Rules

Reserved Seats

Current Enrollment

Open Reserved Seats:
30 reserved for Information Management and Systems: Masters & PhD Students

Textbooks & Materials

See class syllabus or https://calstudentstore.berkeley.edu/textbooks for the most current information.

Textbook Lookup

Guide to Open, Free, & Affordable Course Materials

eTextbooks

Associated Sections

None