USC Viterbi School of Engineering INF 551: Foundations of Data Management Units: 4 Term—Day—Time: Fall 2016 – MW 10-11:50am (section 32418D ) Location: GFS 101 Fall 2016 – MW 4-5:50pm (section 32431D ) Location: THH 208 Instructor: Wensheng Wu Office: GER 204 Office Hours: MW 3-4pm or by appointment Contact Info: [email protected] Grader: Jing Lin Office: SAL computing lab Office Hours: TBA Contact Info: l [email protected] Grader: Xingyuan Wang Office: SAL computing lab Office hours: TBA Contact info: [email protected] A. Catalogue Course Description Function and design of modern storage systems, including cloud; data management techniques; data modeling; network attached storage, clusters and data centers; relational databases; the map-reduce paradigm. B. Expanded Course Description This course is one of the foundation courses in the Informatics program. It prepares the students with the fundamental knowledge on the data management. Such a knowledge is critical for the students to succeed in more advanced data management courses in the program. It also exposes students to the cutting-edge data management concepts, systems, and techniques for managing large scale of data, to ensure that students have adequate background for further exploring big data analytics in follow-up courses. The course may be divided into three parts. (1) Fundamental of data management: data storage, file system, file format, relational data vs. semi-structured data such as XML and JSON, conceptual modeling, relational modeling, relational algebra, SQL, views, constraints, query processing and optimization; (2) Advanced topics in data management: data warehousing, data cleaning, ETL, data integration, and metadata management; (3) Big data analytics: NoSQL, key-value and document stores, cloud data storage, distributed file system, and MapReduce.
The course will also provide students with hand-on experiences on RDBMS, e.g., MySQL, cloud data storage, e.g., Amazon S3/Dynamo, CouchDB, Cassandra, and big data solution stacks, e.g., Apache Hadoop, Pig, and Spark. C. Recommended Preparation : INF 550 taken previously or concurrently. Basic understanding of operating systems, networks, and databases. A basic understanding engineering principles is required, including basic programming skills; familiarity with the Python/Java programming language is desirable. D. Course Notes The course will be run as a lecture class with student participation strongly encouraged. There are weekly readings and students are encouraged to do the readings prior to the discussion in class. All of the course materials, including the readings, lecture slides, home works will be posted online E. Technological Proficiency and Hardware/Software Required Students are expected to know how to program in a language such as Python or Java.
- Fall '14
- Data Management, cloud data storage