grad course | nivdayan

CSC2525

Research Topics in Database Management

Instructor: Niv Dayan

Lectures: Wednesday 13:00-15:00 (UC 85)

Office Hours: after each class

Bigger, Faster, and Stronger Systems

This is a research seminar course on database systems and their inner data structures. We will read exciting papers to aquaint ourselves with both classic and state-of-the-art constructions. You will also pursue a fun research project of your choice, in which you will implement and evaluate some of the methods we learn about.

Reading & Presenting Research Papers

A core part of this course is reading and digesting research papers from the database research community. Throughout the course, we will cover approximately 20 research papers. Each student will present two papers to the class along with a classmate. Each presentation will be followed by a Q&A session with the class.

Project

Throughout the course, you will propose and pursue a research project that involves implementing and benchmarking some technqiue or method that we covered in class. More information can be found here.

Office Hours

There will be office hours immediately after each class. In addition, students are encouraged to book office hours with the instructor, especially before class presentations to ensure high quality.

Student Participation

As this is a course with a small class, students are required to address classes in person. Students are strongly encouraged to particpate in the Q&A session after each paper presentation.

Class Structure

Each class will begin with a 1.5 hour lecture by the instructor covering core techniques. The rest of each class will consist of student presentations.

Oral Exam

At the end of the course, the instructor will hold an oral exam to test students about the course content. Here are sample exam questions.

Grade Components

The final grade for each student will derived based on their presentations (20%), project (40%), and oral exam (40%).

Prerequisites

Students should have taken the courses listed in here, or have equivalent knowledge in algorithms, data structures, SQL, and operating systems. Hands-on experience with a low-level high-performance programming language like C/C++, Rust or Java is also required. Ideally, students should also have taken a course similar to CSC443/CSC2234: Database System Technology, and be familiar with indexing and filter data structures, buffer pools, page layout, table layouts, and detailed knowledge of the memory hierarchy. However, CSC443 is not required and there will be opportunities to catch up on this material.

Academic Integrity

The project hand-ins must be the group’s own work. It is an academic violation to copy code or experimental results from other groups, whether you copy yourself or let someone else copy. That said, we encourage you to discuss course material widely with your fellow students within and across groups.

Piazza

We will be using Piazza as our main discussion board. You are responsible for reading all postings made by me or the TAs. Please use Piazza to ask questions about assignments and course lecture materials so that everyone can benefit.

Contact

Course announcements will arrive through Piazza. Aside to that, this course website is required reading. It contains essential material and will be updated throughout the semester. Please use Piazza whenever possible to ask questions about course material. For personal questions, email me to nivdayan@gmail.com. Please include "CSC2525" in the subject line along with your full name. If you do not hear back quickly, we are always available during office hours to help.

Accessibility

The University of Toronto is committed to accessibility. If you require accommodations or have any accessibility concerns, please visit Accessibility Services as soon as possible.

Course Material Synopsis

We will study various topics including expandable arrays, probabilistic filter data structures (XOR, blocked Bloom, range, and quotient filters), database page layouts, compression techniques (e.g., Elias-Fano, LZ7, delta encoding, entropy encoding), advanced index trees and tries, multi-dimensional indexes (R-trees, space-filling curves), key-value stores, etc.

Course schedule

We hope this course will get you excited about research. For students who excel in this course and seek research opportunites, check out my home page and get in touch.

About Niv Dayan

Contact