-
Notifications
You must be signed in to change notification settings - Fork 4
Home
This wiki is for the Fall 2024 W4111 extra credit assignment.
The purpose of this open-ended extra credit assignment is for you to learn a database-adjacent technology, and create a tutorial that highlights its main concepts. In this way, you can teach the rest of your classmates something interesting.
Each technology will have a separate wiki page, and added to the following list. Up to 5 students can work together to write the wiki page for a given technology, and each person's contributions must be made clear at the top of the page (see template).
Although you will learn the technology by reading its documentation and tutorials, and using the technology, everything that you write and create should be completely original content for this extra credit assignment. If you wrote code in another project or class, then it is not original content.
It is probably a good idea to discuss your idea with Professor Wu during his office hours.
To submit, simply create a new Wiki page and edit it. No edits will be accepted after 12/8 11:59PM EST.
Extra credit can range from 0% to >3% of the total class grade. It is in your interest to make it easy for the staff to determine what you contributed. Grading is subjectively determined by the staff based on:
- accuracy and conciseness of the text,
- how clearly the text and example builds on and relates to concepts we learned in W4111, and
- how easy the tutorial is to follow.
- (see template for more notes on the structure of the tutorial)
Each contributor will clearly state which part of the tutorials they contributed to, and will be graded based on that part. If it is not not easy for the staff to identify or find your contribution, then it may not be attributed to you.
Remember, what you write should be completely original content for this extra credit assignment. If you wrote text/code in another project or class, or used any text generated by a large language model like ChatGPT, then it is not original content.
Add links to tutorial pages here. When you add a new page, you should follow the template.
- Template
- Modin
- Polars
- Apache Kafka
- DocETL
- Ibis
- Steampipe
- Google's SQL Pipe Syntax
- Feldera
- In-DB Machine Learning (Apache MADlib)
- MonetDB
- [ClickHouse]
- PostGIS
- PRQL
- Hex
- Your Technology Here
The following is a partial list of technologies you can study. Technologies can be advanced SQL/database functionalities, databases, data products, or database research papers. You can, and are encouraged to, create tutorials for technologies not listed below.
Data Tools
- DocETL: llm-powered document processing pipeline
- The Datasette data multi-tool
- Ibis
- Prisma
- Great Expectations
- DBT (beyond project 2)
- Hex
- sqldiff
- Superset
- Apache Calcite Open Source Optimizer
- Kafka
- Apache Arrow
- Steampipe
SQL Dialects/Features/Extensions
- PRQL
- Google's SQL Pipe Syntax
- SQL user defined functions
- SQL Window Functions
- SQL foreign data wrappers
- PostGis
- In-DB machine learning
DBMSes/Query Engines
- Convex
- kuzu graph database
- motherduck
- Feldera
- Materialize
- RelationalAI
- MonetDB
- ClickHouse
- quickstep
- datafusion
- cozo
- Lots of interesting cloud, embedded, specialized database systems!
Scalable Pandas
Your idea (good idea to check with Professor Wu first)!