Databases
A subjourney dedicated to database learning starting from 0.
The goal is to master database theory
We embarked on a sub-journey to master database fundamentals, aiming to build a strong theoretical and practical understanding essential for leveraging technological and AI advancements.
Romain initiated a structured, efficient learning plan for databases, breaking content into one-hour daily theoretical and practical modules. This plan is segmented into "Fundamental Concepts," "Practical Skills," and "Advanced Topics." Our strategic reasoning behind this was to optimize our learning experience by dividing content into small, manageable chunks, ensuring deep engagement and material retention.
For Day 1, focused on "Data Types and Structures," Romain led us through understanding fundamental data types (Integer, Floating-Point, Text, Boolean, Date/Time) and their role in data integrity, storage optimization, and operational validity, alongside the concept of `NULL` ("the *absence* of a value"). We explored data structures, highlighting tables as the core of relational databases and how data types align with columns to enforce integrity. Practically, we set up SQLite via CLI, GUI, or online interfaces, learned essential dot-commands (`.tables`, `.schema`, `.help`, `.exit`), and practiced creating and populating a `users` table with `CREATE TABLE` and `INSERT INTO` statements. We concluded Day 1 by performing simple data retrieval using `SELECT` commands, followed by an assessment, practice exercises, and considerations for visual learning, real-world context, and common pitfalls, acknowledging this as the beginning of our database journey. Romain also ensured that we added practice exercises to ensure the practical application of concepts, visual learning enhancements for visual learners, real-world context for practical relevance, and a common pitfalls section to help new learners avoid common mistakes.
Romain then provided a detailed breakdown of the `SELECT` command, "the cornerstone of querying databases." We covered its basic syntax, including selecting specific or all columns and specifying the `FROM` table. We learned to filter data using the `WHERE` clause with various comparison and logical operators (`AND`, `OR`, `NOT`). We also explored sorting results with `ORDER BY` (`ASC`/`DESC`) and limiting rows with `LIMIT`. The explanation included aggregate functions (`COUNT`, `SUM`, `AVG`, `MIN`, `MAX`), grouping results with `GROUP BY`, and filtering those groups with `HAVING`. We addressed handling `NULL` values with `IS NULL` and `COALESCE`, and touched upon modern SQL features like Window Functions and Common Table Expressions (CTEs). Romain included practical exercises and highlighted common pitfalls such as forgetting `IS NULL` or improper grouping, illustrating the `SELECT` query's order of operations (FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT).
Finally, Romain presented SQL's syntax essentials, stressing its logical, language-like, and "declarative nature" ("We declare WHAT we want, not HOW to get it"), which is vital for AI applications in data preprocessing, training dataset queries, and efficient data management. We covered core CRUD operations (`CREATE`, `SELECT`, `INSERT`, `UPDATE`, `DELETE`) and the typical `SELECT` query clause structure. We reviewed logical operators and common aggregate functions. Romain highlighted relevant SQL patterns for AI/ML, such as data sampling, feature extraction, and data cleaning queries. He also shared "gotchas" and best practices, including caution with SQL injection, data type validation, and using aliases for readability. Looking ahead, Romain noted the benefits of Object-Relational Mapping (ORM) tools, like Django's ORM (a favorite of Mike's), as a bridge between SQL databases and object-oriented code, underscoring that mastering SQL fundamentals will greatly enhance our ability to leverage ORMs and Python's data libraries effectively, as "SQL's logical thinking patterns translate well." Mike is also building a Python library for rapid code mutation that Romain will be able to use, further solidifying the synergy between our database learning and future AI applications.By Romain Peter