Skip to content

Data Modeling & SQL

You should be familiar with the core concepts of relational databases like data modeling, normalization, and SQL.

During the MSIS Program

As part of the MSIS program you will design data-driven applications, and pull data from databases for statistical modeling. You will also learn about application architectures, and the trade-offs between structured and unstructured data, and between relational and non-relational data stores. Lastly, you will be introduced to databases from several vendors.

Major Concepts

Before entering the MSIS program, you should be able to do the following:

  1. Build a correct relational data model and express it as an ER or logical diagram.
  2. Explain the difference between data, information, and knowledge.
  3. Explain the difference between structured and unstructured data.
  4. Normalize a data model using at least First Normal Form thru Third Normal Form, and BCNF.
  5. Model your data using ERD and logical diagrams

Normalization

  1. The following two tutorials will introduce you to data Normalization:

SQL

  1. You should be familiar with how to use SQL to insert, update, and retrieve data from a database. You should also be able to create and drop tables.
  1. It would be to your advantage to have experience installing and using a RDMS such as MySQL, MS SQL Server, or SQLite (pick one). This will also allow you to do the practice problems below.
  2. Probably the easiest (free) way to learn SQL is via the “Intro to SQL” unit in Khan Academy’s “Computer Programming” course.

  3. Two good—but very different—books on SQL include:

    • Sams Teach Yourself SQL in 10 Minutes a Day (5th Edition), Ben Forta, Sams Publishing (2019), ISBN-13: 978-0135182796, [Amazon] Highly recommended. This book covers SQL as understood by MySQL, SQLite, MS SQL Server, and several other RDMS.
    • Head First SQL: Your Brain on SQL, Lynn Beighley, O’Reilly Media (2007), ISBN-13: 978-0596526849 [Amazon]. The “Head First” series of books are quirky, fun, and easy to read. Readers either love or hate the style (most love it). This book focuses primarily on SQL as used by the MySQL database engine.
  4. Installing software:

    • It’s best if you get the practice installing, setting up, and using your own database, but free resources exist as well. DB Fiddle and SQL Playground both provide ways to practice SQL via your web browser, without installing any software.

    • If you wish to install a database on your local machine (yes!), we recommend MySQL or SQLite (use the pre-compiled bundles with command line tools).

Practice

  1. Rick Watson has published an “Classic Models” database and SQL file (assumes MySQL), along with a set of practice problems, to accompany his book1 on data modeling and SQL. You should be able to successfully write queries for his sample problems in these categories:

    • “Single Entity” (all)
    • “One to many relationship” (all)
    • “Many to many relationship” (all)
    • “General queries”, simple aggregations, such as problems #7, and #20–#24.
  2. Those with an advanced knowledge of SQL (not required, but encouraged) will also be able to complete all of the problems in the “General Queries” and “Correlated subqueries” categories.


  1. Rick Watson is author of Data Management: Databases and Organizaitons (sixth edition) (2020) [Amazon]