Database Querying and SQL EDA (MSDS 420 β Module 4)
π Description
This project builds on foundational database skills by executing advanced SQL queries and performing exploratory data analysis using real-world Yelp Chicago business and review datasets. It covers join logic, subqueries, filtering, aggregation, and basic statistical analysis directly within SQL.
Developed as part of the MSDS 420 course at Northwestern University, the assignment demonstrates how SQL can drive business insights from complex data models β a key skill for any data analyst or data scientist working with structured data sources.
βΈ»
π§ Features
- Schema Mapping: Explores entity relationships in the Yelp dataset using foreign keys and joins.
- SQL Querying: Demonstrates filtering, aggregation, grouping, and ordering to surface business intelligence.
- Subqueries & CTEs: Utilizes nested logic and common table expressions for modular SQL development.
- Basic EDA in SQL: Highlights how to compute summary statistics and discover trends using SQL only.
- Chicago Focus: All queries and results relate to businesses and reviews in the Chicago metro area.
βΈ»
π‘ Key Insight
SQL isnβt just for CRUD β itβs an exploratory and analytical language when used fluently. This module shows how to generate insights directly within the database layer, minimizing the need for data export or transformation.
π View the source code on GitHub