Chicago Yelp Graph Database Analysis
π Description
This project explores the use of EdgeDB to model and analyze a real-world graph-relational dataset β Yelp businesses and reviews from the Chicago metro area. Built for the MSDS 420-3 Databases course at Northwestern University, the notebook walks through schema design, data loading, and analytical queries over a semi-structured dataset.
By adopting EdgeDB, the project illustrates how flexible, queryable schemas can unlock new dimensions of insight β blending the rigor of relational databases with the adaptability of graph systems. Business metadata, category tags, and review sentiment are cross-linked to enable multi-hop queries that mimic human curiosity.
βΈ»
π§ Features
- Graph Schema Design: Models Yelp businesses, users, reviews, and locations as typed relations and links in EdgeDB.
- Data Ingestion: Loads cleaned JSON data into EdgeDB and validates schema integrity.
- Query Language Mastery: Uses EdgeQL for pattern-matching and filtering β including aggregates, conditional expressions, and nested structures.
- Review Sentiment & Trends: Analyzes how review scores and comment patterns shift across neighborhoods and categories.
- Graph Navigation: Demonstrates multi-hop joins between users β reviews β businesses β locations.
βΈ»
π‘ Key Insight
SQL-based tools dominate enterprise analytics, but graph-relational platforms like EdgeDB offer a more natural fit for richly connected data. This project shows how the right data model can turn flat tables into exploratory engines for user and business insight.
π View the source code on GitHub