Skip to content

YoungThund3rCat/modern-data-stack-bootcamp

Repository files navigation

Modern Data Stack Bootcamp - Phase 0 Template

Welcome to the Modern Data Stack Bootcamp! This template repository provides the foundation for your journey into modern data analytics using DuckDB, dbt, Great Expectations, and Metabase.

🎯 Overview

Phase 0 of the bootcamp introduces you to the complete modern analytics workflow:

  • Data Ingestion with DuckDB
  • Data Transformation with dbt
  • Data Validation with Great Expectations
  • Data Visualization with Metabase

📋 Prerequisites

  • Python 3.9 or higher
  • Git
  • Docker (for Metabase)
  • Text editor or IDE (VS Code, Deepnote, etc.)

🚀 Quick Start

1. Clone this Repository

git clone https://github.com/your-org/modern-data-stack-bootcamp.git
cd modern-data-stack-bootcamp

2. Install Dependencies

Create a virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install required packages:

pip install -r requirements.txt

3. Configure dbt

Copy the example profiles file:

mkdir -p ~/.dbt
cp profiles.yml.example ~/.dbt/profiles.yml

Edit ~/.dbt/profiles.yml to update the path to your DuckDB database file.

4. Verify Setup

Run dbt debug to verify your configuration:

dbt debug

You should see all checks passing. If not, see the Setup Guide for troubleshooting.

5. Get Started with Phase 0

Open the Phase 0 notebook and follow along:

jupyter notebook phase_0_modern_data_stack.ipynb

Or if using Deepnote/VS Code, open the notebook directly.

📚 Project Structure

modern-data-stack-bootcamp/
├── models/
│   ├── staging/         # Your staging models (stg_*.sql)
│   └── marts/          # Your mart models (dim_*.sql, fct_*.sql)
├── tests/              # Custom dbt tests
├── macros/             # Reusable SQL macros
├── seeds/              # CSV files to load as tables
└── docs/               # Documentation

📖 Documentation

🎓 Learning Objectives

By the end of Phase 0, you will:

  1. ✅ Set up a complete modern data stack environment
  2. ✅ Query and explore data using DuckDB and SQL
  3. ✅ Build production-ready data transformations with dbt
  4. ✅ Validate data quality with Great Expectations
  5. ✅ Create compelling visualizations with Metabase
  6. ✅ Practice professional collaboration with Git and PR reviews

🤝 Contributing

See CONTRIBUTING.md for guidelines on peer review and collaboration.

📝 License

This project is licensed under the MIT License - see LICENSE for details.

🙋 Getting Help

🎉 Ready to Start?

Begin by running through the setup steps above, then open the Phase 0 notebook to get started!

Happy learning! 🚀

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published