Data Analysis + LLM Curriculum (Python, R, PydanticAI with Ollama)

A 12-week, project-based roadmap for learning data analysis, Unix/Linux basics, and LLM apps using Python, R, PydanticAI, and Ollama (local models).
All resources are 100% free and include lectures, interactive projects, GitHub practice, and an optional "cool kids" track with Neovim + CLI tools.

Week -1: Mac + Homebrew Setup Checklist

Step 1: Install Python + Tools

Install Python 3.11 → brew install [email protected]
Install uv package manager → brew install uv
Install Git → brew install git
Install iTerm2 → brew install --cask iterm2
Install VSCode → brew install --cask visual-studio-code
Verify Python version → python3 --version

Step 2: Install R + RStudio

Install R → brew install --cask r
Install RStudio → brew install --cask rstudio

Step 3: Install JupyterLab

Install JupyterLab → uv pip install jupyterlab
Run Jupyter → jupyter lab

Step 4: Install Data Science Libraries

Install NumPy → uv pip install numpy
Install pandas → uv pip install pandas
Install matplotlib → uv pip install matplotlib
Install seaborn → uv pip install seaborn
Install SciPy → uv pip install scipy
Install statsmodels → uv pip install statsmodels

Step 5: Install AI Stack

Install Pydantic → uv pip install pydantic
Install PydanticAI → uv pip install pydantic-ai
Install ChromaDB → uv pip install chromadb

Step 6: Install Ollama

Install Ollama → brew install ollama
Pull Llama3.1 model → ollama pull llama3.1
Pull nomic-embed-text model → ollama pull nomic-embed-text
Test Ollama → ollama run llama3.1 "Hello world"

Step 7: Create Project Folder

Create project folder → mkdir data-analyst-curriculum && cd data-analyst-curriculum
Create virtual environment → uv venv .venv
Activate environment → source .venv/bin/activate

Suggested Repo Layout

data-analyst-curriculum
├── 01_unix_linux
├── 02_python_basics
├── 03_pandas_eda
├── 04_stats_intro
├── 05_r_tidyverse
├── 06_pydanticai_rag
├── 07_pydanticai_assistant
├── 08_sql_tooling
├── 09_eval_observability
└── capstones
    ├── analyst_copilot_pydanticai
    └── r_eda_report

Week 0: CS & Unix/Linux

Lectures

Interactive

OverTheWire – Bandit Wargame (hands-on Linux terminal practice)

Weeks 1–3: Python Foundations + Unix/Linux Basics

Lectures

Interactive Projects

freeCodeCamp – Scientific Computing with Python Certification
Projects: Arithmetic Formatter, Time Calculator, Budget App, Polygon Area Calculator, Probability Calculator

Weeks 4–5: Statistics & R

Lectures

Interactive Projects

freeCodeCamp – Data Analysis with Python Certification
Projects: Mean-Variance Calculator, Demographic Data Analyzer, Medical Data Visualizer, Time Series Visualizer, Sea Level Predictor

Weeks 6–7: Python for Data Analysis

Lectures

Interactive

Finish freeCodeCamp Data Analysis with Python Certification

Weeks 6–12: LLMs, PydanticAI & AI Agents

Lectures

freeCodeCamp – LangChain for Beginners (concepts apply to PydanticAI)
DeepLearning.AI – Free Short Courses on LLMs

Week 10: SQL & Databases

Lectures

Interactive Projects

freeCodeCamp – Relational Database Certification
Projects: Celestial Bodies Database, World Cup Database, Salon Scheduler, Number Guessing Game

Weeks 11–12: Capstones & Projects

Resources

Interactive Projects

freeCodeCamp – Machine Learning with Python Certification
Projects: Rock, Paper, Scissors AI, Cat/Dog Classifier, Book Recommender, Stock Predictor, Neural Network SMS Classifier

Git & GitHub (Do This Throughout)

Weekly GitHub Habits

Create a repo for each week/project
Add README.md summarizing what was learned
Commit often (git add . && git commit -m "message" && git push)
Upload Jupyter Notebooks (render well on GitHub)
Pin best repos to showcase skills

⚡ Optional: "Cool Kids" Neovim + CLI Tools Track

Neovim Setup

Install Neovim → brew install neovim
Learn basics: i (insert), :wq (save & quit), dd (delete line), / (search)
freeCodeCamp – Vim Tutorial for Beginners (2 hrs)
Explore config: ~/.config/nvim/init.lua

CLI Tools (Make Linux fun & powerful)

Install fzf (fuzzy finder) → brew install fzf
Install ripgrep (fast search) → brew install ripgrep
Install htop (system monitor) → brew install htop
Install bat (better cat) → brew install bat
Install exa (better ls) → brew install exa
Bonus: Try tmux for terminal multiplexing → brew install tmux

Practice

Replace basic commands: cat → bat, ls → exa
Use fzf to search command history
Use ripgrep to search code fast
Manage processes with htop
Keep multiple projects open in tmux

✅ Finish Week -1 through Week 12, sprinkle in the optional “cool kids” track, and she’ll have a full data + AI + Linux toolkit with a strong GitHub portfolio.

📊 Data Analysis + LLM Projects (Week-by-Week)

This project roadmap matches the 12-week curriculum. All projects are aligned to tools introduced that week, and many include LLM integration using PydanticAI and Ollama.

✅ Week 0: CS & Unix/Linux

Project: Command Line File Audit Tool

Write a Python CLI tool to summarize number, type, and size of files in a directory.
Optional: Output as a CSV report.

✅ Week 1: Python Fundamentals

Project: Python Expense Tracker (CLI)

Build a simple CLI that logs expenses and categorizes them.
Output to CSV with pandas.
Stretch: Add terminal charts with rich or plotext.

✅ Week 2: Python Functions + CLI Skills

Project: CSV Column Analyzer

Let users input a CSV and return column types, null counts, and stats.
Use pydantic-ai to summarize or rephrase the report via LLM.

✅ Week 3: Basic EDA in Python

Project: Netflix Dataset EDA

Use pandas, matplotlib, and seaborn to explore the Netflix dataset.
Ask Ollama to suggest plots or describe patterns with pydantic-ai schema outputs.

✅ Week 4: Statistics with Python

Project: Survey Summary Generator

Take a small dataset of survey results.
Use Python to calculate means, modes, distributions.
Ask LLM to generate a Markdown report with PydanticAI.

✅ Week 5: Intro to R & Tidyverse

Project: R Data Explorer

Use R (dplyr, ggplot2) to generate a basic report of any dataset.
Export tables/plots.
Bonus: Knit an .Rmd to HTML.

✅ Week 6: Advanced Pandas

Project: Airbnb Dataset EDA Bot

Use Kaggle's NYC Airbnb dataset.
Build a CLI or notebook that allows user to:
- Load data
- Ask questions
- Get answers powered by Ollama + PydanticAI

✅ Week 7: Data Cleaning with LLMs

Project: Dirty CSV Fixer

Input: Messy CSV (with typos, missing units, bad formats).
Define a schema with pydantic-ai.
LLM suggests cleaned rows and explanations.

✅ Week 8: CSV Q&A Assistant

Project: CSV Chat Agent

Upload a CSV and ask natural questions ("What’s the avg revenue in Q1?").
Use pydantic-ai + chromadb + ollama for retrieval.

✅ Week 9: PDF Report Analyzer

Project: PDF → Markdown Extractor

Load a multi-page PDF report (e.g. economic data).
Use PyMuPDF to extract text.
Ask LLM to generate a summary and key metrics as Markdown.

✅ Week 10: SQL + LLM

Project: Natural Language to SQL

Load a SQLite database (e.g. ecommerce).
User types: "Top 5 customers by revenue."
LLM generates SQL → fetch results → show as table or chart.

✅ Week 11: Capstone Prep

Project: Analyst Copilot (Part 1)

Combine multiple tools:
- Ask a question
- Pull from CSV + SQL
- Output: Graphs + Summary + raw data
Use pydantic-ai to structure answers.

✅ Week 12: Capstone Final

Project: Analyst Copilot (Final)

Wrap up with a CLI or Web UI (Streamlit).
Multi-modal assistant: PDF + CSV + SQL.
Structured answers + human-friendly output.
Host on GitHub with full documentation.

These projects ensure hands-on practice across data analysis, LLM-powered workflows, and reproducible portfolio-building with Python, R, SQL, and PydanticAI.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md

Uh oh!

Uh oh!

Burnsedia/data-science

Folders and files

Latest commit

History

Repository files navigation

Data Analysis + LLM Curriculum (Python, R, PydanticAI with Ollama)

Week -1: Mac + Homebrew Setup Checklist

Step 1: Install Python + Tools

Step 2: Install R + RStudio

Step 3: Install JupyterLab

Step 4: Install Data Science Libraries

Step 5: Install AI Stack

Step 6: Install Ollama

Step 7: Create Project Folder

Suggested Repo Layout

Week 0: CS & Unix/Linux

Lectures

Interactive

Weeks 1–3: Python Foundations + Unix/Linux Basics

Lectures

Interactive Projects

Weeks 4–5: Statistics & R

Lectures

Interactive Projects

Weeks 6–7: Python for Data Analysis

Lectures

Interactive

Weeks 6–12: LLMs, PydanticAI & AI Agents

Lectures

Week 10: SQL & Databases

Lectures

Interactive Projects

Weeks 11–12: Capstones & Projects

Resources

Interactive Projects

Git & GitHub (Do This Throughout)

Weekly GitHub Habits

⚡ Optional: "Cool Kids" Neovim + CLI Tools Track

Neovim Setup

CLI Tools (Make Linux fun & powerful)

Practice

📊 Data Analysis + LLM Projects (Week-by-Week)

✅ Week 0: CS & Unix/Linux

✅ Week 1: Python Fundamentals

✅ Week 2: Python Functions + CLI Skills

✅ Week 3: Basic EDA in Python

✅ Week 4: Statistics with Python

✅ Week 5: Intro to R & Tidyverse

✅ Week 6: Advanced Pandas

✅ Week 7: Data Cleaning with LLMs

✅ Week 8: CSV Q&A Assistant

✅ Week 9: PDF Report Analyzer

✅ Week 10: SQL + LLM

✅ Week 11: Capstone Prep

✅ Week 12: Capstone Final

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages