Skip to content

ryankamiri/track-app

Repository files navigation

TrackApp

A production-ready tool that monitors your Gmail inbox, detects real job application updates, and keeps a structured CSV/JSON trail of each application.

Features

  • Intelligent filtering with DSPy + OpenAI (classification + structured extraction)
  • Token-optimized email preprocessing (HTML stripping, ATS awareness)
  • Context-aware status tracking (Applied → Interview → Offer/Rejected)
  • Persistent caching (email_cache.json, email_fetch_cache.json) for resumable runs
  • CSV output deduplicated by company with merged role history

1. Prerequisites

  • Python 3.9+
  • A Google Cloud project with Gmail API enabled
  • OpenAI API key
  • Desktop OAuth credentials (credentials.json)

Install dependencies:

pip install -r requirements.txt

Create/activate a virtualenv if desired (python -m venv .venv && source .venv/bin/activate).


2. Gmail API Setup

  1. Open Google Cloud Console
  2. Enable Gmail API for your project
  3. Configure OAuth consent screen (External → add yourself as a test user)
  4. Create OAuth client ID (Application type: Desktop app)
  5. Download the client JSON as credentials.json (place in project root)

When you run the script, a browser window prompts you to authorize Gmail access. A token.json refresh token is generated for subsequent runs. If the token becomes invalid, delete token.json and re-run.


3. Configuration (Layered)

TrackApp loads configuration in this order (first found wins), then overlays environment variables:

  1. config.local.json (git-ignored; personal settings)
  2. config.json (optional project defaults)
  3. config.example.json (fallback)

Environment variables override file values when present:

  • OPENAI_API_KEY
  • TRACKAPP_START_DATE
  • TRACKAPP_OUTPUT_CSV
  • TRACKAPP_CACHE_FILE

Optional: add a .env file (loaded when python-dotenv is installed):

OPENAI_API_KEY=sk-...
TRACKAPP_START_DATE=2025-07-01

Quick Start

cp config.example.json config.local.json
# edit config.local.json to your preferences
# or put OPENAI_API_KEY in .env
python main.py

4. Running TrackApp

python main.py

Workflow:

  1. Loads configuration and caches (using the layered approach)
  2. Authenticates with Gmail (browser prompt on first run)
  3. Fetches emails from start_date onward, saving batches into email_fetch_cache.json
  4. Processes emails from oldest to newest:
    • Keyword filter → DSPy classifier → DSPy extractor (with cached results reused)
    • Merges updates into job_applications.csv
    • Saves intermediate results every 10 emails
  5. Deletes email_fetch_cache.json only after a successful run (preserves it on errors/interrupts)

You can stop the script (Ctrl+C) at any time. It will persist caches so the next run resumes without re-fetching or re-classifying old emails.


5. Outputs & Caches

  • job_applications.csv: one row per company with consolidated role list and history
  • email_cache.json: per-email DSPy classification/extraction cache (prevents repeated LLM calls)
  • email_fetch_cache.json: raw Gmail fetch cache (kept until a run completes successfully)

To reset everything:

rm email_cache.json email_fetch_cache.json job_applications.csv token.json

6. Troubleshooting

  • invalid_grant during auth: delete token.json, ensure credentials.json matches the Google project, retry
  • Missing companies or duplicates: update preprocessing.ats_domains or exporter.invalid_company_names, rerun
  • High OpenAI cost: lower max_email_words, tighten keywords, reduce context_months_limit
  • Reprocessing from scratch: delete caches above and rerun

MIT Licensed. Contributions welcome.

About

Track your job apps with AI instead of trying to log everything by yourself :)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages