Skip to content
View ignuslabs's full-sized avatar

Highlights

  • Pro

Block or report ignuslabs

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ignuslabs/README.md

👋 Hey, I’m Joe

I’m a data & systems guy obsessed with clean pipelines, forensic accuracy, and making “impossible” workflows actually…usable.

  • 🎓 Senior @ University of Pittsburgh (CBA)
    • Business Analytics & Accounting (Honors)
    • Minor in Italian
  • 🧮 Teaching Assistant — Business Analytics II (Python, pandas, ML)
  • 🧪 Forensic & Litigation Consulting / Data & Analytics experience
  • 🛠️ Builder of audit, ETL, scraping, and document-intelligence tools that feel like real products, not class projects.

I like solving problems where:

  • The data is messy.
  • The stakes are high (audit, compliance, legal, risk).
  • The UX actually matters for non-technical users.

🚀 Core Projects

🔍 Audit AI / Intelligent Audit Stack

AI-assisted audit & forensic analytics framework focused on:

  • Journal entry anomaly detection, Benford’s Law, vendor & payment risk
  • Evidence-linked workpapers (JSON schemas, hashed attachments, lineage)
  • Alignment with PCAOB, ISA, AICPA, COSO concepts
  • Data quality + governance via dbt, Great Expectations, and clear documentation

Tech: Python, SQL, dbt, Great Expectations, cloud warehouses, LLM integration

Goal: Give audit teams an actually usable AI-first toolkit that is explainable, repeatable, and regulator-friendly.


📄 Aptura — Document Intelligence & Parsing

A blueprint-style, production-minded document parsing toolchain.

  • Built around robust PDF parsing + OCR (e.g. Docling / Tesseract style pipelines)
  • Structured extraction for contracts, invoices, workpapers, and audit evidence
  • Designed for packaging as a real desktop/CLI app (PyInstaller / Electron-style flow)
  • Emphasis on: transparency, logs, reproducibility, clean APIs

Focus: Turn ugly PDFs into trustworthy, analysis-ready datasets.


🕸️ Interactive Web Scraper

An advanced, modular scraping framework with:

  • Smart CSS/HTML pattern detection + container recognition
  • Playwright-based automation & interactive GUI concepts
  • Re-usable configs for legal, compliant data collection
  • Designed for investigators, analysts, and power users

Keywords: Playwright, Python, modular architecture, “Codex-friendly” prompts


🎵 Playlist-Aware Recommender

A system that treats playlists like context, not just lists.

  • Embeddings + similarity (e.g. Faiss-style)
  • Metadata-aware suggestions based on mood / theme
  • API-driven design aimed at scalable deployment

🧠 What I Work With

Languages & Data

  • Python (pandas, NumPy, PyTorch basics, FastAPI)
  • SQL (analytical queries, modeling, optimization)
  • R / basic stats & forecasting concepts
  • Markdown, LaTeX when needed

Data & Infra

  • Databricks, Snowflake, SQL Server
  • dbt, Great Expectations
  • ETL design, audit-ready data modeling

Tools & Dev

  • VS Code / JetBrains
  • Git & GitHub (obviously)
  • Docker basics, packaging, automation scripting

Domains

  • Audit & forensic analytics
  • Financial/reporting integrity
  • Operations & supply chain analytics
  • Document intelligence & LLM-powered workflows

📚 Teaching, Leadership & Context

  • TA for Business Analytics II — helping students get comfortable with Python, pandas, and applied ML.
  • Leadership roles in Delta Sigma Pi & Sigma Chi (operations, treasury, tech, and comms).
  • Regularly build:
    • Internal dashboards
    • Notion & documentation systems
    • Study guides, formula sheets, and technical how-tos

I care a lot about:

  • Making complex systems understandable to non-technical users.
  • Writing docs that someone else’s future intern can follow.
  • Building open, well-documented tooling others can extend.

📫 Connect

If you’re working on:

  • audit / risk / forensic analytics,
  • AI for accounting / compliance,
  • or high-signal data tooling for real workflows,

feel free to reach out or open an issue — I like collaborating on serious problems.


“Strong systems, clear evidence, honest data.”

Popular repositories Loading

  1. Calendar Calendar Public

    Python 1

  2. ollama-deepseek ollama-deepseek Public

    screwing around with ai in colab

    Jupyter Notebook

  3. ML-Spotify ML-Spotify Public

    Python

  4. InteractiveScraper InteractiveScraper Public

    Python

  5. EventScraper EventScraper Public

    playing around with chrome extensions

    JavaScript

  6. PDF-Parsing PDF-Parsing Public

    Python