Local LLM Document Parser (Logistics Edition)

A privacy-focused, local document parser designed for the logistics/trucking industry. It uses Ollama (Llama 3.2) to intelligently categorize PDF documents into specific regulatory and operational folders without sending data to the cloud.

Features

Local Intelligence: Uses llama3.2 running locally via Ollama. No API costs, no data privacy concerns.
Smart Classification: Sorts documents into 12 specific categories:
- IFTA
- Corporation (Auto-detects California SOI)
- IRP, Permits, Title Transfer, Driver Files, DOT
- Clean Truck Check, Invoices, Drug Test
- INFO (General)
Manual Verification: Automatically routes ambiguous or low-confidence documents to a Manual_Verification folder for human review.
Extensible: Built with a modular architecture to support future expansions like Excel parsing and Vector DB integration.

Prerequisites

Python 3.10+
Ollama: Must be installed and running.
- Download Ollama or brew install ollama
- Pull the model: ollama pull llama3.2

Installation

Clone the repository:

git clone https://github.com/abhinyaay/document-parser.git
cd document-parser

Create and activate a virtual environment:

python3 -m venv venv
source venv/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```

Usage

Start Ollama Service:

brew services start ollama
# OR simply run 'ollama serve' in a separate terminal

Add Documents: Place your PDF files in the input/ directory.
Run Parser:
```
python3 parser_v2.py
```
Check Output: Sorted files will appear in the output/ directory, organized by category.

Project Structure

parser_v2.py: Main entry point.
classifier/local_llm.py: Logic interfacing with Ollama API.
extractors/: Modules for reading different file formats (currently PDF).
input/: Drop your raw files here.
output/: Processed files land here.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
classifier		classifier
extractors		extractors
input		input
.gitignore		.gitignore
README.md		README.md
gen_test_pdfs.py		gen_test_pdfs.py
parser_v2.py		parser_v2.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Local LLM Document Parser (Logistics Edition)

Features

Prerequisites

Installation

Usage

Project Structure

About

Uh oh!

Releases

Packages

Languages

abhinyaay/document-parser

Folders and files

Latest commit

History

Repository files navigation

Local LLM Document Parser (Logistics Edition)

Features

Prerequisites

Installation

Usage

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages