A full-stack application that uses OpenAI's GPT models through LangChain to automatically categorize and analyze Mexican bank statement transactions. Upload PDF statements and get intelligent insights about your spending patterns with full Spanish language support and a complete dark mode experience.
- Hybrid Approach: Combines rules-based and ML-based categorization
- Mexican Bank Support: Optimized for BBVA, Santander, and other Mexican banks
- CONDUSEF Format: Supports universal bank statement format (October 2024+)
- Multi-language Support: Full Spanish interface with Mexican financial terminology
- Intelligent Recognition: Understands Mexican merchant names and transaction patterns
- Interactive Charts: Built with Chart.js
- Custom Reports: Generate and export detailed reports
- Spending Trends: Track expenses over time
- End-to-End Encryption: All data encrypted in transit and at rest
- Data Minimization: Only processes necessary transaction data
- GDPR Compliant: Built with privacy in mind
graph TD
subgraph Frontend[React Frontend]
A[File Upload] -->|PDF Statements| B[Processing Status]
B --> C[Interactive Dashboard]
C --> D[Charts & Visualizations]
C --> E[Transaction Management]
end
subgraph Backend[FastAPI Backend]
F[API Gateway] --> G[Authentication]
G --> H[PDF Parser]
H --> I[Transaction Extractor]
I --> J[AI Categorizer]
J --> K[Analysis Engine]
end
subgraph Database[PostgreSQL]
L[Statements]
M[Transactions]
N[Categories]
O[Users]
end
subgraph AI[AI Services]
P[OpenAI API]
Q[LangChain]
R[Embeddings]
end
A -->|HTTP POST /upload| F
C -->|HTTP GET /api/statements| F
F -->|Query| L
F -->|Query| M
J -->|API Call| P
J -->|Use| Q
- OpenAI API Key: Get your API key from OpenAI Platform
- Docker & Docker Compose: For containerized deployment
git clone <repository-url>
cd statement-sense
# Copy environment file and add your OpenAI API key
cp example.env .env
# Edit .env and add your OPENAI_API_KEY
# Start development environment
make docker-dev
# OR
docker-compose up --build# Copy environment file
cp example.env .env
# Edit .env with your OpenAI API key and local settings
# Backend setup
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install uv
uv pip install -e ".[dev]"
# Start local database
docker-compose up postgres
# Run backend
make run
# OR
uvicorn app.main:app --reload
# Frontend setup (in another terminal)
cd frontend
npm install
npm run devgit clone <repository-url>
cd statement-sense
# Edit example.env with your OpenAI API key, then:
cp example.env .env
./setup.sh# OpenAI Configuration
OPENAI_API_KEY=your-openai-api-key-here
OPENAI_MODEL=gpt-3.5-turbo # or gpt-4 for better accuracy
# Database Configuration
DB_HOST=postgres # Use 'localhost' for local dev
DB_PORT=5432
DB_USER=statement_user
DB_PASS=statement_password
DB_NAME=statement_sense
# Security
SECRET_KEY=your-secret-key-here# OpenAI Fine-tuning
OPENAI_MAX_TOKENS=150
OPENAI_TEMPERATURE=0.1
# Application Settings
PROJECT_NAME=SentidoFinanciero
LOG_LEVEL=INFO
DEBUG=true
UPLOAD_DIR=./uploads
MAX_FILE_SIZE=50MB
# CORS
BACKEND_CORS_ORIGINS=http://localhost:3000- GPT-3.5-turbo: Fast and cost-effective, good for most use cases
- GPT-4: Higher accuracy for complex transactions, more expensive
Update your .env file:
OPENAI_MODEL=gpt-4 # For maximum accuracy
# OR
OPENAI_MODEL=gpt-3.5-turbo # For cost efficiency- Navigate to Subir Estado (Upload Statement) page
- Drag & drop PDF file or click Seleccionar Archivos to browse
- Supports CONDUSEF universal format (since October 2024)
- Maximum file size: 50MB per file
- Wait for upload confirmation
- Click Procesar to analyze with AI
- Access the main Dashboard for overview metrics
- View total statements, transactions, and amounts processed
- Search and filter statements by name or bank
- Click on any statement to view detailed analysis
- Select a processed statement from the dashboard
- Explore two main tabs:
- Análisis: Interactive charts showing spending distribution by category
- Transacciones: Complete transaction list with AI-generated categories
- View detailed breakdowns including:
- Balance Neto (Net Balance)
- Total Ingresos (Total Income)
- Total Gastos (Total Expenses)
- Spending by category with visual charts
- Review AI-suggested categories in the transaction list
- Categories include: Alimentación, Transporte, Salud, Ropa, etc.
- All categorization is automatic using hybrid AI approach
- Export functionality available for external analysis
- Dark Mode Toggle: Click the sun/moon icon in the navbar for instant theme switching
- System Preference: Select "System" mode to automatically follow your OS theme setting
- Persistent Settings: Your theme choice is saved and restored across sessions
- Complete Coverage: All components, charts, and interactions properly themed
- Professional Design: High contrast and accessibility-compliant color schemes
# Tier 1: Exact Keyword Matching (Fastest)
"OXXO ROMA" → "alimentacion" (Confidence: 1.0)
# Tier 2: Pattern Recognition (Fast + Smart)
"REST BRAVA" → regex: r'\brest\b' → "alimentacion" (Confidence: 0.8)
# Tier 3: OpenAI GPT Analysis (Smart + Context-Aware)
"POINTMP*VONDYMEXICO" → GPT → "servicios" (Confidence: 0.9)- 85% of transactions classified by Tiers 1-2 (< 1ms, $0 cost)
- 15% require GPT analysis (~500-1500ms, ~$0.001-0.003 per transaction)
- Intelligent Batching: Groups similar transactions to reduce API calls
- Context Awareness: GPT understands Mexican merchant names and contexts
- Alimentación - Restaurants, groceries, convenience stores
- Gasolineras - Gas stations, fuel
- Servicios - Utilities, subscriptions, streaming
- Salud - Healthcare, pharmacies, medical
- Transporte - Uber, taxi, parking, public transport
- Entretenimiento - Movies, bars, entertainment
- Ropa - Clothing, fashion, department stores
- Educación - Schools, books, courses
- Transferencias - Bank transfers, payments
- Seguros - Insurance, policies
- Intereses/Comisiones - Bank fees, interest
- Otros - Miscellaneous
POST /api/v1/statements/upload- Upload PDF fileGET /api/v1/statements- List all statementsGET /api/v1/statements/{id}- Get statement detailsPOST /api/v1/statements/{id}/process- Process statement with AIDELETE /api/v1/statements/{id}- Delete statement
GET /api/v1/statements/{id}/transactions- Get transactionsPUT /api/v1/transactions/{id}- Update transactionDELETE /api/v1/transactions/{id}- Delete transaction
GET /api/v1/statements/{id}/analysis- Get AI-powered spending analysis
# Upload a statement
curl -X POST "http://localhost:8000/api/v1/statements/upload" \
-F "[email protected]"
# Get AI analysis
curl "http://localhost:8000/api/v1/statements/{id}/analysis"Full API documentation available at: http://localhost:8000/docs
statement-sense/
├── app/ # FastAPI Backend
│ ├── api/ # API routes
│ ├── models/ # Database models
│ ├── schemas/ # Pydantic schemas
│ ├── services/ # Business logic
│ │ ├── pdf_parser.py # Enhanced PDF processing
│ │ ├── mexican_parser.py # Mexican bank statement parser
│ │ ├── ocr_table_parser.py # OCR table extraction
│ │ └── smart_categorizer.py # OpenAI + LangChain categorization
│ └── main.py # FastAPI app
├── frontend/ # React Frontend (SentidoFinanciero)
│ ├── src/
│ │ ├── components/ # UI components
│ │ ├── pages/ # Page components
│ │ ├── hooks/ # Custom hooks
│ │ ├── services/ # API services
│ │ └── utils/ # Utilities
│ └── package.json
├── docs/ # Documentation and screenshots
├── migrations/ # Database migrations
├── docker-compose.yml # Docker services
└── README.md
- Add model in
app/models/ - Create schema in
app/schemas/ - Add API endpoint in
app/api/ - Generate migration:
alembic revision --autogenerate
- Create component in
src/components/ - Add route in
src/App.jsx - Create API service in
src/services/ - Add hook in
src/hooks/
# Backend tests
pytest
# Frontend tests
cd frontend
npm test
# E2E tests
npm run test:e2e| Service | Port | Description |
|---|---|---|
| Frontend | 3000 | React development server |
| Backend | 8000 | FastAPI application |
| Database | 5432 | PostgreSQL database |
# Start all services
docker-compose up -d
# View logs
docker-compose logs -f [service-name]
# Restart service
docker-compose restart [service-name]
# Stop all services
docker-compose down
# Rebuild and start
docker-compose up --buildGPT-3.5-turbo:
- Input: $0.0005 / 1K tokens
- Output: $0.0015 / 1K tokens
- ~$0.001-0.003 per complex transaction
GPT-4:
- Input: $0.01 / 1K tokens
- Output: $0.03 / 1K tokens
- ~$0.01-0.03 per complex transaction
- Use GPT-3.5-turbo for most use cases (good accuracy, lower cost)
- Hybrid approach reduces API calls by 85%
- Batch processing for multiple statements
- Set monthly limits in OpenAI dashboard
Example Monthly Cost:
- 500 transactions/month
- 15% require AI (75 transactions)
- GPT-3.5-turbo: ~$0.08-0.23/month
- GPT-4: ~$0.75-2.25/month
- Invalid API Key: Check your
.envfile and OpenAI dashboard - Rate Limits: Upgrade your OpenAI plan or implement retry logic
- Insufficient Credits: Add billing information to your OpenAI account
- Check file is PDF format
- Ensure file size < 50MB
- Verify backend is running
- Check OpenAI API status
- Verify API key permissions
- Check backend logs for errors
- Check PostgreSQL is running
- Run migrations:
alembic upgrade head - Reset database:
python init_db.py
# View all logs
docker-compose logs -f
# Backend logs only
docker-compose logs -f backend
# Check OpenAI API usage
curl -H "Authorization: Bearer $OPENAI_API_KEY" \
https://api.openai.com/v1/usage- Use production OpenAI API key with proper limits
- Set secure environment variables
- Enable HTTPS
- Configure proper CORS settings
- Set up monitoring and logging
- Implement rate limiting
# Production compose file
docker-compose -f docker-compose.prod.yml up -dWe welcome contributions!
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- OpenAI - GPT models for intelligent categorization
- LangChain - Simplified LLM integration framework
- FastAPI - Modern Python web framework
- React - Frontend framework
- Tailwind CSS - Utility-first CSS
- Chart.js - Data visualization
- PostgreSQL - Database system
- Issues: Report bugs via GitHub Issues






