Welcome to the SQL Query and Analysis Crew project! This application leverages the power of AI agents to generate, execute, and analyze the resulting data from user questions. It integrates with a PostgreSQL database and uses OpenAI's GPT models to process and analyze data.
- Introduction
- Project Overview
- Setup Instructions
- Running the Application
- How It Works
- Project Structure
- Contributing
- License
This project demonstrates how AI agents can collaborate to interpret a user's question, generate an appropriate SQL query, execute it against a PostgreSQL database, and analyze the results to provide meaningful insights. It's an example of using hierarchical agent management and tool integration to automate data analysis tasks.
The application employs four main agents, each with a specific role:
-
Manager Agent: Oversees the entire process, coordinating between agents to ensure the user's question is answered comprehensively.
-
SQL Developer Agent: Generates a valid and optimized SQL query based on the user's question and the database schema.
-
SQL Execution Agent: Executes the SQL query against the PostgreSQL database and retrieves the results.
-
Data Analyst Agent: Analyzes the data returned by the SQL execution agent to extract 2 to 3 meaningful insights relevant to the user's question.
- SQLExecutionTool: A custom tool that connects to a PostgreSQL database, executes SQL queries, and returns the results. It is used by the SQL Execution Agent to interact with the database.
Each agent performs specific tasks:
-
SQL Developer Task: Generate a SQL query to answer the user's question.
-
SQL Execution Task: Execute the SQL query and obtain results.
-
Data Analyst Task: Analyze the data and provide insights.
The Manager Agent coordinates these tasks, ensuring smooth execution and integration of outputs.
- Python 3.12 (not tested with <3.12)
- Docker and Docker Compose (for setting up the PostgreSQL database)
- An OpenAI API key (to access GPT models)
The project includes a Docker Compose file to set up a PostgreSQL database. The database schema and data are based on the dvdrental sample database.
To set up the database, follow the instructions in the setup_db.md file.
Create a .env file in the root directory of the project to store environment variables. This file should include your OpenAI API key.
OPENAI_API_KEY=your-openai-api-key-hereNote: Replace your-openai-api-key-here with your actual OpenAI API key. Do not share this key publicly.
-
Clone the repository:
git clone https://github.com/samgriek/sql-agent.git cd sql-agent -
Create a virtual environment:
python3 -m venv venv source venv/bin/activate -
Install the required packages:
pip install -r requirements.txt
After setting up the database and environment variables:
-
Ensure the PostgreSQL database is running:
docker-compose up -d
-
Run the application:
python main.py
-
Enter your question when prompted:
## Welcome to the SQL Query and Analysis Crew ---------------------------------------------- Please enter your question:For example:
How many customers rented more than 5 movies in July? -
Wait for the agents to process your question.
-
View the insights provided:
################################################ ## Here are the insights ################################################ [Agent outputs with insights]
-
User Input: The user enters a question in natural language.
-
Manager Agent Coordination:
- The Manager Agent receives the question and orchestrates the workflow.
- Assigns the SQL Developer Agent to generate the SQL query.
-
SQL Developer Agent:
- Analyzes the user's question and database schema.
- Generates an optimized SQL query.
-
SQL Execution Agent:
- Executes the SQL query using the
SQLExecutionTool. - Retrieves the results from the database.
- Executes the SQL query using the
-
Data Analyst Agent:
- Analyzes the data returned.
- Extracts 2 to 3 meaningful insights related to the user's question.
-
Manager Agent:
- Integrates the outputs from all agents.
- Presents the final insights to the user.
├── agents
│ └── pg_agents.py # Definitions of the agents
├── tasks
│ └── pg_tasks.py # Definitions of the tasks
├── tools
│ └── pg_query.py # SQLExecutionTool implementation
├── main.py # Entry point of the application
├── requirements.txt # Python dependencies
├── setup_db.md # Database setup instructions
├── database.md # Database schema description
├── docker-compose.yml # Docker configuration for PostgreSQL
├── .env # Environment variables (not tracked by git)
└── README.md # This file
Contributions are welcome! Please open an issue or submit a pull request.
This project is licensed under the MIT License.
Disclaimer: This project is for educational purposes and demonstrates the use of AI agents in automating data analysis tasks. Ensure you comply with OpenAI's usage policies when using their API.