| title | emoji | colorFrom | colorTo | sdk | sdk_version | app_file | pinned | hf_oauth | hf_oauth_expiration_minutes |
|---|---|---|---|---|---|---|---|---|---|
Final Assigment Submission GAIA Agent |
π΅π»ββοΈ |
indigo |
indigo |
gradio |
5.25.2 |
app.py |
true |
true |
480 |
This repository contains an AI agent developed for the Hugging Face Agents Course. The primary goal of this project is to create a robust agent capable of tackling tasks from the GAIA Benchmark.
The agent is built using Python and leverages the power of LangGraph for creating a stateful, multi-actor agent. It interacts with various tools to gather information, perform actions, and ultimately solve complex problems.
This project uses uv for Python package management. uv is a fast Python package installer and resolver, written in Rust.
To set up the environment:
- Install
uv: Follow the instructions at https://github.com/astral-sh/uv. - Create a virtual environment and install dependencies:
Or, if you prefer to use the
uv venv uv pip install -r requirements.txt
pyproject.toml:uv sync
To use this agent, you will need API keys for the following services:
- Groq: For fast LLM inference. You can get a key from GroqCloud.
- Tavily AI: For the comprehensive web search tool. You can get a key from Tavily AI.
Once you have your keys, create a .env file in the root of the project and add your keys like this:
GROQ_API_KEY="gsk_YOUR_GROQ_API_KEY"
TAVILY_API_KEY="tvly-YOUR_TAVILY_API_KEY"Replace gsk_YOUR_GROQ_API_KEY and tvly-YOUR_TAVILY_API_KEY with your actual API keys. The agent will load these keys automatically.
The core of the agent is built with LangGraph. It follows a ReAct (Reason + Act) prompting strategy.
- LLM: The agent uses a Large Language Model (LLM) hosted on Groq (e.g.,
qwen/qwen3-32borllama3-8b-8192) for its reasoning capabilities. - Prompting: A base prompt (
base_prompt.txt) guides the LLM's behavior, instructing it on how to use the available tools and respond to user queries. - Tools: The agent has access to a suite of tools to interact with the external world. These tools allow it to:
- Perform mathematical calculations (e.g.,
calculator,multiply, ...). - Search the web and specific platforms (e.g.,
web_searchvia Tavily,wiki_search,arxiv_search). - Read and write files (e.g.,
read_file,write_file,list_files). - Download files from URLs (
download_file). - Fetch and parse web page content (
get_url,get_url_text). - Process images (captioning with
image_captioner, OCR withocr).
- Perform mathematical calculations (e.g.,
- Graph: The LangGraph framework orchestrates the flow of information between the LLM and the tools. The
agent.pyfile defines the graph structure, including:- An
AgentStateto hold the current state of the conversation and any input files. - An
assistantnode that calls the LLM. - A
ToolNodethat executes the chosen tool. - Conditional edges that determine the next step based on the LLM's output (e.g., call a tool or respond to the user).
- An
The create_react_agent function from langgraph.prebuilt is used to quickly set up a ReAct agent with the specified LLM and tools.
.github/workflows/sync-to-hf.yml: GitHub Actions workflow to automatically sync the repository to a Hugging Face Space.agent.py: Defines the LangGraph agent, its state, and the interaction logic.app.py: Gradio application to interact with the agent.base_prompt.txt: The system prompt for the LLM.pyproject.toml: Project metadata and dependencies foruv.requirements.txt: List of Python dependencies (can be generated frompyproject.toml).tools.py: Defines all the tools available to the agent.
