GAIA Benchmark AI Agent
Advanced AI Agent for GAIA Benchmark
This agent uses:
- Qwen 2.5-7B-Instruct for reasoning and planning
- Tavily Search for real-time information retrieval
- Python Interpreter for computational tasks
- File Reading capabilities for document analysis
Instructions:
- Clone this space and set up your environment variables:
TAVILY_API_KEY: Your Tavily API key for web searchHF_TOKEN: Your Hugging Face token (if needed)
- Log in to your Hugging Face account using the button below
- Click 'Run Evaluation & Submit All Answers' to start the evaluation
Expected Performance: This agent is designed to score >30% on the GAIA benchmark.
Questions and Agent Answers