GAIA Benchmark AI Agent

Advanced AI Agent for GAIA Benchmark

This agent uses:

  • Qwen 2.5-7B-Instruct for reasoning and planning
  • Tavily Search for real-time information retrieval
  • Python Interpreter for computational tasks
  • File Reading capabilities for document analysis

Instructions:

  1. Clone this space and set up your environment variables:
    • TAVILY_API_KEY: Your Tavily API key for web search
    • HF_TOKEN: Your Hugging Face token (if needed)
  2. Log in to your Hugging Face account using the button below
  3. Click 'Run Evaluation & Submit All Answers' to start the evaluation

Expected Performance: This agent is designed to score >30% on the GAIA benchmark.

Questions and Agent Answers

Questions and Agent Answers