Top AI LLM Leaderboard
An interactive ranking of leading Large Language Models in the race to AGI.
Intelligence Report: The AGI Race
Inspired by independent analysis from ArtificialAnalysis.ai, this section provides insight into the benchmarks defining the frontier of AI.
Key Intelligence Benchmarks
MMLU-Pro
Tests broad, expert-level knowledge.
GPQA Diamond
Evaluates graduate-level science questions.
LiveCodeBench
Measures real-world coding capabilities.
AIME & MATH
Assesses advanced mathematical reasoning.
Frontier Model Intelligence Over Time
Late 2022
The Spark
Early models set the stage for the AGI race.
2023
The Leap
Models like GPT-4 demonstrate huge leaps in reasoning.
2024
The Acceleration
Rapid releases from all major labs push performance.
2025 & Beyond
The Frontier
The race intensifies, approaching human-expert performance.