ARC-AGI Test — Pattern Reasoning Puzzles
Visual Logic Puzzles · Free · No Signup

ARC Pattern Reasoning Test

Study tiny colored grids, infer the hidden transformation rule, and complete the missing output. No trivia, no math formulas - just visual reasoning.

⚠️ Note: these puzzles are based on early ARC-AGI-1 (2019). Modern AI models like GPT-5.4 and Claude Opus 4.6 (March 2026) can now solve these easily. This is a challenge for you — not a current AI test.
5
Puzzles
60%
Human avg.
~0%
Old AI score
Free
Always

Also try: You vs ChatGPT

10 questions AI famously fails — logic traps, counting tricks + 1 ARC puzzle. GPT-4o scores 1/10.

Attribution: Puzzle tasks on this page are designed in the style of the ARC-AGI benchmark format. ARC-AGI was created by François Chollet (© 2019). ARC-AGI-2 © ARC Prize Foundation. Both released under the Apache License, Version 2.0. Source datasets: github.com/fchollet/ARC-AGI · github.com/arcprize/ARC-AGI-2 · arcprize.org. Tasks on this page are original works designed in the ARC-AGI format; they are not direct reproductions from the dataset.

What Is ARC-AGI?

ARC-AGI (Abstraction and Reasoning Corpus for Artificial General Intelligence) was designed by François Chollet — creator of Keras and a leading AI researcher at Google — as a rigorous test of general intelligence. Unlike benchmarks that reward memorization of training data, ARC-AGI tests a skill that humans use effortlessly: learning a new rule from just a few examples and applying it to a new case.

Each puzzle consists of colored grid transformations. You see 2–3 input/output pairs, figure out the rule, and complete a new test case. The grids are small, the colors are simple, and no domain knowledge is required. Yet these tasks remain extraordinarily difficult for AI.

Why Does AI Fail ARC-AGI?

When ARC-AGI-2 was released in March 2025, the results were stark: GPT-4o scored 0%, Claude 3.7 Sonnet scored 0%, Gemini 2.0 Flash scored 1.3% — compared to a human average of 60%. Every task had been solved by at least two humans in under two attempts. The gap reveals a fundamental difference between AI pattern-matching and human reasoning:

The ARC Prize

The ARC Prize Foundation runs an annual competition offering millions of dollars for AI systems that can match human performance on ARC-AGI tasks without the astronomical compute costs. The competition has driven significant research into genuine machine reasoning, with results improving each year. The ARC-AGI-1 benchmark is now considered largely solved by top systems; ARC-AGI-2 is the current frontier.

All ARC-AGI benchmark data is open source under the Apache License 2.0, making it freely available for research, education, and tools like this one.

🤖
Also try: You vs ChatGPT
10 documented AI failures — counting traps, logic riddles, and more. GPT-4o scores 1/10.

Love Learning How Minds Work?

NerdSip has courses on AI, cognitive science, logic, and hundreds of other topics — in 5-minute gamified micro-lessons.

Download NerdSip Free