readme updates

Update faa9f03d with edits
faa9f03d first text index removed
2025-05-15 16:15:22 -07:00 · 2025-05-15 16:12:29 -07:00 · 2025-05-15 16:07:33 -07:00 · 2025-05-15 16:06:45 -07:00 · 2025-05-15 16:04:52 -07:00 · 2025-05-15 16:04:03 -07:00
8 changed files with 8084 additions and 505 deletions
--- a/changelog.md
+++ b/changelog.md
@ -4,9 +4,15 @@ This document tracks changes and updates to the ARC-AGI-2 dataset tasks.

 ## Updates

+### 2025-04-17
+
+* Public eval task `d8e07eb2` - [Single train pair update](https://github.com/arcprize/ARC-AGI-2/commit/14fba87526c727b80b3a9b85d5933fd7825b991f)
+
 ### 2025-04-14

 * Public Eval Tasks were updated with minor adjustments (off-by-one-pixel errors and slight ambiguities) to train and test pairs. No major task refactors. Updated tasks:
+    * `38007db0` - [Single test pair update](https://github.com/arcprize/ARC-AGI-2/commit/385b761253cf7157ad503909f4d8224b8d85ca97#diff-41216bd1be9cb219575a44e2a21a7dcf18667c75dfa292d52ea7878a3148bcd1)
+    * `36a08778` - [Single test pair update](https://github.com/arcprize/ARC-AGI-2/commit/385b761253cf7157ad503909f4d8224b8d85ca97)
    * `247ef758` - [Single test pair update](https://github.com/arcprize/ARC-AGI-2/commit/8b454b595552981fc9aa8e9540f3e68c92b0f03a)
    * `f560132c` - [Single test pair update](https://github.com/arcprize/ARC-AGI-2/commit/30c145f7c524c932d95d4a512abdd5318ef21bf9)
    * `f931b4a8` - [Train pair update](https://github.com/arcprize/ARC-AGI-2/commit/86a8149f53ce915c069cf586f061eb0af0204713)
--- a/data/evaluation/4a21e3da.json
+++ b/data/evaluation/4a21e3da.json
--- a/data/evaluation/abc82100.json
+++ b/data/evaluation/abc82100.json
--- a/data/evaluation/b6f77b65.json
+++ b/data/evaluation/b6f77b65.json
@ -152,36 +152,6 @@
      }
    ],
    "test": [
-      {
-        "input": [
-          [3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
-          [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
-          [0, 0, 4, 3, 3, 0, 0, 0, 0, 0, 0, 0],
-          [0, 0, 4, 0, 3, 0, 8, 7, 7, 7, 0, 0],
-          [0, 0, 4, 0, 3, 0, 8, 0, 0, 7, 0, 0],
-          [0, 0, 4, 0, 3, 0, 8, 0, 0, 7, 0, 0],
-          [0, 0, 6, 5, 5, 5, 5, 5, 0, 7, 0, 0],
-          [0, 0, 6, 0, 0, 0, 0, 5, 0, 7, 0, 0],
-          [0, 0, 6, 0, 0, 0, 0, 5, 0, 7, 0, 0],
-          [0, 3, 1, 1, 1, 0, 2, 2, 2, 2, 9, 0],
-          [0, 3, 0, 0, 1, 0, 2, 0, 0, 0, 9, 0],
-          [0, 3, 0, 0, 1, 0, 2, 0, 0, 0, 9, 0]
-        ],
-        "output": [
-          [3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
-          [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
-          [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
-          [0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0],
-          [0, 0, 4, 0, 0, 0, 0, 0, 0, 7, 0, 0],
-          [0, 0, 4, 0, 0, 0, 8, 7, 7, 7, 0, 0],
-          [0, 0, 4, 0, 0, 0, 8, 5, 0, 7, 0, 0],
-          [0, 0, 4, 0, 0, 0, 8, 5, 0, 7, 0, 0],
-          [0, 0, 6, 5, 5, 5, 5, 5, 0, 7, 0, 0],
-          [0, 0, 6, 0, 1, 0, 2, 2, 2, 2, 9, 0],
-          [0, 0, 6, 0, 1, 0, 2, 0, 0, 0, 9, 0],
-          [0, 0, 1, 1, 1, 0, 2, 0, 0, 0, 9, 0]
-        ]
-      },
      {
        "input": [
          [7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
--- a/data/evaluation/d8e07eb2.json
+++ b/data/evaluation/d8e07eb2.json
@ -35,9 +35,9 @@
      ],
      "output": [
        [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
-          [3, 3, 0, 0, 0, 3, 3, 2, 2, 3, 3, 3, 7, 3, 3, 3, 3, 3, 6, 6, 3, 3],
-          [3, 3, 0, 0, 0, 3, 3, 2, 2, 2, 3, 3, 7, 7, 7, 3, 3, 3, 3, 6, 3, 3],
-          [3, 3, 0, 0, 0, 3, 3, 3, 2, 3, 3, 3, 7, 3, 3, 3, 3, 3, 6, 6, 3, 3],
+        [3, 3, 0, 0, 0, 3, 3, 1, 1, 1, 3, 3, 7, 3, 3, 3, 3, 3, 6, 6, 3, 3],
+        [3, 3, 0, 0, 0, 3, 3, 3, 1, 3, 3, 3, 7, 7, 7, 3, 3, 3, 3, 6, 3, 3],
+        [3, 3, 0, 0, 0, 3, 3, 1, 1, 1, 3, 3, 7, 3, 3, 3, 3, 3, 6, 6, 3, 3],
        [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
        [6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6],
        [8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8],
--- a/data/evaluation/f560132c.json
+++ b/data/evaluation/f560132c.json
--- a/data/evaluation/faa9f03d.json
+++ b/data/evaluation/faa9f03d.json
--- a/readme.md
+++ b/readme.md
@ -4,14 +4,14 @@ This repository contains the ARC-AGI-2 task data (ARC-AGI-1 can be found [here](

 *"ARC can be seen as a general artificial intelligence benchmark, as a program synthesis benchmark, or as a psychometric intelligence test. It is targeted at both humans and artificially intelligent systems that aim at emulating a human-like form of general fluid intelligence."*

-A foundational description of the dataset, its goals, and its underlying logic, can be found in: [On the Measure of Intelligence](https://arxiv.org/abs/1911.01547) and the [ARC-AGI-2 Presentation](https://docs.google.com/presentation/d/1hQrGh5YI6MK3PalQYSQs4CQERrYBQZue8PBLjjHIMgI/edit?usp=sharing)
+A foundational description of the dataset, its goals, and its underlying logic, can be found in: [On the Measure of Intelligence](https://arxiv.org/abs/1911.01547), the [ARC-AGI-2 Presentation](https://docs.google.com/presentation/d/1hQrGh5YI6MK3PalQYSQs4CQERrYBQZue8PBLjjHIMgI/edit?usp=sharing) and [ARC-AGI-2 Technical Report](http://arcprize.org/blog/arc-agi-2-technical-report)

 ## Dataset composition

-ARC-AGI-2 contains 1,000 training tasks and 120 public evaluation tasks.
+ARC-AGI-2 contains 1,000 public training tasks and 120 public evaluation tasks.

 The training tasks are intended to demonstrate the task format and the Core Knowledge priors used by ARC-AGI. They can be used for training AI models.
-The public evaluation tasks are intended for testing AI models that have never seen these tasks before. Average human performance on these tasks in our test sample was 60%.
+The public evaluation tasks are intended for testing AI models that have never seen these tasks before. Average human performance on these tasks in our test sample was 66%.

 ARC-AGI-2 also features two private test sets not included in the repo:

@ -48,7 +48,7 @@ When looking at a task, a test-taker has access to inputs & outputs of the demon

 ## Usage of the testing interface

-You can view tasks on [ARCPrize.org/play](https://arcprize.org/play) or clone the [ARC-AGI testing interface](https://github.com/fchollet/ARC-AGI/tree/master/apps) located at `apps/testing_interface.html`. Open it in a web browser (Chrome recommended). It will prompt you to select a task JSON file.
+You can view tasks on [ARCPrize.org/play](https://arcprize.org/play) or clone the [ARC-AGI-1 testing interface](https://github.com/fchollet/ARC-AGI/tree/master/apps). Open it in a web browser (Chrome recommended). It will prompt you to select a task JSON file.

 After loading a task, you will enter the test space, which looks like this:
Author	SHA1	Message	Date
Greg Kamradt	f3283f7274	readme updates	2025-05-15 16:15:22 -07:00
Greg Kamradt	2c42f4d6f2	Update faa9f03d with edits	2025-05-15 16:12:29 -07:00
Greg Kamradt	f4852d1766	faa9f03d first text index removed	2025-05-15 16:07:33 -07:00
Greg Kamradt	fa11dfc31c	removing first test index f560132c	2025-05-15 16:06:45 -07:00
Greg Kamradt	f85d970504	removing b6f77b65 first test index	2025-05-15 16:04:52 -07:00
Greg Kamradt	fb0a4bfce8	removing abc82100 first test index	2025-05-15 16:04:03 -07:00
Greg Kamradt	124910ab8e	removing 4a21e3da first test index	2025-05-15 16:02:37 -07:00
Greg Kamradt	1ef37bc909	changelog update	2025-04-17 10:37:44 -07:00
Greg Kamradt	14fba87526	task update	2025-04-17 10:36:10 -07:00
Greg Kamradt	fd80c5ad77	adding 2 tasks	2025-04-15 14:33:56 -07:00