The MC-LARC dataset proposed in this study consists of ‘one sentence describing the example input image from ARC’ and ‘five sentences explaining the problem-solving rules’. The image description sentence was written directly by humans, while the problem-solving rule sentences were created using GPT4-32k. Subsequently, an inference ability test experiment was conducted using MC-LARC to determine whether ChatGPT4 and humans could select appropriate descriptions for the given images.

Download paper here