Sunday, March 16, 2025
HomeTechnologyAI Can't Tell Time? Study Shows LLM Struggles

AI Can’t Tell Time? Study Shows LLM Struggles

AI, artificial intelligence, large language models, LLMs, multimodal LLMs, time perception, clock reading, calendar comprehension, Edinburgh University, research, GPT-4o, Gemini 2.0, Claude 3.5 Sonnet, Llama 3, Qwen2-VL7B-Instruct, MiniCPM-V-2, Rohit Saxena, AI limitations, time-sensitive applications

The Time-Telling Troubles of Artificial Intelligence: A Reality Check

Artificial intelligence has made astonishing strides in recent years, demonstrating capabilities that were once confined to the realm of science fiction. We now have AI that can generate photorealistic images indistinguishable from reality, craft intricate novels with compelling characters and narratives, assist students with their academic assignments, and even predict the complex structures of proteins, a feat that could revolutionize medicine and biotechnology. These advancements have fueled excitement and speculation about the potential of AI to transform various aspects of our lives.

However, a new study conducted by researchers at Edinburgh University has unveiled a surprising weakness in these seemingly omnipotent systems. Despite their remarkable achievements in complex tasks, artificial intelligence often struggles with a seemingly elementary skill: telling time. This revelation casts a spotlight on the limitations of current AI models and highlights the need for further research to bridge the gap between human and machine intelligence.

The researchers focused their investigation on the ability of seven prominent multimodal large language models (MLLMs) to interpret time-related information from visual inputs. MLLMs are a type of AI designed to process and generate various forms of media, including images, text, and audio. The models tested included OpenAI’s GPT-4o and GPT-o1, Google DeepMind’s Gemini 2.0, Anthropic’s Claude 3.5 Sonnet, Meta’s Llama 3.2-11B-Vision-Instruct, Alibaba’s Qwen2-VL7B-Instruct, and ModelBest’s MiniCPM-V-2.6.

The study, which is scheduled for publication in April and is currently available on the preprint server arXiv, involved presenting the MLLMs with a series of images depicting analog clocks and calendars. The clock images featured a variety of designs, including timekeepers with Roman numerals, different dial colors, and some even lacking a seconds hand. The calendar images spanned a period of ten years.

The researchers then posed a series of time-related questions to the models, based on the images they were shown. For the clock images, the models were asked a straightforward question: "What time is shown on the clock in the given image?" For the calendar images, the questions ranged from simple queries such as "What day of the week is New Year’s Day?" to more challenging questions like "What is the 153rd day of the year?"

The results of the study were surprisingly underwhelming. Overall, the AI systems demonstrated a significant lack of proficiency in accurately interpreting time from visual cues. They correctly read the time on analog clocks less than 25% of the time. The models struggled with clocks featuring Roman numerals and stylized hands just as much as they struggled with clocks lacking a seconds hand altogether. This suggests that the underlying problem lies in the AI’s ability to accurately detect the positions of the hands and interpret the angles they form on the clock face.

While Google’s Gemini-2.0 performed relatively better on the clock task compared to its competitors, GPT-o1 achieved a higher accuracy rate of 80% on the calendar task. However, even the most successful MLLM on the calendar task still made mistakes approximately 20% of the time. This indicates that even in tasks where AI demonstrates some level of competence, there is still a significant margin for error.

The researchers emphasized the cognitive complexity involved in tasks that humans typically master at a young age. "Analogue clock reading and calendar comprehension involve intricate cognitive steps: they demand fine-grained visual recognition (e.g., clock-hand position, day-cell layout) and non-trivial numerical reasoning (e.g., calculating day offsets)," the researchers explained.

These findings highlight a fundamental gap in the ability of AI to perform tasks that are considered basic skills for humans. Rohit Saxena, a co-author of the study and a PhD student at the University of Edinburgh’s School of Informatics, emphasized the significance of this deficiency. "Most people can tell the time and use calendars from an early age. Our findings highlight a significant gap in the ability of AI to carry out what are quite basic skills for people," Saxena stated in a university press release.

The implications of this study extend beyond the mere observation that AI struggles with time-telling. The ability to accurately interpret and reason about time from visual inputs is crucial for a wide range of real-world applications, from scheduling events and managing logistics to enabling autonomous systems and assistive technologies. If AI systems are to be successfully integrated into these time-sensitive domains, it is essential to address these shortcomings and improve their ability to understand and reason about time.

The study’s findings serve as a reminder that while AI has made remarkable progress in many areas, it is still far from achieving human-level intelligence. Despite their ability to generate sophisticated content and perform complex calculations, AI systems often struggle with tasks that require common sense, contextual understanding, and the ability to integrate information from different modalities.

The researchers hope that their work will encourage further research into the temporal reasoning capabilities of AI. By identifying the specific areas where AI struggles, researchers can focus their efforts on developing new algorithms and techniques that will enable AI systems to better understand and reason about time.

In conclusion, while AI might be capable of completing complex tasks like writing novels and predicting protein structures, it still faces significant challenges in mastering basic skills like telling time. This limitation underscores the need for continued research and development to bridge the gap between human and machine intelligence and ensure that AI systems can be reliably integrated into time-sensitive, real-world applications. So, while AI might be able to complete your homework, don’t count on it sticking to any deadlines just yet.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular