Researchers have discovered a surprising limitation in the capabilities of modern artificial intelligence: many AI systems struggle with the seemingly simple task of telling time.
AI’s Time-Telling Troubles

These days, AI can generate images, write stories, and even solve complex problems. However, a new study reveals that many AI models struggle with basic temporal reasoning. Researchers from the University of Edinburgh tested the ability of several large language models (LLMs) to interpret images of analog clocks and calendars.
Their findings, which will be published in April and are currently available on the arXiv preprint server, indicate significant difficulties for these advanced AI systems when confronting these fundamental tasks.
“The ability to interpret and reason about time from visual inputs is critical for many real-world applications—ranging from event scheduling to autonomous systems,” the researchers noted in their study. “Despite advances in multimodal large language models (MLLMs), most work has focused on object detection, image captioning, or scene understanding, leaving temporal inference underexplored.”
The research team assessed the performance of several leading AI models. These included:
- OpenAI’s GPT-4o and GPT-o1
- Google DeepMind’s Gemini 2.0
- Anthropic’s Claude 3.5 Sonnet
- Meta’s Llama 3.2-11B-Vision-Instruct
- Alibaba’s Qwen2-VL7B-Instruct
- ModelBest’s MiniCPM-V-2.6
The researchers presented the models with various images, including analog clocks with Roman numerals, different dial colors, and even clocks missing a seconds hand. They also used 10 years of calendar images to assess the AI’s ability to answer time-related questions.
For the clock images, the AI models were asked to state the time shown. Calendar queries ranged from straightforward questions about the day of the week for New Year’s Day to more complex questions, such as identifying the 153rd day of the year. According to the researchers, “Analogue clock reading and calendar comprehension involve intricate cognitive steps: they demand fine-grained visual recognition (e.g., clock-hand position, day-cell layout) and non-trivial numerical reasoning (e.g., calculating day offsets.”
Poor Performance Across the Board
The results were not encouraging. The AI systems correctly read the time on analog clocks less than 25% of the time. The models struggled with stylized hands and clocks with Roman numerals as much as they struggled with clocks missing a seconds hand. Researchers believe this suggests a core issue in detecting and interpreting the angles of the clock hands.
Of all the models tested, Google’s Gemini-2.0 performed best on the clock task, while GPT-o1 had the best performance (80%) on the calendar questions. Even the best performance on the calendar task showed that the AI system made errors approximately 20% of the time.
Rohit Saxena, a co-author of the study, and PhD student at the University of Edinburgh’s School of Informatics, said, “Most people can tell the time and use calendars from an early age. Our findings highlight a significant gap in the ability of AI to carry out what are quite basic skills for people.”
Saxena continued, “These shortfalls must be addressed if AI systems are to be successfully integrated into time-sensitive, real-world applications, such as scheduling, automation and assistive technologies.”
So, while AI is rapidly advancing and can perform difficult tasks, it seems that these models aren’t ready to manage your schedule just yet.