Close Menu
Breaking News in Technology & Business – Tech Geekwire

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    IEEE Spectrum: Flagship Publication of the IEEE

    July 4, 2025

    GOP Opposition Mounts Against AI Provision in Reconciliation Bill

    July 4, 2025

    Navigation Help

    July 4, 2025
    Facebook X (Twitter) Instagram
    Breaking News in Technology & Business – Tech GeekwireBreaking News in Technology & Business – Tech Geekwire
    • New
      • Amazon
      • Digital Health Technology
      • Microsoft
      • Startup
    • AI
    • Corporation
    • Crypto
    • Event
    Facebook X (Twitter) Instagram
    Breaking News in Technology & Business – Tech Geekwire
    Home ยป New AI Benchmark Tests Vision-Language Models’ Ability to Play Classic Video Games
    AI

    New AI Benchmark Tests Vision-Language Models’ Ability to Play Classic Video Games

    techgeekwireBy techgeekwireApril 27, 2025No Comments2 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email

    A new research project has introduced VideoGameBench, an AI benchmark designed to test whether state-of-the-art vision-language models can play and beat a suite of 20 popular video games using only what they see on the screen.

    The researchers behind the project found that even the most advanced vision-language models, including GPT-4o, Claude Sonnet 3.7, and Gemini 2.5 Pro, struggle with playing classic first-person shooter games like Doom due to high inference latency. This delay means that by the time the model responds with an action, the game state has already changed significantly, making the action irrelevant.

    Challenges in Gaming Environments

    The researchers used classic Game Boy and MS-DOS games for their simpler visuals and diverse input styles, which better test a vision-language model’s spatial reasoning capabilities. The suite of games includes classics like Warcraft II, Age of Empires, and Prince of Persia. In their tests, Claude Sonnet 3.7 managed to progress the furthest in Doom, reaching the blue room.

    Key Findings

    1. Latency Issues: Delayed responses are particularly problematic in fast-paced games like first-person shooters.
    2. Action Understanding: Models often failed to perform basic in-game actions or understand how their actions would translate on-screen.
    3. Mouse Control: The most consistent failure across all tested models was an inability to reliably control the mouse in games requiring precise movements.

    The researchers emphasized that evaluating AI systems in dynamic environments like video games is crucial for understanding their limitations. Unlike complex tasks like unsolved math proofs, playing video games is considered a more accessible challenge for AI, yet current models still struggle.

    VideoGameBench and its associated agent are open-source, allowing developers to test vision-language models themselves. This development comes as AI continues to face challenges in gaming environments, a domain where even non-AI entities like lawnmowers and human gut bacteria have been tested against classic game challenges.

    AI Gaming technology video games vision-language models
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    techgeekwire
    • Website

    Related Posts

    IEEE Spectrum: Flagship Publication of the IEEE

    July 4, 2025

    GOP Opposition Mounts Against AI Provision in Reconciliation Bill

    July 4, 2025

    Navigation Help

    July 4, 2025

    Andreessen Horowitz Backs Controversial Startup Cluely Despite ‘Rage-Bait’ Marketing

    July 4, 2025

    Invesco QQQ ETF Hits All-Time High as Tech Stocks Continue to Soar

    July 4, 2025

    ContractPodAi Partners with Microsoft to Advance Legal AI Automation

    July 4, 2025
    Leave A Reply Cancel Reply

    Top Reviews
    Editors Picks

    IEEE Spectrum: Flagship Publication of the IEEE

    July 4, 2025

    GOP Opposition Mounts Against AI Provision in Reconciliation Bill

    July 4, 2025

    Navigation Help

    July 4, 2025

    Andreessen Horowitz Backs Controversial Startup Cluely Despite ‘Rage-Bait’ Marketing

    July 4, 2025
    Advertisement
    Demo
    About Us
    About Us

    A rich source of news about the latest technologies in the world. Compiled in the most detailed and accurate manner in the fastest way globally. Please follow us to receive the earliest notification

    We're accepting new partnerships right now.

    Email Us: info@example.com
    Contact: +1-320-0123-451

    Our Picks

    IEEE Spectrum: Flagship Publication of the IEEE

    July 4, 2025

    GOP Opposition Mounts Against AI Provision in Reconciliation Bill

    July 4, 2025

    Navigation Help

    July 4, 2025
    Categories
    • AI (2,696)
    • Amazon (1,056)
    • Corporation (990)
    • Crypto (1,130)
    • Digital Health Technology (1,079)
    • Event (523)
    • Microsoft (1,230)
    • New (9,568)
    • Startup (1,164)
    © 2025 TechGeekWire. Designed by TechGeekWire.
    • Home

    Type above and press Enter to search. Press Esc to cancel.