Close Menu
Breaking News in Technology & Business – Tech Geekwire

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Tech Industry Pushback on ‘Smart Deregulation’ Proposal

    May 11, 2025

    DXC Launches DXC Complete: Simplifying SAP Modernization with Microsoft Azure

    May 11, 2025

    OpenAI in Talks with Microsoft for New Funding and Future IPO

    May 11, 2025
    Facebook X (Twitter) Instagram
    Breaking News in Technology & Business – Tech GeekwireBreaking News in Technology & Business – Tech Geekwire
    • New
      • Amazon
      • Digital Health Technology
      • Microsoft
      • Startup
    • AI
    • Corporation
    • Crypto
    • Event
    Facebook X (Twitter) Instagram
    Breaking News in Technology & Business – Tech Geekwire
    Home » Enhancing Amazon’s Just Walk Out Technology with Multi-Modal AI
    Amazon

    Enhancing Amazon’s Just Walk Out Technology with Multi-Modal AI

    techgeekwireBy techgeekwireFebruary 26, 2025No Comments5 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email

    Enhancing Amazon’s Just Walk Out Technology with Multi-Modal AI

    Since its introduction in 2018, Amazon’s Just Walk Out technology has redefined the retail experience. Customers can now enter a store, select their items, and leave without the need to wait in line to pay. This checkout-free technology is available in over 180 locations worldwide, including various retail environments such as sports stadiums, entertainment venues, and convenience stores.

    The Just Walk Out system automatically identifies the products each customer selects, generating digital receipts and eliminating checkout lines. This post highlights the latest advancements in Just Walk Out technology, powered by a sophisticated multi-modal foundation model (FM).

    We’ve engineered this multi-modal FM for physical stores, using a transformer-based architecture that is similar to the technology behind many generative AI applications. This model leverages data from multiple sources, including a network of overhead video cameras, advanced weight sensors on shelves, digital floor plans, and product catalog images, to generate highly accurate shopping receipts.

    In essence, a multi-modal model processes data from various input channels. Our investment in state-of-the-art multi-modal FMs enables the Just Walk Out system to be deployed in diverse shopping environments with improved accuracy and efficiency. The new system, like large language models (LLMs) that produce text, is engineered to generate a precise sales receipt for every shopper visiting the store.

    The Challenge: Addressing Complex Shopping Scenarios

    Just Walk Out stores provide a unique technical challenge due to their innovative checkout-free environment. Retailers and shoppers alike demand nearly perfect checkout accuracy. This includes handling unusual shopping behaviors that generate complex activity sequences, which require significant effort to analyze.

    Previous iterations of the Just Walk Out system used a modular architecture, breaking down the shopper’s visit into distinct tasks such as shopper interaction detection, item tracking, product identification, and quantity counting. These components were integrated into sequential pipelines. While this approach delivered highly accurate receipts, addressing challenges in new or complex shopping scenarios demanded substantial engineering effort, which limited scalability.

    The Solution: Just Walk Out Multi-Modal AI

    To overcome these challenges, we introduced a new multi-modal FM designed for retail store environments, empowering Just Walk Out technology to handle complex real-world shopping situations. The new FM improves the system’s ability to generalize to new store formats, new products, and customer behaviors—key to scaling Just Walk Out technology.

    Continuous learning is another advantage as the model automatically adapts and learns from new and challenging scenarios. This capability ensures high performance, even as retail environments evolve. The system utilizes a combination of end-to-end learning and enhanced generalization, allowing it to tackle a broader array of dynamic and complex retail settings. This enables a more user-friendly checkout-free shopping experience.

    The core elements of the Just Walk Out model include:

    • Flexible data inputs: The system tracks how shoppers interact with products and fixtures. It primarily uses multi-view video feeds and utilizes weight sensors to track small items. The model maintains a 3D digital representation of the store and cross-references catalog images to identify products, including situations where items are returned to shelves incorrectly.
    • Multi-modal AI tokens to represent shoppers’ journeys: Encoders process multi-modal data inputs, compressing them into transformer tokens, which serve as the foundational unit for the receipt model. This enables the model to interpret hand movements, distinguish between items, and accurately count items picked up or returned to the shelf with speed and precision.
    • Continuously updating receipts: The system uses tokens to generate digital receipts, distinguishing between various shopper sessions, and dynamically updating each receipt as items are selected or returned.

    Training the Just Walk Out FM

    By feeding large amounts of multi-modal data into the Just Walk Out FM, the model consistently generated accurate receipts, or, technically, “predicted” them. To improve accuracy, we designed more than 10 auxiliary tasks such as detection, tracking, image segmentation, grounding (connecting abstract concepts to real-world objects), and activity recognition, all within a single model. This enhanced the model’s ability to successfully work in new store setups, with new products, and customer behaviors. This is integral to expanding Just Walk Out technology to new locations.

    To achieve the most accurate results, AI model training relies on curated data fed to selected algorithms. We accelerated the model training by employing a data flywheel, a continuous, self-reinforcing cycle of data mining and labeling. This system is designed to integrate these iterative improvements with minimal human intervention.

    To address the massive data volume required for training high-capacity neural networks, we built the infrastructure for the Just Walk Out model on Amazon Web Services (AWS). We used Amazon Simple Storage Service (Amazon S3) for data storage and Amazon SageMaker for training.

    Key steps in the FM training process:

    • Selecting challenging data sources: We focused on training data from tricky shopping scenarios that tested the system’s limits. Though rare, these cases proved valuable for the model to learn from its mistakes.
    • Leveraging auto-labeling: Algorithms and models were created to attach meaningful labels to data automatically, which improves operational efficiency. In addition to receipt prediction, our algorithms handle these additional tasks.
    • Pre-training the model: The FM was pre-trained on a diverse collection of multi-modal data to enhance its ability to generalize to new store settings.
    • Fine-tuning the model: We further refined the model and implemented quantization techniques to develop a smaller, more efficient model for edge computing.

    As the data flywheel continues, more difficult cases will inform the training set, increasing the model’s accuracy and applicability in new retail environments.

    Tian Lan
    Tian Lan

    Tian Lan, a Principal Scientist at AWS.

    Chris Broaddus
    Chris Broaddus

    Chris Broaddus, a Senior Manager at AWS.

    Conclusion

    This multi-modal AI system represents a substantial step forward for Just Walk Out technology. This innovative approach shifts from modular AI, which depends on human-defined subcomponents and interfaces, to end-to-end trained AI systems that are simpler and more scalable. Multi-modal AI is improving the shopping experience in more Just Walk Out stores worldwide.

    AI Amazon computer vision Just Walk Out Machine Learning retail
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    techgeekwire
    • Website

    Related Posts

    Tech Industry Pushback on ‘Smart Deregulation’ Proposal

    May 11, 2025

    DXC Launches DXC Complete: Simplifying SAP Modernization with Microsoft Azure

    May 11, 2025

    OpenAI in Talks with Microsoft for New Funding and Future IPO

    May 11, 2025

    PAR Technology Corporation Reports Q1 2025 Earnings Results

    May 11, 2025

    The Crypto Entrepreneurs Dreaming of Creating New Countries

    May 11, 2025

    Deerfield Launches $600 Million Healthcare Innovations Fund III to Advance Therapeutics and Healthcare Technology

    May 11, 2025
    Leave A Reply Cancel Reply

    Top Reviews
    Editors Picks

    Tech Industry Pushback on ‘Smart Deregulation’ Proposal

    May 11, 2025

    DXC Launches DXC Complete: Simplifying SAP Modernization with Microsoft Azure

    May 11, 2025

    OpenAI in Talks with Microsoft for New Funding and Future IPO

    May 11, 2025

    PAR Technology Corporation Reports Q1 2025 Earnings Results

    May 11, 2025
    Advertisement
    Demo
    About Us
    About Us

    A rich source of news about the latest technologies in the world. Compiled in the most detailed and accurate manner in the fastest way globally. Please follow us to receive the earliest notification

    We're accepting new partnerships right now.

    Email Us: info@example.com
    Contact: +1-320-0123-451

    Our Picks

    Tech Industry Pushback on ‘Smart Deregulation’ Proposal

    May 11, 2025

    DXC Launches DXC Complete: Simplifying SAP Modernization with Microsoft Azure

    May 11, 2025

    OpenAI in Talks with Microsoft for New Funding and Future IPO

    May 11, 2025
    Categories
    • AI (1,970)
    • Amazon (794)
    • Corporation (751)
    • Crypto (875)
    • Digital Health Technology (790)
    • Event (420)
    • Microsoft (950)
    • New (7,064)
    • Startup (814)
    © 2025 TechGeekWire. Designed by TechGeekWire.
    • Home

    Type above and press Enter to search. Press Esc to cancel.