Close Menu
Breaking News in Technology & Business – Tech Geekwire

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Crawford County, Pa. to Use AI to Review 911 Response Quality

    July 5, 2025

    The Rise of Small Language Models: Enhancing AI Efficiency and ROI

    July 5, 2025

    CMS Announces 6-Year Prior Authorization Program Pilot

    July 5, 2025
    Facebook X (Twitter) Instagram
    Breaking News in Technology & Business – Tech GeekwireBreaking News in Technology & Business – Tech Geekwire
    • New
      • Amazon
      • Digital Health Technology
      • Microsoft
      • Startup
    • AI
    • Corporation
    • Crypto
    • Event
    Facebook X (Twitter) Instagram
    Breaking News in Technology & Business – Tech Geekwire
    Home ยป The Rise of Small Language Models: Enhancing AI Efficiency and ROI
    AI

    The Rise of Small Language Models: Enhancing AI Efficiency and ROI

    techgeekwireBy techgeekwireJuly 5, 2025No Comments3 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email

    The advent of large language models (LLMs) has sparked a surge in AI pilot programs transitioning to deployment in enterprises. However, early LLMs proved unwieldy and expensive, prompting a shift towards smaller, more efficient models. Companies like Google, Microsoft, and Mistral have developed compact models such as Gemma, Phi, and Small 3.1, respectively, offering fast and accurate solutions for specific tasks.

    Enterprises can now opt for smaller models tailored to particular use cases, reducing operational costs and potentially achieving a better return on investment (ROI). According to Karthik Ramgopal, distinguished engineer at LinkedIn, smaller models require less computational power, memory, and faster inference times, directly translating to lower infrastructure costs. “Task-specific models have a narrower scope, making their behavior more aligned and maintainable over time without complex prompt engineering,” Ramgopal explained.

    Model developers have priced their smaller models competitively. For instance, OpenAI’s o4-mini costs $1.1 per million tokens for inputs and $4.4/million tokens for outputs, significantly lower than the full o3 version at $10 for inputs and $40 for outputs. The availability of small, task-specific, and distilled models has expanded, with most flagship models now offering a range of sizes. Anthropic’s Claude family, for example, includes Claude Opus, Claude Sonnet, and Claude Haiku, with the latter being compact enough to operate on portable devices.

    The question of ROI remains complex. Experts suggest that while some companies consider ROI achieved by cutting task time, others wait for actual cost savings or increased business. Ravi Naarla, Cognizant’s chief technologist, recommends identifying expected benefits, estimating them based on historical data, and being realistic about overall AI costs, including hiring, implementation, and maintenance.

    Small models reduce implementation and maintenance costs, particularly when fine-tuned for more context. Arijit Sengupta, CEO of Aible, noted that fine-tuning can significantly reduce token costs. Aible’s experiments showed a 100X cost reduction from post-training alone, dropping model use costs from “single-digit millions to something like $30,000.” However, maintaining small models requires post-training to match large models’ performance, incurring additional costs.

    Experts emphasize the importance of right-sizing models for performance and cost. Daniel Hoske, CTO at Cresta, suggests starting with LLMs to assess feasibility before transitioning to smaller models. LinkedIn’s Ramgopal follows a similar approach, using general-purpose LLMs for prototyping before customizing solutions.

    While smaller models offer cost savings, they may lack the context window of larger models, potentially increasing human workload and costs. Rahul Pathak of AWS cautions against overusing small models, as they may not handle complex instructions effectively. Sengupta also warns that some distilled models can be brittle, potentially negating long-term savings.

    Industry players stress the need for flexibility in model choice. Tessa Burg, CTO at Mod Op, advises organizations to be prepared for model changes and updates. Starting with the understanding that current models will be superseded by better versions allows for more adaptable AI strategies. Smaller models have already helped Burg’s company save time and budget in researching and developing concepts.

    Ultimately, the key to maximizing ROI lies in matching model size to specific tasks and being prepared to adjust as needed. Vendors are now making it easier to switch between models automatically, but users must also consider platforms that facilitate fine-tuning to avoid additional costs.

    AI applications AI efficiency operational costs ROI small language models
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    techgeekwire
    • Website

    Related Posts

    Crawford County, Pa. to Use AI to Review 911 Response Quality

    July 5, 2025

    CMS Announces 6-Year Prior Authorization Program Pilot

    July 5, 2025

    Best Buy Sells Health Tech Startup Current Health

    July 5, 2025

    Modernizing Government through Technology and Institutional Design

    July 5, 2025

    Proposed ‘Frontier Valley’ Tech Zone Planned Near San Francisco

    July 5, 2025

    L.A.’s Thriving Crypto VC Scene: A Shift Towards Mainstream Acceptance

    July 5, 2025
    Leave A Reply Cancel Reply

    Top Reviews
    Editors Picks

    Crawford County, Pa. to Use AI to Review 911 Response Quality

    July 5, 2025

    The Rise of Small Language Models: Enhancing AI Efficiency and ROI

    July 5, 2025

    CMS Announces 6-Year Prior Authorization Program Pilot

    July 5, 2025

    Best Buy Sells Health Tech Startup Current Health

    July 5, 2025
    Advertisement
    Demo
    About Us
    About Us

    A rich source of news about the latest technologies in the world. Compiled in the most detailed and accurate manner in the fastest way globally. Please follow us to receive the earliest notification

    We're accepting new partnerships right now.

    Email Us: info@example.com
    Contact: +1-320-0123-451

    Our Picks

    Crawford County, Pa. to Use AI to Review 911 Response Quality

    July 5, 2025

    The Rise of Small Language Models: Enhancing AI Efficiency and ROI

    July 5, 2025

    CMS Announces 6-Year Prior Authorization Program Pilot

    July 5, 2025
    Categories
    • AI (2,700)
    • Amazon (1,056)
    • Corporation (991)
    • Crypto (1,132)
    • Digital Health Technology (1,082)
    • Event (526)
    • Microsoft (1,230)
    • New (9,584)
    • Startup (1,167)
    © 2025 TechGeekWire. Designed by TechGeekWire.
    • Home

    Type above and press Enter to search. Press Esc to cancel.