A little while ago I was listening to a podcaster interview with the CEO of one of the biggest tech companies in the world. He made a good point – companies and industries that use AI to boost productivity and grow the economy will be the real winners of AI.
It calls to mind the practical advances we’ve seen with generative AI, especially small language models (SLMs) and AI agents. Although currently less “famous” than the large language models (LLMs) found on desktop PCs, phones and in news headlines, SLMs offer some outstanding benefits and real-life applications for frontline workers in industries like retail.
A selection of dedicated SLMs as part of a suite of AI agents can be effectively optimised for the intelligent automation of specific tasks. These AI capabilities can enable frontline workers to easily, accurately and quickly capture workflow context and integrate it directly into a handheld mobile computer with AI agents, to improve productivity, gain better asset visibility, and elevate the customer experience.
Making it Real
SLMs are also ideal for on-device AI, bringing AI capabilities to mobile devices and wearables and other resource-constrained devices, enabling features like offline voice assistants and real-time translation. SLM AI agents enable edge computing applications allowing for the processing of data closer to the source, reducing latency and bandwidth consumption.
This offers significant benefits for frontline workers in retail, warehouses and logistics operations, as it significantly enhances real-time decision-making and boosts efficiency in these environments.
AI in the Moment – No Cloud Needed
On-device SLMs offer some compelling advantages that IT, innovation and CTO teams would appreciate, such as additional privacy, which is a major benefit. User data never leaves the device, minimising the risk of data breaches and ensuring greater control over personal information.
There’s also lower latency, as processing happens locally, without the need to communicate with a remote server. This results in near instantaneous responses, which is critical for real-time applications like voice assistants and live translation.
There can be reduced bandwidth and LLM cloud compute costs, since data isn’t being sent to the cloud, and on-device processing can significantly reduce mobile data usage. SLMs on-device also enable offline functionality, as the AI can still function even without an internet connection, making it useful in areas with limited or unreliable connectivity.
There is a lot of potential in several areas including enhanced voice assistants that are more responsive, privacy-preserving and understand and respond to complex requests even offline. There’s also real-time translation, with instantaneous translation of spoken language and text without relying on a cloud connection.
The Future is Multimodal AI Agents
The future of AI is inherently multimodal. Humans don’t experience the world through text only; we use sight, sound, touch, and more. AI needs to do the same using all the senses to truly understand and interact with the world effectively.
The good news is that SLMs and AI agents can be multimodal, such as the merchandising example cited earlier. They are not limited to only processing and generating text. And, in this vision, they must be multimodal to reach their full potential, especially when deployed on edge devices. There are a couple of primary approaches to achieving this, each with its own advantages and considerations.
The trend is toward more integrated multimodal SLMs as the technology matures, and we find more efficient ways to train these complex models. However, a modular approach is often more practical and cost-effective in the short term, particularly when dealing with the resource limitations of on-device deployment.
This future involves a combination of both approaches, depending on the specific use case and the available resources. Current research and development will create more efficient and powerful integrated multimodal SLMs and AI agents, while also developing robust modular systems that can be easily customised and deployed on a wide range of devices.
Ultimately, the goal is to enable AI systems that can seamlessly process and understand the world through multiple senses, enabling more natural, intuitive, and effective interactions with humans and the environment. AI that makes work better every day will be the real winner.
Would AI agents be useful for you and your teams? Learn more here.