I’m convinced Edge AI is going to explode in the near term as a killer domain area for unlocking AI value.
Perhaps my experiences biased me…
While at Poly, I spent years looking at use cases and innovations that could bring intelligence to the edge and differentiate the user experience. Not to toot my own horn, but I’ve worked with some brilliant minds, like Scott Walsh and Arun Rajasekaran to work on the IP what’s needed to make that real.
During my tenure at Verizon the company was pursuing a variety of different technologies, from MEC (mobile-access edge compute) to Private 5G networks, that will enable future use cases with a greater shift of the workload towards the edge. And while many of those differentiated use cases are still in the early days, it’s undeniable that leveraging edge technologies offers a multitude of benefits.
I believe the building blocks for Edge AI are rapidly being forged.
And after a plethora of recent discussions about the future of AI and the increasing need for computing at the edge, I thought it would be interesting to step back from the enabling technology, and consider some of the key business considerations that will drive the rise of Edge AI.
Why Edge AI?
To be sure, there’s no shortage of availability of AI in the cloud - and many (if not the majority) state of the art models are intended for cloud deployment. Today, Edge AI deployment often requires an additional lift to get going - and while there will be tools in the future to bring the edge deployment challenges to parity with cloud, I’d argue that even today, there are some core benefits worth considering for why you should look at Edge AI. Namely:
- Cost
- Speed
- Privacy
Cost: Keeping your cloud bill under control
I often focus on cost first - as it’s core in any business case and driving switching behavior. Whether we are considering the model training side of AI, or the inference in production of results, AI is computationally expensive. Ask ChatGPT to vibe code for you, ask Llama to generate an image, run Gemini Deep Research, and there’s a giant cloud bill! Even if you’re not footing the bill for it today, I promise you it’s coming your way sooner or later!
Perhaps you’ve been reading articles about massive data center build out, or the huge amounts of energy companies are expecting to need to power their AI compute (here’s a couple in case you haven’t):
- North America data center construction market growing to $110B by 2030
- Microsoft and Dubai-telecom DU data center build
- Google’s pursuit of nuclear energy sources to power AI
- Fears on how AI will tax the US energy grid
The use of cloud for AI (and the associated bill you can expect) is so massive that Google’s current cloud “offer” for AI startups is $350,000 in cloud credits.
So what’s your cloud bill going to be?
Now look around your office, and picture all the compute you have at your finger tips, sitting around, paid for and unused.
Your headset has compute. Your video camera. Your router. Heck, even that Apple Lightning-to-HDMI adapter you carry around in your bag has enough compute to run Doom!
To be sure, there are more solutions needed in the stack to leverage all the latent compute just sitting around your home or office - but the simple fact is that you are surrounded by compute that WILL BE usable, at little to no extra cost - so why not avoid some of that AI-driven cloud bill?
Speed: When pseudo real-time isn’t good enough
“Hey [insert AI assistant] can you suggest 3 restaurants near my office where I could organize a business dinner for 5 people where the ambiance enables a good conversation, and the food isn’t too spicy”, you say prompting your LLM.
It process the text, shooting a query off to the cloud, and in a few seconds (perhaps even 10 seconds!) comes back with a pointed (albeit verbose) set of options with you to choose from.
That’s not real-time! Although we may often think of it as a real-time response, because of the parity it provides for getting the solution yourself via traditional approaches.
Instead consider the following:
- A robotic forklift that needs to make an instant decision combining visual / radar / lidar data as to whether or not to halt to avoid hitting a human walking in front of it: REAL-TIME
- A noise suppression solution to remove background baby crying as you try to make your sales pitch working from your home office: REAL-TIME
- An automated visual QA solution on a high speed assembly line, inspecting manufactured parts for defects (or to perform binning): REAL-TIME
There’s an inescapable fact around processing in the cloud - transit time. A packet of data must be transmitted over the air/cables - get the cloud and back - and the transit time (latency) introduced is dictated by the laws of physics.
When a true real-time response is warranted - processing in the cloud is not good enough!
Sometimes even microseconds count: Tesla has created its own Tesla Transfer Protocol over Ethernet, because “TCP/IP is too slow” for its supercomputing needs.
So when speed matters, and latency can’t be tolerated, it’s natural to ask the question - “what can we be processing at the edge?” It may still be early days of considering a true real-time solution space, but we are on the cusp of it!
Privacy: How can you get more secure than never leaving the house?
I’m sure you, like I, have the fondest memories of securing our life and health in 2020 by never leaving the house. And despite the mental torment, my 2020 was marked by absolutely no sickness - no sneezing, sniffles, fever - nothing.
Now I’m not in any way suggesting that cloud is insecure. Or that E2E encrypted transit over the network is not a best-practice in maintaining security. But in all fairness, once the data leaves your premises (in any manner) there is risk - however minimal that may be.
I believe that (at least today) most use cases can take on that risk - especially if they adopt and properly implement best practices. But when data security is a critical factor - why not simply keep it on the edge device?
A poignant use case example is biometric data. If you’re an iPhone user, rest assured: “Face ID data doesn’t leave your device and is never backed up to iCloud or anywhere else.” Pixel users can expect the same security: “the face model is stored securely on your phone and never leaves the phone.”
If the argument that data segmentation at the edge is not needed given modern day encryption standards is where you stand, then I’d counter you with the more doomsday-like argument about future quantum computing: “The potential for quantum computers to break current encryption standards means that any data intercepted and stored today could be at risk, leading to significant breaches of privacy, loss of competitive advantage, and compromised national security.”
But rather than debate, can’t we simply agree with logic - if the data never leaves the edge device, it’s as secure as you were sitting at home quarantined during COVID?
Cheaper, Faster, More Secure - But is it ready?
So the potential virtues of Edge AI are clear. With the right use case, processing at the edge could be a savvy business decision - even a differentiator today!
Yes, yes… let’s not get ahead of ourselves - it’s important we find the right solution, the right deployment approach for what we need. But there’s a more fundamental question at play today - is it even ready for my dev team to explore?
- Looking to explore Edge AI capability with a state of the art model - well Google just released Gemini Nano: “Gemini Nano allows you to deliver rich generative AI experiences without needing a network connection or sending data to the cloud. On-device AI is a great solution for use-cases where low latency, low cost, and privacy safeguards are your primary concerns.”
- Perhaps your big idea is to leverage vision recognition on a mobile device, like the iPhone? Consider Apple’s FastVLM for leveraging Apple Silicon.
- Pursuing an innovative hardware business approach - Qualcomm has a solution to help you out.
Even if none of these are the right toolkits for you - that’s ok! They are truly only the tip of the iceberg.
The inescapable truth is that Edge AI is real, and when applied appropriately can deliver real business benefits. The question isn’t how you get started, but how can you use it to create something amazing?


