I've been thinking a lot lately about where we are with AI. Not the hype, not the headlines, but the reality of what's actually happening on the ground with real customers trying to solve real problems.
And honestly? There's a massive disconnect between what everyone's talking about and what's actually working in the real world.
Let me explain what I mean.
We're heading toward AI everywhere whether we're ready or not
Look, the trajectory is pretty clear if you step back and look at the big picture. We went from approximately 10 million mainframes back in the 60s-80s to 2 billion PCs by 2005. Now we've got over 50 billion connected devices, and we're racing toward 100+ billion devices that demand some form of intelligence built in.
This isn't just tech evolution, it's economics. Three things are driving this shift that nobody can ignore:
- AI compute costs are dropping fast – what used to cost thousands now costs hundreds
- Cloud AI hits a wall when you need real-time responses for billions of devices
- The value you get from AI goes through the roof when it's right where you need it
Think about it this way: if you're running a retail store, you can't have your self-checkout system waiting for a round trip to some data center in Virginia every time someone scans a banana. An industrial robot can't pause for 200 milliseconds to "think" about whether to grab that part or not, or worse, to stop if a worker crosses its path. A smart traffic system can't afford to have every camera upload video to the cloud just to figure out if the light should change. A car can’t wait for the network to decide whether to turn right, left, or correctly recognize a danger.
The math does not work.
The problem: current solutions are... not great
Here's what really gets me. Despite this obvious need, the solutions out there are just not cutting it. I see this every single day when I talk to customers across different industries, and across regions.
Retail experts tell me their current edge AI setups can't handle the latest computer vision models they need. The hardware isn't fast enough, it overheats in their environment, or it is too costly to scale across thousands of stores or point-of-sale systems.
Industrial customers say everything available is either too power-hungry (try explaining a $500/month electricity bill increase to procurement) or gets thermally constrained the moment you put it in a real factory setting.
Smart city deployments? Most cities take one look at the price tag and just walk away. The ROI isn't there with current solutions.
Medical and agritech applications need something that can run 24/7 without breaking the bank on power costs, and frankly, most of what's available today just can't deliver.
The fundamental issue is that everyone's trying to shove cloud chips or mobile processors into edge applications. It's like trying to use a freight train for Formula 1 racing – the underlying architecture just wasn't built for this job.
Why the cloud-first approach is hitting its limits
Truly disruptive technologies often begin in a centralized form, requiring significant investment and intensive experimentation. As the technology matures, it gradually decentralizes.
Take electricity: we started with massive, centralized power plants, but today we’re moving toward distributed systems—solar panels on rooftops and, potentially in the future, compact nuclear reactors powering individual homes or neighborhoods.
The same trend applies to computing. Centralized mainframes evolved to personal computers, and now to smartphones in every pocket.
We’re seeing the same pattern unfold in quantum computing, and it’s already happening with AI. The evolution in computing, software and neural network architecture are making edge AI (physical AI) come true.
Meanwhile, latency, privacy concerns, or regulatory requirements for keeping data local are accelerating the expansion of AI from cloud to edge devices.
The math that matters: it's all about matrix operations
Running real artificial intelligence in a constrained environment like inside edge devices requires a completely new hardware and software architecture.
Let me get a bit technical for a minute, because this is where it gets interesting.
Neural networks are basically doing matrix-vector multiplications (MVMs) about 70-90% of the time, whether you're doing speech recognition, natural language processing, or computer vision. That's just the reality of how these models work.
Traditional computer architectures are constantly moving data back and forth between memory and processing units. For edge applications where every milliwatt matters, this approach is just wasteful. You're spending most of your energy budget on data movement rather than actual computation.
The solution? You need to rethink the silicon architecture completely. Put memory and application specific compute elements right next to each other, reduce data movement, shrink the physical footprint, and dramatically increase throughput for these MVM operations.
This is where digital in-memory computing architectures really shine. They're built specifically for the mathematical operations that define modern AI workloads. It's not about being faster at everything; it's about being optimal for the things that actually matter.
The real challenge is making high performance accessible
But solving the hardware problem is only half the battle. The real breakthrough happens when you make this performance accessible to developers and innovators who aren't chip designers.
This means comprehensive software development kits that hide the complexity while delivering the full performance benefits. It means modular solutions: M.2 cards, complete edge servers, whatever fits into existing infrastructure without requiring a complete overhaul.
And most importantly, it means pricing that makes sense for real-world deployments, not just proof-of-concept demos.
The sectors that are ready to explode
The opportunity is huge because so many industries are basically waiting for solutions that actually work:
Retail and hospitality need computer vision that's reliable and fast enough for real-time applications, but current solutions are either too expensive or too unreliable for widespread rollout.
Energy and utilities want distributed intelligence for grid management and predictive maintenance, but existing edge AI hardware can't handle the environmental requirements and uptime expectations.
Manufacturing and robotics need real-time decision-making that current solutions simply can't deliver at the right price points and power budgets.
Smart cities might be the biggest opportunity of all: traffic optimization, public safety, infrastructure monitoring. They all require local processing that doesn't exist at scale today.
These aren't niche applications. These are massive markets waiting for technology that actually works.
What comes next
The companies that will win the next wave of AI deployment in the physical world won't be the ones with the biggest cloud infrastructure or the most general-purpose chips. They'll be the ones who figured out early that edge AI needs purpose-built solutions: hardware and software designed from scratch for distributed intelligence.
This isn't some future scenario. It's happening right now. The question isn't whether we'll see ubiquitous AI deployment. We will. The question is which architectures and which companies will make it possible most effectively.
As we build toward this future, I think there are three things that really matter: delivering genuine performance improvements over what exists today, making that performance accessible through intuitive tools, and pricing it for mass adoption rather than just high-end applications.
The next frontier of AI isn't in the cloud, it's everywhere else, around us. (Should we say in the physical world?) And honestly, I think the companies that get this first are going to define the next era of computing.
It's still day one, and the best is yet to come.