It was solely a bit greater than a yr in the past that I began listening to about Steady Diffusion and Midjourney and the power to create photos from nothing. Simply string a number of phrases collectively, and a generative AI mannequin sitting on a server transforms these written phrases right into a graphic picture. Magic.
Every thing has progressed so quick and so frenetically since then. And immediately, I used to be standing in the midst of MediaTek’s sales space at MWC, an Android cellphone working the Dimensity 9300 chipset and producing AI photos on the fly.
The mannequin generated and improved the picture with each letter I typed, in real-time.
Each letter and phrase I typed triggered the Steady Diffusion mannequin and altered the picture to suit my description extra precisely. In actual time. Zero lag, zero wait, zero servers. Every thing is native and offline. I used to be dumbstruck.
Simply final yr, Qualcomm was glad to indicate off (at MWC too) a Steady Diffusion mannequin that would generate an AI picture domestically in beneath 15 seconds. We discovered that spectacular then, particularly in comparison with Midjourney’s extra time-consuming and server-demanding era.
However now that I’ve seen real-time era in motion, these 15 seconds look like a lagfest. Oh, what a distinction 12 months make!
Now that I’ve seen real-time AI era in motion, anything looks like a lagfest.
The Dimensity 9300 was constructed from the bottom as much as face up to extra on-device AI capabilities, in order that wasn’t the one demo MediaTek was touting. Nevertheless, the others weren’t as spectacular and as eye-catching: native AI summaries, picture enlargement, and Magic Eraser-like picture manipulation. Most of these options have turn out to be commonplace now, with Google and Samsung boasting them of their Pixel software program and Galaxy AI go well with, respectively.
Robert Triggs / Android Authority
Then there was a neighborhood video era mannequin, which creates a picture and animates it as a collection of GIFs to make a video out of it. I attempted it a few occasions. It took over 50 seconds and wasn’t at all times correct, so you possibly can think about that it didn’t catch my eye as a lot because the real-time picture mannequin.
MediaTek additionally confirmed off a real-time AI avatar maker that makes use of the digicam to seize reside footage of an individual and animates it with a number of kinds. The animation was a second or two behind her actual actions, so it was not so laggy, however the generated picture jogged my memory of the early days of Dall-E. Once more, this was working domestically and offline, which explains these points. It’s nonetheless spectacular tech, after all, nevertheless it didn’t really feel “there” in the identical method because the real-time picture era mannequin.
As you possibly can inform by now, I actually appreciated that first demo. It simply felt just like the tech had lastly arrived. And the truth that you possibly can do it domestically, with out the additional prices of servers and the privateness issues of sending requests on-line, is what makes this extra sensible to me.