As the ability for computers to interpret and generate text, images, and video keeps accelerating, it's increasingly clear that multi-modal personal assistants are in our future. But the only way to find out what form they'll take is continued exploration. With that in mind, here's some thoughts on using the AI Pin from Humane.
Humane was founded in 2018 -well before there was widespread awareness about Large Language Model capabilities. Kudos to them for having the conviction that capable enough systems would exist to enable a "smartphone alternative". That said, many of the frustrations (short battery life, overheating, poor usability) I've encountered with the AI Pin likely stem from the fact that it's trying to replace, or at the very least not require, smartphone use.
"It feels like Humane decided early on that the AI Pin couldn’t have a screen no matter what and did a bunch of product and interface gymnastics..." -The Verge
I'm not here to harp on the hardware and software issues the device has as you can find plenty of reviews calling them out. That said, there are enough problems to prevent most people from getting to what could be the hero interactions of the AI Pin.
AI Pin has the components you'd need for multi-modal input and output: a camera, a microphone, a display, a speaker, a network connection, and a processor. The camera and mic are even positioned where we intake the world around us: attached to the clothes on our chest vs. tucked into a pocket or attached to a wrist. Only smartglasses or earbuds seem more optimally located.
So what's wrong? For privacy of the people around you, AI Pin's camera takes seconds to turn on and often does so unreliably as touch gestures to activate it go unrecognized. So getting real-time input from the World around is clunky and the e-ink hand-only display used for output is even clunkier.
The audio side of AI Pin works better. Though slow, its ability to answer nearly any question and utilize context like your current location when responding is exactly the kind ambient computing interaction we've dreamed of since the Knowledge Navigator video from Apple. Minus the screen of course.
When I demoed the ability to effectively talk with a search-enabled Large Language Model to my 15 year-old son, he asked: "Can I get one?" When I pressed on why, he said: "It's like having a little personal assistant with me all the time." Which is the future I outlined up front. So despite all it's flaws, the AI Pin has at least strongly hinted toward where human-computer interaction is headed.
Whether that personal assistant shows up in your palm, on your wrist, in your glasses, or attached to your lapel, we'll find out. As I don't doubt more hardware experiments are coming that let us glimpse the future and ultimately live in it. (insert sound of future here).