Design like Iron Man

What is the next step in natural user interfaces? And what do comic book heroes have to say about it?

Iron Man 2008

Iron-Man is one cool hero and Robert Downey Jr.'s 2008 portrayal made it even cooler than the paper concept.

But what did a geek notice while watching the movie and sighed wanting the same thing? No, it's not a supersonic costume with limitless energy, but the 3d direct-manipulation interface that RDJ used to build said costume.

Though present only for a very limited time on screen (after all, there are baddies to kill, damsels to save), it was an "oh wow" moment for me. Two-dimensional direct manipulation is already here, thanks to the likes of Surface, iPhone and iPad and multitouch technology. But what about accurate, responsive 3d manipulation?



A discussion I've had some time ago made me think again about the interface envisioned in Iron Man two years ago. It is, I believe, the holy grail in computer interfaces. Direct, accurate manipulation of digital matter in three dimensions. The software used in the movie was representative of an Autocad package, used in modeling houses, cars, furniture, electronics. Let's analyze the specifics in the video and see how further along we are in building this.
No interface elements. There aren't any buttons, sliders or window elements. The interface simply disappears and you are working directly on the "document" or "material". When designing any software product, the biggest credit an UI designer can get is that the user doesn't notice the interface and the he or she just gets things done.
Kinetic/physical properties. Movement of digital items has speed and acceleration (rotation of model in first video), mass, impact and reactivity (throwing items in recycle bin unbalances it for a moment). All these are simulated for one purpose: making the user believe he is manipulating the digital entities in the exact same way one would act with physical objects.

Advanced (3d) gestures for non-physical manipulation. There are, of course, a large number of actions a computer can perform which don't map to a physical event such as the extrusion of an engine.

Accurate 3D holographic projections. The digital elements take a physical shape and are projected in 3d space.

Accurate response to direct manipulation. There are no styluses, controllers, or other input devices used.

So how far along are we? If you think this is all silly cinema stuff, prepare to be pleasantly surprised.

Learn to g-speak

G-speak is a "spatial operating environment" developed by Oblong Industries. If the above demo didn't impress you, check out the introduction of g-speak at TED 2010 and some in the field videos as this digital pottery session at the Rhode Island School of Design. The g-speak environment speaks to me as Jeff Han's multitouch video did back in 2006. It took 4 years for that technology to appear in a general-purpose computing device as is the iPad. The g-speak folks hope to bring their technology to mass commercialization within 5 years.  That sounds pretty ambitious, even coming from the consultants of the 2002 Minority Report movie. Right?

Bad name, awesome tech: enter Kinect

The newly released Kinect is an addon for Microsoft's XBox gaming console. Whereas g-speak is still some years away from commercialization, you can get a taste of it right now with the Kinect.

Combining a company purchase (3DV), tech licensing (PrimeSense) and some magic Microsoft software dust, Kinect was born. Here are a few promotional videos, if you can stomach them. And here's Arstechnica's balanced review. According to PrimeSense,

The PrimeSensor™ Reference Design is an end-to-end solution that enables a computer to perceive the world in three-dimensions and to translate these perceptions into a synchronized depth image, in the same way that humans do.


The human brain has some marvelous capabilities for viewing objects in 3D. It is helped by an enormous parallelism capacity. It only needs the two inputs from our eyes to tell distance. Not disposing of the brain's firepower, Kinect uses a neat trick: projecting infrared rays into the room with a IR light source that are picked up by the second camera (a CMOS sensor).  Here's how the Kinect "sees" our 3D environment.

kinect-shell.jpg Wall-E, is that you? © iFixit

Besides the IR projector and IR Receiver, Kinect also comes equipped with a VGA camera and no less than 6 microphones.  Microsoft took the PrimeSense design and added the VGA camera for face recognition and help with the 3d tracking algorithm. The microphones are used for speech recognition; you can now yell at your gaming console and it might actually do something. All this in a 150$ dollars package that you can buy today.

From some reports I read on the net it appears Microsoft spent a lot of money in R&D on Kinect. The advertising campaign alone is estimated at something like 200 million $. It is only natural to assume that they have bigger plans for Kinect than just having it remain a gaming accessory.  I believe Microsoft is betting on Kinect to represent the next leap in natural user interaction. Steve Ballmer was recently asked what was the riskiest bet Microsoft was taking and replied with "Windows 8". The optimist in me says Windows 8 will be able to use Kinect and have a revised interface to suit 3d manipulation. The cynic in me tells me that he was talking about a new color scheme.

So what do we get? Almost no interface elements, kinetic/physical properties, advanced 3d gestures from the original list. They added some really cool stuff via software, such as, in a multiplayer game, if a person comes into a room and he/she has an XBox Live account, they are signed in automatically into the game, simply via face recognition. Natural language commands bring another source of input for a tiny machine that knows much more about it's surroundings than previous efforts.

What do we miss? Holographic projections, accurate response and, something missing also from Iron Man's laboratory, tactile feedback. The early reviews for Kinect all mention this in one way or another... Kinect's technology, when it works, is an amazing way to interface with a computer. When it breaks down, it reminds us that there is still a lot of ground to cover. Microsoft's push for profitability (understandable, remember this is a mass-consumer product) removed an image processor from the device. This means that it needs an external processing power. The computing power reserved for Kinect is at the moment up to 15% of XBox's capability. The small sized cameras and their proximity requires a distance of 2-3 meters from the device in order to operate it successfully. Because of the small amount of processing power reserved for it, Kinect's developers have supplied the software with a library of 200 poses which can be mapped to your body faster and easier than a full body scan. You cannot operate it sitting down; it's my opinion that this is a side-effect of the 200 pre-inputted poses. You can also notice in the g-speak video above that their system reacts to their tiniest change, even when moving just their fingers. How do they do that? By using 6 or more HD cameras (and tons of processing) per second. The 340p IR receiver and 640p video camera just doesn't cut it for such fine detections. This is , again, an understandable means of reducing the cost.

On the other hand, Microsoft made a great move by placing Kinect 1 next to a gaming platform. Games are by their nature experimental, innovative processes. This gives everyone huge amounts of freedom to experiment. Made a gestural based interface and no one likes it? You can scrap it and the next game will try something different. This will give Microsoft valuable data for improving Kinect and filtering out bad interaction paradigms.

Kinect has a chance to evolve and become the next natural way to interface with computers. With increases in processing power, accuracy will increase. If you want to play like Iron Man, you can do so now with Kinect.

In the next installment, I'll talk about the accuracy, feedback, Playstation Move and the (sorry) state of holographic projections.


The year of mainstream Linux, 2010 edition

Synopsis: Here's my tale with Linux over the years and why I believe Android fits the bill for this article's title. But first, a bit of history and how the desktop had to change for Linux to be on it.

Ah, Linux... champion of open-source, love of computer geeks everywhere and owner of the cutest Operating System mascot around.

Most of my colleagues label me a mac-freak. With two Apple laptops, two iPods, an iPhone and a Time Capsule in the house, I can't really blame them for it. However, not all of them know that before hooking up with Apple I had a three years stint with Linux.

This was almost 8 years ago, in a land where there was no Ubuntu, Android wasn't even an idea and editing /etc/X11/XF86Config-4 was the only way to change the screen resolution. It's been a while since so allow me to reminisce a bit.

"Damn kids, get off my lawn!"

I started off with Red-Hat 7.1, in a brave attempt to have a triple booting system together with Windows 98 and then newly released Windows XP. I sat down on a Friday afternoon and emerged from my room Sunday around lunch. I slept about 5 hours through the whole process and after 20-something installation attempts of the three operating systems, I abandoned in defeat.

I was intrigued by my failure in the realm of technology while thinking of the new worlds I had learned of. It's funny how Microsoft's domination on operating systems market share made it strange to even question the status quo or think about alternatives. Learning about open source, volunteers, Unix, command line, kernels, distributions was as strange to me as was coming out of the Matrix to Neo.

A few months later and several stubborn sessions ("you will learn vi's commands or starve at this keyboard") I had learned a great deal about operating systems, partitioning, package management, scripting, window managers, boot loaders and other assorted varieties of unix-y knowledge. I firmly believe that any developer should know the innards of his preferred operating system as well as what choices may exist. This knowledge will help him/her write better code in some instances but most importantly will help debug software when something goes wrong at the lower parts in the technology stack.

Red Hat, Mandrake, Gentoo, Slackware, Debian all served time on my desktop. Lycoris Desktop, Linspire, Yoper, Mepis, FreeBSD and other curiosities such as Linux From Scratch had a brief run during a period of experimentation. Knoppix was making the rounds as the first usable Live-CD Linux distro, a feature now common to all distributions. At the time, running an OS from a CD was nothing short of amazing, even if agonizingly slow. Debian Unstable eventually became my base OS and xfce my window manager of choice following testing of Blackbox, Fluxbox, IceWM, WindowMaker, Enlightenment, and of course, KDE and Gnome.

I remember clearly spending one week trying out kernel builds on the 2.4 branch, ranging from keyboard-biting frustration to enlightening exhilaration. I made some really good friends that taught me as I went along. I understood communities and open source. I joined a LUG and went to a conference. I also didn't spend more than 6 months running the same OS on a daily basis.

Three years is a long time to run such an experiment, but I don't regret doing it (I also don't recommend Slackware or LFS to anyone, either). I probably learned more about computers during this time than in any other period. I wanted to start a business on Linux consultancy.

So what happened?

University years passed by and pretty soon I needed my computer for "real work". Eventually the thrill of discovery and learning wore off and I became weary of spending hours configuring things just to make them work. My respect for the Unix-way of doing things remained, so I couldn't go back to Windows. Ubuntu was just a blip on the radar in 2003-2004. In spring 2005 I ran across John Siracusa's excellent review of Mac OSX Tiger and the course was set. John's reviews have been epic enterprises over the years, sometimes expected more by the community than the actual releases of OSX. His attention to detail, precise critique and detailed Unix knowledge drew my admiration and desire to learn more of this OSX. One typical feature of his reviews is the attention to the aesthetic. All of these, I would later discover, are things highly treasured by the mac-community; I'm sad to say the latter one is still absent from their linux-minded counterparts.

Three months later I did what any self-appointed geek does at some point: buy the most capable computer he doesn't really need. I embarked on a dual-CPU, 2.7 GHz G5 Powermac and put my Linux days behind me.

A modern, unix-based operating system set up on top of FreeBSD meant I would have the Unix strength beneath the hood while at the same time benefit from an interface built with usability and speed in mind. Sure, I might give up some "freedoms" found in the Linux world, but really, how many times do you need to change window managers?
Which brings me to the topic at hand.

Mainstream, schmainstream

Mainstream software as a concept lives and dies by the amount of people using it. Software ecosystems thrive when users drive demand that developers strive to meet. I'm not going to mince words here. Where operating systems are concerned everything outside of that is a highly specialized tool, an academic experiment or a hobby.

It so happened that during my years running Linux and thereafter I ran across several articles, forum posts and discussions as to which year would finally be the year of "mainstream" Linux. What drove linuxists to this goal besides recognition and free software ideals?

Linux developers were united by another thing. An idealistic underground current against the Microsoft "opression". Even today, Ubuntu's Bug No.1 stands as an example of this counter-movement.
Microsoft has a majority market share in the new desktop PC marketplace. This is a bug, which Ubuntu is designed to fix.

Microsoft's monopolistic strategies of the past, shady business decisions and outright hostile campaigns against Linux painted a big target on its back. Flame wars ensued, parodies popped up, salvos were fired from every camp. "Microsoft is evil/no it isn't/yes it is" flame wars will eventually pop up in any tech community.

It's a known thing that humans are uniting easily against a common enemy and rally behind heroes in any battle. And although Microsoft has always been the "enemy" for the Linux camp, a true "hero" never quite emerged. I've often thought of Ubuntu as of a pacifying unifier of the various Linux tribes while at the same time spreading a message of love and understanding for users.

The other OS company running in the mainstream race, Apple, faced the same upstream battle against the Microsoft monopoly. They had a more focused approach, a lot of money and still, after many years are still placed somewhere between 5 and 10% market share worldwide.

The desktop wars were won by Microsoft a long time ago, and the Windows+Office+Exchange+Sharepoint combination will be hard to "beat" in the near future. Apple had a clean break with the iPod, the iPhone and pretty soon with the iPad. Google won the internet race and Linux is hard at work on servers, embedded devices and phones.

Rise of the replicants

Since November 2007, a new hero emerged in the Linux community. Android took on a long path from a Palo Alto startup snatched by Google in 2005 to an alliance-backed open source contender for the mobile operating system crown.
Microsoft ignored the web and Google snatched it away. Microsoft also ignored the mobile space and Apple stole the spotlight. Nokia struggled in unifying its many platforms and UI toolkits. RIM focused on email and business users while HTC took upon grafting a modern, pleasant interface on top of the aging Windows Mobile platform. Apple had shown with the iPhone that consumers appreciate usability with a top notch media and web interface. Mobile device manufacturers needed a modern operating system with a big software developer behind it.

This is the landscape in which Android was introduced by the two Google founders who rollerbladed their way through business suits when introducing the HTC G1 phone.
340x_lollerskates.jpg Rollerblades ©Gizmodo
g1_launch_suits.jpg Suits ©Engadget
The G1 launch didn't set the world on fire, however slowly but surely, Android gathered a lot of momentum. An army of droids is being assembled as I write this (tip'o the hat to my friend, Mihai).

Being used to lengthy flamewars in the past, i was slowly recognizing a trend among comments on sites I frequently visit related to Android articles. However, it really dawned on me that Android became "the hero" for the Linux community after reading David Pogue's amusing followup to his Nexus One review:
Where I had written, "The Nexus One is an excellent app phone, fast and powerful but marred by some glitches," some readers seemed to read, "You are a pathetic loser, your religion is bogus and your mother wears Army boots."[...] It's been awhile since I've seen that. Where have I seen… oh, yeah, that's right! It's like the Apple/Microsoft wars!
Yes friends, wars, passion, heroes! Being an iPhone-toting Java developer among open-source enthusiasts in our company, I soon started to get looks and remarks as "yeah, that iPhone guy who bows to Steve Jobs". Because you see, Android managed to unite two battle-fronts: both Linux developers as well as Java developers (but that's a topic for a future article).

As I mentioned earlier on, I've always been a supporter of Linux, even if not apparent at first glance. That's why I always get a laugh when overhearing the above line. At the same time, I'm glad to see passion among developers for a Linux-based platform . I truly believe passion is needed to bring people to create software, develop an ecosystem, rally behind an idea and yes, bring it into the mainstream. This guy had the right idea, if lacking a bit in style.

Despite my continual purchases to Apple, I also believe competition is good. And unfortunately, besides Android, there haven't been many (or few) to light up fires under Apple's iPhone platform, forcing them to react to its shortcomings.

But why a phone? And surely, if we aren't "winning" on the desktop, it's not truly winning, is it? Apple and Google may have targeted phones at first because of different reasons and backgrounds, but found themselves on common territory. Here's my take on it.

The Desktop has been gradually shifting away to the Mobile. Laptops, smartphones, tablets, e-readers. The computing landscape changed in the last years, a fact obvious to many. We are witnessing a mindshift, a transition from general purpose computing to device and activity specific. Reduction of costs, size and increasing computing power made the original iPhone twice as fast as my first computer and the Nexus One five-to-six times as fast. What about constraints? Memory, storage, screen estate are all premiums on mobile devices. You can't just plug in another hard drive. You can turn it however into a valuable asset in creation: focus.

I believe the focus on this class of devices and consumer-orientation made them a success. Why is that? Targeting a reduced platform, a niche if you will ensures you don't get distracted or waste resources. You can fail without taking down the company. It's a relatively low-risk avenue. It's an excellent test-bed for new interaction and UI paradigms. And if you play your cards right and use the correct development method, you can then expand your Operating system onto other generic devices that eat up a pie of the desktop's hegemony.

Interestingly, Apple and Google arrived here from different roads. Apple leveraged its iPod legacy of industrial design and its flexible OSX platform with a focus on media, entertainment (much of that being games) but also a premier web experience. Google wanted to leverage it's excellent infrastructure for the "data in the cloud" paradigm while promoting Linux to mainstream use.

Google made the laudable decision and kept Android open-source. As a result, with people starting to use it for ebook readers, upcoming tablets and netbooks, enterprising developers are rapidly expanding Android's reach. The emphasis on portability, battery performance and Google's focus in this area will ensure that for some time, Android will remain a mobile-devices OS.
androidfriends.jpg Here's to more Android friends Image ©Richard Dellinger

And that's a really good thing, because mobile is where the desktop is now.