The year was 2015. Denis Sverdlov, CEO of the car manufacturer Kinetik, announced that his company, together with the Formula E racing circuit, would launch a new kind of race. He explained that this one would not create individual celebrities à la Lewis Hamilton or even accentuate the engineering prowess of a particular engine. It would feature 20 identical cars and zero drivers.
That’s right, like Kennedy who, in 1961, announced that America was embarking on a space program that would usher in a new era, Sverdlov was heralding something similar in the racing world — one based on driverless cars. The implications for industry (and indeed human consciousness) were comparably stratospheric.
Sverdlov ambitiously predicted his so-called “roborace” would take place a year later. Though that date has come and gone, the new kick-off day — scheduled for 2018 — promises to be a game changer.
This scene illustrates all of the emotions involved in artificial intelligence at the moment. On one hand there are entrepreneurs like Denis Sverdlov who make bold proclamations about the indefatigable advance of AI technology. On the other, there are setbacks and several other events — such as the occasional crash, as in the case of Tesla — that make it clear that things are not going to be so easy.
That is part of AI’s story too. And contrary to popular belief (or fear), the progress of AI is not necessarily experienced as one giant leap for mankind, but a slow, incremental march — and sometimes a stumble. Almost no aspect of this painstaking development process makes this clearer than the field of image annotation.
The army behind image annotation
A cursory google search for the words “artificial intelligence” will yield thousands of stories about how the technology will decimate the labor market. Though some of those concerns are founded in reality, that narrative often leaves out an important point: Developing AI technology, for instance, for driverless cars, will require human labor — a lot of it.
Just think for a moment about what a driverless car needs to be able to do. It has to interpret a driving scene in a number of very challenging situations: At night, in a thick fog, in rain, snow etc. Then there are the countless objects that could be obstacles, all needing identification. Even objects such as human beings, which one would think would be particularly unambiguous, present problems. For example, if the system develops a rule like: “Humans have two legs,” things get complicated when a woman appears wearing a skirt which can make her legs seem to merge into one limb.
This is why human beings are indispensable in the AI development process — and will remain so for a good long while. They are needed to sift through particular scenarios and label objects in order for the software to be able to learn.
In the driverless car industry, where progress is measured in a the number of miles a company has converted into data, and each mile requires manual work hours from flesh and blood people. Driving just a few miles can create tens of gigabytes of data. Often data sets are so large that the car cannot wirelessly upload all the data, so, in an ironic twist of fate, many car manufacturers find themselves archaically carrying hard drives to outsourcing centres for processing. Considering that Tesla has covered over 100 million miles, the mind boggles as to how much data processing needs to happen for AI to work.
In the short-to-medium term this means quite a few jobs will be created. David Liu, chief executive of Plus.ai, a Silicon Valley startup, recently told the Financial Times: “We need hundreds of thousands, maybe millions of hours of data” for self-driving vehicles to go everywhere, he said, requiring “hundreds of thousands of people to get this thing done”.
True, these will not be kinds of jobs that will make anyone rich, but then again, it’s not as if truck drivers were ordering caviar at their rest stops either.
How does the annotation work?
There are various approaches to image annotation that are enlisted in different scenarios. The simplest is called “Road Lines”, aptly named after the markings on the asphalt. This approach logs static features of the road that sensors encounter along the journey.
When non-two dimensional objects are involved, a higher order of thinking is required. Then “Bounding Boxes” is the method of choice. This is the process whereby a person will draw a line around the object that the software is meant to remember.
The “Cubes” method is the third commonly used approach, it also centers around identifying objects manually, with the notable difference being that it takes place in three dimensional space.
And finally, the most labour-intensive option is known as “Full Segmentation” which entails marking every single pixel in the frame.
The result can be seen in this demonstration video of Tesla’s Autopilot 2.0.
And here a simulated view of the surrounding by Waymo, who are recognized as marketleaders on autonomous driving.
Choosing the right annotation method for the task at hand
When selecting the appropriate annotation method, the key concern is filtering only the relevant information for the AI software. This often becomes a balancing act because though each additional object makes the calculations more precise, each addition also increases the processing load. The matter is exacerbated by the fact that costs rise proportionately with the amount of precision involved in the annotation method.
Cubes requires more processing power than road lines, for instance, but the formermay be the only option for drone flights. Incidentally, the method could also come in handy for fast-moving cars, as in the case of roboraces.
Whatever method is needed, the bottom line is that, at least for the time being, vast armies of people (the crowd) will be needed to get AI to a level of self-sufficiency.