NTSB Hearing Blames Humans, Software And Policy For Fatal Uber Robocar Crash - But Mostly Humans
The National Transportation Safety Board presented its findings today on the fatal crash involving an Uber test robocar and Elaine Herzberg. Not unexpectedly, they placed blame on the safety culture at Uber, the safety driver’s watching of a streaming video show, and to some degree, on the impaired state of the pedestrian.
Start an Emergency & Disaster Management degree at American Military University.
Most notably, a correction is required on earlier reports that the Uber system could not identify pedestrians outside of crosswalks. That was not the case, and while there were flaws related to that issue, they played no role in this accident.
I wrote extensive coverage earlier on the report. The hearing revealed a few extra snippets of detail not highlighted in the written report.
Probable Cause Report
The NTSB’s final determination of probable cause put primary blame on the safety driver’s inattention. Contributory causes where Uber’s lack of safety culture, poor monitoring of safety drivers, and lack of countermeasures for automation complacency. They put tertiary blame on the pedestrian’s impaired crossing of the road, and the lack of good regulations at the Arizona and Federal levels.
Most notably, they do not attribute the technology failures as causes of the crash. This is a correct cause ruling – all tested vehicles, while generally better than Uber’s, have flaws which would lead to a crash with a negligent safety driver, and to blame those flaws would be to blame the idea of testing this way at all.
When it comes to human fault, the report noted that Herzberg had a “high concentration of methamphetamine” (more than 10 times the medicinal dose) in her blood which would alter her perception. She also had some marijuana residue. She did not look to her right at the oncoming vehicle until 1 second before the crash.
There was also confirmation that the safety driver had indeed pulled out a cell phone and was streaming a TV show on it, looking down at it 34% of the time during her driving session, with a full 5 second “glance” from 6 to 1 seconds prior to the impact.
While Uber recorded videos of safety drivers, they never reviewed those of this driver to learn that she was violating the policy against cell phone use. She had received no reprimands, and driven this stretch of road 73 times before.
While there is also a small tablet computer in each car for the operator, it is very simple, and just shows the navigation route like other navigation computers, and had very few inputs. NTSB attributed none of the distraction to this tablet. Uber claims it designed the tablet UX in accordance with NHTSA guidelines for in-car driver UX. Some teams use a voice system rather than a touch system to avoid distraction when reporting anomalies.
More detail was revealed about how Uber’s system dealt with pedestrians. Earlier reports (including my own) stated that the system was unable to classify an obstacle as a pedestrian outside of crosswalks. This is not true, though the truth is somewhat more confusing. It simply never classified her as a pedestrian, probably because she was walking a bicycle. What confused writers about the accident was a mention of potential failure (which did not actually occur) in the system’s goal-assignment modules.
When a robocar system identifies an obstacle, it tries to predict where it may move, and to do so estimates a “goal” for it. For example, a pedestrian in a crosswalk will probably have the goal of crossing the road. A car or bicycle in a lane will have the likely goal of continuing to drive down that lane (with some chance of changing lanes.)
A flaw – which played no role in this crash – was that Uber’s system would not assign “crossing the road” as a goal to a pedestrian not in a crosswalk. If they had identified her has a pedestrian then they would have not viewed it as likely she was crossing the road.
Instead, she was classified as things like a vehicle, cyclist or unknown obstacle. The goal guesses for those objects are different, and in many cases are based on the past history of movement of the object. That’s why the flaw of forgetting the past history after reclassification was the real technical error here. Each time it reclassified her, it had to predict her motion with no past history, and thus it was not able to understand that, whatever she was, she was crossing the road and going to enter their lane.
This lack of “object persistence” remains the greater error than the classification errors. Classification errors happen all the time in robocar systems, but that is supposed to be mitigated by the fact that the systems track and object as it moves without knowing exactly what it is. Uber failed to do that.
More detail was revealed about the disabling of the Volvo standard automatic emergency braking system that comes with the SUV. The Volvo system had its own radar on the same frequency as Uber’s radar and thus could not be used at the same time. Later, they were able to re-tune one of the radars so that both systems can be active.
Some focus was placed on the issue of how well humans do at monitoring automated systems, and the tendency for people to become complacent and not diligent. This is a common factor in transportation modes that have some amount of automation. Uber never reviewed videos of the safety driver (and rarely viewed those of others) but now, has a third party do spot-checks on safety drivers to look for problems like this. Now Uber, and other companies use automated systems that track driver gaze to make sure eyes are kept on the road and even alert if they are off the road too much. Uber has returned to having two safety drivers.
An interesting statistic was revealed, that Uber had 40 test vehicles in Tempe and 254 operators on staff. This makes it baffling that they wanted to switch to having one safety driver instead of two. They seem to have had more than enough staff to keep two in each vehicle.
While generally two safety drivers are better than one, and would prevent an accident of this sort, I now believe that computerized monitoring of one safety driver is actually probably enough. The record of Tesla autopilot drivers, who are not trained and usually drive alone, suggests that having a single person doing oversight of a decent automated system results in an adequate safety record. A trained safety driver with automated monitoring should do even better.
Uber ATG’s poor safety culture – or in some cases, lack of safety culture, saw much criticism, as has been reported before. According to the report by Robert Fox, they did not have a proper framework for risk mitigation, they had bad policies and procedures, and bad oversight of vehicle operators. Uber, it is reported, has fixed many of these things, and at present is only doing very limited testing – just a one mile loop around their HQ limited to 25mph.
Enzar Becic of NTSB reported on the fairly basic rules currently applied to companies testing robocars, particularly in Arizona, which puts no requirements on teams testing with safety drivers. The NHTSA regulations request an optional safety assessment, but only 16 companies have filed one (over 64 companies are registered in California.)
Chairman Sumwalt praised Uber for its cooperation with their investigation, and made some strong digs against Tesla for the way it had not helped and had to be forced out of one of their investigations. Sumwalt said he liked that Uber’s CEO “did not hang up on me,” strongly implying that Elon Musk might have hung up on him during conversations.
More notes on the technical faults
In my earlier report on the Uber fatality, I included a lot of discussion on two new technical details, namely (incorrectly) the Uber system’s configuration to not identify pedestrians who were in the middle of the road outside of crosswalks, and more importantly, the flaw in their object tracking system which had it forget any past learnings every time it reclassified an object. Many other press also made reports on these issues, most of them putting a strong focus on the problem with the identification of pedestrians.
Even though I made it very clear that these technical problems, while a sign of poor system design and coding, were not the cause of the accident, it has been natural for readers to take it that way. As such it is important to reiterate that all robocar teams, to my knowledge, have issues where there system does not properly classify things in its view, and which have made it necessary, from time to time, for a safety driver to take over to prevent an accident. With brand new projects, this can be a very frequent event, and there is nothing wrong with that, as long as a trained and attentive safety driver is there to do just that. While we can and should be critical of the poor quality of Uber’s software, I doubt any project out there hasn’t gone through a very early period where their system was immature and prone to problems like this, which were handled correctly by a safety driver. If any of these teams had deployed one of their early cars with a safety driver who sat watching a video, as is alleged here, those cars probably would have gotten into an accident, and possibly a fatal one if circumstances were wrong.
It does not excuse Uber, but they also had some bad luck here. Normally, a pedestrian crossing a high speed street outside a crosswalk would exercise some minimal caution, starting with “look both ways before crossing the street” as we are all taught as children. By all appearances, the crash took place late on a Sunday night on a largely empty road, exactly the sort of situation where a person would normally hear any approaching car well in advance, and check regularly to the right for oncoming traffic, which would be very obvious because of its headlights – obvious even in peripheral vision. Herzberg crossed obliviously, looking over just one second before impact. NTSB investigators attributed this to the meth in her system. They did not know if the concentration in her blood was going up (due to recently taken doses) and altering perception, or coming down (causing unusual moods.)
It was also reported that pedestrian crossings at this location are rare (though were more common that night due to a concert) and there is no recollection of another pedestrian incident in this location.
This does not mean that robocars and safety drivers need not be be ready for oblivious pedestrians outside of crosswalks, of course. These exist, even if most pedestrians are more prudent. In fact, 71% of pedestrian hits in Arizona took place outside of intersections.
What about the cameras?
Questions remain about the role of computer vision and cameras in the Uber fatality. The NTSB report says almost nothing about the camera systems in the car, other than to note that the close-range cameras, which would not have come into play in this incident, were not in use.
Most self-driving car designs make use of 3 different systems to detect something like a pedestrian on the road ahead. The LIDAR is the most reliable, and is used by all major players except Tesla. The radar is also very good but does not work as well on a pedestrian who is not moving towards you or away from you. In this case, Herzberg was crossing the street and thus moving perpendicular to the car. Radar did still see her but provides much less information. Because Uber’s software kept reclassifying her and forgetting what it learned before, and radar is not good at providing a detailed horizontal position, it was not so useful for noticing that she was walking across the road.
Cameras should have played a useful role here. While computer vision doesn’t have the 100% certain knowledge of distance you get from LIDAR and radar, it can be quite good at figuring out what something is, and all teams train their systems to identify pedestrians, and one presumes, people walking bicycles. Uber’s system never did that. But while it’s hard for a LIDAR to identify something like a pedestrian walking a bicycle as specifically that at a long distance – it’s just a diffuse blob of dots until you get close – this is something cameras should have had a better shot at. Yet, the NTSB report discusses nothing about what the vision perception system saw.
Was it off? Is it simply not very functional? Like all the other technical issues, they take second place to the issue of bad safety driving, but they are still interesting for all of us who want to understand what sort of mistakes programmers can make and how to avoid them.
One issue with vision systems is how they operate at night. An ordinary camera will not simultaneously get good images of both bright things (such as car headlamps and things lit by streetlamps and headlamps) and things in the shadow. Herzberg went in and out of pools of light. The frequently shown dashcam video of the crash reveals the poor performance of an ordinary camera.
That’s why robocar teams have been actively developing “high dynamic range” camera approaches to improve vision results. One way to do that is to have 2 or more cameras, each one set at a different exposure level to get a good image of the shadows and the well lit areas. A cheaper way to do that is just have one camera that alternates between taking bright and dark exposures every other frame. These techniques should have allowed the many cameras on the Uber car to get a good image of the road and Herzberg even though it was night.
Board member Jennifer Homendy was highly critical of the (lack of) regulation around testing, and feels there should be more federal and state regulations on the testing of these vehicles. She expressed concern over the NHTSA regulations which are currently fairly minimal. They are minimal because NHTSA and the states have realized they do not have the skills yet to regulate a technology which is still in development and is constantly changing. Nonetheless, NTSB believes NHTSA should make submission of the safety plan reports be mandatory.
Here are the final findings (from the live transcript):
- Finding One, none of the following were factors in this crash: One, driver licensing, experience or knowledge of the automated driving system operation. Two, vehicle operator substance impairment or fatigue or three, mechanical condition of the vehicle.
- Finding Two, the emergency response to this crash was timely inadequate.
- Finding Three, the pedestrian’s unsafe behavior in crossing the street in front of the approaching vehicle at night and at a location without a crosswalk violated Arizona Statutes and was possibly due to diminished perception in judgment resulting from drug use.
- Finding Four, the Uber advanced technologies group did not adequately manage the anticipated safety risk of its automated driving system’s functional limitations, including the system’s inability in this crash to correctly classify and predict the path of the pedestrian crossing the road mid-block.
- Finding Five, the aspect of the automated driving system’s design that precluded braking in emergency situations only when a crash was unavoidable increased the safety risks associated with testing automated driving systems on public roads.
- Finding Six, because the Uber advanced technology group’s automated driving system was developmental with associated limitations and expectations of failure, the extent to which those limitations pose a safety risk depends on safety redundancies and mitigation strategies designed to reduce the safety risk associated with testing automated driving systems on public roads.
- Finding Seven, the Uber advanced technology group’s deactivation of the Volvo forward collision warning and automatic emergency braking systems without replacing their full capabilities removed a layer of safety redundancy and increased the risks associated with testing automated driving systems on public roads.
- Finding Eight, postcrash changes by Uber advanced technologies group such as making Volvo’s forward collision warning and automatic emergency braking available during operation of the automated driving system or ADS added a layer of safety redundancy that reduces the risks associated with testing ADSes on public roads.
- Finding Nine, had the vehicle operator been attentive she would have had sufficient time to detect and react to the crossing pedestrian to avoid the crash or mitigation the impact.
- Finding Ten, the vehicle operator’s prolonged visual distraction, a typical effect of automation complacency led to her failure to detect the pedestrian in time to avoid the collision.
- Finding 11, the Uber advanced technologies group did not adequately recognize the risk of automation complacency and developed effective countermeasures to control the risk of operator disengagement which contributed to the crash.
- Finding 12, although the installation of a human machine interface in the Uber advanced technologies group test vehicle reduced the complexity of the task the decision to remove the second vehicle operator increased the task’s demands on the sole operator and also reduced the safety redundancies that would have minimized the risks associated with testing automated driving systems on public roads.
- Finding 13, although the Uber advanced technologies group had the means to retroactively monitor the behave of vehicle operators and their adherence to operational procedures it rarely did so. And the detrimental effect of the company’s ineffective oversight was exacerbated by its decision to remove the second vehicle operator during the testing of the automated driving system.
- Finding 14, the Uber advanced technology group’s postcrash inclusion of a second vehicle operator during testing of the automated driving system along with real-time monitoring of operator attentiveness begins to address the oversight deficiencies that contributed to the crash.
- Finding 15, the Uber advanced technology group’s inadequate safety culture created conditions including inadequate oversight of vehicle operators, that contributed to the circumstances of the crash and specifically to the vehicle operator’s extended distraction during the crash trip.
- Finding 16, the Uber advanced technology group’s plan for implementing a Safety Management System as well as postcrash changes in the company’s oversight of vehicle operators begins to address the deficiencies in safety risk management that contributed to the crash.
- Finding 17, mandatory submission of safety self-assessment reports which are currently voluntary and their evaluation by the National Highway traffic safety administration would provide a uniform minimal level of assessment that could aid states with legislation pertaining to the testing of automated vehicles.
- Finding 18, Arizona’s lack of a safety focused application approval process for automated driving systems — excuse me, automated driving systems testing at the time of the crash and its inaction in developing such a process since the crash demonstrate the state’s shortcomings in improving the safety of ADS testing and the Safeguarding of the public — or Safeguarding the public.
- Finding 19, considering the lack of Federal safety standards and assessment protocols for automated driving systems, as well as the National Highway traffic safety administration’s inadequate safety self-assessment process, states have no or only minimal requirements related to automated vehicle testing, can improve the safety of such testing by implementing a thorough application and review process before granting testing permits.