Sunday, May 3, 2015

Counter Rollover Bites Boeing 787

Counter rollover is a classic mistake in computer software.  And, it just bit the Boeing 787.

The Problem:

The Boeing 787 aircraft's electrical power control units shut down if powered without interruption for 248 days (a bit over 8 months). In the likely case that all the control units were turned on at about the same time, that means they all shut down at the same time -- potentially in the middle of a flight. Fortunately, the power is usually not left on for 8 continuous months, so apparently this has not actually happened in flight.  But the problem was seen in a long-duration simulation and could happen in a real aircraft. (There are backup power supplies, but do you really want to be relying on them over the middle of an ocean?  I thought not.) The fix is turning off the power and turning it back on every 120 days.

That's right -- the FAA is telling the airlines they have to do a maintenance reboot of their planes every 120 days.

(Sources: NY Times ; FAA)


Analysis:

Just for fun, let's do the math and figure out what's going on.
248 days * 24 hours/day * 60 minute/hour * 60 seconds/minute = 21,427,200
Hmmm ... what if those systems keep time as an 32-bit signed integer in hundredths of a second? The maximum positive value for such a counter would give:
0x7FFFFFFF = 2147483647 / (24*60*60) = 24855 / 100 = 248.55 days.
Bingo!

If they had used a 32-bit unsigned it would still overflow after twice as long = 497.1 days.


Other Examples:

This is not the first time a counter rollover has caused a problem.  Some examples are:

  • IBM: Interface adapters hang after 497 days of uptime [IBM]
  • Windows 95: hang after 49.7 days without reboot, counting in milliseconds [Microsoft]  
  • Hong Kong rail service outage [Blog]
There are also plenty of date roll-over bugs:
  • Y2K: on 1 January 2000 (overflow of 2-digit year from 99 to 00)   [Wikipedia]
  • GPS: 1024 week rollover on 22 August 1999 [USCG]
  • Year 2038: Unix time will roll over on 19 January 2038 [Wikipedia]

There are also somewhat related capacity overflow issues such as 512K day for IPv4 routers.

If you want to dig further, there is a "zoo" of related problems on Wikipedia:  "Time formatting and storage bugs"


Friday, May 1, 2015

How To Report An Unintended Acceleration Problem

Every once in a while I get e-mail from someone concerned about unintended acceleration that has happened to them or someone they know.  Commonly they go to the car dealer and get told (directly or indirectly) that it must have been the driver's fault. I'm sure that must be a frustrating experience.

Fortunately, you can do more than just get blown off by the dealer (if you feel like that is what happened to you).  Visit the US Dept. of Transportation's complaint system and file a complaint with their Office of Defects Investigation (ODI):

What this does is put information into the database that DoT uses to look for unsafe trends in vehicles. ODI conducts defect investigations and administers safety recalls, and this database is a primary source of information for them.  Putting in an entry does not mean that anyone will necessarily get back to you about your particular complaint, but eventually if enough drivers have similar problems with a particular vehicle type, ODI is supposed to investigate. You should be sure to use several different words and phrases to thoroughly describe your situation since often this database is searched via key words. (That means that if they are looking for a particular trend they might only look at records that contain a specific word or phrase, not all records for that vehicle.)  You should include specifics, and in particular things that you can recall that would suggest it is not simply driver error. But, realize that the description you type in will be publicly available, so think about what you write. 

To be sure, this should not be the only thing you do.  If you believe you have a problem with your vehicle should talk to the dealer and perhaps escalate things from there. (If it happened to me I would at a minimum demand a written problem report to the manufacturer central defects office be created and demand a written response from the manufacturer customer relations office to leave a record.)  But, if you skip the DoT database then one of the important feedback mechanisms independent of the car companies that triggers recalls won't have the data it needs to work. If you had an incident that did not result in a police report or insurance claim reporting is especially important, since there is no other way for DoT or the manufacturer to even know it happened.

Even if you haven't suffered unintended acceleration, you might be interested to look at complaints others have filed for your vehicle type, which are publicly available. And of course you can report any defect you like, not just acceleration issues. 

I recently came across the web site:  http://www.safercar.gov/
This has general car safety information and also has a way to file a vehicle safety complaint, including a specific page for filing a safety complaint (https://www-odi.nhtsa.dot.gov/ivoq/).  One would hope the data ends up in the same place, but I don't have information either way on that.

 (For those who are interested in how the keyword search might be done, you can see a NHTSA Document for an example from the Toyota UA investigations.)

Job and Career Advice

I sometimes get requests from LinkedIn contacts about help deciding between job offers. I can't provide personalize advice, but here are...