Archive for July, 2010

Safe Computing: No Rebooting, Please

July 25, 2010

I recently had an interesting discussion with Michal Zalewski over at Icamtuf’s blog concerning disclosure policies.  This wound up reminding me of something that’s always been obvious but that I haven’t thought about much recently: the effect that having to reboot in order to install updates has on update take-up.  Given the speed with which vulnerabilities are now weaponized and the increased number of zero-day attacks, this is even more relevant today than ever before.

One of the tasks I take care of at work is monitoring the deployment of Microsoft security updates on machines within the faculty’s Windows domain.  We deploy Microsoft updates using WSUS, which also keeps track of the status of the machines for us.  Unfortunately I haven’t kept any statistical data over the past however-many years, but my subjective impression is that our users are rebooting far more promptly than they used to; this may partly be due to increased awareness of security risks, but I suspect that the regular monthly update cycle plays a big part.  If you know you’re going to need to reboot once a month, and only once a month, there’s no point in putting it off.  If you might be asked to reboot again the following day, you’re less motivated to get it over with.

Even so, the problem wouldn’t arise if updates could be installed without having to reboot.  So, is this possible?  Well, yes – but a qualified yes.  This isn’t easy to do, even if you’re given a clean slate to work with.

Firstly, we need to think a bit more about what we mean when we say “reboot” in this context.  Technically, you’re only rebooting if the OS shuts down and hands control of the system back to the BIOS – to all effects and purposes, the effect is essentially the same as switching the computer off and back on again.  If you took this definition literally, it wouldn’t be too hard to avoid rebooting after updates are installed; for example, you could run the OS inside a virtual machine, and let the hypervisor retain control of the hardware while the OS is reinitialized.  This would probably save some time, since a virtual reboot is typically much quicker than a physical reboot, but it wouldn’t actually help; the user would still have to close and reopen all the applications and documents they were working with, and that’s what they really want to avoid.

Similarly, it isn’t enough to just think about OS updates and whether you can keep an application running (in some sense) while the OS reinitializes itself underneath it.  You also have to make sure that you can also update the application itself without interrupting the user’s work – oh, it probably won’t hurt if the application is unresponsive for a few seconds, but there’d better not be any other visible disruption.  All of the user’s documents need to remain open and in the same state they were in before the update; if it’s a web browser, all the web pages that were open before need to remain open, session cookies need to remain in place, and the internal state of any client-side script needs to be retained.  That’s not going to be easy for the application developer to support, and I suspect that there’s a limited amount the OS can do to help out.

For updating the OS itself, modular design would seem to be key; something like Minix could conceivably be adapted to allow individual modules to be updated one at a time, with both old and new versions of an updated module running simultaneously during the handover to allow the clients to be (transparently, if at all possible) migrated from one to the other.  A similar approach might be possible if the microkernel itself needs to be updated, although if hardware-based security measures have been taken to protect the microkernel from modification it might be necessary to put the OS into hibernation and perform a physical reboot.

I strongly suspect it would be impossible to reliably achieve any equivalent functionality in a more conventional OS design such as Windows or Linux.  [*]

Applications could perhaps be updated in a similar way; allow the old version of the application to continue running in parallel with the new version, then migrate each open document, window, or related set of windows from one to the other in turn.  The internal state of each document/window/set would have to be transformed into a version-independent representation for the handover.  This isn’t going to be easy for the programmer, but there would be some fringe benefits: in addition to allowing transparent updating, a user’s idle applications could also be individually hibernated for more efficient multitasking, or a user’s entire logon session could be transparently migrated from one physical machine to another.

I’ll give this more thought as time permits, but apart from some more or less trivial commentary on how programming languages could be modified to make this sort of thing easier, I’m not sure there’s much more to be said.  Hopefully one day someone will be able to actually try to make something like this happen – come to think of it,  similar issues must come up in the context of cloud computing, so perhaps there’s already a body of work I’m not aware of – and practical experimentation is much more likely to result in useful ideas than mere armchair theorizing!

Harry

[*] Windows has theoretically had support for hotpatching for a number of years, but as far as I know none of the security updates released so far have actually implemented it.  I believe you can also buy support for hotpatching Linux systems, but if my understanding of how this is implemented is correct this only allows certain kinds of updates; enough (in most cases) to keep a system secure without having to reboot it for each individual vulnerability, but not a general-purpose solution.

Incorrect Username or Password – or perhaps the clock is wrong

July 2, 2010

Greetings!

OK, this one had me puzzled for an hour or two.  My main test box having failed, I had taken one of the older machines hanging around my office and given it a once-over; some extra RAM, a bigger hard disk drive, that sort of thing.  I noticed the motherboard battery had died, but figured I could live without it.  What the hey; the machine stays connected to the mains most of the time anyway.

So I go to install Windows 7 on the box, using our automated install system on my USB stick.  This is a Windows PE image which runs Windows Setup in unattended mode over the network.  It also includes a bit of in-house code so that you can provide an administrator password to join the machine to the domain in advance; once this password has been accepted, everything else happens automatically.  A few hours later the machine will be ready to use, software and all.  This sort of streamlining may seem unnecessary, but it does make life easier.

Except that this time the domain controller wouldn’t accept my password.  Typed it in again; no luck.  After trying once more, well, okay, several times more, and logging in on another machine to make sure I hadn’t gotten confused about which password went with which account, I realised there really was something wrong.

So I try it on another machine, and it works perfectly.  I try booting the first machine from a copy on CD instead of the one on my USB stick; no difference.  Bringing up a command line window allows me to explicitly attempt to connect to the domain controllers as well as other network servers, with both my own account and a test account with a simpler password – just in case!

The results were puzzling to say the least.  Error code 86 – the network password is not correct – for any user account but only from that one machine and only when connecting to a domain controller.  The file server, by comparison, was perfectly happy to accept the supposedly “incorrect” passwords.  In case there was something wrong with the Windows PE image, I repeated the experiment using the standard Windows 7 install DVD, but this exhibited the same problem.  Swapping the keyboards and network cables between the machines was a long shot at best, this didn’t pan out either.  The event log on the domain controller showed the logon failures but provided no additional information.  For some reason, the problem didn’t occur when booted to a copy of Windows PE 2 (which is based on Vista rather than Windows 7) but this didn’t help me much.

Eventually it dawned on me that the flat motherboard battery meant that the on-board clock would have been reset; the computer thought the year was 2004.  Mismatched clocks have been known to cause authentication problems, and sure enough, once I had corrected the date and time via the BIOS the domain controller accepted my password and the automated installer set to work.

I’m not sure what the moral of this story is exactly.  I suppose I should have remembered the failed battery as soon as the machine started behaving oddly, and stopped to think about whether there could be a connection.  Anyway, I’ve tagged this as a bug, not because the authentication should necessarily have worked, but because the error code (or at least the event log!) should have indicated the actual problem instead of insisting that the password was wrong.

Harry.


Follow

Get every new post delivered to your Inbox.