Archive for the ‘And While I’m Dreaming I’d Like a Pony’ Category

Safe Computing: Another Case of Too Many Moving Parts

May 15, 2012

So, one of this month’s Windows XP updates, KB2686509, has a problem.  Long story short, in order to fix a security problem, Microsoft had to make the rules more strict for software that installs custom keyboard layouts.  If I’ve understood correctly, most software already does it the recommended way, but if you have software installed that is incompatible with the new rules the update will refuse to install.  You may be able to fix the problem yourself, or you may have to contact your software vendor.

That isn’t actually the interesting part.  The update will also refuse to install if you’ve remapped a key, for reasons that aren’t entirely clear.

Thought for today: If God wanted us to remap keys, he’d have given us a control panel.

Nah, too easy an answer.  People have reasons for wanting to remap keys, and the process is, or at least was, documented.  If Microsoft are going to make it that easy, of course people are going to do it, and it shouldn’t cause strange problems years later.

No, this is another case of too many moving parts.  We need to be able to configure this sort of thing, sure, but it needs to be done in a controlled way, so that issues can be avoided, or at least properly identified.  Of course, none of the suggestions in my previous post would have addressed this issue, so the question becomes: what should the ideal operating system do to deal with this sort of situation?

I think the solution [1] is to associate changes to the registry (or rather, its equivalent) with the application (or component) that made those changes.  This would probably be implemented by keeping a collection of registry settings for each application; a query for a particular setting would return the default value only if no application had made any changes.  As a bonus, you’d know when two applications or components were in conflict, although to be honest I’m not sure what you’d do about it.

(This doesn’t mean that you couldn’t make ad-hoc changes; in this context, an “application” could be a single text file, not much different from a Windows .reg file except that it would have to include a name to go in the list of installed software.  The registry editor could build these for you.)

Anyway, if Windows worked like that, KB2686509 wouldn’t have had to, metaphorically speaking, shrug its shoulders at you.  It would know which application had configured the setting it was upset about and could say something like, “I can’t install because I am an incompatible with the installed program, Freddy’s Keyboard Mapper.  Please contact the vendor for an upgrade, or uninstall that program and try again.”

I think that would have upset fewer people.


[1] By solution, I mean a way for a hypothetical new operating system to avoid running into this sort of trouble.  It would not be feasible to modify Windows to work in this way.


Safe Computing: Integer Arithmetic

August 19, 2011

A few quick notes to set the scene.  Firstly, the need for safe integer arithmetic was pointed out in the CWE list and there’s an excellent example here.   I’m going to use C-like syntax for my examples because it’s the language I’m most familiar with, although the same principles apply to other languages.  Finally, I’ll be using 16-bit arithmetic most of the time because it makes it easier to demonstrate overflow without using inconveniently large numbers.  Where appropriate it will be necessary to imagine that the code is running on a 16-bit system so that the 16-bit values are not automatically widened to 32 bits or more.

I’m going to use new keywords for integer variable types: int<n> to indicate a signed integer n bits wide, e.g., int32 for a 32-bit signed integer; and nat<n> to indicate an unsigned integer, e.g., nat32 for a 32-bit unsigned integer.  In mathematics, natural numbers are the positive integers, sometimes (as in this case) including zero, and I find “unsigned int” unnecessarily long-winded and abbreviations like “uint” ugly.

Making integer arithmetic safe is simple in principle and can be summed up briefly: all integer arithmetic must either produce the right answer or cause execution to stop.  I’ll leave vague exactly what “stop” means because it depends on context, but the same sort of mechanisms already used to deal with a divide-by-zero are probably suitable.

In the following example, execution will stop on the second line:

nat16 a = 65535;
nat16 b = a + 1;

Execution will also stop on the second line in this example:

nat16 a = 32767;
int16 b = a + 1;

More subtly, this also needs to cause execution to stop:

nat16 a = 32768;
int16 b = a;

Technically an assignment isn’t arithmetic, but it still needs to be safe. 🙂

We need to detect underflow as well as overflow:

nat16 a = 5;
nat16 b = a - 10;

This is a bit trickier:

nat16 a = 32768;
nat16 b = 32768;
nat16 c = (a + b) / 2;

The correct answer is 32768, which will fit into the nat16 type, but the intermediate value of 65536 won’t.  Whether you will get the right answer depends on what type the compiler uses when working out the expression.  In some cases (such as C in a 32-bit environment) the language specification will guarantee that the code succeeds; in other cases, I would suggest that the compiler be permitted to either succeed or to stop at its own convenience.  The only thing it shouldn’t be allowed to do is decide that the answer is zero.  (Unfortunately, that’s exactly what will happen with most real-world compilers!)

On occasion the programmer needs to allow an overflow to occur, so the language  should define a way to do that.  There are a number of options.  The simplest and most elegant would be to introduce a keyword:

nat16 a = 65535;
nat16 b = lenient a + 1; // b will be set to zero

One problem: this code then wouldn’t compile if the compiler didn’t know about the keyword.  Perhaps a pragma, then:

nat16 a = 65535;
#pragma lenient
nat16 b = a + 1; // b will be set to zero
#pragma strict

Yet another alternative would be to require the compiler to ensure that an overflow in an intermediate calculation won’t stop execution if the final result is still correct:

nat16 a = 65535;
nat16 b = (a + 1) & 65535;

However, I’m not sure this approach is feasible in general, i.e., for more complicated expressions.

On a side note, it might also be desirable to simplify the rules determining the types used in expressions, including a straightforward mechanism for the programmer to override the defaults.  I don’t think I should discuss that now, but I’ll try to write about it later in a separate post.

Now it is of course possible for a compiler to do all the above using the standard x86 instruction set – it just needs to check the carry or overflow flag as appropriate after each step.  Why doesn’t this happen?  Nowadays, backwards compatibility would be a problem, but presumably the original issue was performance, and I suspect this is still a concern, particularly where branch instructions would be necessary.  (The INTO instruction is presumably quite efficient, but there is no corresponding INTC instruction for unsigned arithmetic.)

It shouldn’t be too hard for the CPU to come to our aid here.  For unsigned arithmetic, I think the best approach would be introduce a separate overflow bit for each register, rather than (or as well as) having carry and overflow flags.  Writing the register to memory would trigger a software interrupt (or equivalent) if the overflow bit is set, as would using the register in any other inappropriate way.

Also, instructions for writing only part of a register to memory (the low byte, for example) should generate a software interrupt if the remainder of the register is not zero.

For example, on a 16-bit processor and using a hopefully self-explanatory syntax:

move 65535 -> a0     # register a0 is set to 65535 and the a0 overflow bit is cleared.
move a0 -> nat8 x    # write a0 to 8-bit memory location x;
                     # this causes a software exception because a0 doesn't fit in 8 bits.
add 1 -> a0          # a0 is now zero and the a0 overflow bit is set.
move a0 -> nat16 y   # this causes a software exception, because the overflow bit is set.
add a0 -> a1         # add the value of a0 to a1;
                     # this causes a software exception because the a0 overflow bit is set.

move 65535 -> a0    # a0 is set to 65535 and the a0 overflow bit is cleared.
move 65535 -> a1    # a1 is set to 65535 and the a1 overflow bit is cleared.
add 1 -> a0         # a0 is set to zero and the a0 overflow bit is set.
addc a0, 0 -> a1    # the a0 overflow bit is added to a1;
                    # the a0 overflow bit is now clear;
                    # a1 is now zero and the a1 overflow bit is set.
move a0 -> nat16 x  # zero is written to the sixteen-bit memory location.
move a1 -> nat16 y  # this causes a software exception.
clc a1              # but, if we were to clear the a1 overflow bit first ...
move a1 -> nat16 y  # then this would succeed.

Hopefully, this is clear.  There are details that need filling in, but I think nothing difficult.

Signed arithmetic is much the same, except for one edge case.  There are at least three different ways of distinguishing between signed and unsigned arithmetic; the simplest is distinct arithmetic instructions:

move 255 -> a0       # register a0 is set to 255 and the a0 overflow bit is cleared.
move a0 -> nat8 x    # this will write 255 to the 8-bit memory location x.
move a0 -> int8 x    # this causes a software exception because 255 doesn't fit in
                     # an 8-bit signed variable.  Note that this is a different
                     # instruction code to the above instruction.
adds 32767 -> a0     # a0 is now 33022 (meaning -32514) and the a0 overflow bit is set.
move a0 -> int16 y   # this causes a software exception because the a0 overflow bit is set.
move a0 -> nat16 y   # so does this.

move 65535 -> a0   # a0 is set to 65535 and the a0 overflow bit is cleared.
move 32767 -> a1   # a1 is set to 32767 and the a1 overflow bit is cleared.
add 1 -> a0        # a0 is set to zero and the a0 overflow bit is set.
addcs a0, 0 -> a1  # the a0 overflow bit is added to a1 using signed arithmetic;
                   # the a0 overflow bit is now clear;
                   # a1 is now 32768 (meaning -32768) and the a1 overflow bit is set.
move a0 -> nat16 x  # zero is written to the sixteen-bit memory location.
move a1 -> int16 y  # this causes a software exception.
clc a1              # but, if we were to clear the a1 overflow bit first ...
move a1 -> int16 y  # then this would succeed.

I mentioned an edge case.  This is adding a signed number to an unsigned number.  Consider this code:

nat32 total;
void add_to_total(int32 x)
  total = total + x;

Simple enough, but … do you do the addition with signed or unsigned arithmetic?  You can’t use unsigned arithmetic because x may be negative.  But you can’t always convert total to int32, either; it may be too large.

One solution would be to convert to int64, but this is inefficient if the processor is 32 bits.  Another is to rewrite the code:

void add_to_total(int32 x)
  if (x > 0) total = total + (nat32)x;
  if (x < 0) total = total - (nat32)(-x);

But now we’re introducing branches, which is also inefficient.

My suggestion is to introduce a new type, int33 (as well as int9, int17, and int65) which is that extra bit longer.  These types would only be valid as intermediate values in expressions, not as variables.  The code then becomes:

void add_to_total(int32 x)
  total = (int33)total + x;

The compiler is then free to deal with this in the best way available to it in the given instruction set.  Note that by requiring the programmer to explicitly request this special treatment, we avoid inefficiency in cases where the programmer expects total to always be small enough to convert to int32.  In these cases it might be wise for the compiler to generate a warning unless the programmer explicitly says this is OK:

void add_to_total(int32 x)
  total = (int32)total + x;

Going back to the CPU, there are several ways a new design could help the compiler out in implementing int33.  The simplest would be to give the arithmetic registers a dedicated sign bit, so that register values are always int33 (or int65, or int129).  This would require separate instructions for loading signed and unsigned values from memory (so the CPU knows whether to sign-extend) but since this is already required when loading values smaller than the register size this probably isn’t a big issue.

I hope this is all clear enough.  Please email me or leave a comment if you have any questions.


2011 CWE/SANS Top 25 Most Dangerous Software Errors

July 1, 2011

The list can be found here.  The following is my take on what we (meaning society as a whole) should be doing to reduce the risks; remember, I (almost) never care about backwards compatibility so not all of my suggestions are very practical. 🙂

1CWE-89: Improper Neutralization of Special Elements used in an SQL Command (‘SQL Injection’)

Developers need to stick to parametrized queries.  Programming languages and database interfaces should be designed so as to prohibit the construction and execution of ad-hoc query strings.  Programming languages should be developed that mix SQL (or SQL-like) functionality with traditional code, to make parametrized queries (or the equivalent) more natural.

Also, I suspect that putting the back-end code in the database itself (rather than, for example, on a web server) would prove a safer programming model; this can already be done to some extend with SQL stored procedures, but the language is clunky and you’re still using SQL procedure calls as the interface.

2CWE-78: Improper Neutralization of Special Elements used in an OS Command (‘OS Command Injection’)

This is all about software that uses subprocesses to run command-line OS tools, although the same issue applies to the use of third-party software via a subprocess.  This one is easy: developers shouldn’t do that, ever.  Operating systems should ensure that there is always an alternative; future operating systems (if unencumbered by backwards compatibility issues) should eliminate exec and similar functions.  (Instead, a well-defined plugin mechanism should exist to allow software to cooperatively interact.)

3CWE-120: Buffer Copy without Checking Size of Input (‘Classic Buffer Overflow’)

Programming languages should be updated and/or modified to provide for safe pointer arithmetic.  It would be preferable if this was supported by the CPU.  Safe integer arithmetic is also essential.  In the meantime, developers need to do their best to be very very careful when dealing with buffers; the CWE discussion covers this in some detail.

4CWE-79: Improper Neutralization of Input During Web Page Generation (‘Cross-site Scripting’)

HTTP and HTML need to go.  They aren’t even close to adequate for the way we use the web nowadays.  I’d like to see a new protocol, www://, with all the obvious changes (and perhaps a few not so obvious ones).  During the (no doubt extended) migration period, sites could be available in both flavors via automated conversion in one form or another.

5CWE-306: Missing Authentication for Critical Function

Not much to say about this one; it mostly boils down to developer common sense.  It could be helpful if future internet protocols (i.e., either a hypothetical replacement for TCP or a replacement for or variant of SSL) came with built-in multiplexing functionality.

6CWE-862: Missing Authorization

Much the same as #5, except that I suspect it is most prevalent in web applications, in which case my comments for #4 apply as well.

7CWE-798: Use of Hard-coded Credentials

I think a mandatory death penalty for programmers who do this is probably the most reasonable approach.  Better programming models for database-backed applications (as per #1) may be of some help.

8CWE-311: Missing Encryption of Sensitive Data

Built-in encryption support in operating systems and perhaps in network protocols.  It needs to be easier for programmers to use encryption, and in particular for them to use it correctly.  Windows has done quite well in this regard, I think.  In particularly sensitive situations, though, nothing beats expert assistance.

9CWE-434: Unrestricted Upload of File with Dangerous Type

The underlying issue is lousy web server design, in particular the use of a general-purpose file system (rather than a dedicated one) for storing web content, including executable content.  This could be improved in combination with #4 or independently.

10CWE-807: Reliance on Untrusted Inputs in a Security Decision

Largely developer carelessness, I suppose.  Better languages could help.  So could banning PHP.  Relates somewhat to #4 in many cases.  (The hypothetical www: protocol would certainly include login sessions as a native concept.)

11CWE-250: Execution with Unnecessary Privileges

Operating systems (and perhaps languages) need to provide better support for reduced-privilege execution.  Future (backwards-compatibility-unencumbered) operating systems should run most applications with very limited privileges as a matter of course.

12CWE-352: Cross-Site Request Forgery (CSRF)

See #4.

13CWE-22: Improper Limitation of a Pathname to a Restricted Directory (‘Path Traversal’)

Again, the underlying problem here is the use of a general-purpose file system for storing potentially malicious data (or data that may be maliciously retrieved).  Either operating system support or good open-source code libraries would be helpful.  In the case where the content is part of the application, development tool support would be helpful.  (That is, development tools should allow programmers to easily incorporate read-only embedded file systems into an application.)

14CWE-494: Download of Code Without Integrity Check

Software shouldn’t usually be downloading code of its own accord. On the rare occasions when it is necessary, the developer needs to take great care.  Better operating system support for the appropriate cryptographic tools might help.

Operating systems should also include built-in mechanisms for third-party software to stay up to date.  This would help significantly with what is probably the most common scenario.

15CWE-863: Incorrect Authorization

I’m guessing this is only really common in web-based applications, so once again, see #4.

16CWE-829: Inclusion of Functionality from Untrusted Control Sphere

Partly developer laziness and/or carelessness, but better development tools could help.  In the case of a web application, see #4.

To expand on this a little: I’m not sure whether CWE/SANS consider DLL hijacking (aka Binary Planting) in this category but it’s certainly analogous; the proper solution to DLL hijacking is to use static libraries (preferably source code!) instead.  I imagine there is a similar option in most other scenarios in which this issue comes up, but better development tools may be necessary.

17CWE-732: Incorrect Permission Assignment for Critical Resource

Operating systems should provide better support for developers when it comes to storing critical resources.  Developers shouldn’t always need to assign permissions explicitly.  Better use of and support for embedded file systems could also help.  Ideally, all applications should contain only a single file (with an embedded file system or the equivalent if necessary) and operating systems and developer tools should assist programmers in this goal.  Data stores such as the Windows registry and application data stores should be automatically divided into segments that are only accessible by the application that created them.

18CWE-676: Use of Potentially Dangerous Function

Better languages and run-time libraries.

19CWE-327: Use of a Broken or Risky Cryptographic Algorithm

As per #8, operating system support can help.  Developer education is essential.

20CWE-131: Incorrect Calculation of Buffer Size

See #3, as this is the just same problem from the other end.

21CWE-307: Improper Restriction of Excessive Authentication Attempts

I don’t have anything more to suggest on this one.  (Shocking, isn’t it?)  Developer education (such as this list!) is the key.

22CWE-601: URL Redirection to Untrusted Site (‘Open Redirect’)

See #4.

23CWE-134: Uncontrolled Format String

Special case of #18, same solution.

24CWE-190: Integer Overflow or Wraparound

Already mentioned in #3, modifying languages (or providing new languages) to support safe integer arithmetic is essential to safe computing, and CPU support would be very helpful.  I’m planning to write more about this later.

25CWE-759: Use of a One-Way Hash without a Salt

Developer education.  Operating system support and good open-source libraries would help.  Since this is probably most prevalent on the web, the hypothetical www: protocol could probably help out as well by mandating authentication methods that don’t actually give the password to the web site in the first place.








Safe Computing: Heterogeneous Multi-Core

June 10, 2011

It was bought to my attention recently that it’s been a long time since I’ve posted anything in my Safe Computing series.  Today I want to talk about an idea for post-modern CPUs that I’m going to call Heterogeneous Multi-Core (HMC).  As always, this is strictly a theoretical exercise; I’d be delighted if CPU designers picked up my ideas, but I don’t actually expect it to happen, so don’t bother pointing out that it won’t.  On the other hand, please do point out any logical errors or oversights on my part.

Heterogeneous Multi-Core means exactly what it says: the CPU has more than one type of core.  Such CPUs may well exist already, but the best known lines of CPU are homogeneous, that is, each core is (more or less) the same as the others.  (I don’t count integrating a GPU into a CPU because this is such a trivial case, and because if I understand correctly the CPU and GPU are still logically distinct even if they are on a single chip.)  Note that in this context I’m talking about differences as seen by a software developer, rather than hardware differences, although of course the one largely determines the other.

There are all sorts of possibilities within this general concept.  For example, you could have cores of varying speeds (in fact I gather this is already done).  I’m mainly interested in possibilities that would help with Safe Computing, though, so I’m going to concentrate on that.

Master Core

The master core (or perhaps cores) runs the top-level kernel.  The key idea here is that each master core determines all execution context for the cores under its command.  The various servant cores don’t have a kernel mode; everything that would normally be done in kernel mode is instead handled by instructions from the master core over a dedicated channel.

In principle, the master core could run the OS kernel, but I think it would be preferable for it to run a hypervisor instead.  Also, to minimize compatibility issues, the hypervisor code should be part of the CPU firmware rather than being a third-party product.  You’d need some kind of message queue from the servant cores so that the OS kernel(s) can give the hypervisor instructions.

The master core should use an instruction set suitable for the purpose and the hypervisor should be single-threaded.  The source code should also be available to the public so it can be analyzed for possible flaws.

Hypervisor device drivers for on-board devices could be included in the motherboard firmware.  (These wouldn’t be run on the master core itself, of course.)  Similarly, the motherboard can provide a base OS in firmware to serve the user interface functions currently provided by the BIOS as well as those normally provided by a host OS or hypervisor console.

I’m guessing that each master core could handle at least 64 servant cores, and quite possibly many more than that.  So it seems likely that for the time being only one master core would be needed.  This would be preferable, because it simplifies the hypervisor design significantly.

Central Processing Cores

I’m going to call servant cores intended for general processing, i.e., running the OS and applications, Central Processing Cores, or CPCs for short.  As previously discussed, these cores wouldn’t have any form of kernel mode.  Instead, they would receive instructions from the Master Core (MC) to set security context, load or save register values, start, stop, and so on.

One useful optimization would be for the CPCs to have two sets of registers – somewhat similar to hyperthreading – so that all the information necessary for the next thread to execute can be loaded into the core ahead of time.  When the MC issues the instruction to switch context, the core could begin executing the new thread almost immediately.  The register contents for the old thread could then be lazily saved back to main memory before setting up for the next context switch.

As a special exception to the “no kernel mode instructions” rule, a halt command could be available to code running on the CPC.  If the MC has already configured and approved the next context switch, that could happen immediately, if not, the CPC would halt execution until instructed by the MC to continue.

x86/AMD64 Cores

Although I don’t normally talk about backwards compatibility in my “Safe Computing” posts, this particular possibility is too significant to overlook.  An x86 core would implement the user mode parts of the x86 instruction set.  It should be simpler than a “real” x86 core because it doesn’t need kernel mode.  An AMD64 core would implement the AMD 64-bit instruction set.  It might be preferable to provide cores that can do either, but I’d leave this to the experts to decide. 🙂

The theory here is that operating systems could be ported to the HMC CPU while retaining the ability to run existing PC applications.  That is, the operating system would need to be ported, but the applications wouldn’t.  Windows already supports multiple CPU architectures, so it should be (relatively!) simple for Microsoft to port it to HMC.  I’m not sure about Linux, but I know there is already support for paravirtualization under Xen, so one option would be to port Xen, or, rather, re-implement the Xen guest machine interface.

It would also be desirable to be able to run unmodified PC operating systems in virtual machines.  This would require a bit more work.  Conceptually it would be cleaner to handle the kernel mode parts entirely (or almost entirely) in software.  There are some instructions which exist in both kernel and user mode but behave differently, so these would need to be implemented carefully to avoid causing problems.  We would also need x86-like address space mapping, which is a shame, but would probably be worth it.

In summary, some additional hardware assistance might be appropriate, but only if it doesn’t complicate the cores too much.

Non-x86 Processing Cores

Just because we want some x86/AMD64 cores for running legacy applications doesn’t mean we shouldn’t have better instruction sets available as well.  I keep meaning to write about some of the ways a CPU instruction set could help make software more reliable, but so far I’ve only posted about protecting the flow of execution.  I’ll try to do better in the coming months.

Since I mentioned it already, I will point out that while we of course need to be able to restrict which parts of memory the CPC can read and/or write to, we don’t necessarily need a fully mappable per-process address space of the sort x86 CPUs provide.  Once again, I’ll try to discuss this in more detail later this year.

Kernel Cores

There could be some advantage to having a separate core type designed specifically for OS kernels, but my ideas on this front are still pretty unformed.  I’d probably insist on a single-threaded design, so that wouldn’t work for ported operating systems. 🙂

IO Cores

When it comes to the hypervisor’s device drivers, we’re free to insist on a new instruction set, since architectural differences would almost certainly make porting existing drivers infeasible anyway.  (Also, we’re talking about a brand new hardware platform, so it seems likely that most of the core devices will be new designs.  Conventional motherboard designs are fairly unsatisfactory from a security standpoint – with luck, the subject of a future post – so that further reduces any risk from changing the instruction set.)

I don’t really have many solid ideas about IO core design either, but I’m of the opinion that it would be beneficial to use a customized instruction set.  Again, the drivers should be single-threaded.  We’d need an efficient way of passing packets of data between related drivers, including both hypervisor and OS-level drivers.  Some IO cores might be dedicated to running a single driver, i.e., without multitasking, which I think would make programming some sorts of drivers a whole bunch easier.

I’d also want to investigate whether we could eliminate the need for DMA by using dedicated IO cores with a suitable instruction set and direct access to the IO bus(es).  As well as potential security benefits, if we assume that there will be only a single multi-core CPU per motherboard, this would mean that only the CPU would need to talk to the RAM, which has to simplify all sorts of issues around caching and locking memory access.

Multimedia Cores

It might be useful to have a cheap and simple core designed specifically for applications that are generating sounds; it shouldn’t need to be all that fast, since audio frequencies are 20kHz or lower, although you’d need enough power to process, say, MP3 data in real-time.  (I presume dedicated hardware support would speed this up.)

The main idea here is to ensure that audio applications aren’t being interrupted arbitrarily by other tasks, so as to reduce pops and crackles and to make it possible to minimize latency.  A direct channel from the audio multimedia core to the audio IO core might be sensible – personally I’d be inclined to bypass the need for audio device drivers altogether and just have a couple of analog output pins on the CPU, but I imagine that would horrify all the audiophiles. 🙂

Video multimedia is a different story.  I’m not sure that a completely distinct instruction set would be sensible, but perhaps an extended instruction set to provide hardware assistance to codecs could be provided on only some of the CPCs?  On the other hand, the OS would then have keep track of which threads needed which cores, which might be more trouble than it was worth.

GPU Cores

Mentioned only because it is so obvious.  I don’t have the background knowledge to weigh in on the argument about whether separate or integrated GPUs are better, although I do want to see standardization on the instruction set.

Well, I guess that’s all.  Thanks for reading.


Safe Computing: No Rebooting, Please

July 25, 2010

I recently had an interesting discussion with Michal Zalewski over at Icamtuf’s blog concerning disclosure policies.  This wound up reminding me of something that’s always been obvious but that I haven’t thought about much recently: the effect that having to reboot in order to install updates has on update take-up.  Given the speed with which vulnerabilities are now weaponized and the increased number of zero-day attacks, this is even more relevant today than ever before.

One of the tasks I take care of at work is monitoring the deployment of Microsoft security updates on machines within the faculty’s Windows domain.  We deploy Microsoft updates using WSUS, which also keeps track of the status of the machines for us.  Unfortunately I haven’t kept any statistical data over the past however-many years, but my subjective impression is that our users are rebooting far more promptly than they used to; this may partly be due to increased awareness of security risks, but I suspect that the regular monthly update cycle plays a big part.  If you know you’re going to need to reboot once a month, and only once a month, there’s no point in putting it off.  If you might be asked to reboot again the following day, you’re less motivated to get it over with.

Even so, the problem wouldn’t arise if updates could be installed without having to reboot.  So, is this possible?  Well, yes – but a qualified yes.  This isn’t easy to do, even if you’re given a clean slate to work with.

Firstly, we need to think a bit more about what we mean when we say “reboot” in this context.  Technically, you’re only rebooting if the OS shuts down and hands control of the system back to the BIOS – to all effects and purposes, the effect is essentially the same as switching the computer off and back on again.  If you took this definition literally, it wouldn’t be too hard to avoid rebooting after updates are installed; for example, you could run the OS inside a virtual machine, and let the hypervisor retain control of the hardware while the OS is reinitialized.  This would probably save some time, since a virtual reboot is typically much quicker than a physical reboot, but it wouldn’t actually help; the user would still have to close and reopen all the applications and documents they were working with, and that’s what they really want to avoid.

Similarly, it isn’t enough to just think about OS updates and whether you can keep an application running (in some sense) while the OS reinitializes itself underneath it.  You also have to make sure that you can also update the application itself without interrupting the user’s work – oh, it probably won’t hurt if the application is unresponsive for a few seconds, but there’d better not be any other visible disruption.  All of the user’s documents need to remain open and in the same state they were in before the update; if it’s a web browser, all the web pages that were open before need to remain open, session cookies need to remain in place, and the internal state of any client-side script needs to be retained.  That’s not going to be easy for the application developer to support, and I suspect that there’s a limited amount the OS can do to help out.

For updating the OS itself, modular design would seem to be key; something like Minix could conceivably be adapted to allow individual modules to be updated one at a time, with both old and new versions of an updated module running simultaneously during the handover to allow the clients to be (transparently, if at all possible) migrated from one to the other.  A similar approach might be possible if the microkernel itself needs to be updated, although if hardware-based security measures have been taken to protect the microkernel from modification it might be necessary to put the OS into hibernation and perform a physical reboot.

I strongly suspect it would be impossible to reliably achieve any equivalent functionality in a more conventional OS design such as Windows or Linux.  [*]

Applications could perhaps be updated in a similar way; allow the old version of the application to continue running in parallel with the new version, then migrate each open document, window, or related set of windows from one to the other in turn.  The internal state of each document/window/set would have to be transformed into a version-independent representation for the handover.  This isn’t going to be easy for the programmer, but there would be some fringe benefits: in addition to allowing transparent updating, a user’s idle applications could also be individually hibernated for more efficient multitasking, or a user’s entire logon session could be transparently migrated from one physical machine to another.

I’ll give this more thought as time permits, but apart from some more or less trivial commentary on how programming languages could be modified to make this sort of thing easier, I’m not sure there’s much more to be said.  Hopefully one day someone will be able to actually try to make something like this happen – come to think of it,  similar issues must come up in the context of cloud computing, so perhaps there’s already a body of work I’m not aware of – and practical experimentation is much more likely to result in useful ideas than mere armchair theorizing!


[*] Windows has theoretically had support for hotpatching for a number of years, but as far as I know none of the security updates released so far have actually implemented it.  I believe you can also buy support for hotpatching Linux systems, but if my understanding of how this is implemented is correct this only allows certain kinds of updates; enough (in most cases) to keep a system secure without having to reboot it for each individual vulnerability, but not a general-purpose solution.

Safe Computing: Moving Parts Are Bad

June 28, 2010


This is really more about robust computing than safe computing per se, but of course the one is pretty much a prerequisite for the other.

The excellent Susan Bradley recently pointed out this interesting blog post, describing one of the many ways in which a careless or misinformed administrator can break Windows.  In short, one of the built-in operating system services had been reconfigured to use a different security context than the one it was supposed to.  The only way the problem was identified was by comparing the entire registry to that of a clean system, and, presumably, hand-picking through the huge number of irrelevant differences to identify the relevant one.

Now, David is certainly correct in saying that changing the security context in which a built-in service runs in is a bad idea, but he doesn’t ask the obvious question: why the heck can you make such a change in the first place?

Answer: because the service configuration, including the security context, is stored in the Windows registry.  And anything that’s stored in the registry can be changed, whether it should be configurable or not.

All together now: This Is A Bad Idea.

The registry isn’t the only part of Windows that has too darn many “moving parts”, however.  I’ve just completed a basic Windows 7 installation on a virtual machine, and it has over 32,000 files.  That’s before installing any applications, mind you.

Now, Windows isn’t as vulnerable to loss of system files as it used to be, but that’s mostly because of mechanisms that make it harder for a file to be improperly removed or modified and help repair damage if it does occur.  This is better than nothing, but it isn’t a good solution.  A good solution would be to have fewer separate files in the first place.

For the record, Unix operating systems (including Mac OS X) don’t have a registry, but they do have too darn many files, just like Windows.

So how many files is too many?  The obvious answer here is “two” but I think that would be taking a good idea more than a bit too far!

Ideally, the OS itself would indeed be contained in but a single file with a digital signature.  As well as the code, this would include the default settings for everything that is configurable.  In addition to reducing the number of moving parts, this would make it a whole lot easier for external security tools to check for any tampering with the OS.

Local configuration settings – any setting which has actually been configured – would be in another file.  It would probably also be sensible to keep backup copies of this file to allow a previous configuration to be easily restored.

Any device drivers not included in the base OS would take up one file apiece, as would installed applications.  Similar to the OS, each of these would have a corresponding configuration file containing any non-default settings.

Finally, depending on the hardware one or more additional files might be necessary for the boot process, virtual memory and hibernation support.  Oh, and I guess the user might want to save some files too. 🙂

OK, so much for the file system.  Going back to the registry, I should admit that making almost everything configurable isn’t a completely silly idea: it allows for very flexible troubleshooting, which is valuable.  To accommodate that, we should ensure that configuration settings (for the OS, the device drivers, and applications) are stored in a manner that allows there to be lots of different settings with a structured namespace, and that is easy for the programmer to access.

In other words, something very much like the Windows registry.  However, there are a few critical points that need to be kept in mind: first and foremost, the system must always know what the default value is for any setting; secondly, it must be easy to enumerate those settings which have been configured [*]; and finally, there must be a separation of domains, so that (for example) a third-party application can’t change the OS settings but only its own.

Additionally, there should be a clear distinction between settings which are normally changed and those which are not, so that it is easy to enumerate abnormal settings.  If the Windows registry provided this functionality, the Case of the Mysterious Access Denied could have been solved almost straight away, because this would be one of the very first things any trouble-shooter would check.  This should perhaps be a sliding scale, starting with settings that have to be defined (the timezone, perhaps) and moving on to settings that are usually defined, those that are uncommon but supported, and through to settings intended only for advanced troubleshooting performed under expert advice.

It would also be preferable if it were not possible to (unintentionally) configure a setting which does not exist or configure a setting to a syntactically meaningless value.


[*] As a minor point of fine-tuning, note that there is a distinction between a setting which is not configured, and one which is configured but happens to be equal to the default value.

Safe Computing: Protecting the Flow of Execution

June 18, 2010

Most (though by no means all) exploitable security vulnerabilities in traditional client applications involve an attack disrupting the application’s flow of execution.  That is, instead of running the code that was intended, the CPU is directed somewhere else: perhaps just to a random memory address, in which case the application will usually crash, or perhaps to code provided by the attacker.

There are various techniques used to make it harder for an attacker to get his or her own code executed: for example, modern CPUs allow data pages to be marked not executable.  Unfortunately if an attacker can cause the CPU to jump to the wrong place, it is often possible to use legitimate code in malicious ways – this is known as a return-to-libc attack.  To counter this, a technique called address space layout randomization may be used, but even this is not always effective.

The most common cause of vulnerabilities involving the flow of execution is a buffer overflow on the stack, overwriting the return address of the affected subroutine with data of the attacker’s choosing.  Now, there are ways to reduce the risk of buffer overflows, and stronger techniques are possible if we adopt a blank-slate approach, but it is unlikely that the risk can be eliminated entirely.  Therefore, I suggest that the best approach would be to introduce separate code and data stacks, which I’m going to dub the dual-stack model.

In the simplest form of the dual-stack model, the code stack would contain fixed-size frames and the data stack variable-sized frames.  Local variables and call parameters would be on the data stack; the code stack would hold the return address and a pointer to the data stack frame associated with the code at the return address.  Ideally, the code stack would be located in a separate address space, or at least at an unpredictable address.

With a dual-stack model, buffer overflows on the data stack might corrupt the content of other local variables, including in parent procedures, but could not directly cause a flow-of-execution failure.

The other main situation in which the flow of execution is threatened is where a function pointer, interface, or polymorphic class is used.  From the CPU’s point of view these all boil down to jumping to a procedure whose address is contained in a variable rather than being fixed.  If the contents of the variable in question have been manipulated by an attacker, the correct flow of execution will be disrupted.

I don’t see any comprehensive way of preventing this, but the impact can be considerably mitigated if the code (or the CPU) can verify that the target address is, if not actually the correct procedure, at least a legitimate one.  This would require the variable to point not directly to the code, but into a trusted table containing a list of candidate procedures.  (Obviously this table would have to be located in read-only memory or nothing would be gained!)

As a further refinement, separate tables could be provided for different groups of function pointer calls.  In C, each table would correspond to a function signature; in object-oriented code, to a class or interface.  Only those procedures which are actually called in this way would need to be included in these tables.

It should be noted that these tables would have to be built at link time, which would be problematic for dynamic libraries.  I do not see this as a major issue, because one of the first things I’d do in designing a Safe Computing environment would be to eliminate dynamic libraries anyway – they cause reliability problems, and most of the reasons for using them are no longer compelling.  I’ll talk about this another time.

Want Safe Computing? Start Again.

June 7, 2010

Michal Zalewski recently talked about the reasons why security engineering is hard, and why it hasn’t had a great deal of success in making computer users safer.  (To be fair, processes have certainly improved; if it seems that things are getting worse, this is probably simply because the overall level of exposure to any vulnerabilities that may be present has increased so rapidly as to overwhelm any reduction in the number of vulnerabilities that may have been achieved.)

Mr. Zalewski’s article has attracted a considerable amount of interest and a fair bit of comment, including this response by Charles Smutz.  I was struck by this comment in particular: “While our users need to improve their resilience to social engineering, in many cases, targeted attacks are so done so well, that I couldn’t fault a user for being duped.”

Now, what I’m about to say is so obvious that it probably didn’t occur to either Mr. Zalewski or Mr. Smutz, experts in their field, to mention it: if we can’t stop users from being duped, it might be a good idea to limit the damage that a duped, or otherwise dangerous, user can do.

So if this is so obvious, why am I mentioning it?  Well, mostly because it leads directly to the more interesting question: can we limit the damage that a duped user can do?

The interesting thing about this question is that while it seems obvious that it should be possible, to at least some extent, it is often assumed, without much further clarification or justification, that it isn’t.  As an example, let’s consider the simplest possible case – a user opens a malicious executable file sent by email.  This is Law Number One of Microsoft’s 10 Immutable Laws of Security; see also this more recent discussion, along with part 2 and part 3.

The key part of Microsoft’s First Law is this sentence:  “Once a program is running, it can do anything, up to the limits of what you yourself can do on the computer.”

The only problem with this is that it isn’t actually true – or, more precisely, it doesn’t need to be true.  In fact, to a limited extent, Microsoft changed this with Windows Vista’s much maligned User Account Control system.  If you are an administrator on the system, and you run a program without elevation, it can’t do everything you can do.  So while the First Law is (more or less) true for existing operating systems, it is hardly immutable – it’s just a design feature of the OS.

It’s entirely possible to design an operating system in which each open document is associated with a process that can do nothing but modify that particular document.  Oh, there’s a lot more that you would need to do before you could really say that the OS was resistant to user error, and the OS would have to provide facilities for the other things said process would need to do (such as reading the user’s preferences, for one example) but the basic idea is simple enough.  I don’t doubt that there are alternative approaches of the same general class.

It should be noted that, in addition to resisting user error, such an OS would be largely resistant to security problems caused by application bugs.  Only certain classes of application (mainly network-oriented apps) would be exposed to attack, and in most cases the infection wouldn’t survive a system reboot.

So why isn’t this class of approach taken seriously?  Mainly, I think, because it breaks the programming model of pretty much every piece of application software ever written.  Porting existing code to such an OS would be a significantly greater challenge than a normal port, although no doubt emulation techniques could be developed that would mitigate this somewhat.  Still, developing such an operating system along with software to run on it would be a gargantuan effort which nobody seems likely to want to make.

On the other hand … if, as a society, we decided that we really did want safer computing systems and were willing to spend the money, I’m personally convinced that this would be the best way to go about it.  Of course, it isn’t just operating systems that would need to be redesigned.  Over the next few weeks I plan to take a break from reality and describe some of the ways in which computers could (perhaps) be made safer, if only financial constraints and backwards compatibility weren’t an issue.