The Halting Problem - Why you should care

Courtesy of the alt.comp.virus newsgroup participants.
(These "anti-malware" pages are the result of a continuing cooperative effort.)

Anti-Virus Main Menu
Main Menu

The Halting Problem - Why you should care
By Kurt Wismer

The Halting Problem is very technical and I'm certainly not going to do the technical aspects of it any justice here (and the technically minded really aren't the audience I'm writing this for anyways). The short version is that The Halting Problem tells us what is not possible in the computing world.

The basic idea goes like this: there is no set of steps a person or computer can follow that will always determine in a finite amount of time if an arbitrary program will halt (terminate/exit/stop running). Since following steps (instructions) is all a computer can do, this is significant for computers and computer software.

The reason this is interesting and useful to us is that we can apply it to other things; by that I mean that if we can show that doing X is reducible to The Halting Problem then we've effectively proven that it is impossible to do X.

Now, lets follow a simple progression: creating a set of steps that will always determine in a finite amount of time if an arbitrary program performs function Y is reducible to The Halting Problem. All you have to do is say that function Y is a halt function and you'll see it's trivially true. There's nothing special about code that causes a program to exit that would make it difficult to find, the problem is determining if it will ever get executed. In reality, The Halting Problem is a special case of this more general principle.

Creating a set of steps that will always determine in a finite amount of time if an arbitrary program is a virus is reducible to The Halting Problem. To be a virus the program has to perform the function of self-replication and since you can't always determine (by following a set of steps) if an arbitrary program performs that function, therefore you can't always determine if that program is a virus.

Creating a set of steps that will always determine in finite time if an arbitrary program performs any other virus-like functions is reducible to The Halting Problem... I'm sure by now you can guess why.

This probably looks pretty bad. It seems like we can't tell very much at all about arbitrary programs, and not only is that absolutely true but that's also the point. The Halting Problem is important because it shows us what can't be done so that we can separate the impossible claims from the possible ones, so that we can understand what is preventing Anti-Virus developers from finding all possible viruses, and so that when someone comes along and says "Well why can't it just look at the code to find out what it does" we can know why that won't work. Knowing what isn't possible is probably the best tool there is when it comes to weeding out the hype and identifying snake-oil in the Anti-Virus field.

Additionally, knowing that we have these limitations regarding the determination of what arbitrary programs do tells us that the virus (trojan/spyware/adware/etc) problem cannot be solved by purely technological means with our current computing paradigm. While changes to the current computing paradigm have been suggested for the purposes of better managing the problem, those changes would result in computers that are much more restrictive to users, developers, and probably innovation, and whether such changes will actually have their intended effect on the virus problem remains a matter of speculation.

It's not all doom and gloom on the Anti-Virus front, however. While The Halting Problem does effectively say that perfect virus detection is impossible, it doesn't say anything about how close to perfect we can get. In practice, known virus scanning can theoretically detect all known viruses (since they're known it's generally a much simpler task to tell a program specifically what to look for). With the addition of heuristic scanning many or even most derivative viral works can be detected whether they're already known or not. Layering sandbox technology on top of all that brings us even closer (but not all the way, since sandboxing is not a solution to The Halting Problem) to perfect detection. It may not be exactly perfect detection but in practice it has been good enough an overwhelming majority of the time.


(Kurt Wismer - January, 2006)

© Claymania Creations 2001 - 2009. All rights reserved.

Updated: January 4, 2006