A Little Bit Of (Computability) Theory

When most people think of Computer Science, the first thing that pops into their head is an image of some greasy kid in a black t-shirt and sandals tippity-tappity typing away. While programming is a fundamental tool in the computer scientist's kit, it's no more the core of Computer Science than glassware is the core of Chemistry. In truth, the fundamental quality of all rigorous sciences, from the abstract (Mathematics) to the "hard" (Physics, Chemistry) to the "fuzzy" (looking at you, Psychology!) is that they all postulate fundamental truths about their area of study.

In Computer Science, these fundamental truths are based on assertions about the ability of a computational system to be able to solve problems. There are some problems which are inherently unsolvable by computational systems, either due to physical limitations (memory, processing speed, etc) or theoretical boundaries. For example, let's take three problems which we might want to give our computer to solve:

  1. We'd like to know how much time a ball dropped from rest takes to fall 3 meters on Earth.

  2. We'd like to know what the shortest path between the 100 largest cities in Germany is that passes through each city exactly once and begins and ends in the same city.

  3. We'd like to know, once we have written a program, whether or not it will run for a finite period of time or enter into an infinite loop.

These three problems illustrate three very different classes of computability. The first is in the class commonly referred to as P: Deterministic Polynomial Time problems. These problems execute in an amount of time which has an upper bound given by some polynomial in the number of inputs. In simple terms, this means that the amount of computation required to solve a problem doesn't grow too quickly for more inputs. These include constant time problems, which can be solved in the same amount of time for any number of inputs (think "find the nth number in an array"); logarithmic time problems (Quicksort); and polynomial time problems (Bubble Sort). This is generally the category we want to be in.

Our first problem is in P because it can be solved in a constant number of steps. Unfortunately, our second problem is significantly more complex: it actually is bounded in its execution time by a function of the form O(CN!) where C is a constant and N is the number of cities. We can solve this problem, but it will take an extremely long time for even moderately sized inputs. For example, the problem as we stated above would take as many as 9 x 10^157 steps to solve for the unique ideal solution. This class of problems is called NP, for Nondeterministic Polynomial Time. These problems are extraordinarily difficult, and for larger input lengths are effectively impossible to solve in a reasonable amount of time. NP is commonly considered a superset of P, meaning that every problem contained in P is also contained in NP, although this is as of yet unproven. In fact, one of the major outstanding problems in Computer Science is to prove this conclusively one way or the other. The problems in NP are important but difficult to compute; if one could find polynomial time algorithms for them it would be a huge advance in our computational abilities!

Finally, the last example demonstrates one of the fundamental weaknesses of current computers: this problem is not solvable for all possible inputs. It joins a class of problems which, despite our best efforts and algorithms, are fundamentally impossible to compute answers for (say, counting the number of atoms in the universe). While our techniques are powerful, there are inherent limits on what we can do with computers, and Computational Complexity Theory aims to find out what they are.

If you like what I write, you can subscribe to my mailing list to get new posts in your inbox: