About My License Plate

Most people who know me know that my license plate reads ‘P = NP’. A lot of the people who know me study computer science, and so they understand the plate. I wrote this entry to explain P and NP to people who don’t understand the plate. As always, the wikipedia article is a great source of information. Unfortunately, I think it’s a little heavy on the math. In this article I’ve tried to abstract away as much math as possible, to leave the important intuitive concepts.

Background: Whirlyball, Algorithms, and Running Times

Let’s start with an example. Suppose you’re playing a game of whirlyball with your friends. You want to pick teams that are even. To simplify things, you give each of your friends an integer ’skill number’ to express how good they are. The skill of a team is then the sum of the skills of each of its individual members.  Example: Suppose we have the following players:

Player Skill Number
Mark 10
Bob 9
Shaan 7
Jeremy 5

The Team “Mark and Jeremy” has total skill 15, while the Team “Bob and Shaan” has total skill 16. Your goal is to partition the players into two teams of equal skill. In general, given a long list of players and their skill numbers, how do you choose teams that are as equal as possible? This problem is called the PARTITION problem (problem names are always capitalized for some reason).

A step-by-step general solution to a problem is called an algorithm. Computer Scientists spend a lot of time analyzing algorithms. Of particular interest is how many steps an algorithm takes, as a function of the size of its input. For our PARTITION problem, the size of the input is the number of players playing the game. If an algorithm takes n steps to choose a team, we say it has running time n. This is considered a good running time. For example, one algorithm for the PARTITION is to randomly decide which team each player is on. If we just flip a coin for each player to randomly assign them to a team, picking teams for n players will take us n steps. Therefore, the ‘coin-flipping’ algorithm for the PARTITION problem has running time n. Note that the coin-flipping algorithm is not guaranteed to give a correct solution – it might put everybody but Jeremy on the same team.

Let’s suppose we want an algorithm that always gives the fairest possible teams. What would such an algorithm look like? The most obvious approach is to go through the list of all possible teams, and check each possible team, seeing if that team’s total skill is equal to half of the total skill of the players. Call this the “obvious” algorithm. If we find a team whose total skill is half the skill of all the players, then the other team must also have the same skill. How long does the “obvious” algorithm take? With n players, there are 2n possible teams that could be made. Becuase the “obvious” algorithm might check each possible team before concluding that there are no fair teams, it has a running time of 2n.

Suppose Shaan proposes two teams and says that they are fair. How can we tell Shaan is correct? Luckily, this is an easy question to answer – just add up the skill numbers of the players on each team and check to see if they are equal. For n players, this process takes n steps to complete. The process of confirming a proposed solution’s correctness is called ‘verifying’ a solution.

We have shown that PARTITION is a problem with the property that proposed solutions can be verifed quickly. A question naturally arises – if it’s so easy to check and see if a team is fair, shouldn’t it be easy to solve the PARTITION problem correctly? That question, it turns out, is the question of whether P = NP. People have been asking that question for around 40 years, and nobody has been able to successfully answer it.

The Meat of the Question: Polynomials, P, and NP

With the background behind us, we can finally start to answer explain what “P = NP” means:

P and NP are two sets of problems. ‘P’ stands for ‘polynomial’, and it is the set of all problems for which there exists an algorithm that always runs in a polynomial number of steps. A polynomial is a function that looks like f(x) = x3 + 2x2 + x + 4. Polynomial functions never have x inside an exponent, like 2x

‘NP’ stands for ‘nondeterministic polynomial.’ It is the set of all problems for which a proposed solution to the problem can be verified in polynomial time on the size of the problem. The reason P and NP are studied is because any algorithm that runs in a polynomial number of steps is considered ‘efficient’, whereas any algorithm that runs in a number of steps that grows faster than any polynomial is not considered efficient.

The random algorithm for the PARTITION problem has running time n, which is a polynomial.  The random algorithm doesn’t always give us the correct answer, though. The obvious algorithm has running time 2n, which is grows much faster than a polynomial does. Is there a solution for the PARTITION which has a polynomial (or better) running time? It turns out that nobody knows the answer to this question. As a a result, we do not know whether the PARTITION problem is in P.

What about NP? As I showed above, any proposed solution to the PARTITION problem can be verified in n steps. Therefore, the PARTITION problem is definitely in NP. It turns out that any problem in P is also in NP. This means P is a part (subset) of NP. The reason for this is not very complicated, but explaining it would bring me beyond the scope of the article and I don’t want to confuse anyone.

To sum up what we have shown so far: P and NP and sets of problems. P is definitely a subset of NP. The PARTITION problem is in NP, but it is not known whether it belongs to P or not. Computer Scientists know of many, many problems that are called NP-Complete. A problem is NP-Complete if it is a member of NP, and complicated enough that solving it would allow you to solve any other problem in NP in polynomial time. PARTITION turns out to be an NP-Complete problem. In other words, if you had a little box that could solve the PARTITION problem in polynomial time, you could use that little box to solve any problem in NP in polynomial time. There are lots of problems that are in NP but have no known efficient solutions. Such a box would be very, very valuable.

The hunt for that box leads us back to an unsolved question in Computer Science: Are P and NP the same, or is P only one part of NP? In other words, are there any problems for which solutions can be verified efficiently but that cannot be solved efficiently? If P = NP, then the magic box exists, and it could solve all of these frustrating problems efficiently. Becuase so many smart people have worked so hard for so long without success to build the magic box, many of them believe that it can’t exist. They believe that P does not equal NP, because there are some problems in NP that are not in P. Thus, most Computer Scientists and Mathematicians believe that P does not equal NP.

OK, but Who Cares?

Why does any of this matter? First, I find the fact that it’s a (relatively) easy to understand, yet unsolved open problem fascinating. PARTITION is an easy problem to grasp, yet nobody has been able to solve it efficiently. It’s almost as if the “P vs NP” problem is self-referential. The question of whether P = NP or not appears to be a very difficult problem to solve; the only real theoretical results that have been produced since the question was first asked in the 1971 are results saying that certain types of proofs could not prove that P does not equal NP. So, from a theoretical point of view, it’s a very interesting problem.

Not only is the problem theoretically interesting, it is also very applicable to the real world. There are many problems commonly faced by people in many fields that are proven to be NP-Complete. Determining the quickest route to take when visiting a set of cities is NP-Complete. Assigning students to dorm rooms such that roomate preferences are honored is NP-Complete. Loading boxes onto a truck in the most efficient way is NP-Complete. If someone found a polynomial-time algorithm for one of those problems, all of them could be solved in polynomial time. The benefits would appear all over the place, in many different industries. By saying that P = NP, in some ways, I’m saying “The world is a good place,” because these hard problems that we would like to solve efficiently actually ARE solvable efficiently.

At the same time, it’s well known that I’m a bit of a contrarian. I like questioning commonly accepted ideas. The majority of Computer Scientists suspect that P does not equal NP. By saying ‘P = NP’, I am saying, in a mathematical sense, “Question Authority!” I’m challenging the ‘orthodox’ way of thinking, as well as the line of thinking that says “many smart people have tried and failed, so it must be impossible.”

For me, “P = NP” is a mantra, a way of thinking, that says “challenge your assumptions, be optimistic, be humble, be daring, and above all else, be curious.” It’s quite a lot for a 4 – character license plate.

blog comments powered by Disqus