The art of building software: 2012

Wednesday, January 25, 2012

The importance of factoring large numbers

Here's a simple math question for you: What two integers do you have to multiply to get 15, other than 1 and 15?
Prime Numbers, from Nerdy Baby
Most people get that the answer is 3 and 5, because 3 x 5 = 15.  That one's pretty easy to figure out.  Of particular note: The numbers 3 and 5 are prime numbers, meaning that 3 and 5 can not be divided evenly by any other integers other than 1 and themselves.  Numbers like 15 are called semiprime numbers, because they can be factored into exactly two prime numbers.  This cool chart teaches babies about the first few prime numbers.

Here's a harder one for you.  What two prime numbers do you have to multiply to get the following semiprime number?
25195908475657893494027183240048398571429282126204032027777137836043662020707595556264018525880784406918290641249515082189298559149176184502808489120072844992687392807287776735971418347270261896375014971824691165077613379859095700097330459748808428401797429100642458691817195118746121515172654632282216869987549182422433637259085141865462043576798423387184774447920739934236584823824281198163815010674810451660377306056201619676256133844143603833904414952634432190114657544454178424020924616515723350778707749817125772467962926386356373289912154831438167899885040445364023527381951378636564391212010397122822120720357
Yeah, that's a pretty big number - it has 617 digits in it.  I don't know the answer, and I believe no one alive knows the answer.  Until just a few years ago, the RSA corporation who originally asked this question offered a $100,000 prize to the first person to come up with the answer to that question.  They called this the RSA Challenge.  They multiplied two large prime numbers together to get the monster above, and then threw out the piece of paper on which they had recorded which two prime numbers they multiplied together - so no one alive has the answer.

How long would it take a typical computer to figure out the answer to RSA's challenge?  According to Digicert, one of RSA's competitors, it would take 6,400 trillion years using a standard desktop computer, and using the fastest method known (called the general number field sieve) for coming up with the answer.  Using the fastest supercomputers today might drop that number down to only a few trillion years.  You could further improve that by using several other techniques, but still, we're talking about a much longer time than the human lifespan.  Here's a cool video from Digicert that is fun to watch and puts this into perspective:


Though it's impossible to figure out the answer using today's technology, it's easy to check whether someone has the right answer by taking two numbers and multiplying them together to see if that equals the number above.

Since I mentioned two Internet security companies - RSA and Digicert - I would be remiss of I did not mention the name of the company that originally invented this stuff.  That company was Netscape, which was once a high flying Internet company but crashed and burned and ended up getting bought by AOL for 4.2 billion dollars.


The reason this is so important is that our Internet security relies on the fact that it's currently impossible for anyone to figure out the two prime factors that are multiplied together to get numbers as big as the example shown above.  RSA is one of the leading vendors of Internet security products based on this simple mathematical exercise, and they wanted to prove to themselves and the rest of the world that our Internet is secure.  Every time you access your online banking or use your credit card or login to some web site, you use a security technology which at its core relies on the impossibility of factoring large numbers.

I think RSA has made their point, at least for now: hackers generally don't try to solve this problem but instead rely on numerous other types of attacks.  Here's why.  Let's say you're a bank robber trying to break into the bank's vault.  You're not going to try to break into the front door of the vault by brute force, say with a bomb.  Instead, you might get your own safety deposit box in the vault and just walk right in as if you're going to your own safety deposit box, and then pull out your gun and demand that a bank executive empty out other people's safety deposit boxes for you.  Similarly, hackers don't try to break the secure semiprime key by finding out what two prime numbers need to be multiplied together.  Instead, they attack weaker parts of the security protocols having nothing to do with factoring large, semiprime numbers.

It may surprise most people to learn that the latest Internet security protocols can easily be cracked by hackers in seconds, using techniques such as ARP spoofing or ARP poisoning, and then doing what's called a "Man in the middle" attack.  You are most at risk if you are using a public WIFI connection, even if you are using secure Internet protocols such as SSL.  These attacks have nothing to do with factoring these large semiprime numbers, and will go right through your anti-virus software and firewall software.  Other techniques let you discover any passwords saved in a user's browser, or crack your Facebook password.  You are also vulnerable in your own home.  If you don't set up your own strong, WIFI router password (something you actually have to explicitly do), then you are vulnerable.  I'm not going to link out to this information, but if you're really interested, you should be able to easily find detailed How-To videos if you look around.  Once you know this, you may never want to use a public WIFI connection again!

Most people don't know how vulnerable they are when they are connected to a public WIFI connection.  Let's say you're carrying your iPhone or Android in your pocket, and you walk into your local coffee shop and connect to the free WIFI there.  Once you've done that, a hacker could easily steal your passwords, even if you never take your phone out of your pocket.  Scary, huh?  These hackers are not attacking the semiprime numbers in the security protocol.  Instead, they are acting like the bank robbers, and attacking other parts of the protocol that really are weak.

It's possible to defend yourself against these types of attacks, but as of today, you have to know what you're doing if you want to protect your computers and cell phones, especially if you're connected to a public WIFI.  I believe that our computers and devices will get smarter over the coming years so that they will not be vulnerable to these attacks. However, at the heart of it all, are still these large semiprime numbers.  If someone figures out a way to crack them, there's no easy way to fix this.

If someone could answer the RSA Challenge, they could crack the large semiprime numbers at the heart of the security protocol - it would be like having a way to walk right through the bank vault door even if it were closed and locked.  While no one possess anywhere near sufficient computing power to answer RSA's challenge, if someone magically came up with the answer, anyone could quickly verify that they had the right answer because it would be super fast to multiply the two numbers together to see if that equaled the number above.  Fundamentally, the fact that it's too time consuming to calculate the answer but trivial to check the correctness of the answer, is what secures the network connection between your computer and all servers on the Internet.  If you could answer RSA's challenge, our Internet security would vanish in an instant.

In the 1990's our government used to consider this type of computer security to be equivalent to possessing a nuclear weapon - it was classified as an armament.  They have since changed this stance, but their former stance shows you how important this kind of security is to our nation, and to nearly everyone who uses the Internet.  I think we're fortunate that the government changed their view on this before the 9/11 attacks.

Hypothetical intel Quantum
computer chip
But the day is coming when quantum computers will be able to break into any secured Internet connection, because unlike the computers of today, quantum computers will be able to solve the RSA challenge in a reasonably short amount of time.  And when this happens, our government could well classify these new computers as equivalent to nuclear weapons again because they would destroy all privacy and security on the Internet.  How soon this will happen is a matter of debate, but I believe some time this century we will have the technology, perhaps as soon as a few decades from now.  And then it's only a matter of time before the cost of these devices comes down and then that's it - we need to change the fundamental security architecture of the Internet that we've relied on since the dawn of the Internet.

Knowing that this day is coming, some people formed the PQCrypto conference to pool together great minds to prevent this from happening - but they haven't yet figured out a solution.  Assuming they or someone else does eventually come up with a counter measure, fixing this problem will be a project equivalent or worse in risk and size to the Y2K problem that we were all worried would bring down the Internet, and that cost over $300 billion dollars to fix according to the BBC.