How SSL Certificates Work & Why The Internet Was Broken on May 30
In case you didn’t notice, the Internet was broken on the 30th of May. The culprit:
SSL certificates.
In a hurry? Here’s what you’ll find in this article:
First of all, why do we need SSL certificates?
Technically speaking, they should be called TLS certificates because they bear a public key which is used in Transport Layer Security (TLS) protocol to authenticate the server.
But this is not a technical article, so we will stick to the concepts everyone can understand.
When you go to your favourite online shopping site, you want to be assured that the site is genuine (no one is trying to lure you on a fake copy of Amazon) and everything you send or receive (like your personal data and shopping choices) is secure.
So, how do the browsers know that the site is trustworthy and display that padlock near the address bar? The SSL certificate tells them so.
So, basically SSL certificate is a guy who tells you: "This site is ok".
But why would you trust this guy you do not know? Let's call this guy Sam. You do not trust Sam, but Mike trusts Sam, and Dennis trusts Mike. You trust Dennis; he is a friend of yours! This way you have a chain of trust, just like SSL certificates do.
Normally, people have a close circle of friends they trust. So do the browsers. They have a list of root certificates they trust without any reservations, and they would trust any certificate issued (signed) by a chain that leads to one of the trusted root certificates.
Some browsers (like Firefox) have their own trust store, other browsers rely on the trust store of the operating system they run on.
Now, imagine the following situation:
There is a guy called Peter who is trusted by Tom, who is trusted by Vito. Everyone trusts Vito because Vito is the head of a well-known and respected Family.
Then Vito decides to retire and announces that his son Michael will replace him as the head of the family, and whoever respects Vito must now respect Michael.
Tom pays his respect to Michael, and everything is good. But one day Vito dies…
Suddenly, it appears that a group of people respected Michael only because they respected Vito when he was alive and do not recognise Michael as the head of the respected family.
Things won’t end well for these people… So do the browsers or operating systems which do not regularly update their trust stores.
Back to Michael's family drama…
The idea of Michael and Vito ruling the Family business together for some period of time sounds like a very good idea. For Michael, as the new official Head of the Family, it is a good opportunity to get some experience from his father, be introduced to different people etc., for Vito, as a retiring boss, it means that transition of power will be smooth, and the Family business will be in the good hands.
The same happens in the world of SSL certificates. Sometimes the new root certificates are signed by the older root certificates of the same Certificate Authorities. This is called cross-signing.
The older root certificates are more widely spread on various platforms and more likely to be trusted. The newer root certificates can take this advantage and be trusted by the systems, even while not being recognised as a root certificate on its own, just by the fact that they are cross-signed by the older (and trusted) root certificate.
The ultimate guide to Progressive Web App (PWA)
Everything you need to know about Progressive Web App (PWA); how does it work, the pros and cons, and many more.
However, the older root certificates have one critical defect – they expire sooner, and when it happens, the new root certificate is expected to have been disseminated well enough to be respected by the majority of the platforms.
Unfortunately, this is not as simple as it seems to be…
Imagine the following situation:
Knock, knock.
Who's there?
It's Peter
What Peter?
Tom sent me
Who is Tom?
He works for Michael, son of Vito
Come in...
Technically, this dialog is quite inefficient. Certainly, there is a room for improvement in this communication. Consider this:
Knock, knock.
Who's there?
It's Peter. Tom sent me. He works for Michael, son of Vito
Come in...
Now THAT is a much more efficient communication.
In the world of TLS/SSL security, this means a quicker turnaround for the initial SSL handshake. For this reason, the server sends not just one, but several SSL certificates which allow to validate the whole chain of trust up to the trusted root certificate without a need to download any intermediary certificates from Certificate Authorities.
Unfortunately, the following situation may occur:
Knock, knock.
Who's there?
It's Peter. Tom sent me. He works for Michael, son of Vito.
But Vito is dead.
Michael is now the boss.
I know, but… Vito is dead.
A mention of Vito obviously caused some sort of confusion here. It should not matter anymore that Vito was the Head of the Family.
Now Michael is the Head and everyone should recognize this fact for their own good. However, a mention of Vito creates two chains of trust, one leading to Michael and another one leading to Vito.
Now it is possible to explain what happened on 30 May when many system administrators around the world woke up early in the morning and discovered an avalanche of alerts from the monitoring systems and the angry customers:
"Your SSL certificate has expired! We can no longer access your API!!!11111"
To their relief and astonishment they realized that their site certificates are absolutely fine, and their websites are working in all major browsers without any issues. What would cause such a problem?
On May 30, one of COMODO (now Sectigo) root certificates expired after 20 years of a happy life.
This should not have caused any issues, because the replacement root certificate was issued in 2010 and by the end of 2015 it has been disseminated across all major operating systems, browsers and programming frameworks and runtime environments.
At the same time, the new root certificate was cross-signed by the old one.
So, why after all these precautions and good planning, expiration of the root certificate causes such a big problem?
Here are a couple of reasons why:
Reason #1: The trust store (Certificate Authorities Bundle) is not up to date
In 2020, if you use Internet Explorer 7 on Windows XP or run an application written in Java on an outdated version of Java Virtual Machine (JVM), it’s highly likely you will not have the most up-to-date list of the trusted root certificates.
In case of Java, for instance, the new Sectigo root certificate was included in Java 8 Update 51 release on July 14, 2015. The older versions of Java won’t have it.
Reason #2: The software does not support cross-signed SSL certificates
Application software developers do not have to worry about all the SSL magic. All they have to do in their apps is just call a URL which starts with 'https://'.
It is then the responsibility of the operating system components, application programming
framework or external libraries to handle all TLS/SSL cryptography. Usually, the programmers have very little control.
One of the most notable TLS implementation libraries is OpenSSL. It is widely used by Internet servers, especially Linux based. Some estimate that about 2/3 of the web relies on OpenSSL.
Cross-signed certificates support was introduced to OpenSSL in version 1.0.2 released in January 2015. However, this support was optional, which means the software developers had to explicitly enable it in their application, but the majority of software developers did not even know about the cross-signing issue, and nobody did it.
September 2018 version 1.1.1 of OpenSSL was released where cross-signed certificates support was enabled out of the box. This means that the systems with OpenSSL version earlier than 1.1.1 must be upgraded.
About 67% of all web servers in the world run on Linux operating systems which have one little problem - package managers.
Package managers are the cancer of software industry, or a drug, that initially seems like fun, but in the end… it kills.
This is a topic for a separate article, but the bottom line is:
You cannot simply update a single package (OpenSSL library) to a new major version, because the dependencies of the newer version may not be compatible with dependencies currently installed, and the other installed packages (e.g. web servers) which depend on OpenSSL may not be compatible with the newer version of it.
In the end, instead of upgrading just a single package, you may end up with upgrading the whole operating system.
Imagine the situation:
You have a flat tire on your new car. You come to a service station and ask them to repair the puncture, but they say they cannot do that because your tire is too old to be repaired and has reached the end of life.
You say it is okay and ask them to sell a new tire, but they say that no new tires are compatible with your car and the only available option you have is buying a new car; even though your car is just 2 years old.
This is essentially what happens in the software industry.
For instance, CentOS 7 is a very popular operating system for web servers. And as of June 2020, it is in the active support stage with the end of life set to be in 2024.
However, the package manager of CentOS 7 will only let you work with 1.0 branch of OpenSSL which reached the end of life at the end of 2019. It makes this relatively modern operating system obsolete if you want to work with modern TLS cryptography.
Soon after Sectigo's root certificate had expired, a holy war has begun between software developers and system administrators.
The developers demanded the expired certificate to be removed from the SSL bundle the web servers were sending. The system administrators insisted that the developers should update their own technology stack because only completely outdated systems could not properly deal the cross-signed root certificates where one of them had expired (“I trust Michael, but Michael is a son of Vito and Vito is dead” situation).
The industry (and the common sense) were on the system administrators' side. All SSL certificate providers who use Sectigo’s certificates and Sectigo themselves issued statements weeks before the expiration date that no action was required, that cross-signing should do its job, that 10 years is more than enough to update the root CA bundles, that amongst several available chains of trust the software should pick the one that is valid, etc.
Nevertheless, some part of the software industry was not prepared for this trivial event.