5 Lessons Learned from Deploying 100,000,000 Patches
With over 100 million patches deployed, see our top five lessons learned over the past decade of patch management services.
A History of Patching
It was a cold December afternoon in 2008. A support customer, one of the largest retail outlets in the United Kingdom, was breached.
The down-n-up virus, also called Conficker, infected one remote server and spread across almost every single server and a huge number of workstations. Over the next 48 hours, hundreds of hours were spent running a custom “Conficker Killer” on every device and rebooted. It was almost eradicated, but they failed to install (KB958644, KB957097 and KB958687).
As quickly as that, the virus was back.
Compared to recent infections, the Conficker virus didn’t steal anything or hold you to ransom, but it did slowly drain resources which could destroy the Operating System and force a Blue Screen system crash. This end user disruption had to be resolved quickly.
The initial reaction from the IT Director was to set up a patching task to bring the estate up to date. Their premise toolset was scheduled to install everything that night. In hindsight, that strategy didn’t work, and that was when we were asked to resolve the situation.
After more than 10 years, 100 million patches deployed, and hundreds of customers onboarded, no customer has ever been breached.
Here are our 5 biggest lessons we have learned over the past decade of patch management services.
1. Patching is Essential Even If You Have an Up-to-Date AV
Industry experts estimate data breaches have increased almost 60% in 2019, and ransomware specific infections have increased 90% potentially costing businesses $11.5 billion this year alone.
Many senior IT Directors, CIOs and CISOs believe their own perimeter protection, including firewall and antivirus / anti spyware protection, will keep their environment safe. However, this is now how data theft and network / system intrusions occur. Once a break in occurs, sophisticated exploitation is easy but extremely difficult to track or remediate.
We followed a simple experiment conducted by a group of students in the United Kingdom. They built several Windows, Linux and Mac OS systems in a lab fully protected with firewall and antivirus / anti spyware without any OS updates. All had access to the internet with a routed IP Address. Each system was left “as was” without any updates, patches or hotfixes and left 720 hours to see which, if any, would become victim to an external attack. The following results are somewhat astounding and worrying at the same time:
|Operating System||Exposed / Infected||Notes|
|Windows 7||Yes||Infected by Windows 2012|
|Windows 10||Yes||Infected by Windows 2012|
|Windows Server 2012||Yes||Exploited using RDP|
|Windows Server 2016||No|
|Mac OS “Mojave”||No|
|Linux Ubuntu 14||No|
Before the destruction of the lab, forensic evidence was collected that demonstrated RDP was used to exploit ransomware on the Windows Server 2012 server and two Windows virtual desktops. None of the Linux or Mac OS devices were impacted. Maybe if the experiment had been left longer, others could perhaps have been exploited? What we can take from this experiment is that anyone can be a victim even with firewall and antivirus/anti spyware protection in place.
2. Performing and Recording Test Evidence
No one will appreciate the fear an IT Manager feels when they hear these 7 little words; “Is anyone doing any patching right now?”
We have learned that any patching should be based on a platform of transparency. We believe there is a perfect allocation of resources to be used to test and document your testing. This leads to less end user disruption and a rise in confidence.
The following template is an example of what we perform routinely for our customers:
- Each Operating System (with unique Service Packs / Feature Updates), a copy of the image which best exemplifies the live environment is prepared in a virtual lab. All variations must be tested.
- Each Operating System is rebooted multiple times to ensure all post reboot activities are performed. A patch we tested in 2015 changed the keyboard layout to Chinese. It would have only been found after multiple reboots, and if not found, would have caused a catastrophic nightmare for the global helpdesk. This particular patch was removed from the deployment.
- All issues are investigated thoroughly, even if they are seen only a single time.
- All patches which have an uninstall, are tested to ensure the uninstall works! Do not always believe the vendor, this is one of our golden rules.
- Any patch which does not have an uninstall is tested at least twice. This is another one of our golden rules.
All tests are documented and should be concluded before any further activity. Any issues found during the deployment, testing, or post reboot are detailed. Most customers will want to see this evidence before starting the Change Control process.
The deployment is just a couple clicks away from being complete, right? In reality, this is where your knowledge of the environment you are working in is invaluable.
You have completed testing and know the patches do not contribute to end user disruption. However, what you do not know is how those patches deployed “on mass” will have on your network, the amount of time needed to install and environmental requirements on your workstations and servers.
Here are some of the high level requirements you need to complete:
- Rank the patches missing by severity in order. Determine which patches are the most important. If you can, also rank by CVSS score since this is the most accurate independent severity available today. Secure your environment by covering the worst offenders!
- Identify which patches are superseded. You do not need to deploy all Critical updates if they are already replaced—make your deployment efficient!
- Calculate the size of the patches. Is there enough free disk space on your workstations and servers, and can the network handle such a deployment? Planning improves confidence!
- Time the installation as part of your testing. Can your users wait this long? Happy end users is the reference of a good service!
4. Change Management
Change control used to be nonexistent in most of our customer environments. It was only the banking, retail and local government that insisted on a formal patching approval process. The ranking and selection of patches combined into a schedule and approval received frames our service for success. This is why all of our customers allow us to provide change control, even when it is not fully implemented elsewhere in their company.
In some studies, formal change management is implemented at more than 80% of companies, and 100% in the FTSE 100. Our job is to provide the evidence of the patches we want to deploy, the testing we have conducted, the results of that testing, and to seek approval to begin a Pilot or Live rollout. It does place the onus on others, but doing so ensures any scheduling which will take place is conducted at the right time. Can you imagine deploying to users in the UK and Japan simultaneously?
5. Reporting and Perception
The final but critical step in your patching strategy should be to report on your success. It is important that upper management see the patch coverage for the entire environment and length of the service. If you are under contract to deploy updates every month, have reports which can prove this. Nothing helps alleviate governance concerns that proving your monthly efforts with “service over time.”