The Ultimate Remediation Lifecycle for Vulnerability Management

Roee Shohat, Head of Go to Market
Roee Shohat, Head of Go to Market
December 20, 2023

When it comes to dealing with vulnerability remediation there are at least 100 ways to approach the matter, but the following are the agreed-upon, fundamental stages:

  1. Identification  - In order to remediate we must first identify vulnerabilities using various detection and identification tools.
  2. Prioritization -  determining which assets are the most important and which vulnerabilities are most critical to fix.
  3. Remediation - Fix what matters first, based on prioritization and the responsibilities of remediation stakeholders in your organization. 
  4. Reporting - Measure and report on remediation progress and success to stakeholders and leadership 

Let’s discuss the four stages in detail and see how they can be optimized to better drive remediation from a risk perspective, and conserve valuable time and effort.

Identification

Most organizations have more than one solution for the identification of vulnerabilities, whether for deployed environments that are network or agent-based, for on-premise and cloud, or across disparate  departments using various tools due to the size and scope of the organization. 

Some solutions detect vulnerabilities earlier in the development stages, some identify them in the actual code, some in software dependencies and others in testing/runtime. These tools generate a growing number of findings  from multiple sources, sent to various teams or stakeholders for resolution. With the growing number of detection tools grows the risk of generating  duplicated and multiplied vulnerabilities, as tools will often detect the same issue in several ways.

On to the next stage in the lifecycle.

Prioritization of Vulnerabilities

Vulnerability Prioritization is an essential stage in the process, helping organizations make sense of the enormous amount of vulnerabilities. These security issues are of different criticality and impact, and there is hardly enough time or resources available to resolve them all - and it isn’t necessary to do so. Some of these vulnerabilities aren’t impactful or reside somewhere that is of negligible importance to the organization due to limited accessibility to the company’s systems.  

Prioritization includes the de-duplication and identification of vulnerabilities that are either the same, but come from different sources, or are actually caused by a single issue, whether a piece of code, a base image of an OS, etc.

Prioritization should include the following:

  1. Insights into the business context of the system which is vulnerable, its application or service and its business impact(usually analyzed in advance in a BIA process);
  2. Analysis of the type of environment in which it resides -  is it in production, or is it a sandbox or testing environment?
  3. Exposure of the system or service in terms of internet access and its mitigating controls (for example: XSS in the web application, which has a WAF protection)The vulnerability criticality, its specific impact on a system and the relevance to that affected system.

Additional approaches to prioritization could include an assessment of the effort required to actually fix a vulnerability. This may require measuring simple things like whether the fix requires a major vs. minor version upgrade, does it require replacing a chain of packages throughout the application which might break some things and will therefore require deeper testing, and other considerations.

As we can see, prioritization shouldn’t be taken lightly as it can save the organization a great amount of time and resources, and also help focus remediation where it matters most. 

The Remediation Lifecycle in Security

Remediation

Remediation itself is the core of many frustrations and challenges. Previous stages included data collection, aggregation and analysis, but this is where the work really begins. 

The first step seems like a no-brainer - identifying the relevant stakeholders, but in many organizations, this becomes a complex challenge. 

Some components should be dealt with by developers of “Application A” or maybe “Application B”, in others the infrastructure team is responsible, or maybe the cloud team, and sometimes it’s the DevOps team. Understanding who needs to take care of what sometimes requires an understanding of our entire development pipeline. 

For example, a vulnerability in a running container may originate from a docker file built somewhere at some point in another part of the organization. While investigating it, instead of it being the responsibility of our “Application A” team where the vulnerability was found, it could take us to another team or to a single individual who built it some time ago.

Identifying what is the root cause of the vulnerability to make sure you solve it once and not 100 times is crucial. It may be a single docker file deployed across your organization, and require one package upgrade by a single person, other times it’ll need the upgrade of the base OS image used in your organization, and sometimes something else (and should be prioritized differently). 

The fixing stage could result in package upgrades, OS patches or individual file fixes for code or IaC files.

It will inevitably involve some sort of testing, which could be impactful in terms of effort and must be considered and conducted by the various engineering teams in the organization.

Last but not least is the verification phase. An issue may have been solved, or a patch could have been  deployed but until verified again in the tool in which the finding originated, a vulnerability can’t be considered fully remediated. One approach to driving remediation is creating SLA levels backed by organization-wide policies to enforce them is where friction begins.

What I've found useful when communicating such policies to get to an understanding with the technological organization, is that security issues are no different than other application bugs. Some are critical and others less so and if they aren’t fixed, they will provide a lower-quality solution, and nobody likes that. I know it's easier said than done, but it’s a start. 

Another critical element is an open communication channel for questions, prioritization improvements, and of course exception management to link up to your risk management process and perhaps your GRC team, if it works separately.

Reporting 

Reporting is sometimes overlooked, and not because it’s not important, but because it is just sometimes very difficult to do.

You’d want to at least make sure you’re measuring the basic things like MTTR, SLA, the number of vulnerabilities at the severities that matter to you, and be able to report to different parts of the organization in a way that matters to them.

For example:

  • Top management - Total vulnerabilities that are considered medium-critical and that have been  opened in the past month. How many are open in and out of SLA, according to the department?
  • Some organizations will need this data including crown-jewel-specific reports, or even a quarterly risk report (with annual loss expectancy).
  • Department heads - Similar to top management but divided according to team/applications
  • Team/applications - Split by individual contributors, the number of findings assigned to them, SLA-Related Metrics, ticket statuses, how many were closed in the last week (not month).

Collecting this data from all systems and combining it into a single view is a challenging endeavor for many companies, attempting various manual methodologies which are time consuming and resource intensive. Automation would be a more efficient alternative. 

If you’re interested in knowing more about how to streamline, improve and optimize your vulnerability remediation process,  check Opus out.