Fundamentally this is always going to be a messy problem. We’re trying to reconcile a set of (often ambiguously defined) ideal concepts with an inconsistent technical reality that does not have a direct representation of those concepts.
We’ll go through a few examples of how I like to approach this problem, and a few lessons learned along the way. What I hope you take away from this is just a way of thinking about the problem space, which hopefully will allow you to recognise challenges and opportunities when you see them.
How do we go from tooling output to risk reporting?
There are plenty of ways to go about this. Here we’ll go through part of my preferred, general approach. I have used this approach in various forms. It is data-driven that allows you to roll up raw, technical tooling output and reference data to shape defensible reporting on business security performance.
The approach is staged, and we handle and think about data at each stage differently. You can probably already spot from this diagram that this is largely going to be a data engineering exercise with a focus on data quality and transformation. The connection between raw tooling output and the benchmark reporting has to have a clear and traceable path.
We will start with cybersecurity frameworks - ultimately that is how enterprise security is benchmarked and measured. You might have opinions about the completeness or relevance of these frameworks, but if you want adoption at some point a risk team will have to use the data you provide them and technical output will not do the job. Enterprise risk functions use frameworks and you have to work with that.
Data all the way down
Technical security reporting often starts with raw data. This data can be anything from logs, alerts, or even manual assessments. The challenge is to transform this raw data into something meaningful that can be easily understood by stakeholders at all levels of an organization.
How do we present the overall security posture?
In my experience, this is the highest level at which this data-driven view remains useful. Attempting to aggregate data further will not provide any actionable data points. Higher level views will focus on aggregating a selected subset of indicators to provide a view of strategic priorities (e.g. “uplift access controls”) supporting an organisation-wide program.
One step down - posture framed by function
This is an example point in time scorecard. As it is based on periodic measurement, a dashboard can display trends for each aspect.
In this compliance view, each cell is a KCI showing compliance status for the control against an application.
Key Indicators can be positioned against most cybersecurity frameworks (CIS, etc) or newer versions of the same framework
How do we construct a Key Indicator (KI)?
Working from the assumption that we’re measuring ACME’s company policy framed using CIS controls, we examine the policy and policy intent, isolate the individual requirements and then map each of these requirements to a technical measurement.
- A KI consists of one or more technical measurements based on requirements
- Each technical measurement is a numeric metric that can be calculated from raw data
- Each metric has a performance threshold (target) for pass/fail
- Each metric is expressed numerically as a fraction - a numerator and a denominator that generally represent resouces over a scope
- This allows for aggregation of metrics across different scopes, e.g. across applications, business units, or the entire organisation
- Each metric has a timestamp for the measurement time
- Each metric has a timestamp for the oldest data point used in the calculation
- This allows us to track data freshness and decay
- Key Indicators are constructed the same way for risk (KRI) and control (KCI). You can add KPIs for performance in this space, but as KIs are fairly time-consuming to construct and maintain,
How do we measure a Key Indicator?
Each Key Indicator (KI) is measured by evaluating the metrics that make up the KI. Each metric has a target, which is a percentage of the total scope that must be met for the KI to be considered effective.
The resulting measurement (see example) is a categorical metric value.
Example control effectiveness SQL measurements
WITH metric_evaluation AS (
SELECT
metric_id,
CASE
-- Evaluate the pass/fail of each metric
WHEN (value_count / scope_count) > target THEN 1
ELSE 0
END AS is_above_target
FROM vulnerability_metrics
WHERE
-- All the metrics available for this KCI
metric_id IN (
'coverage',
'seconds_since_oldest_checkin',
'seconds_since_oldest_policy',
'second_since_last_scan',
)
-- Reporting period is the last week
-- Start of previous week (Monday)
AND report_date >= DATEADD(wk, DATEDIFF(wk, 7, GETDATE()), 0)
-- Start of current week (Monday)
AND report_date < DATEADD(wk, DATEDIFF(wk, 0, GETDATE()), 0)
)
SELECT
-- This implements the threshold percentages
-- There are 4 metrics in this example, and we are setting the thresholds to:
CASE
WHEN SUM(is_above_target) = 4 THEN 'FE'
WHEN SUM(is_above_target) >= 3 THEN 'SE'
WHEN SUM(is_above_target) >= 2 THEN 'PE'
ELSE 'NE'
END AS result
FROM metric_evaluation;
How do we roll up Key Indicators to categorical values?
We use the battery/distribution calculation for rollup, later followed by a reporting view that allows us to filter and display the data in a way that is meaningful to stakeholders.
Views can be filtered for combinations of control effectiveness values,
- FE - Fully Effective
- SE - Substantically Effective
- PE - Partially Effective
- NE - Not Effective
This example is a rollup of Fully Effective AND Substantially Effective controls,
How do we measure control effectiveness?
- Control effectiveness is a categorical rating, not a numeric value
- Control effectiveness rating is a measure of the number of control requirements met
- Control requirements are not weighted
- My experience is that weightings tend to decrease stakeholders’ confidence in the data, so adding them tends to decrease the value of the reporting
- Threshold is the bucketing percentage of metrics that meet their targets, that is - unweighted percentage of requirements met
Once we set the thresholds we can determine the effectiveness based on the number of metrics that are available,
How do we determine confidence?
For this we use the Lowest Common Denominator (LCD) rule for determining the confidence rating based on metric (and metric input) maturity. This is a manually set rating and is a property of the metric. We use the LCD rule to discourage the inclusion of low maturity data in the Key Indicator (KI) reporting.
The LCD rule is applied to,
- The confidence rating of the metric, rolling up to KI level
- The data freshness rating of the metric, rolling up to KI level
Having selected the LCD rule, we can then apply it the metrics that make up the Key Indicator to determine the confidence rating.
Requirements
Here are some sample requirements that will likely hold true regardless of the specific solution you choose or encounter.
ID | Requirement | Description |
---|---|---|
REP-001 | Reporting traceability | All reporting must be traceable back to the raw data and transformations applied |
REP-002 | Reporting consistency and queryability | Reporting must be consistent and queryable across different views and levels of aggregation |
SEC-001 | Access controls | Access to reporting data must be segmented with sufficient granularity, managed through a role-based access control (RBAC) system that integrates with the enterprise access management system |
DAT-001 | Data quality | Data must be of sufficient, measurable and assured quality to support the reporting process |
DAT-002 | Data freshness | Data must sufficient recent to accurately report on current state |
DAT-003 | Data changes | When action is taken based on reported data, it must be possible to trigger an update (e.g. rescan) of the tooling that generated the original measurement point(s) and the reporting must be updated to reflect the new data |