How do you measure the success of a phishing simulation? If you’re like many people in the cybersecurity community, you may find yourself focusing on statistics like click rates to determine whether or not you’ve led an effective anti-phishing campaign. After all, if subsequent simulations generate fewer clicks, surely that means your employees have understood the message you’re trying to get across about the dangers of phishing—right?
Unfortunately, this simply isn’t the case. And the reason is this: click rates and other popular metrics used to evaluate security training don’t tell the whole story. Complicating this picture is the fact that it’s not common knowledge as to why these statistics can’t be relied upon. In this post, we’ll suggest a viable alternative method to measure the results of a phishing simulation that is robust enough to provide a good basis for evaluating your security awareness program.
The Challenge With Click Rate
In a previous post, we discussed the fallacy of click rates. And more recently, this article from the National Cyber Security Centre also suggests that click rate is a flawed statistic. Yet security training customers and vendors alike still assign a high level of importance to this metric. Here are the two main problems with this assumption:
- Click rates are easy to control. By definition, harder simulations get high click rates while simple ones generate fewer clicks. If someone only wants to improve click rates in phishing simulations, all they must do is start with a difficult simulation and lower the bar over time. Any seasoned cybersecurity professional can obtain such results, regardless of a simulation’s actual effectiveness.
- Click rates only measure a point in time. Each time we measure a click rate, we’re looking at different people encountering different simulations. Measuring click rates from month to month or from simulation to simulation might seem logical on a bar chart, but it doesn’t allow us to understand employees’ true learning curve. If the click rate is 30% in January and 20% in February, we might assume we improved. However, we cannot know if the same employees who fell prey to simulations in January also did so in February—or perhaps these are different employees altogether. When combined with changes in simulations and variation between campaigns, these data points become even more meaningless.
The True Measure of Success
We don’t believe this is an inherent issue with phishing simulations—it’s solvable. We use a variety of proprietary statistics to measure simulation results, and we’ll discuss two of them here, in the hope that these examples will help other organizations to improve their metrics:
Serial Clickers Rate: Serial clickers will usually click on almost any simulation regardless of its complexity. They are a fairly small group, but one that poses the highest risk to the organization. Measuring their behavior provides a solid overall measure of how more susceptible employees behave over time.
To create this measure, we try to define each employee based on their tendency to click on simulations. We analyze statistics that combine clicking patterns into a single score. We then measure the percent of the organization who are serials. This is quite complex. If you’re only starting your simulation program, we recommend that you begin by calculating the employee’s overall percent of clicks, then setting the threshold beyond which that employee should be classified as a serial. If you’re in at least your second year of running simulations, you may want to look solely at the percent of clicks in the last year.
This is an effective measure of each employee, combined into an aggregate score that allows our customers to see how they’ve progressed over time with a single metric that contains multiple data points.
Employee Resilience Score: A major question involves how to measure gradual changes in employees learning curves: do they improve or worsen, and how should we view the differences over time? Employee turnover, downsizing, and growth are changes that any company faces. After a couple of simulations, it becomes evident that the aggregate click rate does not distinguish between employees. So we can’t really say what a specific click rate means without drilling into the data, and doing so is time-consuming and requires some background in statistics.
It is indeed a complex challenge, yet we have established a measure that solves this complexity. With it, we calculate the number of simulations an employee successfully passes between two failures. The higher the number (e.g., the more successes between two failures), the better the employee is at identifying phishing attacks. This measure reflects the behavior of employees clicking within two or more simulations, who we call frequent clickers. They represent the majority of employees, and with this measure, we’re able to show their progress over time.
The Employee Resilience Score reflects employees’ ability to withstand ongoing hacker attempts. The value of this score is that it measures employees’ resilience to phishing scams over time. It also provides data on how difficult it may be for an attacker to get past an employee’s psychological defenses.
A Better Way Forward
There’s been much debate about the insufficiency of click rate measurements, yet little has been done to provide customers with alternative metrics. In this post, we’ve suggested the use of two of our metrics to improve simulation performance management. We hope that these will assist the security community in better addressing phishing risks.
One final note on phishing simulations and the metrics used to measure them—there’s a famous saying (commonly attributed to Peter Drucker) that “Culture eats strategy for breakfast.” In other words, if we want to change employee behavior, we must also change corporate culture.
How can we achieve this? By seeking out metrics that enable us to look beyond the here and now and into the patterns of employees’ behavior over time.