Using machine learning trained on data from more than two dozen sources, a team of university researchers has created a model for predicting which vulnerabilities will likely result in a functional exploit, a potentially valuable tool that could help companies better decide which software flaws to prioritize.
The model, called Expected Exploitability, can catch 60% of the vulnerabilities that will have functional exploits, with a prediction accuracy — or “precision,” to use classification terminology — of 86%. A key to the research is to allow for changes in certain metrics over time, because not all relevant information is available at the time a vulnerability is disclosed, and using later events allowed the researchers to hone the prediction’s accuracy.
By improving the predictability of exploitation, companies can reduce the number of vulnerabilities that are deemed critical for patching, but the metric has other uses as well, says Tudor Dumitraș, an associate professor of electrical and computer engineering at University of Maryland at College Park, and one of the authors of the research paper published last week at the USENIX Security Conference.
“Exploitability prediction is not just relevant to companies that want to prioritize patching, but also to insurance companies that are trying to calculate risk levels and to developers, because this is maybe a step toward understanding what makes a vulnerability exploitable,” he says.
The University of Maryland at College Park and Arizona State University research is the latest attempt to give companies additional information on which vulnerabilities could be, or are likely to be, exploited. In 2018, researchers from Arizona State University and USC Information Science Institute focused on parsing Dark Web discussions to find phrases and features that could be used to predict the likelihood that a vulnerability would be, or had been, exploited.
And in 2019, researchers from data-research firm Cyentia Institute, the RAND Corp., and Virginia Tech presented a model that improved predications of which vulnerabilities would be exploited by attackers.
Many of the systems rely on manual processes by analysts and researchers, but the Expected Exploitability metric can be completely automated, says Jay Jacobs, chief data scientist and co-founder at Cyentia Institute.
“This research is different because it focuses on picking up all of the subtle clues automatically, consistently and without relying on the time and opinions of an analyst,” he says. “[T]his is all done in real time and at scale. It can easily keep up and evolve with the flood of vulnerabilities being disclosed and published daily.”
Not all the features were available at the time of disclosure, so the model also had to take into account time and overcome the challenge of so-called “label noise.” When machine-learning algorithms use a static point in time to classify patterns — into, say, exploitable and nonexploitable — the classification can undermine the effectiveness of the algorithm, if the label is later found to be incorrect.
PoCs: Parsing Security Bugs for Exploitability
The researchers used information on nearly 103,000 vulnerabilities, and then compared that with the 48,709 proofs-of-concept (PoCs) exploits collected from three public repositories — ExploitDB, BugTraq, and Vulners — that represented exploits for 21,849 of the distinct vulnerabilities. The researchers also mined social-media discussions for keywords and tokens — phrases of one or more words — as well as created a data set of known exploits.
However, PoCs are not always a good indicator of whether a vulnerability is exploitable, the researchers said in the paper.
“PoCs are designed to trigger the vulnerability by crashing or hanging the target application and often are not directly weaponizable,” the researchers stated. “[W]e observe that this leads to many false positives for predicting functional exploits. In contrast, we discover that certain PoC characteristics, such as the code complexity, are good predictors, because triggering a vulnerability is a necessary step for every exploit, making these features causally connected to the difficulty of creating functional exploits.”
Dumitraș notes that predicting whether a vulnerability will be exploited adds additional difficulty, as the researchers would have to create a model of attackers’ motives.
“If a vulnerability is exploited in the wild, then we know there is a functional exploit there, but we know other cases where there is a functional exploit, but there is no known instance of exploitation in the wild,” he says. “Vulnerabilities that have a functional exploit are dangerous and so they should be prioritized for patching.”
Research published by Kenna Security — now owned by Cisco — and the Cyentia Institute found that the existence of public exploit code led to a sevenfold increase in the likelihood that an exploit would be used in the wild.
Yet prioritizing patching is not the only way the exploit prediction can benefit businesses. Cyber-insurance carriers could use exploit prediction as a way to determine the potential risk for policy holders. In addition, the model could be used to analyze software in development to find patterns that might indicate whether the software is easier, or harder, to exploit, Dumitraș says.