The recent announcement of the OWASP Top 10 2017 RC1 has many people asking questions about what exactly this is the Top 10 of, how did it get here, and does the data really support the proposed Top 10. I was curious to see what I could figure out, so I thought I would take a look at the data that was publicly released.
Disclaimer: I do not have insight into how each data set was collected, so I will just have to take it all at face value.
Data Call Results
Looking across the data, a decent collection of industries is represented (though the distribution is unknown).
The industries are spread across North/South America, Europe, AsiaPac, and the Middle East (again, the distribution is unknown).
The data sources vary a bit with a number of consulting organizations and product vendors as the primary input.
It looks like a decent distribution of tooling, with commercial dynamic assessment tools (DAST) being the most common.
The traditional "enterprise" languages like Java and .NET are most representative of the languages assessed. The tooling vendors dominate the numbers by the nature of tools typically scaling to test many more applications than the humans would.
(The numbers here represent the number of webapps taken from the percentages and total number of webapps provided in by the contributors.)
Looking at the vulnerabilities, there are an impressive amount at first blush as the total number of vulnerabilities reported was 2.3 million! That’s a lot of vulnerabilities within a two year span (2014-2015). However, if you look closely, Veracode contributed 2.2m, or approximately 96% of the total number of vulnerabilities reported.
Veracode contributed 99% of the XSS, 95-96% of the main three injection vulnerabilities, and 99% of the Unchecked (Unvalidated, Open, etc.) Redirects and XML External Entity (XXE) findings. These numbers equate to on average 4 SQLi and 43 XSS per application that Veracode assessed.
This brings up an interesting point related to metrics from a tool versus manual/hybrid approach. Tools will almost always report every instance of a vulnerability as a unique finding. When humans review, we typically collapse all instances of the same type of finding into one. This seems to be based on a difference of expectations (and marketing): Tools are expected to find all instances of a vulnerability where as humans typically will find something, look for a few other instances, and count it as one. Also, there are some categories of vulnerabilities that tools are pretty good at finding (evidenced above), many times more so than humans, but there are other categories that tools are unable to find because they require context. Humans can understand context within the application, tools struggle with this issue.
Another interesting tidbit is that Sonatype submitted vulnerability data, but was on the small dataset and not included due to only having data for a single category. For their 1,500 Java applications, they found 30k vulnerable libraries, or 20 vulnerable libraries per application. These kind of numbers lend ammunition to the argument that we have a problem with application inventory/dependency management (not that anyone who works with large numbers of applications would dispute this).
There is data (to some extent) for each of the Top 10 categories, with the exception of the new ones (A7 & A10). There was a similar case with the Top 10 2013 list when "A9 - Using Known Vulnerable Components" was added. Without data, we have to look a little more at where the other entries may have come from. The only references I could find to A7 and A9 were recommendations from Contrast to add them to the Top 10. While I don't disagree that these are issues, I'm trying to determine the justification behind making them the only two new entries in this Top 10. Anti-automation was brought up several times in the data call and mailing list, but it still seems to be missing from A7 as the focus is not on that part of "Insufficient Attack Protection".
(List of recommendations for additions to the T10 from both large and small dataset contributors.)
I skimmed through the Top 10 mailing list archives to see what discussions were had about what to add or removed from the Top 10 for this iteration. I expected to spend a couple hours doing so and found that it took 10 minutes. There are two or three emails from June 2016 - April 2017 with relatively basic questions about including SSRF in the Injection category or Mass Assignment in Authorization, but nothing else. If there was a discussion about the additions/modifications to the list, it wasn't part of the mailing list. I thought perhaps maybe there had been discussions on the "project-top-10" Slack channel, but there wasn't. The channel was created on July 10, 2016, and the last post was on July 10, 2016. Maybe I'm missing something, but I can't find any substantial community/project level discussion about what to do with this iteration of the OWASP Top 10 prior to the announcement of RC1.
What is the OWASP Top 10?
According to excerpts from the RC1 document:
"The goal of the Top 10 project is to raise awareness about application security by identifying some of the most critical risks facing organizations." - Page 3
"The 2010 version was revamped to prioritize by risk, not just prevalence, and this pattern was continued in 2013 and this latest 2017 release." - Page 3
"The primary aim of the OWASP Top 10 is to educate developers, designers, architects, managers, and organizations about the consequences of the most important web application security weaknesses." - Page 4
"Welcome to the OWASP Top 10 2017! This major update adds two new vulnerability categories for the first time:" - Page 4
"NOTE: The T10 is organized around major risk areas, and they are not intended to be airtight, non-overlapping, or a strict taxonomy. Some of them are organized around the attacker, some the vulnerability, some the defense, and some the asset. Organizations should consider establishing initiatives to stamp out these issues." - Page 5
So basically the Top 10 is a prioritized collection of "issues", "risks", "weaknesses" or "vulnerabilities" of varied types. Not being clear on what exactly this is a list of contributes to some of the confusion.
There are lots of discussions and opinions on what the Top 10 should list. It's pretty confusing right now to explain to developers that, "yes, there is both a XSS and Injection category in the Top 10" and "yes, we know that XSS is a type of injection" or "yes, A10 is pretty much all the others in an API form." That, and the difference in scope or scale of the varied categories makes it difficult to use. One of the things that is dearly needed is a standard taxonomy for vulnerability classifications. A few attempts have been made in the past at OWASP, but right now the only one I really know about is the CWE and it has it’s own issues as well.
Where Do We Go From Here?
Should we have a list of implementation risks and design/architectural risks? Overall program level risks? Who is the target audience for this?
Unfortunately, we have more questions than answers. However, these questions can be answered with focused effort and time investment. I'm personally in favor of Top 10 lists for program, architecture, implementation, and operations. I think each of these areas have unique audiences and challenges. The metrics collected for the Top 10 2017 represent what was found by either tools or time-boxed humans. It’s a subset of vulnerabilities that are typically found, but are probably not representative of what is actually out there or the bigger risks that are faced. I would like to see an attempt to build a questionnaire for different roles in software development, asking for experienced opinions on the common and important issues that are typically encountered.
For something as important and established in the industry as the OWASP Top 10, I expected a long and vigorous discussion leading up to the release candidate. For whatever reason, I can't find one. Historically, the Top 10 has undergone small tweaks after the release candidate, but I haven’t found an instance where one of the categories changed. For those of us that disagree with where it is at, maybe we need to invest the time to participate. The OWASP Top 10 is one of the most well known and influential projects that we can openly contribute to.