As a researcher in software security, whether you want to validate a novel method you are working on or just to get a feel for what software vulnerabilities look like in the wild, sometimes you want to take a closer look at vulnerabilities in real-world, production-level code. The question is then how to find the actual source code corresponding to a reported software vulnerability. In this post, I’ll show you how to do this for (some) vulnerabilities in the Linux kernel, one of the largest open source code bases out there.

To get information about publicly-known security vulnerabilities in the Linux kernel, we’ll use the CVE Details website, a web interface to the data provided in XML feeds of the National Vulnerability Database (NVD). CVE Details provides access to CVEs—unique, common identifiers for publicly known information-security vulnerabilities in publicly released software packages—including the ability to browse them by vendor, product, date, and type. As an example, you can first look up all CVEs reported for the Linux kernel and then limit the search results to the CVEs reported in 2015. In the remainder of this post, we’ll look at the CVE-2015-4001 and try to locate the exact source code corresponding to this vulnerability.

Every CVE entry contains several standardised fields. In case of the CVE-2015-4001, the textual description of the vulnerability says: “Integer signedness error in the oz_hcd_get_desc_cnf function in drivers/staging/ozwpan/ozhcd.c in the OZWPAN driver in the Linux kernel through 4.0.5 allows remote attackers to cause a denial of service (system crash) or possibly execute arbitrary code via a crafted packet.”

In the references field, which typically holds a list of URLs and other information such as vendor advisory numbers for that CVE, CVE-2015-4001 contains a link to the Git commit b1bb5b4 on git.kernel.org. In addition to a diffstat, this commit contains a description of the vulnerability and the corresponding PoC:

ozwpan: Use unsigned ints to prevent heap overflow

Using signed integers, the subtraction between required_size and offset
could wind up being negative, resulting in a memcpy into a heap buffer
with a negative length, resulting in huge amounts of network-supplied
data being copied into the heap, which could potentially lead to remote
code execution.. This is remotely triggerable with a magic packet.
A PoC which obtains DoS follows below. It requires the ozprotocol.h file
from this module.
...

The diffstat for the commit b1bb5b4 shown below lists 2 changed files, 6 insertions, and 6 deletions. You can see that the type of the parameters status, length, offset, and total_size to the function oz_hcd_get_desc_cnf as well as the type of its local variables copy_len and required_size has been changed from int to unsigned int (u8, u16, and unsigned int, respectively). These changes in the source code in drivers/staging/ozwpan/ozhcd.c are consistent with the CVE-2015-4001’s textual description stating that there is an “integer signedness error in the oz_hcd_get_desc_cnf function in drivers/staging/ozwpan/ozhcd.c.

diff --git a/drivers/staging/ozwpan/ozhcd.c b/drivers/staging/ozwpan/ozhcd.c
index 5ff4716..784b5ec 100644
--- a/drivers/staging/ozwpan/ozhcd.c
+++ b/drivers/staging/ozwpan/ozhcd.c
@@ -746,8 +746,8 @@ void oz_hcd_pd_reset(void *hpd, void *hport)
 /*
  * Context: softirq
  */
-void oz_hcd_get_desc_cnf(void *hport, u8 req_id, int status, const u8 *desc,
-			int length, int offset, int total_size)
+void oz_hcd_get_desc_cnf(void *hport, u8 req_id, u8 status, const u8 *desc,
+			u8 length, u16 offset, u16 total_size)
 {
 	struct oz_port *port = hport;
 	struct urb *urb;
@@ -759,8 +759,8 @@ void oz_hcd_get_desc_cnf(void *hport, u8 req_id, int status, const u8 *desc,
 	if (!urb)
 		return;
 	if (status == 0) {
-		int copy_len;
-		int required_size = urb->transfer_buffer_length;
+		unsigned int copy_len;
+		unsigned int required_size = urb->transfer_buffer_length;
 
 		if (required_size > total_size)
 			required_size = total_size;
diff --git a/drivers/staging/ozwpan/ozusbif.h b/drivers/staging/ozwpan/ozusbif.h
index 4249fa3..d2a6085 100644
--- a/drivers/staging/ozwpan/ozusbif.h
+++ b/drivers/staging/ozwpan/ozusbif.h
@@ -29,8 +29,8 @@ void oz_usb_request_heartbeat(void *hpd);
 
 /* Confirmation functions.
  */
-void oz_hcd_get_desc_cnf(void *hport, u8 req_id, int status,
-	const u8 *desc, int length, int offset, int total_size);
+void oz_hcd_get_desc_cnf(void *hport, u8 req_id, u8 status,
+	const u8 *desc, u8 length, u16 offset, u16 total_size);
 void oz_hcd_control_cnf(void *hport, u8 req_id, u8 rcode,
 	const u8 *data, int data_len);

Moreover, in the original version of the drivers/staging/ozwpan/ozhcd.c file—i.e., in the b1bb5b4’s parent commit d114b9f—the oz_hcd_get_desc_cnf function is defined as shown below. That source code is consistent with the description of the vulnerability in the b1bb5b4’s comment saying that “using signed integers, the subtraction between required_size and offset could wind up being negative, resulting in a memcpy into a heap buffer with a negative length resulting in huge amounts of network-supplied data being copied into the heap, which could potentially lead to remote code execution”. In the code below, the variable urb is defined by the parameters hport and req_id to the oz_hcd_get_desc_cnf function, so the local variable required_size depends on the input to oz_hcd_get_desc_cnf. Because the other local variable copy_len and the parameter offset are also of type int, the assignment copy_len = required_size-offset will result in a negative value of copy_len if urb->transfer_buffer_length is smaller than offset. That, in turn, will result in the memcpy(urb->transfer_buffer+offset, desc, copy_len) with a negative length, just as the commit comment describes.

void oz_hcd_get_desc_cnf(void *hport, u8 req_id, int status, const u8 *desc, int length, int offset, int total_size)
  {
  	struct oz_port *port = hport;
  	struct urb *urb;
    ...
 	urb = oz_find_urb_by_id(port, 0, req_id);
  	if (!urb)
  		return;
  	if (status == 0) {
 		int copy_len;
 		int required_size = urb->transfer_buffer_length;

  		if (required_size > total_size)
  			required_size = total_size;
 		copy_len = required_size-offset;
 		if (length <= copy_len)
 			copy_len = length;
 		memcpy(urb->transfer_buffer+offset, desc, copy_len);
        ...

So given the above observations, it turns out that the link to the Git commit b1bb5b4 in CVE-2015-4001’s references field points us to the actual source code corresponding to that vulnerability. If you browse through the CVEs for the Linux kernel, especially the older ones dating back from 2015 and earlier, you’ll find that many of them—though by far not all—contain a link to a (patch) commit on git.kernel.org or github.com which points you exactly to the source code corresponding to that CVE.