|Image credit: Pexels|
When January updates for Windows got released, the public was alarmed by news of critical vulnerability CVE-2019-0547 in DHCP clients. A high CVSS score and the fact that Microsoft did not release an Exploitability Index assessment right away, which made it more difficult for users to decide whether they needed to update their systems immediately, stirred up the heat. Some publications even speculated that the absence of the Exploitability Index pointed to the appearance of a usable exploit in the near future.
Solutions such as MaxPatrol can identify which computers on a network are vulnerable to certain attacks. Other solutions detect such attacks. For these solutions to work, both the rules for identifying vulnerabilities in products and the rules for detecting attacks on those products need to be described. This, in turn, will be possible if for each separate vulnerability we figure out the vector, method, and conditions of exploitation. In other words, all the details and nuances related to exploitation. This requires a much more in-depth and full understanding compared to what can usually be found in descriptions on vendors’ sites or in CVE, for example:
The reason for the vulnerability is that the operating system incorrectly handles objects in memory.
So, to update our products with rules for detecting attacks targeting the newly discovered vulnerability in DHCP and rules for identifying affected devices, we needed to dive into all the details. With binary vulnerabilities, one can often get to the faults lying at their root by using patch-diff, which compares and identifies the changes to the binary code of an app, a library, or an operating system’s kernel made by a specific patch or update fixing the error. But Step 1 is always reconnaissance.
Note: To go directly to the vulnerability description, without reading the DHCP concepts it’s based on, you can skip the first several pages and go straight to the section titled “DecodeDomainSearchListData function“.
Go to a search engine and go through everything currently known about the vulnerability. This time there’s not much detail, and most of it is information recycled from the original publication on the MSRC site. This situation is typical for errors found by Microsoft during an internal audit.
From the publication, we find that we are dealing with a memory corruption vulnerability contained in both client and server systems running on Windows 10 version 1803 and that it manifests when an attacker sends specially crafted responses to the DHCP client. A couple days after, the page will also contain Exploitation Index ratings:
As we can see, MSRC gave a rating of 2 — Exploitation Less Likely. This means the error is very likely either non-exploitable, or exploiting it is so difficult that it would require too much effort. Admittedly, Microsoft does not have a habit of lowballing such scores. This is partly due to reputational risks, as well as the relative independence of the response center within the company. So let’s assume that if exploitation is indicated as unlikely, that is probably true. We could finish the analysis then and there. But it’s always a good idea to double-check and at least see what exactly the vulnerability was. While vulnerabilities may be diverse, they also tend to reoccur and pop up in other places.
On the same site we download the patch (security update) provided as an .msu archive, unpack it, and look for the files most likely to be related to client-side processing of DHCP responses. Lately this has become more difficult. Updates are now provided not as separate packages fixing specific errors, but as a single package containing all monthly fixes. This increases the number of unrelated changes that we must wade through to find what truly interests us.
In the plethora of files, our search turns up several libraries matching the filter, and we compare these with their versions on an unpatched system. The dhcpcore.dll library looks the most promising of all. Meanwhile BinDiff shows minimal changes:
In fact, more or less significant changes are made only to one function — DecodeDomainSearchListData. If you are well familiar with the DHCP protocol and its rarely used functions, you already have an idea of what list is handled by that function. If not, we move to Step 2: reviewing the protocol.
DHCP and its options
DHCP (RFC 2131 | wiki) is an extensible protocol whose extensibility is implemented by means of the options field. Each option is described by a unique tag (number, identifier), size of the data contained in the option, and the data itself. This practice is typical for network protocols, and one of these options “implanted” in the protocol is Domain Search Option, which is described in RFC 3397. It allows a DHCP server to set standard domain name endings on clients. Those will be used as DNS suffixes for connections set up in this way.
For example, let’s say that on our client we have set the following name endings:
Then, in any attempt to determine address by domain name, these endings will be plugged in to DNS requests one by one, until a match is found. For instance, if the user types ru in the browser address bar, DNS requests will be formed first for ru.microsoft.com and then for ru.wikipedia.org:
In fact, modern browsers are too smart, so they react to names similar to FQDN by redirecting to a search engine. So we will later provide the output of less “thoughtful” utilities:
The reader might think this is the essence of the vulnerability. In itself, the ability to alter DNS suffixes using a DHCP server, when any device on the network can be identified as such, is a threat to clients requesting any network parameters using DHCP. But that’s not all. As evident from the RFC, this is considered quite legitimate and documented behavior. A DHCP server is, in effect, a trusted component able to impact devices that connect to it.
Domain Search option
The Domain Search Option number is 0x77 (119). As with all other options, it is coded by a single-byte tag with option number. And like most options, the tag is followed by a single-byte size of the data following the size. A DHCP message can contain more than one copy of the option. In this case, data from all such sections is concatenated in the same order as in the message.
In the example taken from RFC 3397 the data is divided into three sections of 9 bytes each. As seen from the picture, subdomain names in the full domain name are coded with a single-byte name length, followed by the name itself. The full domain name code ends in a null byte (null size of the subdomain name).
Also, the option uses the simplest data compression method: reparse points. Instead of the domain name size, the field might contain 0xc0. Then the next byte will establish the offset relative to the start of the data of the option used to search for the end of the domain name.
So, in our example, we have a coded list of two domain suffixes:
The DHCP option with number 0x77 (119) allows the server to set DNS suffixes on clients. But not on computers with Windows operating systems. Microsoft systems have traditionally ignored this option, so historically endings of DNS names were applied using group policies, when necessary. But things changed recently, when the new release of Windows 10 version 1803 introduced handling for Domain Search Option. As follows from the function name in dhcpcore.dll that was changed, it is the added handler itself that contains the error.
Now let’s get to work. Comb the code a little, and here’s what we find. The DecodeDomainSearchListData procedure, as one might guess, decodes data from the Domain Search Option of the message received from the server. As input, it takes a data array packed as described earlier, and it outputs a null-terminated string containing a list of domain name endings separated by commas. For instance, the function will transform the data from the above example into the following string:
DecodeDomainSearchListData is called from the UpdateDomainSearchOption procedure, which writes the returned list to the “DhcpDomainSearchList” parameter of the registry key:
which stores the main parameters for the specific network interface.
The DecodeDomainSearchListData function makes two passes. On the first pass, it performs all actions except making an entry to the output buffer. So the first pass is for calculating the size of memory needed to hold the returned data. On the second pass, memory is allocated for that data and the allocated memory is filled. The function is not too big—about 250 instructions—and its main job is to handle each of the three possible variants of the character in the incoming stream: 1) 0x00, 2) 0xc0, or 3) all other values. The fix for the error related to DHCP boils down to adding a check of the size of the resulting buffer at the start of the second pass. If the size is zero, memory is not allocated for the buffer, and the function completes execution and returns an error:
So the vulnerability shows itself only when the size of the target buffer is zero. And in the very beginning the function checks its inputs, whose size cannot be less than two bytes. Therefore, exploitation requires finding a non-empty domain suffix option formed in such a way that the size of the output buffer equals zero.
The first thing that comes to mind is using the reparse points to make sure that non-empty input data generates an empty string of output:
A server set up to respond with an option with such content will indeed cause an access violation on non-updated clients. Here is why. At every step, when the function parses part of the full domain name, it copies that part into the target buffer and appends a period. In this example from the RFC, the following data will be copied to the buffer in the following order:
Then, when the zero domain size is encountered in the input data, the function changes the previous character in the target buffer from a period to a comma:
and keeps parsing:
When input data ends, all that’s left is replacing the last comma with a null character, and here’s a string ready to be written to the registry:
What happens when the attacker sends a buffer formed as described? From the example we can see the list it contains is made of a single element — an empty string. On the first pass, the function calculates the output data size. Since the data does not contain any non-zero domain name, the size is zero.
On the second pass, a heap memory block is allocated for the data and the data is copied. But the parsing function immediately encounters the null character indicating the end of the domain name, so, as explained before, it changes the previous character from a period to a comma. And then we have a problem. The target buffer iterator is set to zero. There’s no previous character. The previous character belongs to the header of the heap memory block. And this character will be changed to 0x2c, which is a comma.
However, this happens only on 32-bit systems. Using unsigned int to store the current position of the target buffer iterator causes changes in handing on x64 systems. Let’s look more closely at the fragment of code responsible for writing the comma to the buffer:
One is subtracted from the current position using the 32-bit register eax, but when addressing the buffer, the code addresses the full 64-bit register rax. On the AMD64 architecture any operations with 32-bit registers zero out the high halfword of the register. This means that the rax register, which used to contain a zero, will after subtraction store 0xffffffff and not –1. Therefore on 64-bit systems the value 0x2c will be written at the address buf[0xffffffff], way outside of the memory allocated for the buffer.
These findings strongly correlate with the exploitability scoring by Microsoft, because to exploit this vulnerability, an attacker has to learn how to perform remote heap spraying on the DHCP client, as well as have sufficient control of heap memory distribution to make sure that preset values (namely, comma and period) are written to the prepared address and cause controllable adverse effects.
Otherwise, writing the data to an unchecked address will result in failure of the svchost.exe process with all the services it may host at the moment, and subsequent restart of those services by the operating system. That’s a fact attackers may also use to their advantage if circumstances permit.
This is seemingly all we can say about the studied error. But we still feel it’s not the end. As if we have not yet considered every option. There must be more than meets the eye…
Most likely, that’s the case. If we look closely at the type of data causing the error and compare that data with how exactly the error occurs, we can see that the list of domain names may be changed in such a way that the resulting buffer size will not be zero, yet there will still be an attempt to write it outside of the buffer. For that to happen, the first element of the list must be an empty string, and all others may contain nominal domain names. For example:
The option includes two elements. The first domain suffix is empty, it ends immediately in a null byte. The second suffix is .ru. The calculated size of the output string will be three bytes, allowing it to pass the check for empty target buffer introduced in the January update. At the same time, a zero at the very beginning of the data will force the function to write a comma as the previous character in the resulting string, but since the current position of the iterator in the string, as in the example before, is zero, it will write outside of the allocated buffer.
Now we need to confirm our theoretical results by a practical test. Let’s simulate a case where the DHCP server responds to a client request with a message with the presented option, and we immediately find an exception when trying to write a comma at position 0xffffffff of the buffer allocated for the resulting string:
Here register r8 contains a pointer to incoming options, rdi contains the address of the allocated target buffer, and rax contains the position in that buffer where the character must be written. These are the results we got on a system with all updates installed (as of January 2019).
We wrote to Microsoft informing them of the problem, and guess what? They lost our message. Yes, this sometimes happens even to the best and most reputable vendors. No system is perfect, and in this case you need to find alternative ways of communication. A week later, having not received even an automated response, we made contact directly on Twitter. After several days of analysis we found that the details we sent had nothing to do with CVE-2019-0547 and actually formed a separate vulnerability, which will get a new CVE identifier. A month later, in March, a new patch was released, and the issue got a number: CVE-2019-0726.
This is how sometimes when trying to figure out a 1-day vulnerability, you may accidentally stumble upon a 0-day vulnerability, just by trusting your instincts.