Exploiting the Ubiquisys/SFR femtocell webserver (wsal/shttpd/mongoose/yassl embedded webserver)Posted by Nico Golde in
As a part of our research on the SFR femtocell I had the pleasure to look for a vulnerability
that might assist us in compromising remote devices. One of the obvious software targets of the box has been the webserver (wsal) that is used to serve some web pages used for configuring the device. As all other services on the box, it runs with root privileges. The device itself runs a Linux 2.6.18-ubi-sys-V2.0.17 on an ARM926EJ (ARMv5). The bug (CVE-2011-2900): I started reversing the binary when at some point Kevin pointed out a string in the binary that hinted towards the Open Source project shttpd (which has been relabeled in mongoose at some point and that is also the basis for the yassl embedded webserver. So this made things a lot easier. As the web service is fairly powerful (including CGI, SSI support) I first looked for non-software related bugs. From shttpd.c/defs.h:
Hmm, that's already more methods than expected. So it made sense to look at those methods. As the webserver can execute CGI I assumed PUT might be interesting in order to push stuff onto the device and execute it. However, it turned out that the web directory is mounted read-only (and the code gracefully handles path traversal attempts). DELETE died for the same reason and it seemed unlikely that this would result in code execution anyway. Back to software vulnerabilities and the PUT functionality. Let's have a look at the function handling PUT requests (io_dir.c/put_dir()):
The function is pretty simple. It loops over the URL path and tries to create each directory of the complete path (Similar to mkdir -p). To do that, the path chunk is copied into the stack buffer buf before it is passed to stat and mkdir. The len argument of the memcpy operation is determined by the distance between two consecutive / characters. Assuming that path can be longer than FILENAME_MAX (+/- a few bytes overhead for the rest of the URL), this is a classical stack-based buffer overflow and seemed like a nice candidate for code execution. In this code snippet the len argument is guarded to not overflow (assert statement). However, assert is only in place if the binary was not compiled with -DNDEBUG, right? I haven't seen any calls to assert wrapper function while looking at the disassembly of wsal. But let's check this... The following output is generated using the radare. If you're on linux, you need a multi-arch reversing tool chain (with unix philosophy in mind) and you can't or don't want to use IDA, I can highly recommend looking at this tool (even though it's still work-in-progress).
As we can see, we see nothing. In particular, no comparison and no call to __assert_fail. So we're lucky, looks like we found our candidate for code execution. A pretty simple standard buffer overflow. Interestingly, the shttpd Makefile even mentions -NDEBUG in order to save ~5kB binary size (remember, this is an embedded device). Let's look at how put_dir returns so we can get control over the program flow. At the function entry registers r4-r7 and the link-register are pushed onto the stack. Leaving looks similar with the difference that the link-register isn't used, but the return value is directly popped into pc.
The pc register is equivalent to EIP on x86 with the difference that you can directly read and write to it. As it is popped from our overflown stack-buffer, this would give us direct control over the program flow. Now the interesting question was, does wsal also support this request type or is it not calling this function?
This made it clear that the wsal binary also supports PUT. Looking at shttpd.c, it seems that PUT as well as DELETE should only be enabled for authorized users (which probably wouldn't be a big problem), but funnily the Makefile also states: # -DNO_AUTH - disable authorization support (-4kb) which was of course also set by wsal Exploitation: Exploitation of this seemed rather straight forward given the nature of this bug. The stack was marked non-executable in the ELF binary, but fortunately the ARMv5 doesn't support the XN bit yet. However, experimenting with this bug I noticed fairly quickly that ASLR is enabled on the device and our stack address is randomized. As a result, I couldn't just place my shellcode into buf and jump right to it. ROP would've been an option, but as my ARM knowledge was limited before playing with this bug, I didn't like this option (even though as we will see, I need it anyway, but not for the actual payload). Return-to-libc, by e.g. returning to system(), was no interesting option either, as the there is no network binary such as netcat installed on the box. So I had to find something else. And as it turned out, the support for heap randomization as well as library randomization starts pretty late on ARM. As Kees points out this started in 2.6.37. This nails down one possible problem. As path was not the original request buffer, but only a copy of it, I started looking for copies of my input or the possibility to put the payload somewhere else (e.g. a POST body, HTTP headers...). First, I checked where path is coming from (shttpd.c/decide_what_to_do()):
There we go, path originates from c->uri which is an url-decoded form of itself. One important thing we have to take into account at this point is that the URL can't be of arbitrary length, but is checked against URI_MAX. We have to overflow a buffer in put_dir() with a length of FILENAME_MAX... However, we are lucky, URI_MAX is defined as 16384 (config.h) while FILENAME_MAX from put_dir is an alias for MAX_PATH which is defined as 4096. So where is c->uri coming from? Again we look at shttpd.c, this time the parse_http_request() function:
As we can see, c->uri is allocated on the heap and as I mentioned, heap randomization was introduced pretty late on ARM/Linux, I assumed I can just jump right into the heap copy of my input. There is a nice side-effect of using the heap copy of the buffer to place our shellcode. Because url_decode() is called on the complete uri length, we have no restrictions whatsoever regarding the bytes we can include in our final shellcode, it can include zeros and the-like in url-encoded form. Anyway, few minutes later it became clear that I can't just jump right to it
While the leading zero itself was no a problem for the input itself (because I can just urlencode this), put_dir has a problem with that. If we recall, the loop is using So if we include a zero before the terminating / in the URL to jump to our heap buffer, our buffer overflow will actually never happen. However, the path copy that is passed to put_dir() is created using snprintf() and this is little-endian. Therefore, we can include one zero in the url-decoded, stack-based path buffer (in decide_what_to_do()) and pop the address including the zero from there. It just has to be past the / character that we need to get a large len value. How do we pop it from there after our buffer was overwritten and the stack frame of put_dir() was teared down? Here is where some ROP is needed (or call it jump-oriented). When the put_dir() function is left, the stack pointer is below the path stack buffer that was passed as an address to the put_dir() function (from where it was copied into the stack buffer over put_dir) and is as well already url-decoded. So if we can lift our stack pointer back up, it should be possible to pop an address with a leading zero from this buffer. Looking at the mentioned program map output, it is visible that libc and libgcc are mapped at addresses without a leading zero. Their base is also not randomized. I didn't have any particular tool to find ROP snippets, but as on ARM all instructions are word aligned, it was easy to find proper instructions with objectdump and grep. In particular objdump -d /lib/libc-2.3.6.so | grep -A 2 -E 'add sp, sp,.*' | grep -B 2 -E 'pop.*(pc|lr)' (can also be done with radare if you're more advanced in usin it than i am ). This way I searched for stack lifting instructions followed by an instruction that pops stack buffer content to pc or the link register in order to regain control. I found a good candidate:
This was perfect. Now I could just make my first jump to this snippet, lift the stack pointer back into my buffer, place the address of sigprocmask+108 url-encoded in my buffer (together with fake r4-r7 values) and lift the stack until I'm past the / character and pop my zero-address from there. The goal was still to jump to the shellcode in the heap copy of the buffer. The ARM-stacle: This would work well, if the target architecture wouldn't be ARM. There is an important constraint on ARM when writing exploits. Unlike x86, ARM is based on the Harvard Architecture. This means that code and data cache are separated. I didn't know this first. A result of this was that when hitting my heap shellcode, the program crashed with a SIGILL. However, analyzing the coredump and the pc at that time always showed correct instructions. Due to the Harvard Architecture, my shellcode is copied into the data cache. But in order to execute it, it needs to land in the data cache and then written back to main memory. Because it wasn't the, the coredump displayed instructions that weren't actually in the data cache and thus resulting in SIGILL, due to whatever was executed as instructions at this point. It turns out that there are two solutions two this problem. The first one is a simple instruction (MCR). However, it is limited to kernel mode. The other option is a clear cache syscall that takes 3 arguments, a start address, a range and flags. This seemed nice. What was even more nice is that the wsal links against libgcc which provides a wrapper to do that:
Crafting the 0x009f0002 by ROP would've been a bit painful I suppose so this wrapper was nice. So before jumping to our shellcode, we need to call this syscall. A small excerpt from linux-2.6/arch/arm/traps.c to better understand this syscall:
Some places suggest that you can pass 0 as a start and -1 (0xffffffff) as a range to this syscall and flush everything. However, this doesn't seem to work and looking at this function I also don't understand why it should. find_vma()(from mmap.c) will traverse the internal tree representation of the kernel until it finds the first virtual memory area that satisfies start < vma->vm_start. So if the start address is zero, this should hardly ever end up in the area of attacker controlled payload (unless you are very lucky). Also flushing the complete memory range doesn't work. As we see end will be set to vma->vm_end if it is bigger than the actual vma end. To sum up, we really need proper values. We need a heap address lower or equal than our shellcode address in r1 and a length larger than our payload in r2. As __clear_cache() returns using the link register, we furthermore have to fill that with a proper value to regain control after flushing the cache. So the plan is: overflow the buffer, lift our stack to a place where we can pop arbitrary addresses (these two steps could also be exchanged), flush the cache, jump to shellcode. The following shows the required ROP sequences to perform this. Searching these instructions was also simply done using objdump and grep:
Mission accomplished. The used shellcode then executes a connect-back shell! As a result, this is a remote root for SFR femtocells. The complete exploit is available here It needs slight modification in case you modified your firmware e.g. with library hooking.... As mentioned before, depending on how shttpd/mongoose/yassl embedded webserver have been compiled, they may be affected by the problem itself. The exact code for them differs slightly, but all of them contain the same bug if compiled with the right options. Slides of our presentation: http://femto.sec.t-labs.tu-berlin.de/bh2011.pdf UPDATE: it seems they have fixed the issue in the latest firmware release (V2.0.24.1) by disabling the PUT functionality completely
(Page 1 of 1, totaling 1 entries)
|
Calendar
QuicksearchSupportRecent Entries
Syndicate This BlogCategoriesTag cloud23c3 acpi advertising annouce announce april argh art awards bash blogging bugs c cli code conferences config configuration data mining debconf debian dell dns documentation email errm? events exploit fail fail2ban filesharing films flame fun gcc google graphs grml gsm hacking hacks hardware heise images information installation internet irc knowledge libacpi links linux mobile phones network news newsbeuter omg open source opera passwords php power privacy programming qa random blurb rant release releases rss scripts security service setup shell sms software spam ssh stfl stuff terminal tests text mode tip tips tools troubleshooting unix user video vim.editing web web 2.0 websites wordpress wtf www youtube zsh
|