Black Hat USA 2012: Owning Firefox’s Heap
by argp on May.14, 2012, under exploitation, research, security
Continuing our work on jemalloc exploitation, myself and Chariton Karamitas (intern at Census, Inc) are presenting “Owning Firefox’s Heap” at Black Hat USA 2012. This presentation extends our recently published Phrack paper by focusing specifically on the most widely used jemalloc application, namely the Mozilla Firefox web browser.
The abstract of our talk will give you a good preview of the content:
jemalloc is a userland memory allocator that is being increasingly adopted by software projects as a high performance heap manager. It is used in Mozilla Firefox for the Windows, Mac OS X and Linux platforms, and as the default system allocator on the FreeBSD and NetBSD operating systems. Facebook also uses jemalloc in various components to handle the load of its web services. However, despite such widespread use, there is no work on the exploitation of jemalloc.
Our research addresses this. We will begin by examining the architecture of the jemalloc heap manager and its internal concepts, while focusing on identifying possible attack vectors. jemalloc does not utilize concepts such as ‘unlinking’ or ‘frontlinking’ that have been used extensively in the past to undermine the security of other allocators. Therefore, we will develop novel exploitation approaches and primitives that can be used to attack jemalloc heap corruption vulnerabilities. As a case study, we will investigate Mozilla Firefox and demonstrate the impact of our developed exploitation primitives on the browser’s heap. In order to aid the researchers willing to continue our work, we will also release our jemalloc debugging tool belt.
For updates on this talk, information on my research and my work at Census, Inc in general you can follow me on Twitter.
Pseudomonarchia jemallocum
by argp on Apr.16, 2012, under code, exploitation, freebsd, linux, mac os x, research, security
Phrack 0×44 is out with my and huku‘s research on exploiting jemalloc titled:
I have been working on and off jemalloc since mid 2010 and it’s very nice to see the work published. In Phrack :)
As the main paper mentions, the development of our gdb/Python utility unmask_jemalloc will continue on github.
The Linux kernel memory allocators from an exploitation perspective
by argp on Jan.03, 2012, under exploitation, kernel, linux
In anticipation of Dan Rosenberg’s talk on exploiting the Linux kernel’s SLOB memory allocator at the Infiltrate security conference and because I recently had a discussion with some friends about the different kernel memory allocators in Linux, I decided to write this quick introduction. I will present some of the allocators’ characteristics and also provide references to public work on exploitation techniques.
At the time of this writing, the Linux kernel has three different memory allocators in the official code tree, namely SLAB, SLUB and SLOB. These allocators are on a memory management layer that is logically on top of the system’s low level page allocator and are mutually exclusive (i.e. you can only have one of them enabled/compiled in your kernel). They are used when a kernel developer calls kmalloc() or a similar function. Unsurprisingly, they can all be found in the mm directory. All of them follow, to various extends and by extending or simplifying, the traditional slab allocator design (notice the lowercase “slab”; that’s the term for the general allocator design approach, while “SLAB” is a slab implementation in the Linux kernel). Slab allocators allocate prior to any request, for example at kernel boot time, large areas of virtual memory (called “slabs”, hence the name). Each one of these slabs is then associated to a kernel structure of a specific type and size. Furthermore, each slab is divided into the appropriate number of slots for the size of the kernel structure it is associated with. As an example consider that a slab for the structure task_struct has 31 slots. The size of a task_struct is 1040 bytes, so assuming that a page is 4096 bytes (the default) then a task_struct slab is 8 pages long. Apart from the structure-specific slabs, like the one above for task_struct, there are also the so called general purpose slabs which are used to serve arbitrary-sized kmalloc() requests. These requests are adjusted by the allocator for alignment and assigned to a suitable slab.
Let’s take a look at the slabs of a recent Linux kernel:
$ cat /proc/slabinfo slabinfo - version: 2.1 ... fat_inode_cache 57 57 416 19 2 : tunables 0 0 0 : slabdata 3 3 0 fat_cache 170 170 24 170 1 : tunables 0 0 0 : slabdata 1 1 0 VMBlockInodeCache 7 7 4480 7 8 : tunables 0 0 0 : slabdata 1 1 0 blockInfoCache 0 0 4160 7 8 : tunables 0 0 0 : slabdata 0 0 0 AF_VMCI 0 0 704 23 4 : tunables 0 0 0 : slabdata 0 0 0 fuse_request 80 80 400 20 2 : tunables 0 0 0 : slabdata 4 4 0 fuse_inode 21299 21690 448 18 2 : tunables 0 0 0 : slabdata 1205 1205 0 ... kmalloc-8192 94 96 8192 4 8 : tunables 0 0 0 : slabdata 24 24 0 kmalloc-4096 118 128 4096 8 8 : tunables 0 0 0 : slabdata 16 16 0 kmalloc-2048 173 208 2048 16 8 : tunables 0 0 0 : slabdata 13 13 0 kmalloc-1024 576 640 1024 16 4 : tunables 0 0 0 : slabdata 40 40 0 kmalloc-512 904 992 512 16 2 : tunables 0 0 0 : slabdata 62 62 0 kmalloc-256 540 976 256 16 1 : tunables 0 0 0 : slabdata 61 61 0 kmalloc-128 946 1408 128 32 1 : tunables 0 0 0 : slabdata 44 44 0 kmalloc-64 13013 13248 64 64 1 : tunables 0 0 0 : slabdata 207 207 0 kmalloc-32 23624 27264 32 128 1 : tunables 0 0 0 : slabdata 213 213 0 kmalloc-16 3546 3584 16 256 1 : tunables 0 0 0 : slabdata 14 14 0 kmalloc-8 4601 4608 8 512 1 : tunables 0 0 0 : slabdata 9 9 0 kmalloc-192 3659 4620 192 21 1 : tunables 0 0 0 : slabdata 220 220 0 kmalloc-96 10137 11340 96 42 1 : tunables 0 0 0 : slabdata 270 270 0 kmem_cache 32 32 128 32 1 : tunables 0 0 0 : slabdata 1 1 0 kmem_cache_node 256 256 32 128 1 : tunables 0 0 0 : slabdata 2 2 0
Here you can see structure-specific slabs, for example fuse_inode, and general purpose slabs, for example kmalloc-96.
When it comes to the exploitation of overflow bugs in the context of slab allocators, there are three general approaches to corrupt kernel memory:
- Corruption of the adjacent objects/structures of the same slab.
- Corruption of the slab allocator’s management structures (referred to as metadata).
- Corruption of the adjacent physical page of the slab your vulnerable structure is allocated on.
The ultimate goal of the above approaches is of course to gain control of the kernel’s execution flow and divert/hijack it to your own code. In order to be able to manipulate the allocator and the state of its slabs, arranging structures on them to your favor (i.e. next to each other on the same slab, or at the end of a slab), it is nice (but not strictly required ;) to have some information on the allocator’s state. The proc filesystem provides us with a way to get this information. Unprivileged local users can simply cat /proc/slabinfo (as shown above) and see the allocator’s slabs, the number of used/free structures on them, etc. Is your distribution still allowing this?
For each one of the Linux kernel’s allocators I will provide references to papers describing practical attack techniques and examples of public exploits.
SLAB
Starting with the oldest of the three allocators, SLAB organizes physical memory frames in caches. Each cache is responsible for a specific kernel structure. Also, each cache holds slabs that consist of contiguous pages and these slabs are responsible for the actual storing of the kernel structures of the cache’s type. A SLAB’s slab can have both allocated (in use) and deallocated (free) slots. Based on this and with the goal of reducing fragmentation of the system’s virtual memory, a cache’s slabs are divided into three lists; a list with full slabs (i.e slabs with no free slots), a list with empty slabs (slabs on which all slots are free), and a list with partial slabs (slabs that have slots both in use and free).
A SLAB’s slab is described by the following structure:
struct slab { union { struct { struct list_head list; unsigned long colouroff; void *s_mem; /* including colour offset */ unsigned int inuse; /* num of objs active in slab */ kmem_bufctl_t free; unsigned short nodeid; }; struct slab_rcu __slab_cover_slab_rcu; }; };
The list variable is used to place the slab in one of the lists I described above. Coloring and the variable colouroff require some explanation in case you haven’t seen them before. Coloring or cache coloring is a performance trick to reduce processor L1 cache hits. This is accomplished by making sure that the first “slot” of a slab (which is used to store the slab’s slab structure, i.e. the slab’s metadata) is not placed at the beginning of the slab (which is also at the start of a page) but an offset colouroff from it. s_mem is a pointer to the first slot of the slab that stores an actual kernel structure/object. free is an index to the next free object of the slab.
As I mentioned in the previous paragraph, a SLAB’s slab begins with a slab structure (the slab’s metadata) and is followed by the slab’s objects. The stored objects on a slab are contiguous, with no metadata in between them, making easier the exploitation approach of corrupting adjacent objects. Easier means that when we overflow from one object to its adjacent we don’t corrupt management data that could lead to making the system crash.
By manipulating SLAB through controlled allocations and deallocations from userland that affect the kernel (for example via system calls) we can arrange that the overflow from a vulnerable structure will corrupt an adjacent structure of our own choosing. The fact that SLAB’s allocations and deallocations work in a LIFO manner is of course to our advantage in arranging structures/objects on the slabs. As qobaiashi has presented in his paper “The story of exploiting kmalloc() overflows”, the system calls semget() and semctl(..., ..., IPC_RMID) is one way to make controlled allocations and deallocations respectively. The term “controlled” here refers to both the size of the allocation/deallocation and the fact that we can use them directly from userland. Another requirement that these system calls satisfy is that the structure they allocate can help us in our quest for code execution when used as a victim object and corrupted from a vulnerable object. Other ways/system calls that satisfy all the above requirements do exist.
Another resource on attacking SLAB is “Exploiting kmalloc overflows to 0wn j00″ by amnesia and clflush. In that presentation the authors explained the development process for a reliable exploit for vulnerability CVE-2004-0424 (which was an integer overflow leading to a kernel heap buffer overflow found by ihaquer and cliph). Both the presentation and the exploit are not public as far as I know. However, a full exploit was published by twiz and sgrakkyu in Phrack #64 (castity.c).
SLUB
SLUB is currently the default allocator of the Linux kernel. It follows the SLAB allocator I have already described in its general design (caches, slabs, full/empty/partial lists of slabs, etc.), however it has introduced simplifications in respect to management overhead to achieve better performance. One of the main differences is that SLUB has no metadata at the beginning of each slab like SLAB, but instead it has added it’s metadata variables in the Linux kernel’s page structure to track the allocator’s data on the physical pages.
The following excerpt includes only the relevant parts of the page structure, see here for the complete version.
struct page { ... struct { union { pgoff_t index; /* Our offset within mapping. */ void *freelist; /* slub first free object */ }; ... struct { unsigned inuse:16; unsigned objects:15; unsigned frozen:1; }; ... }; ... union { ... struct kmem_cache *slab; /* SLUB: Pointer to slab */ ... }; ... };
Since there are no metadata on the slab itself, a page‘s freelist pointer is used to point to the first free object of the slab. A free object of a slab has a small header with metadata that contain a pointer to the next free object of the slab. The index variable holds the offset to these metadata within a free object. inuse and objects hold respectively the allocated and total number of objects of the slab. frozen is a flag that specifies whether the page can be used by SLUB’s list management functions. Specifically, if a page has been frozen by a CPU core, only this core can retrieve free objects from the page, while the other available CPU cores can only add free objects to it. The last interesting for our discussion variable is slab which is of type struct kmem_cache and is a pointer to the slab on the page.
The function on_freelist() is used by SLUB to determine if a given object is on a given page’s freelist and provides a nice introduction to the use of the above elements. The following snippet is an example invocation of on_freelist() (taken from here):
slab_lock(page); if(on_freelist(page->slab, page, object)) { object_err(page->slab, page, object, "Object is on free-list"); rv = false; } else { rv = true; } slab_unlock(page);
Locking is required to avoid inconsistencies since on_freelist() makes some modifications and it could be interrupted. Let’s take a look at an excerpt from on_freelist() (the full version is here):
static int on_freelist(struct kmem_cache *s, struct page *page, void *search) { int nr = 0; void *fp; void *object = NULL; unsigned long max_objects; fp = page->freelist; while(fp && nr <= page->objects) { if(fp == search) return 1; if(!check_valid_pointer(s, page, fp)) { if(object) { object_err(s, page, object, "Freechain corrupt"); set_freepointer(s, object, NULL); break; } else { slab_err(s, page, "Freepointer corrupt"); page->freelist = NULL; page->inuse = page->objects; slab_fix(s, "Freelist cleared"); return 0; } break; } object = fp; fp = get_freepointer(s, object); nr++; } ... }
The function starts with a simple piece of code that walks the freelist and demonstrates the use of SLUB internal variables. Of particular interest is the call of the check_valid_pointer() function which verifies that a freelist‘s object’s address (variable fp) is within a slab page. This is a check that safeguards against corruptions of the freelist.
This brings us to attacks against the SLUB memory allocator. The attack vector of corrupting adjacent objects on the same slab is fully applicable to SLUB and largely works like in the case of the SLAB allocator. However, in the case of SLUB there is an added attack vector: exploiting the allocator’s metadata (the ones responsible for finding the next free object on the slab). As twiz and sgrakkyu have demonstrated in their book on kernel exploitation, the slab can be misaligned by corrupting the least significant byte of the metadata of a free object that hold the pointer to the next free object. This misalignment of the slab allows us to create an in-slab fake object and by doing so to a) satisfy safeguard checks as the one I explained in the previous paragraph when they are used, and b) to hijack the kernel’s execution flow to our own code.
An example of SLUB metadata corruption and slab misalignment is the exploit for vulnerability CVE-2009-1046 which was an off-by-two kernel heap overflow. In this blog post, sgrakkyu explained how by using only an one byte overflow turned this vulnerability into a reliable exploit (tiocl_houdini.c). If you’re wondering why an one byte overflow is more reliable than a two byte overflow think about little-endian representation.
A public example of corrupting adjacent SLUB objects is the exploit i-can-haz-modharden.c by Jon Oberheide for vulnerability CVE-2010-2959 discovered by Ben Hawkes. In this blog post you can find an overview of the exploit and the technique.
SLOB
Finally, SLOB is a stripped down kernel allocator implementation designed for systems with limited amounts of memory, for example embedded versions/distributions of Linux. In fact its design is closer to traditional userland memory allocators rather than the slab allocators SLAB and SLUB. SLOB places all objects/structures on pages arranged in three linked lists, for small, medium and large allocations. Small are the allocations of size less than SLOB_BREAK1 (256 bytes), medium those less than SLOB_BREAK2 (1024 bytes), and large are all the other allocations:
#define SLOB_BREAK1 256 #define SLOB_BREAK2 1024 static LIST_HEAD(free_slob_small); static LIST_HEAD(free_slob_medium); static LIST_HEAD(free_slob_large);
Of course this means that in SLOB we can have objects/structures of different types and sizes on the same page. This is the main difference between SLOB and SLAB/SLUB. A SLOB page is defined as follows:
struct slob_page { union { struct { unsigned long flags; /* mandatory */ atomic_t _count; /* mandatory */ slobidx_t units; /* free units left in page */ unsigned long pad[2]; slob_t *free; /* first free slob_t in page */ struct list_head list; /* linked list of free pages */ }; struct page page; }; };
The function slob_alloc() is SLOB’s main allocation routine and based on the requested size it walks the appropriate list trying to find if a page of the list has enough room to accommodate the new object/structure (the full function is here):
static void *slob_alloc(size_t size, gfp_t gfp, int align, int node) { struct slob_page *sp; struct list_head *prev; struct list_head *slob_list; slob_t *b = NULL; unsigned long flags; if (size < SLOB_BREAK1) slob_list = &free_slob_small; else if (size < SLOB_BREAK2) slob_list = &free_slob_medium; else slob_list = &free_slob_large; ... list_for_each_entry(sp, slob_list, list) { ...
I think this is a good place to stop since I don’t want to go into too many details and because I really look forward to Dan Rosenberg’s talk.
Edit: Dan has published a whitepaper to accompany his talk with all the details on SLOB exploitation; you can find it here.
Notes
Wrapping this post up, I would like to mention that there are other slab allocators proposed and implemented for Linux apart from the above three. SLQB and SLEB come to mind, however as the benevolent dictator has ruled they are not going to be included in the mainline Linux kernel tree until one of the existing three has been removed.
Exploitation techniques and methodologies like the ones I mentioned in this post can be very helpful when you have a vulnerability you’re trying to develop a reliable exploit for. However, you should keep in mind that every vulnerability has its own set of requirements and conditions and therefore every exploit is a different story/learning experience. Understanding a bug and actually developing an exploit for it are two very different things.
Thanks to Dan and Dimitris for their comments.
References
The following resources were not linked directly in the discussion, but would be helpful in case you want to look more into the subject.
http://lxr.linux.no/linux+v3.1.6/
http://lwn.net/Articles/229984/
http://lwn.net/Articles/311502/
http://lwn.net/Articles/229096/
http://phrack.org/issues.html?issue=66&id=15#article
http://phrack.org/issues.html?issue=66&id=8#article
may 2011 0day
by argp on May.24, 2011, under exploitation, research, security
md5: bcaac03a882cb10ac7c01acdb36b6287
sha1: f653e2d13f97bc83b343ebc7d8e97d8890d8be8f
sha256: a76c77ba8f5b8f545f48de76b5613fa93c9f9a1eed2d691d6d8a6c01bf8f59bb
Short Black Hat Europe 2011 review
by argp on Mar.21, 2011, under security
I am not really the kind of person that writes conference reviews, but I will give you a short, almost telegraphical one of this year’s Black Hat Europe.
Day 1
Contrary to conference tradition, day 1 begun without a keynote as Bruce Schneier was unable to be in Barcelona in the morning. The first talk I attended was Nitesh Dhanjani’s “New Age Attacks Against Apple’s iOS”. This talk mainly focused on abusing mobile Safari’s protocol handlers and, although well delivered, wasn’t that interesting to me. Next was Tom Keetch’s “Escaping From Microsoft Windows Sandboxes” in which Tom gave an overview of the Windows sandbox implementation, and explained how Internet Explorer, Google Chrome and Adobe Reader X use it. Furthermore, he presented mechanisms for bypassing the Windows sandbox both generically and specific to the above applications. I really liked this talk and it is my second choice for best talk of day 1. Then I presented mine and Dimitris Glynos’ “Protecting the Core” talk; I got some nice feedback which we will incorporate in the updated version of the slide deck. Next was Mihai Chiriac’s “Rootkit Detection via Kernel Code Tunneling”, an interesting approach for detecting kernel rootkits on Windows with lightweight heuristic analysis and interesting features for live-system disinfection. The final talk of the day was also the best one, Chris Valasek’s and Ryan Smith’s “Exploitation in the Modern Era”. They successfully tried to abstract the process of exploitation into building blocks, or primitives according to the terminology they used in the talk, and how these building blocks were used during the development of two server-side Windows exploits. Day 1 ended with Bruce Schneier’s keynote on cyber war incidents, how countries prepare (or not) for it, and how cyber war will eventually and inevitably become part of all military confrontations.
Day 2
Day 2 started with an amazing talk, Sebastian Muniz’s and Alfredo Ortega’s “Fuzzing and Debugging Cisco IOS”. These two highly talented and competent researchers explained their setup and modifications to the Dynamips emulator for debugging, reversing and fuzzing IOS. Then I went to Tom Parker’s “Stuxnet Redux” talk, but I left after 20 minutes or so since the talk was too abstract and non-technical for my taste. I was able to attend the last 30 minutes of George Hedfors’ “Owning the data centre using Cisco NX-OS based switches”. George explained how he was able to break out of the restrictive shell of Cisco’s NX (which is Linux-based), as well as how the old CDP denial-of-service bug is unpatched on NX. FX’s “Building Custom Disassemblers” was, as I expected, nothing short of brilliant. He documented step-by-step the process of building a custom disassembler for the binary code of a certain Siemens microcontroller. At the end of the talk he revealed that this was actually one of the microcontrollers that Stuxnet was carrying code to attack. To be honest, the Stuxnet connection was not the interesting part of the talk (although it was fun). The insight on FX’s work process, and his tips and approaches on reverse engineering were the most interesting parts for me. I then attended Don Bailey’s “Attacking Microcontroller Environments from a Software Perspective”. Apart from the investment tips, he also gave us exploitation methodologies for attacking two different microcontrollers, one of the von Neumann and one of the Harvard architecture. The day ended with Vincenzo Iozzo’s “Mac Exploit Kitchen” workshop which was great fun.
In closing I would say that when compared to 2010, this year’s Black Hat Europe had more interesting (to me) talks. It was my second Black Hat Europe as a presenter in a row and it was great meeting new people as well as seeing those I met last year.
In between the technical sessions and after them I partied non-stop. The partying was hot.
Yes, I am being sarcastic. I am a nerd. I don’t go out much actually.
Protecting the Core – Black Hat Europe 2011
by argp on Feb.27, 2011, under exploitation, kernel, research, security
In about three weeks I will be presenting “Protecting the Core: Kernel Exploitation Mitigations” at Black Hat Europe 2011 in Barcelona, Spain. This is joined work with good friend and co-researcher at Census, Inc Dimitris Glynos. Our abstract follows:
The exploitation of operating system kernel vulnerabilities has received a great deal of attention lately. In userland most generic exploitation approaches have been defeated by countermeasure technologies. Contrary to userland protections, exploitation mitigation mechanisms for kernel memory corruptions have not been widely adopted. Recently this has started to change. Most operating system kernels have started to include countermeasures against NULL page mappings, stack and heap corruptions, as well as for other vulnerability classes. At the same time, researchers have concentrated on developing ways to bypass certain kernel protections on various operating systems. This presentation will describe in detail the state-of-the-art in kernel exploitation mitigations adopted (or not) by various operating systems (Windows, Linux, Mac OS X, FreeBSD) and mobile platforms (iOS, Android). Moreover, it will also provide approaches, notes, hints and references to existing work for bypassing some of these kernel protections.
This talk basically collects our joined experiences in dealing with and researching kernel exploitation mitigations during kernel exploit development on various operating systems. Unfortunately Dimitris will not be able to travel to Barcelona, so I will present the talk alone. You can follow Dimitris or myself on Twitter to get updates relevant to our talk and our research in general. I am really looking forward to travel to Barcelona (again) and meet all the great people participating in Black Hat.
The title of our talk (Protecting the Core: Kernel Exploitation Mitigations) is, of course, a wordplay on the Phrack article “Attacking the Core: Kernel Exploiting Notes” by twiz and sgrakkyu. Their article focused on kernel attacks, while our talk focuses on the kernel defenses employed by popular operating systems. twiz and sgrakkyu have also written a highly recommended book on the attack side of things which greatly expands their Phrack article, namely A Guide to Kernel Exploitation: Attacking the Core.
CVE-2010-3081 exploit for Debian Linux
by argp on Sep.25, 2010, under exploitation, kernel, linux
I have ported the public exploit for CVE-2010-3081 to work on Debian Linux.
FreeBSD kernel NFS client local vulnerabilities
by argp on May.23, 2010, under advisories, exploitation, freebsd, kernel, research, security
| census ID: | census-2010-0001 |
| CVE ID: | CVE-2010-2020 |
| Affected Products: | FreeBSD 8.0-RELEASE, 7.3-RELEASE, 7.2-RELEASE |
| Class: | Improper Input Validation (CWE-20) |
| Remote: | No |
| Discovered by: | Patroklos Argyroudis |
We have discovered two improper input validation vulnerabilities in the FreeBSD kernel’s NFS client-side implementation (FreeBSD 8.0-RELEASE, 7.3-RELEASE and 7.2-RELEASE) that allow local unprivileged users to escalate their privileges, or to crash the system by performing a denial of service attack.
Details
FreeBSD is an advanced operating system which focuses on reliability and performance. More information about its features can be found here.
FreeBSD 8.0-RELEASE, 7.3-RELEASE and 7.2-RELEASE employ an improper input validation method in the kernel’s NFS client-side implementation. Specifically, the first vulnerability is in function nfs_mount() (file src/sys/nfsclient/nfs_vfsops.c) which is reachable from the mount(2) and nmount(2) system calls. In order for them to be enabled for unprivileged users the sysctl(8) variable vfs.usermount must be set to a non-zero value.
The function nfs_mount() employs an insufficient input validation method for copying data passed in a structure of type nfs_args from userspace to kernel. Specifically, the file handle buffer to be mounted (args.fh) and its size (args.fhsize) are completely user-controllable. The unbounded copy operation is in file src/sys/nfsclient/nfs_vfsops.c (the excerpts are from 8.0-RELEASE):
1094 1095 1096 1097 1098 1099 | if (!has_fh_opt) { error = copyin((caddr_t)args.fh, (caddr_t)nfh, args.fhsize); if (error) { goto out; } |
The declaration of the variables args and nfh is at:
786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 | static int nfs_mount(struct mount *mp) { struct nfs_args args = { .version = NFS_ARGSVERSION, .addr = NULL, .addrlen = sizeof (struct sockaddr_in), .sotype = SOCK_STREAM, .proto = 0, .fh = NULL, .fhsize = 0, .flags = NFSMNT_RESVPORT, .wsize = NFS_WSIZE, .rsize = NFS_RSIZE, .readdirsize = NFS_READDIRSIZE, .timeo = 10, .retrans = NFS_RETRANS, .maxgrouplist = NFS_MAXGRPS, .readahead = NFS_DEFRAHEAD, .wcommitsize = 0, /* was: NQ_DEFLEASE */ .deadthresh = NFS_MAXDEADTHRESH, /* was: NQ_DEADTHRESH */ .hostname = NULL, /* args version 4 */ .acregmin = NFS_MINATTRTIMO, .acregmax = NFS_MAXATTRTIMO, .acdirmin = NFS_MINDIRATTRTIMO, .acdirmax = NFS_MAXDIRATTRTIMO, }; int error, ret, has_nfs_args_opt; int has_addr_opt, has_fh_opt, has_hostname_opt; struct sockaddr *nam; struct vnode *vp; char hst[MNAMELEN]; size_t len; u_char nfh[NFSX_V3FHMAX]; |
This vulnerability can cause a kernel stack overflow which leads to privilege escalation on FreeBSD 7.3-RELEASE and 7.2-RELEASE. On FreeBSD 8.0-RELEASE the result is a kernel crash/denial of service due to the SSP/ProPolice kernel stack-smashing protection which is enabled by default. Versions 7.1-RELEASE and earlier do not appear to be vulnerable since the bug was introduced in 7.2-RELEASE. In order to demonstrate the impact of the vulnerability we have developed a proof-of-concept privilege escalation exploit. A sample run of the exploit follows:
[argp@julius ~]$ uname -rsi FreeBSD 7.3-RELEASE GENERIC [argp@julius ~]$ sysctl vfs.usermount vfs.usermount: 1 [argp@julius ~]$ id uid=1001(argp) gid=1001(argp) groups=1001(argp) [argp@julius ~]$ gcc -Wall nfs_mount_ex.c -o nfs_mount_ex [argp@julius ~]$ ./nfs_mount_ex [*] calling nmount() [!] nmount error: -1030740736 nmount: Unknown error: -1030740736 [argp@julius ~]$ id uid=0(root) gid=0(wheel) egid=1001(argp) groups=1001(argp)
The second vulnerability exists in the function mountnfs() that is called from function nfs_mount():
1119 1120 | error = mountnfs(&args, mp, nam, args.hostname, &vp, curthread->td_ucred); |
The function mountnfs() is reachable from the mount(2) and nmount(2) system calls by unprivileged users. As with the nfs_mount() case above, this requires the sysctl(8) variable vfs.usermount to be set to a non-zero value.
The file handle to be mounted (argp->fh) and its size (argp->fhsize) are passed to function mountnfs() from function nfs_mount() and are user-controllable. These are subsequently used in an unbounded bcopy() call (file src/sys/nfsclient/nfs_vfsops.c):
1219 | bcopy((caddr_t)argp->fh, (caddr_t)nmp->nm_fh, argp->fhsize); |
The above can cause a kernel heap overflow when argp->fh is bigger than 128 bytes (the size of nmp->nm_fh) since nmp is an allocated item on the Universal Memory Allocator (UMA, the FreeBSD kernel’s heap allocator) zone nfsmount_zone (again from src/sys/nfsclient/nfs_vfsops.c):
1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 | static int mountnfs(struct nfs_args *argp, struct mount *mp, struct sockaddr *nam, char *hst, struct vnode **vpp, struct ucred *cred) { struct nfsmount *nmp; struct nfsnode *np; int error; struct vattr attrs; if (mp->mnt_flag & MNT_UPDATE) { nmp = VFSTONFS(mp); printf("%s: MNT_UPDATE is no longer handled here\n", __func__); free(nam, M_SONAME); return (0); } else { nmp = uma_zalloc(nfsmount_zone, M_WAITOK); |
This kernel heap overflow can lead on FreeBSD 8.0-RELEASE, 7.3-RELEASE and 7.2-RELEASE to privilege escalation and/or a kernel crash/denial of service attack. Similarly to the first vulnerability, FreeBSD 7.1-RELEASE and earlier versions do not appear to be vulnerable. We have developed a proof-of-concept DoS exploit to demonstrate the vulnerability. Furthermore, we have also developed a privilege escalation exploit for this second vulnerability which will not be released at this point.
FreeBSD has released an official advisory and a patch to address both vulnerabilities. All affected parties are advised to follow the upgrade instructions included in the advisory and patch their systems.
FreeBSD kernel exploitation mitigations
by argp on Apr.26, 2010, under exploitation, freebsd, kernel, research, security
In my recent Black Hat Europe 2010 talk I gave an overview of the kernel exploitation prevention mechanisms that exist on FreeBSD. A few people at the conference have subsequently asked me to elaborate on the subject. In this post I will collect all the information from my talk and the various discussions I had in the Black Hat conference hallways.
Userland memory corruption protections (also known as exploitation mitigations) have made most of the generic exploitation approaches obsolete. This is true both on Windows and Unix-like operating systems. In order to successfully achieve arbitrary code execution from a vulnerable application nowadays a researcher needs to look to the memory layout and the code structure of the particular application.
On the other hand, exploitation mitigation mechanisms for kernel code have not seen the same level of adoption mostly due to the performance penalty they introduce. This has increased the interest in viewing the operating system kernel as part of the attack surface targeted in a penetration test. Therefore, many operating systems have started to introduce kernel exploitation mitigations. The recent CanSecWest talk by Tavis Ormandy and Julien Tinnes titled “There’s a party at Ring0, and you’re invited” presented an overview of such mitigations on Windows and Linux.
FreeBSD also has a number of memory corruption protections for kernel code. Not all of these were developed with the goal of undermining attacks, but primarily as debugging mechanisms. Some are enabled by default in the latest stable version (8.0-RELEASE) and some are not.
Stack-smashing
Kernel stack-smashing protection for FreeBSD was introduced in version 8.0 via ProPolice/SSP. Specifically, the file src/sys/kern/stack_protector.c is compiled with gcc’s -fstack-protector option and registers an event handler called __stack_chk_init that generates a random canary value (the “guard” variable in SSP terminology) placed between the local variables and the saved frame pointer of a kernel process’s stack during a function’s prologue. Below is the relevant part of the stack_protector.c file:
10: __stack_chk_guard[8] = {}; ... 20: #define __arraycount(__x) (sizeof(__x) / sizeof(__x[0])) 21: static void 22: __stack_chk_init(void *dummy __unused) 23: { 24: size_t i; 25: long guard[__arraycount(__stack_chk_guard)]; 26: 27: arc4rand(guard, sizeof(guard), 0); 28: for (i = 0; i < __arraycount(guard); i++) 29: __stack_chk_guard[i] = guard[i]; 30: }
During the protected function’s epilogue the canary is checked against its original value. If it has been altered the kernel calls panic(9) bringing down the whole system, but also stopping any execution flow redirection caused by manipulation of the function’s saved frame pointer or saved return address (again from the stack_protector.c file):
13: void 14: __stack_chk_fail(void) 15: { 16: 17: panic("stack overflow detected; backtrace may be corrupted"); 18: }
ProPolice/SSP also performs local variable and pointer reordering in order to protect against the corruption of variables and pointers due to stack buffer overflow vulnerabilities.
NULL page mappings
Also in version 8.0, FreeBSD has introduced a protection against user mappings at address 0 (NULL). This exploitation mitigation mechanism is exposed through the sysctl(8) variable security.bsd.map_at_zero and is enabled by default (i.e. the variable has the value 0). When a user request is made for the NULL page and the feature is enabled an error occurs and the mapping fails. Obviously this protection is ineffective in vulnerabilities which the attacker can (directly or indirectly) control the kernel dereference offset. For an applicable example see the exploit for vulnerability CVE-2008-3531 I have previously published.
Heap-smashing
FreeBSD has introduced kernel heap-smashing detection in 8.0-RELEASE via an implementation
called RedZone. RedZone is oriented more towards debugging the kernel memory allocator rather than detecting and stopping deliberate attacks against it. If enabled (it is disabled by default) RedZone places a static canary value of 16 bytes above and below each buffer allocated on the heap. The canary value consists of the hexadecimal value 0×42 repeated in these 16 bytes.
During a heap buffer’s deallocation the canary value is checked and if it has been corrupted the details of the corruption (address of the offending buffer and stack traces of the buffer’s allocation and deallocation) are logged. The code that performs the check for a heap overflow is the following (from file src/sys/vm/redzone.c):
166: ncorruptions = 0; 167: for (i = 0; i < REDZONE_CFSIZE; i++, faddr++) { 168: if (*(u_char *)faddr != 0x42) 169: ncorruptions++; 170: }
This protection mechanism can obviously be easily bypassed.
Use-after-free
MemGuard is a replacement kernel memory allocator introduced in FreeBSD version 6.0 and is designed to detect use-after-free bugs in kernel code. Similarly to RedZone, MemGuard mainly targets debugging scenarios and does not constitute a mechanism to mitigate deliberate attacks. However, MemGuard is not compatible and cannot replace the Universal Memory Allocator’s (UMA – which is the default kernel allocator in FreeBSD) calls. Therefore (and also due to the overhead it introduced even before UMA was developed), it is not enabled by default.
Black Hat Europe 2010 update
by argp on Apr.22, 2010, under exploitation, freebsd, kernel, research, security
Black Hat Europe 2010 is now over and after a brief ash cloud caused delay I am back in Greece. It has been a great conference, flawlessly organised and with many outstanding presentations. I would like to thank everyone that attended my presentation but also all the kind people that spoke to me before and afterwards. I hope to meet all of you again at a future event.
My presentation, titled “Binding the Daemon: FreeBSD Kernel Stack and Heap Exploitation”, was divided into four parts. In the first part I gave an overview of the published work on the subject of kernel exploitation for Unix-like operating systems. The second and third parts were the main body of the presentation. Specifically, in the second part I explained how a kernel stack overflow vulnerability on FreeBSD can be leveraged to achieve arbitrary code execution. The third part focused on a detailed security analysis of the Universal Memory Allocator (UMA), the FreeBSD kernel’s memory allocator. I explored how UMA overflows can lead to arbitrary code execution in the context of the latest stable FreeBSD kernel (8.0-RELEASE), and I developed an exploitation methodology for privilege escalation and kernel continuation.
In the fourth and final part I gave a demo of a FreeBSD kernel local 0day vulnerability that I have discovered. However, I have not released the details of the vulnerability in my Black Hat presentation. The details of this vulnerability (plus the proof-of-concept exploit) will be released shortly, once the relevant code is patched and the official advisory is out.
Below you may find all the material of my presentation, updated with some extra information and minor corrections:
- Slides: bheu-2010-slides.pdf
- White paper: bheu-2010-wp.pdf
- Source code: bheu-2010-src.tar.gz

