网络科技

    今日:78| 主题:278499
收藏本版
互联网、科技极客的综合动态。

[其他] CVE-2016-6187: Exploiting Linux kernel heap off-by-one

[复制链接]
持续爱到几多岁 投递于 2016-10-17 22:20:01
275 4
by Vitaly Nikolenko
  Posted on October 16, 2016 at 8:38 PM
  Introduction

   I guess the reason I decided to write about this vulnerability is because when I posted it on Twitter, I've received a few DMs saying that either this kernel path wasn't vulnerable (i.e., couldn't see where off-by-1 was) or it wasn't exploitable. The other reason is that I wanted to try the usefaultfd() syscall in practice and I needed a real UAF to play with.
   First, I don't know if this vulnerability got into any upstream kernels on any major distributions. I've only checked the Ubuntu line and Yakkety wasn't affected. But hey, backports happen quite often :]. The bug was introduced by the bb646cdb12e75d82258c2f2e7746d5952d3e321a commit and fixed in 30a46a4647fd1df9cf52e43bf467f0d9265096ca .
  Since I couldn't find a vulnerable Ubuntu kernel, I've compiled the 4.5.1 kernel on Ubuntu 16.04 (x86_64). It's worth mentioning that this vulnerability only affects distributions that use AppArmor by default (such as Ubuntu).
  Vulnerability

   Writing into /proc/self/attr/current triggers the proc_pid_attr_write() function. The following is the code before the vulnerability was introduced:
  1. static ssize_t proc_pid_attr_write(struct file * file, const char __user * buf,
  2.                                    size_t count, loff_t *ppos)
  3. {
  4.         struct inode * inode = file_inode(file);
  5.         char *page;
  6.         ssize_t length;
  7.         struct task_struct *task = get_proc_task(inode);
  8.         length = -ESRCH;
  9.         if (!task)
  10.                 goto out_no_task;
  11.         if (count > PAGE_SIZE)                            [1]
  12.                 count = PAGE_SIZE;
  13.         /* No partial writes. */
  14.         length = -EINVAL;
  15.         if (*ppos != 0)
  16.                 goto out;
  17.         length = -ENOMEM;
  18.         page = (char*)__get_free_page(GFP_TEMPORARY);     [2]
  19.         if (!page)
  20.                 goto out;
  21.         length = -EFAULT;
  22.         if (copy_from_user(page, buf, count))             [3]
  23.                 goto out_free;
  24.         /* Guard against adverse ptrace interaction */
  25.         length = mutex_lock_interruptible(&task-;>signal->cred_guard_mutex);
  26.         if (length < 0)
  27.                 goto out_free;
  28.         length = security_setprocattr(task,
  29.                                       (char*)file->f_path.dentry->d_name.name,
  30.                                       (void*)page, count);
  31. ...
复制代码
  The buf parameter represents the user-supplied buffer (with length count ) that's being written to /proc/self/attr/current . In [1], the check is performed to ensure that this buffer will fit into a single page (4096 bytes by default). In [2] and [3], a single page is allocated and the user-space buffer is copied into the newly allocated page . This page is then passed to security_setprocattr which represents the LSM hook (AppArmour, SELinux, Smack). In case of Ubuntu, this hook triggers apparmor_setprocattr() function shown below:
  1. static int apparmor_setprocattr(struct task_struct *task, char *name,
  2.                                 void *value, size_t size)
  3. {
  4.         struct common_audit_data sa;
  5.         struct apparmor_audit_data aad = {0,};
  6.         char *command, *args = value;
  7.         size_t arg_size;
  8.         int error;
  9.         if (size == 0)
  10.                 return -EINVAL;
  11.         /* args points to a PAGE_SIZE buffer, AppArmor requires that
  12.          * the buffer must be null terminated or have size <= PAGE_SIZE -1
  13.          * so that AppArmor can null terminate them
  14.          */
  15.         if (args[size - 1] != '\0') {                     [4]
  16.                 if (size == PAGE_SIZE)
  17.                         return -EINVAL;
  18.                 args[size] = '\0';
  19.         }
  20. ...
复制代码
  In [4], if the last byte of the user-supplied buffer is not null and the size of the buffer is not equal to the page size, the buffer is terminated with a null. On the other hand, if the user-supplied buffer exceeds (or equal to) the size of a single page (allocated in [2]), the path is terminated and -EINVAL is returned.
   The following shows the change (in [3]) to proc_pid_attr_write() after the vulnerability was introduced:
  1. static ssize_t proc_pid_attr_write(struct file * file, const char __user * buf,
  2.                                    size_t count, loff_t *ppos)
  3. {
  4.         struct inode * inode = file_inode(file);
  5.         void *page;
  6.         ssize_t length;
  7.         struct task_struct *task = get_proc_task(inode);
  8.         length = -ESRCH;
  9.         if (!task)
  10.                 goto out_no_task;
  11.         if (count > PAGE_SIZE)
  12.                 count = PAGE_SIZE;
  13.         /* No partial writes. */
  14.         length = -EINVAL;
  15.         if (*ppos != 0)
  16.                 goto out;
  17.         page = memdup_user(buf, count);                   [5]
  18.         if (IS_ERR(page)) {
  19.                 length = PTR_ERR(page);
  20.                 goto out;
  21.         }
  22.         /* Guard against adverse ptrace interaction */
  23.         length = mutex_lock_interruptible(&task-;>signal->cred_guard_mutex);
  24.         if (length < 0)
  25.                 goto out_free;
  26.         length = security_setprocattr(task,
  27.                                       (char*)file->f_path.dentry->d_name.name,
  28.                                       page, count);
  29. ...
复制代码
  Unlike __get_free_page() , memdup_user() allocates a block of memory specified by the count parameter and copies the user-supplied data into it. Hence, the size of the object being allocated is no longer restricted to 4096 bytes (even though that's still the maximum buffer size). Let's assume that the user-supplied data is 128 bytes in size and the last byte of this buffer is not null. When apparmor_setprocattr() is triggered, args[128] will be set to 0 because the check is still for PAGE_SIZE and not the actual size of the buffer:
  1. if (args[size - 1] != '\0') {
  2.                 if (size == PAGE_SIZE)
  3.                         return -EINVAL;
  4.                 args[size] = '\0';
  5.         }
复制代码
  Since the objects are allocated dynamically on the heap, the first (least-significant byte) of the next object will be overwritten with a null. The standard technique for placing a target object (containing a function pointer as the first member) right after the vulnerable object won't work here. One idea was to overwrite a reference counter in some object (of the same size as the vulnerable object) and then trigger a UAF (thanks to Nicolas Trippar for suggesting this). While on the subject of counter overflows, if you'll be at Ruxcon (yeah, not Kiwicon because this talk was just too lame for their lineup this year :) next week, check out my talk on exploiting counter overflows in the kernel. Objects reference counters (represented by the atomic_t type = signed int) are generally the first members of the struct. Since counter values are typically under 255 for most objects, overwriting the least-significant byte of such an object would clear the counter and result in a standard UAF. However, to exploit this vulnerability, I've decided to go with a different approach: overwriting SLUB freelist pointers.
  Exploitation

  The nice thing about this vulnerablity is that we control the size of the target object. To trigger the vulnerabilty, the object size should be set to one of the cache sizes (i.e., 8, 16, 32, 64, 96, etc.). We won't go into details on how the SLUB allocator (default kernel memmory allocator on Linux) works. All we need to know is that (different) objects of the same size (in powers of 2) are accumulated into same caches for both general-purpose and special-purpose allocations. Slabs are basically pages in caches that contain objects of the same size. Free objects have a "next free" pointer at offset 0 (by default) pointing to the next free object in the slab.
   The idea is to place our vulnerable object ( A ) next to a free object ( B ) in the same slab and then clear the least-significant byte of this "next free" pointer of object B . When two new objects are allocated in the same slab, the last object will be allocated over objects A and/or B depending the original value of the "next free" pointer:
     

CVE-2016-6187: Exploiting Linux kernel heap off-by-one

CVE-2016-6187: Exploiting Linux kernel heap off-by-one-1-网络科技-practice,received,Twitter,because,checked
     The scenario above (overlapping both A and B objects) is just one of the possible outcomes. The "shift" value for the target object is 1 byte (0 to 255) and the final target object's position would depend on the original "next free" pointer value and the object size.
   Assuming that the target object will overlap both objects A and B , we would like to control the contents of both of these objects.
  At a high level, the exploitation procedure is as follows:
  
       
  • Place the vulnerable object A next to free object B in the same slab   
  • Overwrite the least-significant byte of the "next free" pointer in B   
  • Allocate two new objects in the same slab: the first object will be placed in B and the second object will represent our target object C   
  • If we control the contents of objects A and B , we can force object C to be allocated in user space   
  • Assuming object C has a function pointer that can be triggered from user space, set this pointer to our privilege escalation payload in user space or possibly a ROP chain (to bypass SMEP).  
  To perform steps 1-3, sequential object allocations can be achieved using a standard heap exhaustion technique.
   Next, we need to pick the right object size. Objects that are larger than 128 bytes (i.e., kmalloc caches 256, 512, 1024, etc.) won't work here. Let's assume that the start slab address is 0x1000 (note that slab start addresses are aligned to the page size and sequential object allocations are contiguous). The following C program lists the allocations for a single page given the object size:
  1. // page_align.c
  2. #include
  3.   
  4.    
  5. int main(int argc, char **argv) {
  6.         int i;
  7.         void *page_begin = 0x1000;
  8.         for (i = 0; i < 0x1000; i += atoi(argv[1]))
  9.                 printf("%p\n", page_begin + i);
  10. }
  11.   
复制代码
For objects that are 256 bytes (or > 128 and<= 256 bytes), we have the following pattern:
  1. [email protected]:~$ ./align 256
  2. 0x1000
  3. 0x1100
  4. 0x1200
  5. 0x1300
  6. 0x1400
  7. 0x1500
  8. 0x1600
  9. 0x1700
  10. 0x1800
  11. ...
复制代码
The least significant byte for all allocations in the slab is 0 and overwriting the "next free" pointer of the adjacent free object with a null will have no effect:
     

CVE-2016-6187: Exploiting Linux kernel heap off-by-one

CVE-2016-6187: Exploiting Linux kernel heap off-by-one-2-网络科技-practice,received,Twitter,because,checked
    For the 128-byte cache, there're two possible options:
  1. [email protected]:~$ ./align 128
  2. 0x1000
  3. 0x1080
  4. 0x1100
  5. 0x1180
  6. 0x1200
  7. 0x1280
  8. 0x1300
  9. 0x1380
  10. 0x1400
  11. ...
复制代码
   

CVE-2016-6187: Exploiting Linux kernel heap off-by-one

CVE-2016-6187: Exploiting Linux kernel heap off-by-one-3-网络科技-practice,received,Twitter,because,checked
     The first option is similar to the 256-byte example above (the least-significant byte of "next free" pointer is already 0). The second option is interesting because overwriting the least-significant byte of "next free" pointer will point it to the free object itself. Allocating some object ( A ) with the first 8 bytes set to some (fixed) user-space memory address, followed by the allocation of the target object ( B ), will place object B at the user-controlled memmory address in user space. This is probably the best option both in terms of reliability and ease of exploitation:
  
       
  • There's a 50/50 chance of success. If it's the first option, there's no crash and we can try again.   
  • Finding an object with some user-space address (first 8 bytes) that would be placed in the kmalloc-128 cache is not that hard.  
   Despite this being the best approach, I've decided to go with 96-byte objects and glue it all together with msgsnd() heap exhaustion/spraying. The main (and only) reason for this is that I've already found a target object that I wanted to use and the size of that object happened to be 96 bytes. Thanks to Thomas Pollet for helping find the right heap objects and automate this tedious process with gdb/python at runtime!
   However, there're obvous downsides to using 96-byte objects; the main one is in exploit reliability. The idea is to exhaust the slabs (i.e., fill in the partial slabs) using the standard msgget() technique with 48-byte objects (the other 48 bytes are used for the message header). This will serve as a heap spray as well since we control a half (48 bytes) of the msg object. We also control the contents of the vulnerable object (data written to /proc/self/attr/current from user space). If the target object is allocated so that its first 8 bytes are overlapped with our controlled data, then the exploit will succeed. On the other hand, if these 8 bytes are overlapped with the msg header (that we don't control), this will result in a page fault but the kernel is likely to recover by itself. Based on my analysis, there're a couple of cases where the "next free" pointer would overlap with the random msg header of the previously allocated object.
  There're some tricks to improve the reliability of the exploit however.
  Target object

   For the target object, I've used struct subprocess_info which is exactly 96 bytes in size. To trigger the allocation of this object, the following socket operation can be used with a random protocol family:
  1. socket(22, AF_INET, 0);
复制代码
Socket family 22 doesn't exist but module autoloading will still be triggered reaching the following function in the kernel:
  1. int call_usermodehelper(char *path, char **argv, char **envp, int wait)
  2. {
  3.         struct subprocess_info *info;
  4.         gfp_t gfp_mask = (wait == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL;
  5.         info = call_usermodehelper_setup(path, argv, envp, gfp_mask,    [6]
  6.                                          NULL, NULL, NULL);
  7.         if (info == NULL)
  8.                 return -ENOMEM;
  9.         return call_usermodehelper_exec(info, wait);                    [7]
  10. }
复制代码
  call_usermodehelper_setup [6] will then allocate the object and initialise its fields:
  1. struct subprocess_info *call_usermodehelper_setup(char *path, char **argv,
  2.                 char **envp, gfp_t gfp_mask,
  3.                 int (*init)(struct subprocess_info *info, struct cred *new),
  4.                 void (*cleanup)(struct subprocess_info *info),
  5.                 void *data)
  6. {
  7.         struct subprocess_info *sub_info;
  8.         sub_info = kzalloc(sizeof(struct subprocess_info), gfp_mask);
  9.         if (!sub_info)
  10.                 goto out;
  11.         INIT_WORK(⊂_info->work, call_usermodehelper_exec_work);
  12.         sub_info->path = path;
  13.         sub_info->argv = argv;
  14.         sub_info->envp = envp;
  15.         sub_info->cleanup = cleanup;
  16.         sub_info->init = init;
  17.         sub_info->data = data;
  18.   out:  
  19.         return sub_info;
  20. }
复制代码
  Once the object is initialised, it will be passed to call_usermodehelper_exec in [7]:
  1. static int apparmor_setprocattr(struct task_struct *task, char *name,
  2.                                 void *value, size_t size)
  3. {
  4.         struct common_audit_data sa;
  5.         struct apparmor_audit_data aad = {0,};
  6.         char *command, *args = value;
  7.         size_t arg_size;
  8.         int error;
  9.         if (size == 0)
  10.                 return -EINVAL;
  11.         /* args points to a PAGE_SIZE buffer, AppArmor requires that
  12.          * the buffer must be null terminated or have size <= PAGE_SIZE -1
  13.          * so that AppArmor can null terminate them
  14.          */
  15.         if (args[size - 1] != '\0') {                     [4]
  16.                 if (size == PAGE_SIZE)
  17.                         return -EINVAL;
  18.                 args[size] = '\0';
  19.         }
  20. ...0
复制代码
  If the path variable is null [8], then the cleanup function is executed and the object is freed:
  1. static int apparmor_setprocattr(struct task_struct *task, char *name,
  2.                                 void *value, size_t size)
  3. {
  4.         struct common_audit_data sa;
  5.         struct apparmor_audit_data aad = {0,};
  6.         char *command, *args = value;
  7.         size_t arg_size;
  8.         int error;
  9.         if (size == 0)
  10.                 return -EINVAL;
  11.         /* args points to a PAGE_SIZE buffer, AppArmor requires that
  12.          * the buffer must be null terminated or have size <= PAGE_SIZE -1
  13.          * so that AppArmor can null terminate them
  14.          */
  15.         if (args[size - 1] != '\0') {                     [4]
  16.                 if (size == PAGE_SIZE)
  17.                         return -EINVAL;
  18.                 args[size] = '\0';
  19.         }
  20. ...1
复制代码
  If we could overwrite the cleanup function pointer (remember that this object is now allocated in user space), then we'll have arbitrary code execution with CPL=0. The only problem is that subprocess_info object allocation and freeing happens on the same path. One way to modify the object's function pointer is to somehow suspend the execution before info->cleanup)(info) gets called and set the function pointer to our privilege escalation payload. I could have found other objects of the same size with two "separate" paths for allocation and function triggering but I needed a reason to try userfaultfd() and the page splitting idea.
   The userfaultfd syscall can be used to handle page faults in user space. We can allocate a page in user space and set up a handler (as a separate thread); when this page is accessed either for reading or writing, execution will be transferred to the user-space handler to deal with the page fault. There's nothing new here and this was mentioned by Jann Hornh .
   The SLUB allocator accesses the object (first 8 bytes to update the cache freelist pointer) before it's allocated. Hence, the idea is to split the subprocess_info object over two contiguous pages so that all object fields except say the last one (i.e., void *data ) will be placed in the same page:
     

CVE-2016-6187: Exploiting Linux kernel heap off-by-one

CVE-2016-6187: Exploiting Linux kernel heap off-by-one-4-网络科技-practice,received,Twitter,because,checked
     Then we would set up the user-space page fault handler to deal with PFs in the second page. When call_usermodehelper_setup gets to assigning sub_info->data , execution will be transferred to the user-space PF handler (where we can change previously assigned sub_info->cleanup function pointer). This approach would've worked if the target object was allocated with kmalloc . Unlike kmalloc , kzalloc uses memset(..., 0, size(...)) to zero out the object after the allocation. Unlike glibc, kernel's memset implementation is pretty straightforward (i.e., setting single bytes sequentially):
  1. static int apparmor_setprocattr(struct task_struct *task, char *name,
  2.                                 void *value, size_t size)
  3. {
  4.         struct common_audit_data sa;
  5.         struct apparmor_audit_data aad = {0,};
  6.         char *command, *args = value;
  7.         size_t arg_size;
  8.         int error;
  9.         if (size == 0)
  10.                 return -EINVAL;
  11.         /* args points to a PAGE_SIZE buffer, AppArmor requires that
  12.          * the buffer must be null terminated or have size <= PAGE_SIZE -1
  13.          * so that AppArmor can null terminate them
  14.          */
  15.         if (args[size - 1] != '\0') {                     [4]
  16.                 if (size == PAGE_SIZE)
  17.                         return -EINVAL;
  18.                 args[size] = '\0';
  19.         }
  20. ...2
复制代码
This means that setting the user-space PF handler on the second page will no longer work because a PF will be triggered by memset. However, it's still possible to bypass this by chaining user-space page faults:
  
       
  • Allocate two consecutive pages, split the object over these two pages (as before) and set up the page handler for the second page.   
  • When the user-space PF is triggered by memset, set up another user-space PF handler but for the first page.   
  • The next user-space PF will be triggered when object variables (located in the first page) get initialised in call_usermodehelper_setup . At this point, set up another PF for the second page.   
  • Finally, the last user-space PF handler can modify the cleanup function pointer (by setting it to our privilege escalation payload or a ROP chain) and set the path member to 0 (since these members are all located in the first page and already initialised).  
   Setting up user-space PF handlers for already "page-faulted" pages can be accomplished by munmapping/mapping these pages again and then passing them to userfaultfd(). The PoC for 4.5.1 can be foundhere. There's nothing specific to the kernel version though (it should work on all vulnerable kernels). There's no privilege escalation payload but the PoC will execute instructions at the user-space address 0xdeadbeef .
  Conclusion

   There're possibly easier ways to exploit this vulnerability but I just wanted to make my found target object "work" with userfaultfd . Clean-up is missing but since we're allocating IPC msg objects, it's not very important and there're a few easy ways to fixate the system.
  To Quihoo 360 teams

   Don't just put your name on my exploits and say they're yours. At least acknowledge the original authors. Bet Quihoo will take this PoC, add a standard commit_creds(prepare_kernel_cred()) payload and submit it to OSS :]



上一篇:The 10 Most Frequently Asked Questions During Dev Interviews (at netguru)
下一篇:CSS 实现 1px 以内的移动
√无限循环 投递于 2016-10-17 23:35:30
专业顶帖的!哈哈
回复 支持 反对

使用道具 举报

我是你的信仰 投递于 2016-10-18 11:15:51
众里寻他千百度,蓦然回首在这里!
回复 支持 反对

使用道具 举报

孤独的猪手 投递于 2016-10-19 04:43:39
非常好,顶一下
回复 支持 反对

使用道具 举报

我很屌 投递于 2016-10-21 05:53:06
报告!别开枪,我就是路过来看看的。。。
回复 支持 反对

使用道具 举报

我要投稿

推荐阅读


回页顶回复上一篇下一篇回列表
手机版/CoLaBug.com ( 粤ICP备05003221号 | 文网文[2010]257号 | 粤公网安备 44010402000842号 )

© 2001-2017 Comsenz Inc.

返回顶部 返回列表