AsProtect - A reverse engineering approach

Wednesday, December 13 2006 @ 08:49 AM CET

Contributed by: ColdT

*** AsProtect - A reverse engineering approach ***

***** ***** by crUsAdEr ***** ****

This tutorial aims to discuss more about internal working of AsProtect mainly, more than just unpacking it. So if you just want to unpack it and don't want to waste your time on reverse engineering, forget the second part of this tutorial!

TOOLS used :
IDA 4.15
Soft Ice on Win2k
Revirgin (for unpacking only)
WinHex (for unpacking only)

Targets : ReGet Deluxe 3.0 beta (build 117) (but I think any program protected with the same version of AsProtect will do)
Included : [file:20061209170544579 asprotect dll]

1. Unpacking AsProtect
This is boring after a while, here I summarise some steps you should take when unpacking AsProtected program
(please read Spl/j's tutorial on commview, not much has changed since!)

  1. - Run the program

  2. - Run WinHex and open the program memory space, search for AsProtect signature byte "61 FF E0"

  3. - If you are on win 98, run super bpm. (for win2k user, use Solomon's trick of "bpx 80464C50" to prevent AsProtect from clearing your bpm).

  4. - Close the program, put a bpx on GetVersion and run the program again, sice should break, press F12 and you should be in the AsProtected code, trace with F8 and take note of where the results of this GetVersion is stored in memory. AsProtect store this results and used it later to emulate the API. Other emulated API are GetModuleHandleA, GetCurrentProcess, GetCurrentProcessId and GetCommandLineA. So trace and watch where they are stored. Write down these memory addresses. Tips : trace with F8

  5. - Do a bpm on the address of "61 FF E0" you found earlier

  6. - Let the program runs and sice will break again at "popad ; jmp eax " where eax is you OEP. Dump here.

  7. - Run revirgin and let it resolve IAT (read the revirgin manual on how to do this)

  8. - There are a few missing APIs which are either emulated (remember the address we wrote down earlier?) or redirected. You need to fix these manually and they should be dealt with case by case. (again, read Spl/j's tutorial)

  9. - Sometimes AsProtect dips inside the program code before hitting OEP to trick /tracex and set various flags, encrypt/decrypt some codes but these should be dealt with individually.

Phew, done with the boring part, now if you are interested in how AsProtect really works and willing to work on your own to learn the art of reverse engineering then read on. The next few sections aims to discuss how AsProtect decrypts the program, how APIs are really emulated, how API is mangled, how dippings are done etc... and of course how your bpm are cleared!

2. AsProtect library
Yep, you are right! AsProtect has a dll that is used to perform all of its tasks of decrypting and loading the target. This dll is decrypted at runtime so this section will discuss how to obtain this dll from memory, rebuild it and use it to study AsProtect. Use IDA to disassemble the program, you should be able to over come most of the obfuscation code (grab IDA tutorials or read the IDA manual). My approach is disassemble and debug bit by bit in parallel.

Trace with softice from the beginning of the protected program, you will soon be brought AsProtect code in the last section of the program, F8 a few step and then you will be in the first decryption loop :

0067B083                   ; ----------------------------------------------------------------
0067B083                   loc_67B083:                     ; CODE XREF: 0067B06Fp
0067B083 8A DD                mov     bl, ch
0067B085 5E                   pop     esi                  ; esi := 67B074
0067B086 8A C3                mov     al, bl
0067B088 81 C6 C2 07 00 00    add     esi, 7C2h            ; esi := 67B836
0067B08E 56                   push    esi
0067B08F 8A E2                mov     ah, dl
0067B091 5B                   pop     ebx                  ; ebx := 67B836
0067B092 68 B5 01 00 00       push    1B5h
0067B097 59                   pop     ecx                  ; ecx := 1B5
0067B097                                                   ; Number of time loop perform
0067B097                                                   ; or number of dwords to be decrypted
0067B098                   loop1:                          ; CODE XREF: 0067B154j
0067B098 FF 36                push    dword ptr [esi]
0067B09A 66 B8 8B 7F          mov     ax, 7F8Bh
0067B09E 5A                   pop     edx                  ; edx := [esi]
0067B09F 0F 89 0C 00 00 00    jns     loc_67B0B1
0067B09F                   ; ----------------------------------------------------------------
0067B0A5 0F                   db  0Fh ;  
0067B0B0 00                   db    0 ;  
0067B0B1                   ; ----------------------------------------------------------------
0067B0B1                   loc_67B0B1:                     ; CODE XREF: 0067B09Fj
0067B0B1 81 C2 51 CF 9C 42    add     edx, 429CCF51h       ; add edx
0067B0B7 66 BF 0A 05          mov     di, 50Ah
0067B0BB 81 F2 B6 23 95 0D    xor     edx, 0D9523B6h       ; xor edx
0067B0C1 80 CB 2D             or      bl, 2Dh
0067B0C4 81 EA B7 D8 25 0E    sub     edx, 0E25D8B7h       ; sub edx
0067B0CA 68 B0 D5 A1 60       push    60A1D5B0h
0067B0CF E8 14 00 00 00       call    sub_67B0E8
0067B0CF                   ; ----------------------------------------------------------------
0067B0D4 DC                   db 0DCh ; _
0067B0E7 5B                   db  5Bh ; [
0067B0E8                   ;  S U B R O U T I N E 
0067B0E8                   sub_67B0E8 proc near         ; CODE XREF: 0067B0CFp
0067B0E8 66 BF A4 D9          mov     di, 0D9A4h
0067B0EC 58                   pop     eax               ; eax := 67B0D4
0067B0ED 5F                   pop     edi
0067B0EE 89 16                mov     [esi], edx        ; store back edx into [esi]
0067B0F0 80 F7 27             xor     bh, 27h
0067B0F3 81 EE 7D 99 CB 4F    sub     esi, 4FCB997Dh
0067B0F9 66 8B D9             mov     bx, cx
0067B0FC 81 C6 79 99 CB 4F    add     esi, 4FCB9979h     ; sub esi, 4
0067B102 66 B8 35 7B          mov     ax, 7B35h
0067B106 49                   dec     ecx                ; decrease counter
0067B107 0F 85 22 00 00 00    jnz     continue_decrypt
0067B10D 0F 8A 06 00 00 00    jp      loc_67B119
0067B10D                   ; --------------------------------------------------------------
0067B113 81                   db  81h ; 
0067B118 74                   db  74h ; t
0067B119                   ; --------------------------------------------------------------
0067B119                   loc_67B119:                   ; CODE XREF: sub_67B0E8+25j
0067B119 E9 48 00 00 00                                  ; 0067B190j
0067B119                      jmp     loc_67B166
0067B119                   ; --------------------------------------------------------------
0067B11E A5                   db 0A5h ; 
0067B12E 15                   db  15h ;  
0067B12F                   ; --------------------------------------------------------------
0067B12F                   continue_decrypt:             ; CODE XREF: sub_67B0E8+1Fj
0067B12F E8 0D 00 00 00       call    loc_67B141
0067B12F                   ; --------------------------------------------------------------
0067B134 91                   db  91h ; 
0067B140 85                   db  85h ; 
0067B140                   sub_67B0E8 endp
0067B141                   ; --------------------------------------------------------------
0067B141                   loc_67B141:                  ; CODE XREF: sub_67B0E8+47p
0067B141 E9 0D 00 00 00       jmp     loc_67B153
0067B141                   ; --------------------------------------------------------------
0067B146 01                   db    1 ;  
0067B152 F5                   db 0F5h ; )
0067B153                   ; --------------------------------------------------------------
0067B153                   loc_67B153:                  ; CODE XREF: 0067B141j
0067B153 5F                   pop     edi               ; edi := 67D134
0067B154 E9 3F FF FF FF       jmp     loop1

Lots of obfuscation code but I list here once so that hopefully you will get accustomed to them. There are more to come!

I hope the dead listing with comments are good enough to understand, but as you can see AsProtect is decrypting something and that some thing happen to be the VERY next block of codes. Do trace through it once or twice and you will get the hang of it. This is important as it will help you to understand AsProtect structure better as you go on. As you can see, the key is to use IDA to disassemble at the right place and ignore obfuscation code. Also, just a personal opinion, comment like mad, I commented everything I see eventhouugh sometimes I don't know what they are, I simply rename those offset "some_shit", the next time you see "some_shit" you'll know that this variable has been accessed before and it helps....

Once you have understand how this loop works, bpx on the exit of the loop and you will soon see the next loop with the same algorithm but different key and size used to decrypt the block after itself. This decryption is repeated a few times, (I think 4) and then a block of data is copy to high memory and decrypt there (our dll). It is also quite interesting to watch how AsProtect search for its import; namely GetProcAddress, GetModuleHandleA, LoadLibraryA, VirtualAlloc and VirtualFree by scanning the export directory of kernel32.dll instead of using the pre-loaded import IAT.

Once, the dll is loaded into some high memory, I made a dump, attach it to the end of the program, adjust the sections header so that the virtual address is the same. At the first glance, it looks like just a data block with some code on it but once I start tracing this code I see something fishy. Here comes the OS loader :
(I remove the relocation codes as they are too long to list here. Only Import loading and OEP calculation is listed)

00A4A488                   loc_A4A488:       
00A4A488                      mov     esi, dword ptr ss:unk_442A61[ebp] ; esi = [A4A11D]
00A4A48E 8B 95 D8 30 44 00    mov     edx, dword ptr ss:unk_4430D8[ebp] ; add image base
00A4A494 03 F2                add     esi, edx                ; esi now points to Import Directory
00A4A496                   load_next_dl_import:               
00A4A496 8B 46 0C             mov     eax, [esi+0Ch]          ; get dll name offset
00A4A499 85 C0                test    eax, eax
00A4A49B 0F 84 0A 01 00 00    jz      finish_import_loading   ; eax := [A4A121]
00A4A4A1 03 C2                add     eax, edx                ; add image base
00A4A4A3 8B D8                mov     ebx, eax
00A4A4A5 50                   push    eax
00A4A4A6 FF 95 EC 31 44 00    call    dword ptr ss:unk_4431EC[ebp] ; GetModuleHandleA
00A4A4AC 85 C0                test    eax, eax
00A4A4AE 75 07                jnz     short library_loaded
00A4A4B0 53                   push    ebx
00A4A4B1 FF 95 F0 31 44 00    call    dword ptr ss:unk_4431F0[ebp] ; LoadLibraryA
00A4A4B7                   library_loaded:                     
00A4A4B7 89 85 4D 29 44 00    mov     dword ptr ss:unk_44294D[ebp], eax
00A4A4BD C7 85 51 29 44 00+   mov     dword ptr ss:unk_442951[ebp], 0 ; initialise Import Counter
00A4A4C7                   next_first_thunk_entry:            
00A4A4C7 8B 95 D8 30 44 00    mov     edx, dword ptr ss:unk_4430D8[ebp] ; get image base
00A4A4CD 8B 06                mov     eax, [esi]              ; check Original_First_Thunk
00A4A4CF 85 C0                test    eax, eax
00A4A4D1 75 03                jnz     short original_first_thunk_found ; 
00A4A4D3 8B 46 10             mov     eax, [esi+10h]          ; get first thunk offset
00A4A4D6                   original_first_thunk_found:             ; CODE XREF: 00A4A4D1j
00A4A4D6 03 C2                add     eax, edx                ;  add image base
00A4A4D8 03 85 51 29 44 00    add     eax, dword ptr ss:unk_442951[ebp] ; add counter
00A4A4DE 8B 18                mov     ebx, [eax]              ; get Import ASCII
00A4A4E0 8B 7E 10             mov     edi, [esi+10h]
00A4A4E3 03 FA                add     edi, edx                ; edi => first thunk
00A4A4E5 03 BD 51 29 44 00    add     edi, dword ptr ss:unk_442951[ebp] ; add counter
00A4A4EB 85 DB                test    ebx, ebx
00A4A4ED 0F 84 A2 00 00 00    jz      dll_done                
00A4A4F3 F7 C3 00 00 00 80    test    ebx, 80000000h          ; import by ordinal?
00A4A4F9 75 04                jnz     short loc_A4A4FF
00A4A4FB 03 DA                add     ebx, edx                ; add image base
00A4A4FD 43                   inc     ebx
00A4A4FE 43                   inc     ebx                     ; add 2 to point to API ASCII
00A4A4FF                   loc_A4A4FF:                             ; CODE XREF: 00A4A4F9j
00A4A4FF 53                   push    ebx
00A4A500 81 E3 FF FF FF 7F    and     ebx, 7FFFFFFFh
00A4A506 53                   push    ebx
00A4A507 FF B5 4D 29 44 00    push    dword ptr ss:unk_44294D[ebp] ; modulehandle
00A4A50D FF 95 E8 31 44 00    call    dword ptr ss:unk_4431E8[ebp] ; get Proc address
00A4A513 85 C0                test    eax, eax
00A4A515 5B                   pop     ebx
00A4A516 75 6F                jnz     short API_add_found     
00A4A518 F7 C3 00 00 00 80    test    ebx, 80000000h          ; import by ordinal ?
00A4A51E 75 19                jnz     short loc_A4A539
00A4A520 57                   push    edi
00A4A521 8B 46 0C             mov     eax, [esi+0Ch]
00A4A524 03 85 D8 30 44 00    add     eax, dword ptr ss:unk_4430D8[ebp]
00A4A52A 50                   push    eax
00A4A52B 53                   push    ebx
00A4A52C 8D 85 53 31 44 00    lea     eax, unk_443153[ebp]
00A4A532 50                   push    eax
00A4A533 57                   push    edi
00A4A534 E9 99 00 00 00       jmp     loc_A4A5D2
00A4A539                  ; -------------------------------------------------------------
00A4A539                   loc_A4A539:                             ; CODE XREF: 00A4A51Ej
00A4A539 81 E3 FF FF FF 7F    and     ebx, 7FFFFFFFh
00A4A53F 8B 85 DC 30 44 00    mov     eax, dword ptr ss:unk_4430DC[ebp]
00A4A545 39 85 4D 29 44 00    cmp     dword ptr ss:unk_44294D[ebp], eax
00A4A54B 75 24                jnz     short loc_A4A571
00A4A54D 57                   push    edi
00A4A54E 8B D3                mov     edx, ebx
00A4A550 4A                   dec     edx
00A4A551 C1 E2 02             shl     edx, 2
00A4A554 8B 9D 4D 29 44 00    mov     ebx, dword ptr ss:unk_44294D[ebp]
00A4A55A 8B 7B 3C             mov     edi, [ebx+3Ch]
00A4A55D 8B 7C 3B 78          mov     edi, [ebx+edi+78h]
00A4A561 03 5C 3B 1C          add     ebx, [ebx+edi+1Ch]
00A4A565 8B 04 13             mov     eax, [ebx+edx]
00A4A568 03 85 4D 29 44 00    add     eax, dword ptr ss:unk_44294D[ebp]
00A4A56E 5F                   pop     edi
00A4A56F EB 16                jmp     short API_add_found    
00A4A571                   ;--------------------------------------------------------------
00A4A571                   loc_A4A571:                             ; CODE XREF: 00A4A54Bj
00A4A571 57                   push    edi
00A4A572 8B 46 0C             mov     eax, [esi+0Ch]
00A4A575 03 85 D8 30 44 00    add     eax, dword ptr ss:unk_4430D8[ebp]
00A4A57B 50                   push    eax
00A4A57C 53                   push    ebx
00A4A57D 8D 85 A4 31 44 00    lea     eax, unk_4431A4[ebp]
00A4A583 50                   push    eax
00A4A584 57                   push    edi
00A4A585 EB 4B                jmp     short loc_A4A5D2
00A4A587                   ; -----------------------------------------------------------------
00A4A587                   API_add_found:                          ; CODE XREF: 00A4A516j
00A4A587 89 07                                                     ; 00A4A56Fj
00A4A587                      mov     [edi], eax              ; update first thunk
00A4A589 83 85 51 29 44 00+   add     dword ptr ss:unk_442951[ebp], 4
00A4A590 E9 32 FF FF FF       jmp     next_first_thunk_entry  ; get image base
00A4A595                   ; ------------------------------------------------------------------
00A4A595                   dll_done:                               ; CODE XREF: 00A4A4EDj
00A4A595 89 06                mov     [esi], eax              ; clear Import Directory Entry
00A4A597 89 46 0C             mov     [esi+0Ch], eax
00A4A59A 89 46 10             mov     [esi+10h], eax
00A4A59D 83 C6 14             add     esi, 14h                ; next Import Directories Entry
00A4A5A0 8B 95 D8 30 44 00    mov     edx, dword ptr ss:unk_4430D8[ebp]
00A4A5A6 E9 EB FE FF FF       jmp     load_next_dl_import     ; get dll name offset
00A4A5AB                   ; ------------------------------------------------------------------
00A4A5AB                   finish_import_loading:                  ; CODE XREF: 00A4A49Bj
00A4A5AB 8B 85 65 2A 44 00    mov     eax, dword ptr ss:unk_442A65[ebp] ; eax := [A4A121]
00A4A5B1 50                   push    eax
00A4A5B2 03 85 D8 30 44 00    add     eax, dword ptr ss:unk_4430D8[ebp] ; add image base
00A4A5B8 5B                   pop     ebx
00A4A5B9 0B DB                or      ebx, ebx
00A4A5BB 89 85 11 2F 44 00    mov     dword ptr ss:unk_442F11[ebp], eax ; update instruction at A4A5CC
00A4A5C1 61                   popa
00A4A5C2 75 08                jnz     short OEP_OK            
00A4A5C4 B8 01 00 00 00       mov     eax, 1
00A4A5C9 C2 0C 00             retn    0Ch
00A4A5CC                   ; ------------------------------------------------------------------
00A4A5CC                   OEP_OK:                                 ; CODE XREF: 00A4A5C2j
00A4A5CC 68 00 00 00 00       push    0                       ; this will be changed to push OEP
00A4A5D1 C3                   retn                            ; go to OEP of the Delphi prog
00A4A5D2                   ; -------------------------------------------------------------------

I know it is long dead listing, but do read through it, at least the comments as I think I wrote sufficient comments for readers to have an idea on what is happening. I have removed most of the codes (the decompression and relocation codes). Basically, it decompresses the main block and then perform what seems to me a relocation of image file, then import loading and finally getting OEP to jump to it. Hence I suspect this was either a dll or exe file (packed with AsPack?) or something. Hence I decided to dump this image (before all this reloc and IAT loading is done), it was hard to build the file as the header is completely ripped off and I took some time before getting it right so that IDA will disassemble them. First I notice that it has 6 sections (most if not all Delphi app has 6 sections) and then I remember AsProtect likes Delphi very much! Thus I open some Delphi exe and ripped the header completely, paste into this dump, fix the section info, fix the reloc and import info. (If you are not familiar with PE header, read some excellent PE tutorials by Iczellion and others at krobar's site).

Finally, I got IDA to disassemble the file nicely and hence discover it was a console dll written in Delphi and IDA (great tool!! Really helpful here) actually can apply its FLIRT feature to detect lots of Delphi function and that saves me lots of time. I must say if IDA has not been able to disassemble the dll properly, this tutorial would not have been done so easily. here I attach the dll here so you can disassemble and study them yourself. You should also try to dump and rebuild on from any AsProtect program as a practice.

Yep, once you obtain the dll, you will realise that all of AsProtect mysteries are inside that dll... its seh clearing debug registers, its crc check, IAT mangling etc... Here come the fun!!!

3. AsProtect.dll seh tricks

Here is how a typical seh is set up in AsProtect and you will know that seh is used not less than 30 times in this dll. It used to stop newbies like me from tracing AsProtect code but not anymore once you understand what is going on. Before you read this, please read some essential information about seh (I suggest Jeremy Gordon's excellent paper on seh).

004106B2 E8 49 C6 FF FF      call    Clear_API_emu_code      ; upper limit
004106B7 E8 25 00 00 00      call    set_seh_3
004106BC                   seh_3_handler:                          ; 
004106BC 8B 44 24 0C          mov     eax, [esp+0Ch]          ; get context to eax
004106C0 83 80 B8 00 00 00+   add     dword ptr [eax+0B8h], 2 ; add context.eip by 2
004106C7 51                   push    ecx
004106C8 31 C9                xor     ecx, ecx
004106CA 89 48 04             mov     [eax+4], ecx            ; clear debug register 0
004106CD 89 48 08             mov     [eax+8], ecx            ; clear debug register 1
004106D0 89 48 0C             mov     [eax+0Ch], ecx          ; clear debug register 2
004106D3 89 48 10             mov     [eax+10h], ecx          ; clear debug register 3
004106D6 C7 40 18 55 01 00+   mov     dword ptr [eax+18h], 155h ; context.dr7 := 155
004106DD 59                   pop     ecx
004106DE 31 C0                xor     eax, eax                ; exception handled, continue
004106E0 C3                   retn
004106E0                   set_seh_1 endp ; sp =  4
004106E1                   ;  S U B R O U T I N E 
004106E1                   set_seh_3 proc near                     ; CODE XREF: set_seh_1+25p
004106E1 31 C0                xor     eax, eax
004106E3 64 FF 30             push    dword ptr fs:[eax]      ; set up seh 3
004106E6 64 89 20             mov     fs:[eax], esp
004106E9 31 00                xor     [eax], eax              ; cause seh 3
004106E9                                                      ; this seh clear debug register
004106EB 64 8F 05 00 00 00+   pop     large dword ptr fs:0    ; remove seh struc
004106F2 58                   pop     eax
004106F3 E8 CC 1F FF FF       call    @System@Randomize$qqrv  ; System::Randomize(void)

The seh is set up in a slightly different way which makes it harder to detect but with IDA everything becomes very clear!

004106B7 call set_seh_3 ; this is equivalent to push handler and move eip to 4106E1 same as

; push 4106BC (our seh handler)

004106E1 xor eax, eax
004106E3 push dword ptr fs:[eax] ; set up seh 3
004106E6 mov fs:[eax], esp
004106E9 xor [eax], eax ; cause seh 3
004106E9 ; this seh clear debug register
004106EB pop large dword ptr fs:0 ; remove seh struc
004106F2 pop eax
Thus when we trace over 4106E9, the seh is triggered and our context is retrieved to eax, where eip is adjusted by increasing it by 2 to point to the next working instruction, also debug registers are cleared the same way. (this is posted by R!sc before)

This is interesting indeed! For example when you are tracing and you are at 4106B7, if you trace into with F8 you will meet the faulty instruction at 4106E9 and be lost in kernel seh code! if you trace over with F10, sice will place a break point at the next instruction which happen to be the seh handler and then you will soon meet the "ret" at 4106E0 and again you will be lost in seh kernel code! Now that we know how this whole seh scheme works, we can trace anywhere we want, simply do a "r eip eip+2" at the faulty instruction and seh handler will be skipped altogether!

Analysing the dll is slightly harder as it is a long, full-blown dll with a hell lot of seh and Delphi bloated codes but IDA eases the jobs quite a bit. You should try to analyse this slowly, always trace over a call and guess what it does first before actually stepping inside it.

4. AsProtect.dll internal

OK, I am not going to discuss the whole dll here, that would take too long and is pointless to paste long Delphi codes. I will just roughly describe its structure. Here is the skeleton of the dll

00410C32 68 4C E7 40 00   push    offset self_hash           
00410C37 68 5C 0D 41 00   push    offset Dips_DriveHash_Date_Registry_CodeDecrypt
00410C3C 68 90 02 41 00   push    offset nothing_important ; notthing really important here
00410C3C                                                   ; just some file header checking
00410C41 68 44 FF 40 00   push    offset complete_API_mangling
00410C46 68 1C F9 40 00   push    offset Fully_decrypt_file ; decrypt the file fully,
00410C46                                                    ; IAT is untouched.. virgin!!!!!
00410C46                                                    ; import ASCII stripped but
00410C46                                                    ; hint are still there
00410C4B 68 10 F3 40 00   push    offset Decrypt_file_first_time
00410C50 68 2C 06 41 00   push    offset emu_API_n_file_hash
00410C55 C3               retn

Yep, so AsProtect will execute the API_emulation routine and then on return, it will jump to the next routine Decrypt_file_first_time and then so on and so on. API emulation is pretty easy to understand, especially if you have unpack a few AsProtected programs. Also AsProtect use open the exe and hash the whole file to check for CRC. Also, you should read the thread by me, Mike and Dakien at Fravia's Board (crypto forum, "stream cipher??" and "Tutorial: finding encryption code") on encryption routines used. You will find that in fact all the main decryption routines are similar to that one throughout this dll with MD5 heavily used!

The third routine decrypt the exe fully and perform relocation on the code section if required. Here you will see how the mysterious "ret" at 401014 call (that appears in all AsProtect programs) is made, I guess to test if the code section is executable.

It should be noted that after the 3rd routine, we can actually dump the file and obtain a nice clean dump with IAT first thunk intact, import ASCII are stripped together with dll names but the hints are still there so we can actually rebuild the program from there!

Disassembling the API mangling routine we'll see exactly how decrypt import information, classify imports into different categories (6 if I am not wrong) and have different treatment for each of them. Again, you have unpacked AsProtect then you will know these routines or at least have an ideas of how Imports are treated differently by AsProtect. I will not post the full code here as the offset given above should be enough for you to find where the routines are.

00410033 A1 A4 49 41 00     mov     eax, ds:Encrypted_data_offset
00410038 8B 55 F0           mov     edx, [ebp-10h]
0041003B 89 42 04           mov     [edx+4], eax
0041003E 8B 15 94 32 41 00  mov     edx, ds:sign_13_import_data
00410044 8B 45 F0           mov     eax, [ebp-10h]
00410047 E8 A4 D0 FF FF     call    Decrypt_n_Find         ; eax points to mem_struc
00410047                                                   ; mem_struc+4 : original data
00410047                                                   ; mem_struc+8 : decrypted data
00410047                                                   ; it also traverse the decrypted data
00410047                                                   ; block to search for data block with 
00410047						   ; signature in edx
0041004C 8B D8              mov     ebx, eax
0041004E 85 DB              test    ebx, ebx
00410050 74 1B              jz      short import_data_not_found
00410052 8B 43 04           mov     eax, [ebx+4]
00410055 E8 3A 25 FF FF     call    @System@@GetMem$qqrv    ; size in eax
0041005A 89 45 FC           mov     [ebp-4], eax            ; [ebp-4] points to import data

The call at 410047 decrypt a block data in the exe containing all the information about the protected program, like import data, dips to be performed, various hash results, decryption keys and assign each of these data with a signature tag. In this case the signature tag is named "sign_13_import_data" by me. The procedure search for this signature in edx and output eax pointing to the desired block of data. Hence, rename this procedure and you will that it is called everywhere in the dll!

Look into the second last routine, we'll see how pre-OEP dipping is done. Using the call Decrypt_n_Find above, it searches for data blocks with Dip_signatures in the large chunk of decrypted data. Each of these data block contains a dipping address in the main program code section and the signature represents different kind of duties these dippings perform. Here are examples of the dips :

00411133 83 3D A4 45 41 00+   cmp     ds:dip_B_address, 0
0041113A 74 0B                jz      short loc_411147   ; dip_B : redirect TApplication$Initialise
0041113C 68 00 CE 40 00       push    offset @Forms@TApplication@Initialize$qqrv ; 
00411141 FF 15 A4 45 41 00    call    ds:dip_B_address
00411147                   loc_411147:                     ; CODE XREF: set_seh_2A+2C2j
00411147 C7 45 C0 F7 27 00+   mov     dword ptr [ebp-40h], 27F7h
0041114E 8D 85 B5 D7 FF FF    lea     eax, [ebp-284Bh]

OR here :

004111C9 83 3D 84 45 41 00+   cmp     ds:Dip_3_address, 0
004111D0 74 16                jz      short loc_4111E8
004111D2 68 54 CD 40 00       push    offset nullsub_3
004111D7 FF 15 84 45 41 00    call    ds:Dip_3_address     ; dip_3 : simple return
004111DD 68 54 CD 40 00       push    offset nullsub_3
004111E2 FF 15 88 45 41 00    call    ds:dip_4_address     ; dip_4 : simple return
004111E8                   loc_4111E8:                     ; CODE XREF: set_seh_2A+358j
004111E8 83 7D C0 00          cmp     dword ptr [ebp-40h], 0
004111EC 75 4C                jnz     short registry_data_found_already

As you can see, there are different type of dips, ranging from @Forms@TApplication@Initialize$qqrv (which probably only applicable to Delphi apps), Decrypting parts of code section to a simple "ret". I think there are about 11 types of dips but some of them are very similar. ReGet only used 2 dips to decrypt some parts of the code section so I was not able to debug much of this D-D business, mainly analysing the dead listing. As you can see from the name of routine, it used you hard disk information as hash key, store the hash in registry, access system date etc... these dippings can get really wild :>

Finally, the last routine listed above are the self hash routine that check for error in its own dll code, so that if you place a breakpoint somewhere there, its opcodes will be replaced by "CC" and AsProtect will be able to detect it and exit. Here is one example :

0040E74C 68 54 C3 84 15     push    1584C354h        	; [esp] value
0040E751 68 AC 0F 00 00     push    0FACh		; end of area to be hashed???
0040E756 68 9C D7 00 00     push    0D79Ch		; start area to be hashed???
0040E75B 68 00 90 01 00     push    19000h		; RVA of rsrc section (hash data)	
0040E760 FF 35 14 40 41 00  push    ds:hInstance	; base image of dll
0040E766 E8 31 E8 FF FF     call    self_hash
0040E76B 31 04 24           xor     [esp], eax          ; test hash ([esp] = 1584C354h)
0040E76E 8B 05 14 40 41 00  mov     eax, ds:hInstance
0040E774 01 04 24           add     [esp], eax          ; if hash is wrong, ret goes to wrong place
0040E774                                                ; seh is trigger and program quit
0040E777 C3                 retn                        ; go to 4117D4 if correct

I am not quite sure about how exactly parameters are used in the Hash routine, too lazy to really trace into the routine in details to figure everything out, but I do have a rough idea of what is going on. As you can see, the hash result is used to decide where the program will go to next so if you are tracing and the hash is wrong, you would not notice it at all and will continue tracing until caught in seh and the program exits!

The above routine is repeated 1 more time to test the other half of the dll to make sure that no bpx escape its grasp ... but now we know how to defeat it :)... Once the hash check are OK, AsProtect proceeds to another memory area (loaded and decrypted by the dll) to perform the final task of calculating OEP and the famous "popad ; jmp eax"! I did not bother tracing and dumping this routine as it looks like a long boring nonlinear MD5 ... nah it wasn't MD5 but I don't see that I can learn much from it. AsProtect is more or less fully reversed.

5. API Mangling, a closer analysis

Okie, this section is going to discuss how AsProtect redirect API in more details so that hopefully you will be able to analyse other protectors the same way. After AsProtect decrypt the import data block, it has a Delphi-like structure with first dword is the signature (remmeber?), next dword is length, then followed by blocks of each library. The library block start with position of the first thunk offset, then dll name, then followed by each import entries. The import entries consist of first byte as group classification, second byte is length of the entry then import ASCII (encrypted of course).

The first byte is 01 then there will be no mangling, 03 is GetProcAddress, 04 is import by ordinal, 05 is redirect API and 06 is emulated API!!! You can find this whole API mangling routine starting from 41011A.

I would like to discuss about class 05 :> (which is quite interesting how AsProtect scan first few instruction of API, copy them to the redirected API location etc..)...

0040FD57                   check_next_instruction:  ; CODE XREF: Mangle_IAT+76j
0040FD57 E8 84 FF FF FF      call  get_instruction_Table
0040FD5C 8B D8               mov   ebx, eax
0040FD5E C6 44 24 0C 00      mov   [esp+10h+copied_flag], 0 ; clear flag
0040FD63                   check_instruction_start_byte: ; CODE XREF: Mangle_IAT+6Fj
0040FD63 0F B6 33            movzx esi, byte ptr [ebx]   ; first byte to esi
0040FD66 8D 43 01            lea   eax, [ebx+1]          ; the second byte is stored in edi
0040FD66                                                 ; for later usage
0040FD69 0F B6 38            movzx edi, byte ptr [eax]
0040FD6C 8D 53 02            lea   edx, [ebx+2]          ; third byte onwards
0040FD6F 8B CE               mov   ecx, esi              ; first byte
0040FD71 8B C5               mov   eax, ebp              ; original API address
0040FD73 E8 E8 C6 FF FF      call  CompareBinary         ; compare binary string in eax and edx
0040FD73                                                 ; with length ecx
0040FD78 84 C0               test  al, al
0040FD7A 74 18               jz    short not_equal
0040FD7C 8B CF               mov   ecx, edi              ; second byte
0040FD7E 8B D5               mov   edx, ebp              ; original API adddress
0040FD80 8B 44 24 08         mov   eax, [esp+10h+new_mem_pointer] ; newly allocated memory address
0040FD84 E8 7F 47 FF FF      call  Move_memory           ; copy a few bytes from original API
0040FD84                                                 ; adddress to the newly allocated
0040FD84                                                 ; memory to redirect the API
0040FD89 01 7C 24 08         add   [esp+10h+new_mem_pointer], edi
0040FD8D 03 EF               add   ebp, edi              ; calculate the position for
0040FD8D                                                 ; the redirected API to jump back
0040FD8D                                                 ; to the original API
0040FD8F C6 44 24 0C 01      mov   [esp+10h+copied_flag], 1 ; set the API redirected flag
0040FD94                   not_equal:                    ; CODE XREF: Mangle_IAT+46j
0040FD94 83 C6 02            add   esi, 2
0040FD97 03 DE               add   ebx, esi              ; point ebx to next API redirection
0040FD97                                                 ; data block
0040FD99 80 7C 24 0C 00      cmp   [esp+10h+copied_flag], 0
0040FD9E 75 05               jnz   short loc_40FDA5      ; is API already redirected?
0040FD9E                                                 ; jump if yes
0040FDA0 80 3B 00            cmp   byte ptr [ebx], 0     ; the end of the API redirection
0040FDA0                                                 ; block??
0040FDA3 75 BE               jnz   short check_instruction_start_byte ; 
0040FDA5                   loc_40FDA5:                   ; CODE XREF: Mangle_IAT+6Aj
0040FDA5 80 7C 24 0C 00      cmp   [esp+10h+copied_flag], 0
0040FDAA 75 AB               jnz   short check_next_instruction

Basically, the routine above check the first few bytes of each instruction in the original API routine, compared with a pre-stored table of instructions and decide how it should copy the routine over to the redirected API. The first call of the routine at 40FD57 "call get_instruction_table" simply points eax to the beginning of the instruction table. The loop goes on until an "unknown" instruction is found, that is an instruction is not defined in the pre-stored table. The pre-stored table looks something like this

0040FCFB 01                  db    1 ;                   ; no. of bytes to compare
0040FCFC 01                  db    1 ;                   ; number of bytes to copy
0040FCFD                   ; ---------------------------------------------------------------
0040FCFD 57                  push  edi
0040FCFD                   ; ---------------------------------------------------------------
0040FCFE 01                  db    1 ;                   ; no. of bytes to compare
0040FCFF 02                  db    2 ;                   ; number of bytes to copy
0040FD00 6A                  db  6Ah ; j                 ; 6Axx  ==> push  xx
0040FD01 01                  db    1 ;                   ; no. of bytes to compare
0040FD02 05                  db    5 ;                   ; number of bytes to copy
0040FD03 68                  db  68h ; h                 ; 68xxxxxxxx ==> push xxxxxxxx
0040FD04 02                  db    2 ;                   ; no. of bytes to compare
0040FD05 03                  db    3 ;                   ; number of bytes to copy
0040FD06 FF                  db 0FFh ;                   ; FF75xx ==> push dword ptr [ebp+xx]
0040FD07 75                  db  75h ; u

Look back at the loop above again, you will understand how it scan through the table to find the right instruction, for example when the routine scan through the table, at 40FD04, the first byte is 02 so that means it compares the first 2 bytes of the current instruction in the original API with 2 bytes starting from 40FD06, if same then the instruction is "push dword ptr [ebp+xx]" hence copy the next 3 bytes (next instruction) over to the redirected routine. This is how it can copy the full instructions without using a disassembler.

After leeching as much as possible from the original API, AsProtect sets about to create the final jump to bring the redirected API to the original API... this is again mundane calculation of number of byte copied etc... Only the last interesting bit is it uses a Random number to decide which kind of call back to use, "push xxxxxxxx ret" or a long jump!

0040FDDA B8 02 00 00 00      mov   eax, 2
0040FDDF E8 D0 29 FF FF      call  @System@@RandInt$qqrv ; System __linkproc__ RandInt(void)
0040FDE4 83 E8 01            sub   eax, 1                ; randomize between 0 and 1 :>??
0040FDE4                                                 ; 2 options f returning to the original
0040FDE4                                                 ; API?? a push ret or a long jump?
0040FDE7 73 1A               jnb   short push_return
0040FDE9 8B 44 24 08         mov   eax, [esp+10h+new_mem_pointer]
0040FDED C6 00 E9            mov   byte ptr [eax], 0E9h  ; setting up the long jump E9xxxxxxxx
0040FDF0 83 C5 05            add   ebp, 5
0040FDF3 8B 04 24            mov   eax, [esp+10h+API_address]
0040FDF6 2B C5               sub   eax, ebp              ; calculate the relative distance
0040FDF6                                                 ; to put in after E9
0040FDF8 03 F0               add   esi, eax
0040FDFA 8B 44 24 08         mov   eax, [esp+10h+new_mem_pointer]
0040FDFE 40                  inc   eax
0040FDFF 89 30               mov   [eax], esi            ; update the redirected API witht the
0040FDFF                                                 ; distance found
0040FE01 EB 1B               jmp   short done
0040FE03                   ; ------------------------------------------------------------------
0040FE03                   push_return:                  ; CODE XREF: Mangle_IAT+B3j
0040FE03 8B 44 24 08         mov   eax, [esp+10h+new_mem_pointer]
0040FE07 C6 00 68            mov   byte ptr [eax], 68h   ; setting up a push, 68xxxxxxxx
0040FE0A 03 34 24            add   esi, [esp+10h+API_address] ; calculate the return address
0040FE0D 8B 44 24 08         mov   eax, [esp+10h+new_mem_pointer]
0040FE11 40                  inc   eax
0040FE12 89 30               mov   [eax], esi            ; update the redirected API
0040FE14 8B 44 24 08         mov   eax, [esp+10h+new_mem_pointer]
0040FE18 83 C0 05            add   eax, 5
0040FE1B C6 00 C3            mov   byte ptr [eax], 0C3h  ; finally put a "ret", C3
0040FE1B                                                 ; to go to the address pushed above
0040FE1E                   done:                         ; CODE XREF: Mangle_IAT+CDj
0040FE1E 8B C7               mov   eax, edi

Rather interesting! I hope everything is clear. There is one more routine that deals with type 06, emulated APIs and it is left as a practise for readers to locate and analyze this routine... it is not that simple though!

6. Finally

I hope you have learnt something from this long tutorial. This is my first so there are bound to be mistakes, PLEASE contact me and help me correct them (I can be found at RCE board most of the time). I hope that you will be able to use these info to unpack AsProtect better, or to inline patch it, to remove CRC check and all...


This tutorial would not have been possible without the following people : Spl/j, evaluator, Solomon, SpeKKel, FoxThree, sv (for helping me unpacking my first AsProtected program), Tsehp (for revirgin), Clandestiny (for answering seh stuff), Daemon (for anti- debugging/tracing stuff on his site) and last but not least R!sc (the old genius) for his excellent tutorials on unpacking.... I am sorry if I miss someone out, but you know that I am always grateful for your help, it is the thought that counts heh :>

Special thanks to Kayaker for some analysis on API mangling and of course, give me more work to do :>

Last Edited : 30 April 2002