Reverse Engineering eicar.com Anti-Malware Testfile

Some of my friends have used EICAR Standard Anti-Virus Test File to test their PCs’ antivirus. Anti-malware programs scan and detect the test file as a virus named EICAR… which sometimes cause my friends to suspect it as a real virus. This blog post will present the reverse engineering of eicar.com test file to see whether it contains any characteristics of malicious software.

First, I download the EICAR Anti-Virus Test File from the following link: http://www.eicar.org/download/eicar.com and open it with IDA.

Here’s the disassembly of the file after IDA finish its initial auto-analysis:

seg000:0100                                         public start
seg000:0100                         start           proc near
seg000:0100 58                                      pop     ax
seg000:0101 35 4F 21                                xor     ax, 214Fh
seg000:0104 50                                      push    ax
seg000:0105 25 40 41                                and     ax, 4140h
seg000:0108 50                                      push    ax
seg000:0109 5B                                      pop     bx
seg000:010A 34 5C                                   xor     al, 5Ch
seg000:010C 50                                      push    ax
seg000:010D 5A                                      pop     dx
seg000:010E 58                                      pop     ax
seg000:010F 35 34 28                                xor     ax, 2834h
seg000:0112 50                                      push    ax
seg000:0113 5E                                      pop     si
seg000:0114 29 37                                   sub     [bx], si
seg000:0116 43                                      inc     bx
seg000:0117 43                                      inc     bx
seg000:0118 29 37                                   sub     [bx], si
seg000:011A 7D 24                                   jge     short loc_10140
seg000:011C 45                                      inc     bp
seg000:011D 49                                      dec     cx
seg000:011E 43                                      inc     bx
seg000:011F 41                                      inc     cx
seg000:0120 52                                      push    dx
seg000:0121 2D 53 54                                sub     ax, 5453h
seg000:0124 41                                      inc     cx
seg000:0125 4E                                      dec     si
seg000:0126 44                                      inc     sp
seg000:0127 41                                      inc     cx
seg000:0128 52                                      push    dx
seg000:0129 44                                      inc     sp
seg000:012A 2D 41 4E                                sub     ax, 4E41h
seg000:012D 54                                      push    sp
seg000:012E 49                                      dec     cx
seg000:012F 56                                      push    si
seg000:0130 49                                      dec     cx
seg000:0131 52                                      push    dx
seg000:0132 55                                      push    bp
seg000:0133 53                                      push    bx
seg000:0134 2D 54 45                                sub     ax, 4554h
seg000:0137 53                                      push    bx
seg000:0138 54                                      push    sp
seg000:0139 2D 46 49                                sub     ax, 4946h
seg000:013C 4C                                      dec     sp
seg000:013D 45                                      inc     bp
seg000:013E 21 24                                   and     [si], sp
seg000:0140
seg000:0140                         loc_10140:                              ; CODE XREF: start+1Aj
seg000:0140 48                                      dec     ax
seg000:0141 2B 48 2A                                sub     cx, [bx+si+2Ah]
seg000:0141                         start           endp ; sp-analysis failed

The COM file is loaded and starts at cs:0x100, here is a pop ax instruction which pops a word from stack into ax. Because the stack pointer is 0xFFFE initially, the last 2 bytes of the segment at 0xFFFE and 0xFFFF (which are 0 from experiments) are pop-ed into ax. Hence ax equals to 0.

At address seg000:0101, instruction xor ax, 214Fh makes ax = 0 ^ 0x214F = 0x214F

At address seg000:0104, instruction push ax makes 0x214F pushed into the stack.

At address seg000:0105, instruction and  ax, 4140h makes ax = 0x214F ^ 0x4140 = 0x140 and al = 0x40 (low byte of ax)

At address seg000:0108, instruction push ax makes 0x140 pushed into the stack.

At address seg000:0109, instruction pop bx makes 0x140 poped out into bx.

At address seg000:010A, instruction xor  al, 5Ch makes al = 0x40 ^ 0x5C = 0x1C and ax = 0x11C.

At address seg000:010C, instruction push ax makes 0x11C pushed into the stack.

At address seg000:010D, instruction pop dx makes 0x11C poped out into dx. Now dx = 0x11C which points to aEicarStandard $-terminated string.

seg000:011C 45 49 43 41 52 2D 53 54+aEicarStandard  db 'EICAR-STANDARD-ANTIVIRUS-TEST-FILE!'
seg000:013F 24                                      db  24h ; $

At address seg000:010E, instruction pop ax makes 0x214F poped out into ax.

At address seg000:010F, instruction xor ax, 2834h makes ax = 0x214F ^ 0x2834 = 0x97B and ah = 0x9 (high byte of ax).

The next 2 instructions at seg000:0112 and seg000:0113 pushes 0x97B into the stack and pops it out into si:

seg000:0112 50                                      push    ax
seg000:0113 5E                                      pop     si

Now, bx = 0x140 pointing to the word value 0x2B48 and si = 0x97B.

seg000:0140                         loc_10140:                              ; CODE XREF: start+1Aj
seg000:0140 48                                      dec     ax
seg000:0141 2B 48 2A                                sub     cx, [bx+si+2Ah]

At address seg000:0114, instruction sub [bx], si makes the word value pointed to by bx equals to 0x2B48 – 0x97B = 0x21CD. Now the instruction at 0x140 becomes int 21h (CD 21). This is a characteristics of self-modifying code in EICAR test file.

The two next instructions increase bx to 0x142 and bx points to the word value 0x2A48.

seg000:0116 43                                      inc     bx
seg000:0117 43                                      inc     bx

At address seg000:0118, instruction sub [bx], si makes the word value pointed to by bx equals to 0x2A48-0x97B = 0x20CD. Now the instruction at 0x142 becomes int 20h (CD 20). This is also a characteristics of self-modifying code in EICAR test file.
The next instruction at address seg000:011A always jumps to seg000:0140 because 0x2A48 is greater than 0x97B.

Now at address seg000:0140 the instruction int 21h is executed with the function code in ah = 0x9 (WRITE STRING TO STANDARD OUTPUT) and the $-terminated string to be written is pointed to by DS:DX (0x11C) which points to aEicarStandard $-terminated string. So, the string ‘EICAR-STANDARD-ANTIVIRUS-TEST-FILE!’ is written to the STDOUT.

The next instruction at address  seg000:0142 (int 20h) terminates the executed EICAR program.

Full disassembly of EICAR Anti-Virus Test File:

seg000:0100                                         public start
seg000:0100                         start           proc near
seg000:0100 58                                      pop     ax              ; ax = 0
seg000:0101 35 4F 21                                xor     ax, 214Fh       ; ax = 214Fh
seg000:0104 50                                      push    ax
seg000:0105 25 40 41                                and     ax, 4140h       ; ax = 0x140
seg000:0108 50                                      push    ax
seg000:0109 5B                                      pop     bx              ; bx = 0x140
seg000:010A 34 5C                                   xor     al, 5Ch         ; al = 0x1C
seg000:010A                                                                 ; ax = 0x11C
seg000:010C 50                                      push    ax
seg000:010D 5A                                      pop     dx              ; dx = 0x11C points to aEicarStandard string
seg000:010E 58                                      pop     ax              ; ax = 0x214F
seg000:010F 35 34 28                                xor     ax, 2834h       ; ax = 0x97B; ah = 0x9
seg000:0112 50                                      push    ax
seg000:0113 5E                                      pop     si              ; si = 0x97B
seg000:0114 29 37                                   sub     [bx], si        ; 0x2B48-0x97b = 0x21CD
seg000:0116 43                                      inc     bx
seg000:0117 43                                      inc     bx
seg000:0118 29 37                                   sub     [bx], si        ; 0x2A48-0x97B = 0x20CD
seg000:011A 7D 24                                   jge     short loc_10140
seg000:011A                         ; ---------------------------------------------------------------------------
seg000:011C 45 49 43 41 52 2D 53 54+aEicarStandard  db 'EICAR-STANDARD-ANTIVIRUS-TEST-FILE!'
seg000:013F 24                                      db  24h ; $
seg000:0140                         ; ---------------------------------------------------------------------------
seg000:0140
seg000:0140                         loc_10140:                              ; CODE XREF: start+1Aj
seg000:0140 48                                      dec     ax
seg000:0141 2B 48 2A                                sub     cx, [bx+si+2Ah]
seg000:0141                         start           endp ; sp-analysis failed
seg000:0141
seg000:0141                         seg000          ends
seg000:0141
seg000:0141
seg000:0141                                         end start

Here are some other interesting points from the analysis of the disassembly:

  • All the opcodes/characters in the EICAR test file are ASCII printable characters (from 0x20 to 0x7E).
  • Instruction pop ax to make ax equals to 0 at the beginning of the program is used although it’s not a good programming practice in comparison with xor ax, ax , it costs only 1 byte (58) in comparison with xor ax, ax which costs 2 bytes (31 C0).
  • To call instructions int 21h and int 20h without using the byte CD (not an ASCII printable character), EICAR uses the self-modifying code technique.
  • The instruction at seg000:011A uses jge instead of jmp because jge uses the opcode 7D which is an ASCII printable character in comparison with jmp (opcode EB is not an ASCII printable character).

Conclusion:

EICAR Standard Anti-Virus Test File is a benign program although it uses self-modifying code which is a common technique in the VX scene.

Some reference links:

Advertisements