Your filters are bypassed: Rustock.C in the kernel

2008-11-01

Chandra Prakash

Sunbelt Software, USA
Editor: Helen Martin

Abstract

Chandra Prakash describes the step-by-step operational characteristics of Rustock.C in kernel mode from its startup to the point at which its spambot code (botdll) is activated in user mode.


Following earlier articles on Rustock.A (see VB, September 2006, p.6) and Rustock.C (see VB August 2008, p.4), this article describes the step-by-step operational characteristics of Rustock.C in kernel mode from its startup to the point at which its spambot code (botdll) is activated in user mode (Unless otherwise stated: Any information on operating system routines or data structures applies to 32-bit Windows XP SP2; any reference to ntoskrnl also implies a reference to ntkrnlpa, ntkrnlmp, ntkrpamp; a file-mapped PE image refers to a PE file as on disk; a virtual-mapped PE image refers to a PE image in virtual memory as loaded by the Windows loader.)

Understanding the operational characteristics of Rustock.C through static analysis is a very cumbersome process as it executes after several stages of unpacking. Furthermore, multiple threads are created right from the malware’s startup, which increase the complexity of dynamic analysis. The analysis presented here is based on a June 2008 sample.

Stage 1: unpacking

In its initial stage Rustock.C uses a simple XOR algorithm to unpack its code to a designated area. Once unpacking is complete, it transfers control to the unencrypted code as shown below:

lea  esp, [esp-4]
mov  dword ptr [esp], offset byte_13000
retn

A different sample of Rustock.C demonstrates an anti-debugging trick when Stage 1 unpacking is complete:

popad
sub esp,4 ; Increase current top of stack.
mov dword ptr [esp],offset rustockC+0x3000
add esp,4 ; Decrease current top of stack.
push dword ptr [esp-4]
           ; Access to a value beyond current top
           ; of stack. In a debugging session,
           ; this stack location may very well
           ; contain previous register eflags
           ; value stored by debug trace
           ; interrupt. As a result EIP after
           ; ‘ret’ can point to an invalid
           ; location.
ret

This shows that sub-variants of Rustock.C exist with slight differences in operational behaviour.

After initial unpacking, one of the first things the malware does is to locate the load address of ntoskrnl via some pointer arithmetic on the interrupt descriptor table (IDT) using the following set of instructions:

mov  eax,dword ptr fs:[00000038h] ; Get IDT address.
mov  eax,dword ptr [eax+4]
xor  al,al
find_ntos_base:
sub  eax,100h
cmp  word ptr [eax],5A4Dh
jne  find_ntos_base

It then scans an address obtained from the first IDT entry to look for the base address of ntoskrnl. The base address of ntoskrnl is used to scan its export table for the following functions:

ExAllocatePool
ExFreePool
ZwQuerySystemInformation
_stricmp

These functions are used for unpacking and loading as described in the next sections.

Stage 2: decompression and decryption

In Stage 2, Rustock.C allocates a temporary buffer using ExAllocatePoolWithTag to decompress and decrypt data from stage 1. All memory allocation calls in the kernel are made through this API with the tag name ‘Ddk ’ (note the space). Decompression is carried out using the apLib algorithm followed by decryption using RC4. These decompression and decryption mechanisms are well documented elsewhere [1], [2].

Loading

The unpacked data from Stage 2 is the final file-mapped PE image of the driver ready to be loaded. The image is loaded with its image base as the start address of the location from which it was originally unpacked, wiping out Stage 1 decrypted data. Loading is carried out in three steps:

  • The PE headers and sections are copied over. The starting virtual address of every section is aligned as per the section alignment.

  • The IAT table is patched.

  • The relocations are fixed.

Imports are mainly from ntoskrnl and hal.dll, which are obtained via a lookup in the export table using their load information. The load information is obtained through ZwQuerySystemInformation using the SystemModuleInformation class. After the IAT fix up, the relocations are completed in place. Once relocations are completed the MZ and PE signatures are zeroed out to obfuscate the loaded image to prevent its detection by kernel debuggers. It then zeroes out and frees up the temporary buffer which contains the file-mapped PE image.

After the image is virtually mapped, control is transferred to the entry point of the final loaded image as shown below:

mov  dword ptr [esp+1Ch],esi ; ESI has entry point.
popad
jmp  eax

The activities of the two threads created at startup and a third thread that is created conditionally (see Figure 1) are described in the following sections.

Figure 1. 

Activities of Thread1

Setting up hooks

Thread1 starts by creating a named event handle via ZwCreateEvent with the name \BaseNamedObjects\{C8453B23-1087-27d9-1394-CDBF03EC72D8}. The use of the \BaseNamedObjects directory indicates that this event object is intended to be shared with user mode. It starts by searching the NULL terminated ASCII string ‘FATAL_UNHANDLED_HARD_ERROR’ in the resource section of ntoskrnl. If the string is found, a page-locking test is performed on the page that contains the string using the pseudo code shown below:

__try
{
    PMDL mdl;
    mdl = IoAllocateMDL(
       vaFatalHandledHardErrorStr,
       0x1b, // NULL terminated length of str.
       0,
       0,
       0);
    MmProbeAndLockPages(
       mdl,
       KernelMode,
       IoAccessRead);
}
__except(EXCEPTION_EXECUTE_HANDLER)
{
    IoFreeMdl(Mdl);
}

The call to MmProbeAndLockPages will throw an access violation exception if appropriate access is not granted to the requested pages. In Rustock.C, there is no reference to MDL allocated from IoAllocateMDL, which raises questions as to the purpose of this code here. However, there is a connection to the Rustock.A kiFastCallEntry hook (see below), indicating that Rustock.C is very likely an enhanced version of the Rustock.A code base, and this code is simply left over from its previous version [3], [4].

;  =
; msr[176] = 806afd59
; 806afd59                 e9 ec 2e e6 77 4e 44 4c   ....wNDL
; 806afd61                 45 44 5f 48 41 52 44 5f   ED_HARD_
; 806afd69                 45 52 52 4f 52 0d 0a 00   ERROR...

806afd59 e9ec2ee677 jmp rustockA+0x4c4a

The thread then calls ZwCreateFile to open a handle to the ntdll file using the \SystemRoot\System32\ntdll.dll path. Following this, a call is made to ZwQueryInformationFile to obtain the on-disk size of ntdll using FILE_INFORMATION_CLASS FileStandardInformation. Using the ntdll file size a new buffer is allocated with ExAllocatePoolWithTag and the ntdll file is read off disk using ZwReadFile.

The disk buffer containing ntdll data is then virtually mapped into a new buffer. The new buffer is also allocated via ExAllocatePoolTag, and once it has been virtually mapped, the previous buffer containing the on-disk data is freed. The virtually mapped ntdll is used to obtain the SSDT service number index of hooked Zw functions by searching the function entry in the ntdll export table. When the virtually mapped ntdll is ready, it stores it own load address, size and full driver path in designated memory locations for subsequent use. The self-load information is used to map its own driver into user space as described later. It then sets up its process creation notification routine via PsSetCreateProcessNotifyRoutine and creates a second thread, Thread2, as shown in Figure 1. All sub-keys and values under \registry\machine\system\CurrentControlSet\Enum\Root\LEGACY_<rustockC_driver_name> are deleted recursively.

Rustock.C hooks the registry in a way that has not been seen in previous Rustock variants [5]. It hooks the registry key parse procedure in the kernel that is registered by the configuration manager with the object manager (see below). The parse procedure is employed to parse a registry path in registry-related APIs.

 _OBJECT_TYPE_INITIALIZER
 +0x000 Length          : 0x4c
 .
 .
 +0x030 OpenProcedure   : (null)
 +0x034 CloseProcedure  : 0x8056bf9e nt!CmpCloseKeyObject+0
 +0x038 DeleteProcedure : 0x8056c072 nt!CmpDeleteKeyObject+0
 +0x03c ParseProcedure  : 0xf9b4fdd3 <-- Rustock.C address (normally
 nt!CmpParseKey).
 +0x040 SecurityProcedure     : 0x8056bfd6 nt!CmpSecurityMethod+0
 +0x044 QueryNameProcedure    : 0x805a935e nt!CmpQueryKeyName+0
 +0x048 OkayToCloseProcedure  : (null)

Some more functions, ZwOpenKey and ZwCreateKey, are also hooked. After setting up registry hooks, the malware gets a handle to the directory containing its driver file using ZwCreateFile and that handle is used in ObReferenceObjectByHandle to get a FILE_OBJECT pointer. It then calls IoGetRelatedDeviceObject on the DeviceObject field of the file object to obtain the highest-level device object in the file system filter driver stack. Typically, on machines that support the filter manager, the highest-level device object happens to be the device object of the filter manager driver (FltMgr.sys). Using the highest-level device object the malware walks down the device stack until it finds the lowest-level device object created by the NTFS driver. The device object of the NTFS driver is used to hook its IRP_MJ_CREATE dispatch routine. In one Rustock.C variant, the mechanics of this create hook allowed a copy of its driver from the Windows command prompt, but the copy was not the same as the original driver file.

ZwTerminateProcess is then hooked and a function dispatch table is set up, which is used to serve commands from botdll in an unusual way:

NTAPI NtTerminateProcess(
      IN HANDLE hProcess,
      IN NTSTATUS ExitCode
);

00012339 cmp     dword ptr [ebp+0Ch], 0FCC7975Bh
            ; ExitCode parameter contains special
            ; encoded value for botdll and
            ; driver communication.
00012340 jnz     short OrigNtTerminateProcess
      .
      .
OrigNtTerminateProcess:
            ; Normal process termination requests
            ; come here.
000123AF push    dword ptr [ebp+0Ch]
000123B2 push    dword ptr [ebp+8]
000123B5 mov     eax, OrigNtTerminateProcess
000123BA call    dword ptr [eax]

The ExitCode parameter of ZwTerminateProcess is set to a specific value that indicates a message from botdll to the driver. The message parameters are encoded in the first hProcess parameter. Normal process termination requests are routed to the original NtTerminateProcess routine address stored in memory as shown above.

Setting up botdll: step 1

Services.exe is used as a goat process for hosting botdll. The process id of the services.exe process is obtained using the SystemProcessAndThreadsInformation class in the ZwQuerySystemInformation call. This process id is used to get the EPROCESS object associated with services.exe. The EPROCESS object is used in the KeAttachProcess call to attach to the virtual address space of services.exe. Then Rustock.C maps its own driver’s PE image into the services.exe address space using the IoAllocateMdl, MmBuildMdlForNonPagedPool, MmMapLockedPages sequence of calls. By mapping its own driver image in user space, the malware makes its code and data available to user-mode processes, as described later in this section.

Before calling KeDetachProcess, Rustock.C calls NtSetInformationProcess on services.exe with the PROCESS_INFORMATION_CLASS parameter as ProcessExecuteFlags(0x22) with mask value MEM_EXECUTE_OPTION_ENABLE(0x2). The purpose of this call is to disable the no-execute (NX) bit for DEP data pages [6]. The malware then gets information of all services.exe threads using the SystemProcessAndThreadsInformation class in ZwQuerySystemInformation called earlier and sends an asynchronous procedure call (APC1) to each of the threads. The APC mechanism is designed to execute a function in the context of a target thread. The API calls used for APC are KeInitializeApc and KeInsertQueueApc:

NTKERNELAPI
VOID
KeInitializeApc (
    IN PRKAPC Apc,
    IN PKTHREAD Thread,
    IN KAPC_ENVIRONMENT Environment,
    IN PKKERNEL_ROUTINE KernelRoutine,
    IN PKRUNDOWN_ROUTINE RundownRoutine OPTIONAL,
    IN PKNORMAL_ROUTINE NormalRoutine OPTIONAL,
    IN KPROCESSOR_MODE ApcMode,
    IN PVOID NormalContext
    )

The NormalRoutine and NormalContext parameters are the user-mode virtual addresses of the APC1 start routine and its context respectively. Note the values for these user-mode virtual addresses are set earlier by mapping the malware’s own kernel PE image into user space. The KernelRoutine parameter in KeInitializeApc is the address of a function in kernel space that frees up the APC object (first parameter) allocated from a non-paged pool. The primary purpose of the APC1 call is to set up the import address table of function names referenced in the NormalContext field:

LoadLibraryA
GetProcAddress
SetEvent
Init
CreateThread
SleepEx

The virtual addresses of these functions are resolved using the load address of kernel32.dll from dll load information stored in the process environment block (PEB). The address of the PEB is obtained using the FS:[30] register expression. The Init function is resolved from exports of botdll injected into services.exe by a second APC (APC2), as described later. APC1 also creates a new thread in user mode, whose startup routine is shown below:

ThreadStartRoutine:
push 1
push 0FFFFFFFFh
call dword ptr [esp+0Ch]
            ; SleepEx(INFINITE, TRUE)
jmp  ThreadStartRoutine

This thread seems to be doing nothing but sleep forever! The purpose of this sleep is to put the thread in an alertable state using the bAlertable parameter as TRUE so that future APCs can be executed promptly:

DWORD SleepEx(
   DWORD dwMilliseconds,
   BOOL bAlertable
);

If the thread is not in an alertable state, APCs are queued [7].

Setting up botdll: step 2

The next step in setting up the user-mode botdll is for the malware to read its own driver file from disk. It first creates an empty file object using ObCreateObject and sets the file name to refer to its own driver file. It then gets the device object of the lowest file system driver, i.e. NTFS driver, and, using the new file object and device object, generates IRP_MJ_CREATE to read its own driver file.

The file is read in two steps. First, the file size is obtained using IRP_MJ_QUERY_INFORMATION with FILE_INFORMATION_CLASS as FileStandardInformation. In the second step, IRP_MJ_READ is sent in a buffer allocated from ExAllocatePoolWithTag. Rustock.C then sends IRP_MJ_CLEANUP and IRP_MJ_CLOSE directly to the NTFS driver to undo the actions associated with IRP_MJ_ CREATE. The memory location containing the malware’s own file data is saved for later use (for example, in a separate worker thread to write its copy to disk at regular intervals for resuscitation).

Typically, IRP_MJ_CREATE, IRP_MJ_CLEANUP and IRP_MJ_CLOSE are generated implicitly by I/O Manager inside the Windows kernel, and by rolling out these IRPs on its own, Rustock.C showcases the sophistication of its authors. Generating its own IRP_MJ_CREATE is a non-trivial task involving several intricate steps, especially relating to setting parameters for the caller’s security context. Since it rolls out its own IRP_MJ_CREATE, the Rustock.C driver is able to send direct read and write requests (IRP_MJ_READ and IRP_MJ_WRITE) to the NTFS driver. This allows the malware to bypass any filter drivers that are typically used by security vendors to provide kernel-based on-access security against malicious files.

From the data buffer containing the on-disk driver the next step is to get botdll. The botdll code is stored encrypted and compressed in the original driver file, as shown in Figure 2.

Figure 2. 

The encryption consists of a simple XOR and the compression algorithm used is aPLib [1]. After the botdll code is uncompressed into a new memory buffer, it is virtually mapped into yet another new buffer. Relocations of the botdll code are fixed in kernel mode as its user-mode base address has already been obtained from the MmMapLockedPages call. PE and MZ signatures in the final virtually mapped buffer are also zeroed out and the start address of that buffer is set in the NormalContext field of the second APC (APC2). Like APC1, APC2 is also queued to threads of the services.exe process. The NormalRoutine parameter of APC2 consists of code that performs fixups of imports of botdll. The imports are fixed up using the LoadLibraryA and GetProcAddress APIs that have already been set up via APC1.

An anti-emulation/anti-debugging trick is used for import table fixups in botdll. First, it looks up the virtual address of byte sequence c20400 in the kernel32!SetEvent function, which is actually the machine code equivalent of mnemonic ‘ret 4’:

00011093 call    loc_00011098
00011098 pop     edx
00011099 add   edx, 10h      ; Save 000110A8 in EDX.
0001109C push  0 ; Extra push1.
                ; PUSH to compensate for
                ; additional sizeof(dword)=4
                ; byte increment in esp,
                ; because of ‘ret 4’.
0001109E push  edx     ; Extra push2.
                ; Pushing location 000110A8.
                ; Location 000110A8 is where
                ; ret 4 instruction
                ; will transfer control.
0001109F push  [ebp+Va_ImpDllName]
                ; e.g. “kernel32.dll”.
000110A2 push  [ebp+Va_RET_4]      ; Extra push3.
                ; Pushing address of location
                ; in kernel32!SetEvent whose
                ; opcode is ret 4.
000110A5 jmp   [ebp+Va_LoadLibraryA]
                ; This is where the return call
                ; trick is executed.
000110A8 mov   [ebp-38h], eax
                ; Save return from LoadLibraryA

The calls to GetProcAddress are made in a similar way, by using the address of the ‘ret 4’ instruction taken from the kernel32!SetEvent function. After relocations, the entry point of botdll is called, followed by a call to its export function, named Init, that performs the bot activity.

Activities of kernel Thread2

Thread2 is created from Thread1 and writes its own driver file to disk every five seconds in a loop. This is most likely its persistence strategy against any deletions of its on-disk driver file. Its own driver file is saved in memory during the startup phase. Similar to the reading of its driver file, it performs its write (IRP_MJ_WRITE) by direct access to the NTFS driver, bypassing the file system filter device stack.

Activities of Thread3

Thread3 is created conditionally from the process create notify routine. This thread does the same work as that carried out towards the end of Thread1, which involves reading its own driver, sending APC1, decompressing botdll and sending APC2. In the notify routine it checks for process create only notifications of services.exe. The condition to create a new thread is whether the botdll code has been spawned into services.exe previously or not. Most likely Thread3 is employed as a backup mechanism to kick off APC1 and APC2, since there may be a race condition in the boot phase between the driver’s startup and the startup of services.exe. If, by the time the driver’s startup has completed services.exe has not started, then Thread3 can kick off APC1 and APC2.

Dispatch routines

The Rustock.C driver has no dispatch routines set up in its DRIVER_OBJECT, as there would be in the DriverEntry routine of a typical device driver. However, it accomplishes a similar objective using an array of 11 functions set up in memory. For example:

  • Dispatch function 0 frees up the current driver in memory, reads its own disk driver afresh and subsequently sends APC1 and APC2 as described earlier.

  • Dispatch function 1 writes a new driver using IRP_MJ_WRITE. This can potentially be used to activate a completely new driver downloaded from botdll.

  • Dispatch function 2 deletes a disk file, using IRP_MJ_SET_INFORMATION and FileInformationClass as FileDispositionInformation.

All disk access in these dispatch functions is also achieved via direct calls to the NTFS driver as described earlier. Each of these functions is called through the hooked ZwTerminateProcess API, by setting a function index along with the corresponding input/output parameters. The layout of the input/output structure is described below:

struct ZwTermProcDispatchIOParam
{
+0x0 FunctionIndex     // Index into function array
+0x4 InputBuffer // Input buffer, if applicable
+0x8 InputBufferSize   // Input buffer size, if applicable
+0xC OutputBuffer      // Output buffer, if applicable
+0x10           OutputBufferSize   // Output buffer size, if applicable
}

The address of this structure is passed in as the first parameter to the ZwTerminateProcess API and the second parameter (ExitCode) consists of the special encoded value as described earlier.

Removal

The Rustock.C variant researched in this paper was removed by deleting its driver service registry keys under HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\<Rustock_Driver_Name>, followed by a system reboot and deletion of its driver file. But the registry hiding mechanism it employs needs to be defeated before access can be gained to its driver service registry keys.

Conclusion

Rustock.C has the ability to operate with its bare minimum driver file containing another driver file and botdll, both stored compressed and encrypted. The mapping of its own driver image in the context of a user-mode goat process, combined with the use of the APC mechanism, obviates the need to have the botdll on disk. Some of the dispatch functions implemented via the ZwTerminateProcess hook demonstrate its ability to activate a completely different botdll and driver on the fly. Any access (read, write, query or set information) to its own driver file on disk is done surreptitiously, bypassing the file system filter device stack.

Bibliography

[1] Kwiatek, L.; Litawa, S. Yet another Rustock analysis... Virus Bulletin, August 2008, p.4.

[2] Kwiatek, L. Rustock.C – kernel mode protector. http://www.eset.com/threat-center/blog/?p=127.

[5] Skape & Skywing, A Catalog of Local Windows Kernel-mode Backdoor Techniques. http://www.uninformed.org/?v=8&a=2&t=txt.

[6] Bypassing Windows hardware-enforced data execution prevention. http://www.uninformed.org/?v=2&a=4.

twitter.png
fb.png
linkedin.png
hackernews.png
reddit.png

 

Latest articles:

Nexus Android banking botnet – compromising C&C panels and dissecting mobile AppInjects

Aditya Sood & Rohit Bansal provide details of a security vulnerability in the Nexus Android botnet C&C panel that was exploited to compromise the C&C panel in order to gather threat intelligence, and present a model of mobile AppInjects.

Cryptojacking on the fly: TeamTNT using NVIDIA drivers to mine cryptocurrency

TeamTNT is known for attacking insecure and vulnerable Kubernetes deployments in order to infiltrate organizations’ dedicated environments and transform them into attack launchpads. In this article Aditya Sood presents a new module introduced by…

Collector-stealer: a Russian origin credential and information extractor

Collector-stealer, a piece of malware of Russian origin, is heavily used on the Internet to exfiltrate sensitive data from end-user systems and store it in its C&C panels. In this article, researchers Aditya K Sood and Rohit Chaturvedi present a 360…

Fighting Fire with Fire

In 1989, Joe Wells encountered his first virus: Jerusalem. He disassembled the virus, and from that moment onward, was intrigued by the properties of these small pieces of self-replicating code. Joe Wells was an expert on computer viruses, was partly…

Run your malicious VBA macros anywhere!

Kurt Natvig wanted to understand whether it’s possible to recompile VBA macros to another language, which could then easily be ‘run’ on any gateway, thus revealing a sample’s true nature in a safe manner. In this article he explains how he recompiled…


Bulletin Archive

We have placed cookies on your device in order to improve the functionality of this site, as outlined in our cookies policy. However, you may delete and block all cookies from this site and your use of the site will be unaffected. By continuing to browse this site, you are agreeing to Virus Bulletin's use of data as outlined in our privacy policy.