How to Make an antivirus engine

https://www.adlice.com/making-an-antivirus-engine-the-guidelines/

Introduction

When roaming around the techies forums, I often see some people (and many not very experienced) asking for “How do I make an antivirus”, sometimes with not very adapted languages (bat, PHP, …) and having a wrong idea of what an antivirus is, and how it should be built.

I’ve also seen many “Antivirus softwares” made by kiddies, with very few still-at-school people and about 4 hours per day of coding on several weeks. I’m not telling kiddies are not skilled, but I’m telling building an antivirus engine needs either lot of skilled people with full time job plus lot of time to release a decent software or lot of money to pay them (in case they are not volunteer).

So, I’ll cover here the guidelines for a basic antivirus coding, for Windows and in C/C++. One can found here the pointers to design an antivirus engine, or simply learn how most of them are built.

Protection

For a good protection, an Antivirus must have at least one driver, to be able to run code in kernel and overall have access to kernel APIs. Starting with Vista, Microsoft understood that the Antivirus industry needed keys to enter the kernel and activate filters in strategic places, such as file system, registry and network. Don’t be stunned if building an antivirus for pre-Vista systems can be a real pain, because it was not designed for this.

However, on Pre-Vista systems, Antivirus companies used to use rootkit-like features to guard the doors (even if it was not recommended at all by Microsoft) and be able to protect your system. They used what we call “Hooks” (API detours for filtering purpose).
On Vista+, Microsoft provided APIs to insert our low level driver between userland calls and kernel APIs. That way, it’s easy to register an antivirus product into the kernel. More, that kind registration based system allows us to dispatch our system security into layers, where several products with different aims can cohabit. This was not the case for hooks, as the implementation was totally product dependant.

NOTE: I will not cover the workarounds with hooks for pre-Vista systems , because it’s easy to find on the internet, and because it would need a whole chapter to explain how to hook, where to hook and so… But you have to know it’s the same idea than the kernel APIs, except that you have to implement yourself what Microsoft provided on Vista+ systems.

To learn about coding drivers, you can check that useful links:
http://msdn.microsoft.com/en-us/library/windows/hardware/gg490655.aspx
http://www.codeproject.com/Articles/9504/Driver-Development-Part-1-Introduction-to-Drivers

To learn about hooks, you can check that basic example:
http://www.unknowncheats.me/forum/c-and-c/59147-writing-drivers-perform-kernel-level-ssdt-hooking.html

Process

The first thing to protect the user from, is the launching of malicious processes. This is the basic thing. Antivirus should register a PsSetCreateProcessNotifyRoutineEx callback. By doing this, on each process creation, and before the main thread starts to run (and cause malicious things) the antivirus callback is notified and receives all the necessary information.

It receives the process name, the file object, the PID, and so. As the process is pending, the driver can tell its service to analyse the process’s memory for anything malicious. It it founds something, the driver will simply set CreationStatus to FALSE and return.

NTSTATUS PsSetCreateProcessNotifyRoutineEx(
  _In_  PCREATE_PROCESS_NOTIFY_ROUTINE_EX NotifyRoutine,
  _In_  BOOLEAN Remove
);

VOID CreateProcessNotifyEx(
  _Inout_   PEPROCESS Process,
  _In_      HANDLE ProcessId,
  _In_opt_  PPS_CREATE_NOTIFY_INFO CreateInfo
);

typedef struct _PS_CREATE_NOTIFY_INFO {
  SIZE_T              Size;
  union {
    ULONG  Flags;
    struct {
      ULONG FileOpenNameAvailable  :1;
      ULONG Reserved  :31;
    };
  };
  HANDLE              ParentProcessId;
  CLIENT_ID           CreatingThreadId;
  struct _FILE_OBJECT  *FileObject;
  PCUNICODE_STRING    ImageFileName;
  PCUNICODE_STRING    CommandLine;
  NTSTATUS            CreationStatus;
} PS_CREATE_NOTIFY_INFO, *PPS_CREATE_NOTIFY_INFO;

Threads

In the same idea than for processes, threads can be a way for malicious things to cause damages. For example, one can inject some code into a legit process, and start a remote thread on that code inside the process’s context (easy to follow? ). That way, a legit process can do malicious things.

We can filter new threads with the PsSetCreateThreadNotifyRoutine callback. Each time a thread is created, the antivirus is notified with the TID and the PID. Thus, it’s able to look into the thread’s start address code, analyse it and either stop the thread or resume it.

NTSTATUS PsSetCreateThreadNotifyRoutine(
  _In_  PCREATE_THREAD_NOTIFY_ROUTINE NotifyRoutine
);

VOID
(*PCREATE_THREAD_NOTIFY_ROUTINE) (
    IN HANDLE  ProcessId,
    IN HANDLE  ThreadId,
    IN BOOLEAN  Create
    );

Images

The third dynamic threat is about images that can be loaded into memory. An image is a PE file, either a EXE, a DLL or SYS file. To be notified of loaded images, simply register PsSetLoadImageNotifyRoutine. That callback allows us to be notified when the image is loaded into virtual memory, even it’s never executed. We can then detect when a process attempts to load a DLL, to load a driver, or to fire a new process.

The callback gets information about the full image path (useful for static analysis), and the more important in my opinion, the Image base address (for in-memory analysis). If the image is malicious the antivirus can use little tricks to avoid the execution, like parsing the in-memory image and go to the entrypoint, then call the assembly opcode “ret” to nullify it.

NTSTATUS PsSetLoadImageNotifyRoutine(
  _In_  PLOAD_IMAGE_NOTIFY_ROUTINE NotifyRoutine
);

VOID
  (*PLOAD_IMAGE_NOTIFY_ROUTINE)(
    __in_opt PUNICODE_STRING  FullImageName,
    __in HANDLE  ProcessId,
    __in PIMAGE_INFO  ImageInfo
    );

typedef struct  _IMAGE_INFO {
    union {
        ULONG  Properties;
        struct {
            ULONG ImageAddressingMode  : 8; //code addressing mode
            ULONG SystemModeImage      : 1; //system mode image
            ULONG ImageMappedToAllPids : 1; //mapped in all processes
            ULONG Reserved             : 22;
        };
    };
    PVOID  ImageBase;
    ULONG  ImageSelector;
    ULONG  ImageSize;
    ULONG  ImageSectionNumber;
} IMAGE_INFO, *PIMAGE_INFO;

Filesystem

Once every dynamic thing is secured, an antivirus should be able to notify user for malicious things on-the-fly, not only when they are about to start. An antivirus should be able to scan files when user opens a folder, an archive, or when it’s downloaded on the disk. More, an antivirus should be able to protect himself, by forbidding any program to delete its files.

The way to do all of this, is to install a driver into the file system, and more specifically a minifilter of a legacy filter (old way). Here we will talk about minifilter.

A minifilter is a specific kind of driver, able to register callbacks on every Read/Write operation made on the file system (IRP major functions). An IRP (Interrupt Request Paquet) is an object used to describe a Read/Write operation on the disk, which is transmitted along with the driver stack. The minifilter will simply be inserted into that stack, and receive that IRP to decide what to do with it (allow/deny operation).

For a little example of minifilter, please check that useful link or that one. The microsoft guidelines are here.

You’ll find also 2 examples of the WDK documentation here and here.

A basic minifilter callback look like this. There are 2 kinds of callback, Pre operation and Post operation, which are able to filter before of after the query. Here’s a preOperation pseudo code:

FLT_PREOP_CALLBACK_STATUS    PreOperationCallback    (__inout    PFLT_CALLBACK_DATA    Data,    
    __in    PCFLT_RELATED_OBJECTS    FltObjects,    
    __deref_out_opt    PVOID    *CompletionContext)
{
    ...

    if    (    all_good    )
    {
        return    FLT_PREOP_SUCCESS_NO_CALLBACK;
    }
    else
    {
        //    Access    denied                
        Data-&gt;IoStatus.Information    =    0;
        Data-&gt;IoStatus.Status    =    STATUS_ACCESS_DENIED;
        return    FLT_PREOP_COMPLETE;
    }    
}

Registry

The registry is one of the most critical place to guard. There are many many ways for a malware to keep persistent hand on the system by registering a single (or few) keys/values into the registry. The most known places are Run keys, and Services. This is also the place where the antivirus can be defeated (along with the file system), by simply removing its driver/service keys, so that it will no longer restart at system boot.

This is not a real necessity for an antivirus to guard restart places, most of them don’t. But they must guard their install registry keys, to avoid being defeated easily by malwares. This can be done by registering CmRegisterCallback.

The callback gives enough information to get the full key name, the kind of access (Create, Rename, Delete, … ) and the caller PID. That way it’s easy to grant access or not to the call, by setting the Status field of Post Operation callback.

NTSTATUS CmRegisterCallbackEx(
_In_        PEX_CALLBACK_FUNCTION Function,
_In_        PCUNICODE_STRING Altitude,
_In_        PVOID Driver,
_In_opt_    PVOID Context,
_Out_       PLARGE_INTEGER Cookie,
_Reserved_  PVOID Reserved
);

EX_CALLBACK_FUNCTION RegistryCallback;

NTSTATUS RegistryCallback(
  _In_      PVOID CallbackContext,
  _In_opt_  PVOID Argument1,
  _In_opt_  PVOID Argument2
)
{ ... }

Argument1 = typedef enum _REG_NOTIFY_CLASS {
  RegNtDeleteKey,
  RegNtPreDeleteKey = RegNtDeleteKey,
...

Argument2 = typedef struct _REG_POST_OPERATION_INFORMATION {
  PVOID    Object;
  NTSTATUS Status;
  PVOID    PreInformation;
  NTSTATUS ReturnStatus;
  PVOID    CallContext;
  PVOID    ObjectContext;
  PVOID    Reserved;
} REG_POST_OPERATION_INFORMATION, *PREG_POST_OPERATION_INFORMATION;

Network (Firewall)

To guard the doors of the whole internet traffic which can be huge on certain systems (servers, huge bandwidth users) without being slowed down by the context switching that takes place in userland, it’s totally not recommended to install a firewall that have no underlying driver, except for some web browser filters that can be enough for http traffic, but that will not protect against malware communication in/out.

In order to have a correct implementation of firewall, one should code a NDIS, TDI or another method for low level IP filtering driver. NDIS/TDI is a bit tricky to do, and would require lot of knowledge (more than other filters in my opinion).

Anyway, here’s some pointers to start coding such a driver, the microsoft guidelines, and old codeproject tutorial (but still good to read), an example of NDIS firewall, and an example of TDI firewall. Here’s also a good writing about NDIS firewall bypass trick, and a little explanation about the network driver stack,

Userland protection

The userland protection is not a necessity, but can be an additional module against Trojan Bankers, and more specifically against process spies. They are generally injected into every process, for several reasons.

First, they are able (on demand) to kill the process if it has been identified as malware (this should not happen, because AVs are supposed to stop it before it starts). It’s always easier to stop a process when you are into its context.

Second they are able to guard critical process, like web browsers, against hooking malwares able to detour and filter API calls in order to gather passwords, banking information, and redirect internet flow to malware servers. They only watch for IAT modification, for splicing, and can also set hooks themselves to avoid LoadLibray of a malware DLL, and thus forbid certain methods of code injection.

An easy way to inject a protector DLL into all processes is to use the AppInitDll registry key to register the protector DLL. It will load the DLL into every process started on the system, as soon as they link the User32.dll image (most of them do).

Analysis Engine

The analysis engine is one of the most important part, it’s responsible for analysing file/memory samples coming from the drivers. If must be fast (even with a huge database), and should be able to handle most of the file types (Self-extracted executables, Archives – RAR, ZIP, Packed files – UPX, … ) and thus should have many modules to do this:

Unpacker : That module must be able to detect and unpack most of the known packers (UPX, Armadillo, ASPack, …)
Signature engine: The database of an antivirus contains millions of signatures, and the engine should be able to fast search for them into a sample. Thus, a very powerful algorithm should be part of it. Some examples : AhoCorasick, RabinKarp, string matching algorithms.
Sandbox : That module is not necessary, but would be a plus to be able to run samples into a limited memory, with no effect on the system. That could help to unpack samples packed with unknown packers, and help the heuristic engine (see after) to detect API calls that could be considered as suspicious or malicious. Some good sandbox here.
Heuristic engine : As said above, the heuristic engine does not search for signatures, but rather look for suspicious behaviour (ie. sample that opens a connexion on the website hxxp://malware_besite.com). That can be done by static analysis, or through a sandbox.

Signature syntax

The signature syntax is the “dictionary” of the language that the signature engine understands. It’s a way to formalize what is the pattern to find, how to search it and where to search it into the sample. The syntax has to be simple enough for the researchers to understand, powerful enough to handle every use case, and easy to parse for better engine performances.

VirusTotal has developed a good syntax and engine (Yara project), which is open source. That should be a good pointer to make your own syntax, or simply use it. Here’s also a good post blog on how create signatures for antivirus.

Example of signature:

rule silent_banker : banker
{
    meta:                                        
        description = "This is just an example"
        thread_level = 3
        in_the_wild = true

    strings: 
        $a = {6A 40 68 00 30 00 00 6A 14 8D 91}  
        $b = {8D 4D B0 2B C1 83 C0 27 99 6A 4E 59 F7 F9}
        $c = "UVODFRYSIHLNWPEJXQZAKCBGMT"

    condition:
        $a or $b or $c
}

Self-protection

The self protection is very important for an antivirus, to avoid being defeated by a malware and continue to protect the user. This is why an antivirus should be able to guard its own installation and keep persistence at reboot.

There are several place to protect: Files, Registry keys, Processes/Threads, Memory.

File protection is implemented into the minifilter, with particular rules on the files of the antivirus (No access in deletion, renaming, moving, writing).
Registry protection is made into the registry filter, with access denied for registry keys of the driver and the service.
The drivers threads are protected, because it’s quite impossible to unload kernel module without crashing the system
To be able to protect the service, which is a userland process, 2 solutions:

The easiest would be to add rules for failures in the service manager, and set every failure rule to “restart service”. That way, when the service is not stopped by service manager, it restarts. Of course, the service should not be able to accept commands until the system is not restarting (or stopping).

The second method, which is more generic, would be to set callbacks on process handles with ObRegisterCallbacks.

By setting the ObjectType to PsProcessType and the Operation to OB_OPERATION_HANDLE_CREATE, you receive a Pre and Post operation callback, and you are able to return ACCESS_DENIED into ReturnStatus if the process handle queried has GrantedAccess which have process terminate rights (or process write rights, or anything that can lead to a crash/kill), and of course if the process is one the antivirus needs to guard (its service for example).

Of course, one also needs to guard Duplicate handle and the PsThreadType to avoid any termination method that requires to grab a handle on the process or a thread. Here’s a little example of usage of that callback.

NTSTATUS ObRegisterCallbacks(
  _In_   POB_CALLBACK_REGISTRATION CallBackRegistration,
  _Out_  PVOID *RegistrationHandle
);

typedef struct _OB_CALLBACK_REGISTRATION {
  USHORT                    Version;
  USHORT                    OperationRegistrationCount;
  UNICODE_STRING            Altitude;
  PVOID                     RegistrationContext;
  OB_OPERATION_REGISTRATION *OperationRegistration;
} OB_CALLBACK_REGISTRATION, *POB_CALLBACK_REGISTRATION;

typedef struct _OB_OPERATION_REGISTRATION {
  POBJECT_TYPE                *ObjectType;
  OB_OPERATION                Operations;
  POB_PRE_OPERATION_CALLBACK  PreOperation;
  POB_POST_OPERATION_CALLBACK PostOperation;
} OB_OPERATION_REGISTRATION, *POB_OPERATION_REGISTRATION;

VOID ObjectPostCallback(
  _In_  PVOID RegistrationContext,
  _In_  POB_POST_OPERATION_INFORMATION OperationInformation
);

typedef struct _OB_POST_OPERATION_INFORMATION {
  OB_OPERATION                  Operation;
  union {
    ULONG  Flags;
    struct {
      ULONG KernelHandle  :1;
      ULONG Reserved  :31;
    };
  };
  PVOID                         Object;
  POBJECT_TYPE                  ObjectType;
  PVOID                         CallContext;
  NTSTATUS                      ReturnStatus;
  POB_POST_OPERATION_PARAMETERS Parameters;
} OB_POST_OPERATION_INFORMATION, *POB_POST_OPERATION_INFORMATION;

typedef union _OB_POST_OPERATION_PARAMETERS {
  OB_POST_CREATE_HANDLE_INFORMATION    CreateHandleInformation;
  OB_POST_DUPLICATE_HANDLE_INFORMATION DuplicateHandleInformation;
} OB_POST_OPERATION_PARAMETERS, *POB_POST_OPERATION_PARAMETERS;

typedef struct _OB_POST_CREATE_HANDLE_INFORMATION {
  ACCESS_MASK GrantedAccess;
} OB_POST_CREATE_HANDLE_INFORMATION, *POB_POST_CREATE_HANDLE_INFORMATION;

GUI (Graphical User Interface)

This is the visible part of the iceberg. In my opinion, one (maybe THE) most important part if you want to sell your product. Users love what is beautiful, easy to use, intuitive. Even if it’s not 100% efficient. The GUI must be sexy.

The GUI is only an empty shell, it does only graphical treatments, and sends/receive commands to the core (the service). It also displays progress bars, what is being analysed, provides configuration, and so… Here’s the Avast UI. Sexy, right?

Architecture

The global architecture could be something that look like this:

GUI : No administrative rights, WEAK
Guard DLL(s) : web browser protection, MEDIUM
Service : Admin rights. Serves as a gateway to kernel code and take decisions along with some database, STRONG
Driver(s) : Kernel filters, STRONG

The GUI doesn’t need any administrative right, it only takes user actions and transmits them to the service. It also displays product status. Nothing more, this is not its aim. If the GUI is killed, this is not a problem as the service should be able to restart it.

The guard DLLs (if any), are massively injected into all processes, and should be able to look for IAT hooks and/or malicious threads. They should be quite hard to unload or defeat. They are not critical but important.

The service is the core of the product. It should be unkillable, or at least should be able to self-restart on kill. The service is responsible for communication between all modules of the product, it sends commands to drivers, takes commands from user, and queries the database for sample analysis. This is the brain.

The kernel drivers are also critical. They are the tentacles that gather information on everything that happen on the system, and transmit them to the service for decision. They are also able to deny access to guarded places, based on service’s decision.

Conclusion

Making a strong, reliable and stable antivirus engine is a complicated task, that needs experimented people with a very strong knowledge in windows kernel programming, windows application programming, GUI design, software architecture, malware analysis, …

Building stable drivers is also a complicated task, because a little grain of sand can crash the whole system. It needs testing, testing, and lot of testing. When your engine is done and working, you’ll just need to hire researchers to analyse malwares and add signatures to your database What are you waiting for? Go on!