The malware analyst’s guide to aPLib decompression
aPLib is a compression library that is very easy to use and integrate with C/C++ projects. It is a pure LZ-based compression library. There is also an executable packer based on it called aPACK. Due to its ease of use and tiny footprint, it’s a very popular library utilized by many malware families like ISFB/Ursnif, Rovnix, and many more. Knowledge about aPLib detection and aPLib decompression is crucial for every malware analyst.
This blog post dives into many internals of aPlib, explains how to detect aPLib compression with your bare eyes as well as YARA, and finally shows you how to decompress aPLib compressed blobs with several tools.
aPLib is a library implementing the compression algorithm found in the executable compressor aPACK. It is a pure LZ compression implementation. The library aPLib is known for its fast decompression speed and tiny footprint of the decompression code. As of writing, the current version is v1.1.1
.
The objective of this section is not to explain how the actual compression algorithm at aPLib’s core works, but rather to give a quick overview of the library itself (project structure, relevant functions, relevant structs).
Since you’ll likely encounter aPLib decompression during malware analysis, I’ll focus on the decompression portion of aPLib in the following. Nevertheless, this blog post should be also valuable for those who look into aPLib compression.
The aPLib library
The library can be found at the author’s website. It comprises a lot of valuable information. First and foremost, there is the source code for decompression (but not compression!):
└── src
├── 32bit
│ ├── crc32.asm
│ ├── depack.asm
│ ├── depackf.asm
│ ├── depacks.asm
│ ├── scheck.asm
│ ├── sdepack.asm
│ ├── sgetsize.asm
│ └── spack.asm
├── 64bit
│ ├── crc32.asm
│ ├── depack.asm
│ ├── depackf.asm
│ ├── depacks.asm
│ ├── scheck.asm
│ ├── sdepack.asm
│ ├── sgetsize.asm
│ └── spack.asm
├── depack.c
├── depack.h
├── depacks.c
└── depacks.h
Second, there is some additional user documentation (html
and chm
). Third, there are libraries to statically or dynamically link against. Several platforms are supported including Windows:
├── lib
│ ├── dll
│ │ ├── aplib.dll
│ │ ├── aplib.h
│ │ └── aplib.lib
│ ├── dll64
│ │ ├── aplib.dll
│ │ ├── aplib.h
│ │ └── aplib.lib
The library offers several decompression functions that fall in three classes:
aP_depack
,aP_depack_asm
, andaP_depack_asm_fast
assume valid input data and may crash if the input is invalidaP_depack_safe
andaP_depack_asm_safe
catch decompression errors and do not crash on invalid input dataaPsafe_depack
is a function wrapper adding a header to the compressed data
At their core, they all utilize the LZ-based compression algorithm. In the following sections, I’ll have a look at each of the three classes.
aP_depack*
The decompression function aP_depack
decompresses a compressed binary blob. The counterpart of aP_depack_safe
is the function aPsafe_pack
.
The following function signature is defined in depack.h
:
unsigned int aP_depack(const void *source, void *destination);
The functions of this class assume that the compressed data is valid. This saves some sanity checks, which in turn results in faster decompression and a smaller footprint of the decompression code. However, it is likely that they crash if an error is encountered.
aP_depack_safe*
The functions of this class solely add additional sanity checks. If they encounter an error condition, they return APLIB_ERROR
(defined as 0xFFFFFFFF
). Furthermore, they require the length of the input (compressed data) and the size of the output (decompressed data) as seen in the signature of the aP_depack_safe
function:
unsigned int aP_depack_safe(const void *source,
unsigned int srclen,
void *destination,
unsigned int dstlen);
Otherwise, they are equivalent to the functions from the aP_depack*
class.
aPsafe_depack
The decompression function aPsafe_depack
decompresses a compressed binary blob safely. It is a wrapper around the functions of the class. The counterpart of aPsafe_depack
is the function aPsafe_pack
.
The function signature of aPsafe_depack
resembles the signature of aP_depack_safe
:
unsigned int aPsafe_depack(const void *source,
unsigned int srclen,
void *destination,
unsigned int dstlen);
aPsafe_depack
requires that the compressed blob starts with a header. This header comprises additional information regarding the blob. This is for example very useful if we want to send an aPLib compressed blob over the network. The header structure looks like the following struct:
struct aPLib_header {
DWORD tag;
DWORD header_size;
DWORD packed_size;
DWORD packed_crc;
DWORD orig_size;
DWORD orig_crc;
}
The struct aPLib_header
has a size of 24 bytes. This holds on x86
and x64
systems. But there are several checks in the library that ensure that the header_size
is at least 24 bytes. The following screenshot shows a PE executable packed with appack
:
We can see the magic AP32
(0x32335041
), directly followed by the header size of 0x18 / 24 bytes. The next four DWORDs are the packed_size
, packed_crc
, orig_size
, and orig_crc
. After the header comes the payload, which starts in this case with M8Z
since we compressed a PE executable. We will use this fact later on for detection.
Detect if a binary statically links against aPLib
Detecting if a binary comprises the capability of aPLib compression/decompression is straightforward in the case it dynamically links against aPLib and it comprises a valid IAT (Import Address Table). However, in the case of custom/malicious binaries, it typically statically links against aPLib.
Still, there are many ways how we can detect this. First, we’ve already seen constants like AP32
(0x32335041
), which we could leverage to detect aPLib. But there are also several strings present that refer to aPLib itself or its author (Jørgen Ibsen) as seen in the following screenshot:
These ASCII strings are:
- “aPLib v1.1.1 – the smaller the better :)”
- “Copyright (c) 1998-2014 Joergen Ibsen, All Rights Reserved.”
- More information: http://www.ibsensoftware.com/
Another way would be detection via matching the assembly code. Tools like mkYARA can help you to generate (strict/relaxed) YARA rules for assembly functions/algorithms.
However, at this point, I do not want to reinvent the wheel and I just refer to one of the freely available YARA rules like the one from “_pusher_”:
rule aPLib : Jorgen Ibsen
{
meta:
author="_pusher_"
date="2016-09"
description="www.ibsensoftware.com/products_aPLib.html"
strings:
$a0 = { 60 8B 74 24 24 8B 7C 24 28 8B 44 24 2C FC 33 DB B2 80 39 18 74 42 A4 B3 02 E8 6D 00 00 00 73 F6 33 C9 E8 64 00 00 00 73 }
$a1 = { 60 8B 74 24 24 8B 7C 24 28 FC 33 DB 33 D2 A4 B3 02 E8 6D 00 00 00 73 F6 33 C9 E8 64 00 00 00 73 1C 33 C0 E8 5B 00 00 00 }
$a3 = { B2 80 33 DB A4 B3 02 E8 6D 00 00 00 73 F6 33 C9 E8 64 00 00 00 73 1C 33 C0 E8 5B 00 00 00 73 23 B3 02 41 B0 10 E8 4F 00 00 00 12 C0 73 F7 75 3F AA EB D4 E8 4D 00 00 00 2B CB 75 10 E8 42 00 00 00 EB 28 AC D1 E8 74 4D 13 C9 EB 1C 91 48 C1 E0 08 AC E8 2C 00 00 00 3D 00 7D 00 00 }
$a4 = { 61 94 55 B6 80 A4 FF 13 73 F9 33 C9 FF 13 73 16 33 C0 FF 13 73 1F B6 80 41 B0 10 FF 13 12 C0 73 FA 75 3A AA EB E0 FF 53 08 02 F6 83 D9 01 75 0E FF 53 04 EB 24 AC D1 E8 74 2D 13 C9 EB 18 91 48 C1 E0 08 AC FF 53 04 3B 43 F8 73 0A 80 FC 05 73 06 83 F8 7F 77 02 41 41 95 8B C5 }
$a5 = { B2 80 A4 B6 80 FF 13 73 F9 33 C9 FF 13 73 16 33 C0 FF 13 73 1F B6 80 41 B0 10 FF 13 12 C0 73 FA 75 3C AA EB E0 FF 53 08 02 F6 83 D9 01 75 0E FF 53 04 EB 26 AC D1 E8 74 2F 13 C9 EB 1A 91 48 C1 E0 08 AC FF 53 04 3D 00 7D 00 00 73 0A 80 FC 05 73 06 83 F8 7F 77 02 }
$a6 = { B2 80 31 DB A4 B3 02 E8 6D 00 00 00 73 F6 31 C9 E8 64 00 00 00 73 1C 31 C0 E8 5B 00 00 00 73 23 B3 02 41 B0 10 E8 4F 00 00 00 10 C0 73 F7 75 3F AA EB D4 E8 4D 00 00 00 29 D9 75 10 E8 42 00 00 00 EB 28 AC D1 E8 74 ?? 11 C9 EB 1C 91 48 C1 E0 08 AC E8 2C 00 00 00 3D 00 7D 00 00 73 0A 80 FC 05 73 06 83 F8 7F 77 02 }
$a7 = { 33 C9 FF D3 73 16 33 C0 FF D3 73 23 B6 80 41 B0 10 FF D3 12 C0 73 FA 75 42 AA EB E0 E8 46 00 00 00 02 F6 83 D9 01 75 10 E8 38 00 00 00 EB 28 AC D1 E8 74 48 13 C9 EB 1C 91 48 C1 E0 08 AC E8 22 00 00 00 3D 00 7D 00 00 73 0A 80 FC 05 73 06 83 F8 7F 77 02 41 41 95 }
$a8 = { 33 C9 FF 14 24 73 18 33 C0 FF 14 24 73 21 B3 02 41 B0 10 FF 14 24 12 C0 73 F9 75 3F AA EB DC E8 43 00 00 00 2B CB 75 10 E8 38 00 00 00 EB 28 AC D1 E8 74 41 13 C9 EB 1C 91 48 C1 E0 08 AC E8 22 00 00 00 3D 00 7D 00 00 73 0A 80 FC 05 73 06 83 F8 7F 77 02 41 41 95 }
$a9 = { 33 C0 FF 13 73 1F B6 80 41 B0 10 FF 13 12 C0 73 FA 75 3A AA EB E0 FF 53 08 02 F6 83 D9 01 75 0E FF 53 04 EB 24 AC D1 E8 74 2D 13 C9 EB 18 91 48 C1 E0 08 AC FF 53 04 3B 43 F8 73 0A 80 FC 05 73 06 83 F8 7F 77 02 41 41 95 }
$a10 = { 60 8B 74 24 24 8B 7C 24 28 FC B2 80 33 DB A4 B3 02 E8 6D 00 00 00 73 F6 33 C9 E8 64 00 00 00 73 1C 33 C0 E8 5B 00 00 00 73 23 B3 02 41 B0 10 E8 4F 00 00 00 12 C0 73 F7 75 3F AA EB D4 E8 4D 00 00 00 2B CB 75 10 E8 42 00 00 00 EB 28 AC D1 E8 74 4D 13 C9 EB 1C 91 48 C1 E0 08 AC E8 2C 00 00 00 3D 00 7D 00 00 73 0A 80 FC 05 73 06 83 F8 7F 77 02 41 41 95 }
//taken from r!sc aspr unpacker,
$a11 = { B2 80 8A 06 46 88 07 47 02 D2 75 05 8A 16 46 12 D2 73 EF 02 D2 75 05 8A 16 46 12 D2 73 4A 33 C0 02 D2 75 05 8A 16 46 12 D2 0F 83 D6 00 00 00 02 D2 75 05 8A 16 46 12 D2 13 C0 02 D2 75 05 8A 16 46 12 D2 13 C0 02 D2 75 05 8A 16 46 12 D2 13 C0 02 D2 75 05 8A 16 46 12 D2 13 C0 74 06 57 2B F8 8A 07 5F 88 07 47 EB A0 B8 01 00 00 00 02 D2 75 05 8A 16 46 12 D2 13 C0 02 D2 75 05 8A 16 46 12 D2 72 EA 83 E8 02 75 28 B9 01 00 00 00 02 D2 75 05 8A 16 46 12 D2 13 C9 02 D2 75 05 8A 16 46 12 D2 72 EA 56 8B F7 2B F5 F3 A4 5E E9 58 FF FF FF 48 C1 E0 08 8A 06 46 8B E8 B9 01 00 00 00 02 D2 75 05 8A 16 46 12 D2 13 C9 02 D2 75 05 8A 16 46 12 D2 72 EA 3D 00 7D 00 00 73 1A 3D 00 05 00 00 72 0E 41 56 8B F7 2B F0 F3 A4 5E E9 18 FF FF FF 83 F8 7F 77 03 83 C1 02 56 8B F7 2B F0 F3 A4 5E E9 03 FF FF FF 8A 06 46 33 C9 D0 E8 74 12 83 D1 02 8B E8 56 8B F7 2B F0 F3 A4 5E E9 E8 FE FF FF 5D 2B 7D 0C 89 7D FC 61 }
condition:
any of them
}
Nevertheless, there is still the possibility that all strings are overwritten, constants like AP32
are changed or are dynamically computed. Just remember that having not a match does not rule out aPLib usage completely but it makes it very unlikely.
Detect aPLib compression with your bare eyes and YARA
The following three sections shows you how to detect aPLib compression with your bare eyes and suggest several YARA rules to automate detection.
aPLib header
If the compressed blob is safely packed, then it is quite easy to find them within larger blobs. All we need to do is looking for the aPLib magic AP32
and the default header size of 0x18:
This boils down to searching for the byte sequence 0x4150333218000000
. We can write a quick and dirty YARA signature:
rule aplib_compressed_blob_with_header {
meta:
author = "Thomas Barabosch"
version = "20200109"
description = "Detects aPLib compressed blobs that comprise an aPLib header."
strings:
$aplib_compressed_with_header = { 41 50 33 32 18 00 00 00 }
condition:
$aplib_compressed_with_header
}
Compressed PE executables without aPLib header
However, as a malware analyst, you will stumble upon aPLib compressed blobs that do not comprise an aPLib header very frequently. At least, the good news is that the trained eye can easily spot aPLib compressed PE files. These blobs of compressed PE files do not start with the classic MZ
magic but with M8Z
:
Once you know that M8Z
means aPLib compression, it makes sense to write a small YARA signature to detect this in the future. Here I assume that we dumped a blob from, for instance, the heap, and the magic is at the beginning of the file:
rule aplib_compressed_pe {
meta:
author = "Thomas Barabosch"
version = "20201226"
description = "Detects aPLib compressed PE files, e.g. from memory dumps."
strings:
$mz_compressed = "M8Z"
condition:
$mz_compressed at 0
}
Compressed ELF executables without aPLib header
Furthermore, I compressed a couple of ELF x64 files with appack
. I wanted to see if there is also a pattern that gives aPLib compression away. If we byte-compare the output of two compressed files, then we will see the following:
Both compressed blobs start with the same AP32
magic and the default header size of 0x18. Of course, the next four DWORDs are the packed_size
, packed_crc
, orig_size
, and orig_crc
are completely different. But then there are 10 bytes that are equal: 0x7F07454C4602011E1501
. These bytes include the ELF magic ELF
, which is not at the beginning but does not get disfigured like the PE magic MZ
to M8Z
. Right now, I am not sure if this is a consistent behavior across all ELF files or just for ELF x64 files.
Again, we can write a YARA rule for this:
rule aplib_compressed_elf_executable {
meta:
author = "Thomas Barabosch"
version = "20200109"
description = "Detects aPLib compressed ELF executables. Note that there may be a aPLib header starting 24 bytes BEFORE the match!"
strings:
$aplib_compressed_elf = { 7F 07 45 4C 46 02 01 1E }
condition:
$aplib_compressed_elf
}
The YARA rules that I’ve presented here leave room for improvement. For example, we can check for cases where we have an aPLib header followed by a compressed PE executable, and so on. Be creative and let me know what you’ve found out!
aPLib decompression
Finally, we’ve learned so much about aPLib compression, how to spot it with our bare eyes and detect it with YARA. But there is one final piece missing: we need to talk about aPLib decompression. The following sections show you how to achieve aPLib decompression with three different tools.
aPLib decompression with apack
Before looking at more complex scenarios, we can always resort to the tools that come with aPLib. The library comes with an example tool called appack
. The source of this tool is stored under examples/appack.c
and there are several make
files for various platforms. appack
offers two commands c
and d
:
appack, aPLib compression library example
Copyright 1998-2014 Joergen Ibsen (www.ibsensoftware.com)
Syntax:
`Compress : appack c <file> <packed_file> `
`Decompress : appack d <packed_file> <depacked_file>`
For instance, we can decompress an aPLib compressed blob with the d
command as in the following snippet illustrated:
> appack d bin_ls.bin bin_ls
appack, aPLib compression library example
Copyright 1998-2014 Joergen Ibsen (www.ibsensoftware.com)
Decompressed 66101 -> 151352 bytes in 0.01 seconds
That’s it, pretty simple. But this is not suitable for more complex scenarios, e.g. automation. Here, we have two options. First, we can use the aPLib library itself and write C programs. Second, we can automate aPLib decompression using Python.
aPLib decompression with malduck
Lately, I utilize Malduck a lot. So, let’s see how we can decompress an aPLib compressed blob with it. We can decompress these files with the following script based on Malduck:
import malduck
import sys
def main(argv):
`if len(argv) != 2:`
` print('Usage: aplib.py PATH_TO_APLIB_COMPRESSED_BUFFER')`
` `
`with open(argv[1], 'rb') as f: `
`data = f.read() `
`try: `
`res = malduck.aplib(data)`
` if res:`
` with open(argv[1] + '_aplib_decompressed', 'wb') as g:`
` g.write(res)`
` else:`
` print(f'Malduck did not decompress the buffer.')`
` except Exception as e:`
` print(f'Could not aplib decompress: {e}')`
if <strong>name</strong> == '<strong>main</strong>':
main(sys.argv)
The function malduck.aplib
may take a flag called headerless
. This flag forces headerless compression and does not check for the AP32
magic. It defaults to True
.
Even though this script seems to be trivial, it is a perfect skeleton for more complex tasks that resolve around aPLib compression. For instance, if you’ve to write a script for extracting the malware configuration of a specific family, which happens to use aPLib as part of the way how it stores its configuration.
aPLib decompression with aplib-ripper
Another great tool is aplib-ripper by herrcore. It rips one or several aPLib compressed PE executables from a blob. It searches for the string M8Z
, forces a headerless decompression, and finally verifies and trims the output with pefile.