-
Notifications
You must be signed in to change notification settings - Fork 73
Spring20Cs361sLab2
| Assigned | 2/5/2020 |
| Due | 2/19/2020 |
| Points | 100 |
The goal of this assignment is to gain hands-on experience with the effect of buffer overflow and format string bugs. Your work in this project will be done in an emulated 32-bit x86 machine. In our experiments, we will run a recent release of NetBSD, one of a family of operating systems based on BSD Unix. We have supplied a virtual machine image of a NetBSD/x86 machine, in OVA format, that you can run using VMware Workstation Player or VirtualBox. (Note that your computer probably has an x86 chip, too, but it runs in 64-bit mode, whereas the VM ensures that this project will run in 32-bit mode.)
Code is provided in the class github repository under 2020_cs316s/labs/lab2. It includes the source code for two exploitable programs, target1.c and target2.c. These programs are to be compiled and run on your x86 VM. Your goal is to write two exploit programs (sploit1 and sploit2). Program sploit[i] will send a command to program target[i] that exploits a vulnerability in the target program and causes it to execute shellcode.
We have supplied you with shellcode (in 2020_cs316s/labs/lab2/shellcode.h) that will spawn a shell and connect to localhost on port 6666. If you have a program listening for connections on this port, for example by running nc -l -p 6666 on the x86 VM, and if your exploit causes the target program to execute the supplied shellcode, then you should see a connection to nc, and you should be able to interact with the shell through nc. (Unfortunately, the shell won't print a prompt, because it can tell that it isn't connected directly to a terminal.) Note that you'll need to install netcat first, along with any other tool you need, using pkgin as described below.
The skeletons for sploit1 and sploit2 are provided as sploit1.c and sploit2.c. Note that correct solutions for the exploit programs can be very short, so there is no need to write a lot of code here.
WARNING! Although we are encouraging you to read the paper on buffer overflows as helpful background for this assignment, the buffer overflow necessary here cannot be done exactly as described in the paper. As you may remember from class, it is common to make the STACK NOT EXECUTABLE and that is the case here. If you try to execute in an arbitrary location, you will find that the system segfaults and fails. The target code (both target1 and target2) allocate space for a buffer that is EXPLICITLY MARKED AS EXECUTABLE. So you will have to have your over-written return address jump within this memory. This is important, so please make sure that you understand it. This is described in a bit more detail in the Hints section below.
You may work together in pairs on this assignment. However, there are a number of students that have created buffer overflows before and many that have not. If you have written a buffer overflow before, and are working in a pair, you must make an attempt to work with a student that has not done this type of thing before. The staff reserve the right to break up and assign pairs to even out experience levels in the class.
You will test your exploit programs within a virtual machine designed to run on an emulated 32-bit x86 machine. This VM is available for download on the course website. You will need a virtual machine monitor capable of importing and running VMs in OVA format. VMware Workstation Player and VirtualBox should both work.
The x86 virtual machine we provide is configured with NetBSD 7.2. You should be able to use NetBSD’s pkgin facility (or ports, or packages) to add software to the virtual machine, should you need to. For example, to install netcat, which provides the nc tool mentioned above, run the following command as root:
pkgin install netcat
Other packages you may wish to install include curl, rsync, nano, and git. BSD-derived operating systems like NetBSD have the vi editor already installed; we recommend you try the system vi before installing vim with pkgin.
The x86 virtual machine is configured to use NAT (Network Address Translation) for networking. You should be able to make outgoing connections from the virtual machine to any server you control. In addition, the x86 virtual machine runs sshd. This sshd should either be accessible from the host using the IP address of the VM (if you are running it under VMware) or you can set up forwarding (if you are running it under VirtualBox). Connecting to the VM from the host machine using ssh is likely to be more pleasant than working at the 25-line virtual console. (Though you do have multiple virtual consoles and can switch between them using Alt-Ctrl-F1, Alt-Ctrl-F2, etc.)
The download tarball includes a Makefile that specifies how to build the targets.
The Makefile also provides a "make pipes" command that will create named pipes /tmp/t1pipe and /tmp/t2pipe. The targets will wait to read a command from the pipe. Your exploit will write a command to the pipe, which will be received and processed by the target. (It is a property of named pipes, also called FIFOs, that a reader will block until a writer arrives, and a writer will block until a reader arrives.) You may need to rerun make pipes each time you restart the x86 virtual machine.
You should not run the targets directly. Instead, run them through the run-target wrapper. The run-target wrapper will ask for two arguments. The first is either 1 to run target1 or 2 to run target2. The second will be a four-digit ID based on your texas EID. Your four-digit ID is the last four digits of your EID if your EID is more than four digits. Otherwise, it is your EID with as many left-flush 0's as necessary to make it four digits. Make sure always to use the same, correct ID; this will ensure that addresses for stack variables remain consistent each time you examine your target, making your exploits reliable and repeatable. If you work with a partner, you may use either of your two four-digit IDs.
The assignment tarball also contains skeleton source for the exploits which you are to write, along with a Makefile for building them. Also included is shellcode.h, which gives shellcode for NetBSD/x86 by minervini.
This shellcode is a “reverse shell” or “connect-back” shellcode: it causes the process that executes it to make a TCP connection to a particular IP address and port, then expose a shell on that connection. If you set up a program to listen for incoming connections on that IP and port, then you can use that program to interact with the shell and give it commands to execute.
The shellcode you are given is hard-coded to connect to the localhost (127.0.0.1) IP address and to port 6666. If you install netcat (using the pkgin tool described above), you can use it to listen on that port: nc -l -p 6666
If you successfully exploit a target, you should be able to type commands like ls and hit RETURN in netcat, then see their output (in the case of ls, a directory listing). You will not see a shell prompt.
Read Aleph One’s “Smashing the Stack for Fun and Profit.” Carefully. Also read the class lecture notes — have a good understanding of what happens to the stack, program counter, and relevant registers before and after a function call. It will be helpful to have a solid understanding of the basic buffer overflow exploits before reading the more advanced exploits.
Please note, however, that you cannot implement Aleph's attack exactly as described. You will be working around a non-executable stack. Only the original command buffer that is read in from your sploit command is marked as executable. You must jump to code within this region. Importantly, exactly where this memory is will depend on your 4-digit code. So please make sure you use the same 4-digit code, or your attack will stop working.
GDB is your best friend in this assignment, particularly to understand what's going on. Specifically, note the “disassemble” and “nexti” commands. You may find the “x” command useful to examine memory (and the different ways you can print the contents such as /a or /i after x). The "info register" command is helpful in printing out the contents of registers.
A useful way to run gdb is to connect to the target after it has been executed by the run-target wrapper.
Start early. Theoretical knowledge of exploits does not readily translate into the ability to write working exploits. The first target is relatively simple and the other problems are quite a bit more complicated.
The format string bug part of the lab is considerably trickier than the buffer overflow and you should start as soon as possible. The paper about the bug has everything you need, but a few critical components are not explained very well. In particular, 90% of the paper is about the "format string" being on the stack. Your format string is not on the stack because it's in the global data structure cmdbuf (globals are not on the stack).
You need to read section 6.4 of the format bug paper entitled, "Format strings within the heap." This applies specifically to this assignment. Unfortunately, this section is very short and scant on details. The key sentence is, "...we do not access those addresses from the format string itself, but from the target buffer." What does this mean?
Let's make sure we understand what's going on.
- We have our RWX memory in
cmdbuf - Within the
cmdbufis the evil format string we sent - The usual approach to these attacks is to adjust the stack pointer until it gets back to the format string on the stack
- But the
cmdbuf(format string) isn't on the stack. - So instead, we will read from the
bufthatcmdbufis being copied into.
How can we do that last step, you might ask? The answer is in how these format bugs work at all. In Section 2.5 of the paper, it states, "The format function now parses the format string ‘A’, by reading a character a time..." Please note the "a character at a time" part.
When doing an snprintf function, the data is being copied from cmdbuf to buf one character at a time. That means that if you write some addresses into buf at the beginning of the format string you can read from them later by advancing the stack pointer to buf.
This is probably very confusing to some of you. I highly recommend some or all of the following:
- Read the paper. Read it over and over. If you don't understand it, ask for help on Slack.
- For rapid testing, modify your
target2.cto not read from the pipe and, instead, hard-code format strings you want to try (switch back to the class-default file when you've got it figured out). - Try all kinds of format strings to test out different parts of the paper
- Do NOT try a memory read (
%s) or write (%n) without first getting the address. That is try a format string with%08xfirst, verify that it is the address you want to read from or write to, and then replace the%08xwith the%sor%n - You will need to use both testing/exploratory format strings AND GDB to get a clear picture of what's going on
- Disassembly is especially helpful here for seeing what is going onto the stack and where
- Your nop slide will be more important in this phase because using
%nto write is less precise for low numbers. For example, if yourcmdbufis at an address that ends in00you may struggle to jump exactly there. Buf if yourcmdbufis filled with nops followed by the shell code, it won't matter
Aleph One gives code that calculates addresses on the target’s stack based on addresses on the exploit’s stack. Addresses on the exploit’s stack can change based on how the exploit is executed (working directory, arguments, environment, etc.); in our testing, we do not guarantee to execute your exploits as bash does.
You must therefore hard-code target stack locations in your exploits. You should not use a function such as get_sp() in the exploits you hand in.
Download the Box virtual machine, box.ova (warning: 590 MB!). Import the virtual machine into VMware or VirtualBox and run it.
The virtual machine has two accounts, root and user. The root account's password is "root" (without the quotes); the user account's password is "user" (again without the quotes). You will do most of your work as user, but remember that you can use the pkgin tool to install additional software if you log in as root.
While you can interact with the VM through the console, you will probably have a better experience if you SSH into it from your physical host computer. Expose port 22 on the VM to the host and SSH into it. You can make multiple SSH connections simultaneously.
As a reminder, the class repo has the code you need to get started under 2020_sp316s/labs/lab2.
To shut down the virtual type "poweroff" as root.
Submissions are per-team. To submit:
- Create a subdirectory in your github repository called
labs/lab2. This should be the same repo that you used in the first lab. - Add a file called
labs/lab2/ID.txtthat contains either one line (if you worked by yourself) or two lines (if you worked with a partner), each in the following format: ProjID EID FirstName LastName. Here ProjID is the four-digit ID that you were assigned over e-mail. If you worked with a partner, make sure the that ProjID that you used in developing your sploits is on the first line, and the other ProjID is on the second line. - Add the files sploit1.c and sploit2.c, which should be the only files where you change code, as
labs/lab2/sploit1.candlabs/lab2/sploit2.c - Tag your commit using git tag lab2-1.0. If you make a change after submission, re-commit and tag with git tag lab2-1.x where x is one greater than the last submission
Your submission will be graded as follows:
50 points for the buffer overflow 50 points for the format string bug