Uncategorized → Analyzing the pentium stack part 1
How does a Pentium stack works
The purpose of this article is to explain very briefly what is a pentium stack. The reader, in order to understand the content of this article, must have a basic knowledge of how processes do memory management (e.g. segmentation), have a basic knowledge of assembly and C.
For this article I have used the following software and hardware:
- GNU gdb 6.1
- GNU gcc 3.4.6 (for the series of 2.4.x)
- Suse 9.1 (with a patched kernel)
- Perl 5.8
- ANCI C
- Aspire 2001WLCI (with Intel Centrino Mobile Technology)
-
For command prompt input (keyboard typing): Grey color is used
-
For command prompt output (stdout): Italik font is used
-
For code:Parple and light blue colors is used
-
Use a root terminal (super user) for your convenience
-
The system I used does not have exec-shield technology (used in later builds such as Fedora ….)
Stack Structure
A stack is a first in, last out (the well known FILO) data memory structure used to load, or in other words to allocate space for the user or operating system processes. Now in order to manipulate the stack you need to know three different stack pointers (this is a very simplistic approach):
- EIP (Extended Instruction pointer)
- ESP (Extended Stack pointer)
- EBP (Extended Base pointer)
The EBP is used to show the base of the stack, in other words the beginning of the stack, and is static, meanning once allocated never changes. The fact that EBP is static makes stacks ideal for writing buffer overflows (e.g. Heaps are much more difficalt to corrupt!!). ESP is not static, which means that it’s values change through the time and it shows always the end of the stack (the stack shrinks and grows from the ESP). The most imortant of all pointers is EIP, because EIP is points the register address for all function calls related with the current stack. If someone can locate EIP then this would give him the opportunity to execute his/her code!!!
Stack structure: esp < ----> ebp < ----> eip
How does a stack look like in assembly
Initially we will compile and run simple programms in C, use gdb to disassmble the main function, and see how the main function looks in assembly
Step 1: Write and compile a simple main function:
int main(){;} //This will allocate a stack to run our main
Step 2: Use gcc to set a dword (meanning to use double words of 16bits long instead of 32 bits long) to make things simpler
# gcc -mpreferred-stack-boundary=2 -ggdb t.c -o t
Step 3: Disassemble only the main function with no parameters or functions
# gdb t
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type “show copying” to see the conditions.
There is absolutely no warranty for GDB. Type “show warranty” for details.
This GDB was configured as “i586-suse-linux”…Using host libthread_db library “/lib/tls/libthread_db.so.1″.
(gdb) disas main
Dump of assembler code for function main:
0×0804835c : push %ebp
0×0804835d : mov %esp,%ebp
0×0804835f : pop %ebp
0×08048360 : ret
End of assembler dump.
(gdb) q
Explanation: The extended base pointer (ebp) is pushed into the stack and then it is copied on the extended stack pointer (esp). The push assemby command is used to insert data on top of the stack (a FIFO stack starting from ebp and ending at esp) while the mov assembly command is used to copy the a source (ebp) to the a destination address (esp).
In our example ebp becomes equal to esp (more simply ebp becomes a copy of esp). Now because ebp and esp do define the limits of the stack it is logical to assume that the stack is empty. The esp always points to the top of the stack (the higher address) and decrements or increments (depending whether the data are pushed or poped), from the lowest address.
The pop command is used to pop data off the stack. In our example exactly because the stack is empty esp becames equal to ebp and then gets poped off the stack. After having esp poped off the stack our process is terminated by calling the ret command. The ret assembly command is used to return the execution flow, and? terminate the current? procedure.
Step4:Now we move to something more complicated
int main(){int a;} //This will allocate a stack running main
Step5:Use gdb to see the assemply
(gdb) disas main
Dump of assembler code for function main:
0×0804835c : push %ebp
0×0804835d : mov %esp,%ebp
0×0804835f : sub $0×4,%esp
0×08048362 : leave
0×08048363 : ret
End of assembler dump.
(gdb) q
Explanation: Variable a is compiled but not initialized in the code example above. The last two assembly lines and are used to clean the stack (leave and ret). Again because our stack is empty esp is copied to the ebp.
The sub command is again used to subtract the source from the destination (this time the source is the esp and the destination is 0×4) and store the result in the destination (meanning the 0×4). The leave assembly command is used to load the effective address of the source to the destination, but because our code does not do anything and contains no data, the line remains empty.
Step6:Now we move to something even more complicated
int main(){int a=0;} //This will allocate a stack running main
(gdb) disas main
Dump of assembler code for function main:
0×0804835c : push %ebp
0×0804835d : mov %esp,%ebp
0×0804835f : sub $0×4,%esp
0×08048362 : movl $0×0,0xfffffffc(%ebp)
0×08048369 : leave
0×0804836a : ret
End of assembler dump.
(gdb) q
Explanation: Everything is like the example in step 5, but this time the variable a is initialized. The value we used to initialized the a variable is 0 and in hexadecimal is represented with $0×0, now the the 0xfffffffc value shows the register address that the true value was placed.
Refrences: