Shellcode Development

Download 39.05 Kb.
Size39.05 Kb.
Shellcode Development

Pallavi S. Murudkar

Olufemi A. Oloyede

CMPE 296 T

November 27, 2007



2 Introduction 3

3 Overview of Shellcode 3

4 Capability of Shellcode 3

5 Tools required for Shellcode development 4

6 Understanding Shellcode 4

6.1 Machine Architecture 4

6.1.1 X86 Machine Architecture 4

6.1.2 Virtual Memory concepts 5

6.1.3 Basic knowledge of Compilers and Debuggers 6

6.2 Program flow dynamics 6

6.3 Introduction to Stack based buffer overflows 9

7 Developing the Shellcode 10

7.1 Finding the vulnerability /Buffer Overflow 10

7.2 Writing the Shellcode: 10


8 Solutions for detecting Shellcode 14

9 Conclusion 14

10 References 15



Shellcode is a machine code (assembly) used as payload in the exploitation of a software bug. This means that when an exploit is trying to alter the program flow, such as buffer overflow exploitation, shellcode becomes the new flow for the program, thus accomplishing the attack. In this paper we will present the development process of a shellcode on top the x86 machine architectures, using c/c++ and assembly. Writing shellcode requires an in-depth understanding of the assembly language. Shellcode depends on the operating system and the architecture of the system. So, it is difficult to reuse the same shellcode for a different system with different architecture and OS.

3Overview of Shellcode

Shellcode is common in exploitation of vulnerabilities such as stack and heap-based buffer overflows as well as format strings attacks. Shellcodes have to fit themselves into buffers which are set by protocol or other elements. Because of this, shellcodes are always trying to get smaller. Thus, less logic can be implemented within them and at times this reduces their fault tolerance. Since it's important that the shellcode is very small, the shellcode hacker usually writes the code in the assembly programming language.

4Capability of Shellcode

Shellcode is used not only used t open a shell command but it is also used in various other exploits.

  1. Providing access to the attacked system

  2. Spawning /bin/sh [or] cmd.exe (local shell)

  3. Binding a shell to a port (remote shell)

  4. Adding root/admin user to the system

  5. Chmod()’ing /etc/shadow to be writeable

The term shellcode is derived from its original purpose—it was the specific portion of an exploit used to spawn a root shell

Shellcode is used to directly manipulate registers and the function of a program, so it must be written in hexadecimal opcodes.

Understanding how the registers work on an IA32 processor and how they are manipulated via assembly is essential for vulnerability development and exploitation. Registers can be accessed, read, and changed with assembly.

5Tools required for Shellcode development

During shellcode development you will require tools to write, compile, convert, test and debug shellcode

Nasm: It is portable Intel syntax assembler. The nasm package contains the assembler as well as disassembler. You need nasm to write the assembly code for you shell code.
Objdump: It is a tool to disassemble file and obtain important information about them
Gdb: gdb is GNU debugger. It will be used to analyze core dump files. It can also be used to dissemble compiled code sing the command disassemble
Ktrace: This utility create a ktrace file which can be viewed in the kdump utility trace to see all the system calls a program is making.

6Understanding Shellcode

The key to understanding and using Shellcode is:

1. An understanding of Machine Architecture (assembly and machine registers)

2. Program Flow dynamics - Processes Memory Organization and context switching during function-calls and interrupt processing.

3. The crux of applying Shellcode is via the modification of the return address of a function by way of a stack-based buffer overflow.

6.1Machine Architecture

Since shellcode is essentially assembly code, the following concepts are necessary to understand [4]:

  1. Knowledge of x86 machine instruction sets and registers

  2. Virtual Memory concepts.

  3. Knowledge of Compilers and debuggers.

6.1.1X86 Machine Architecture

“Today, the x86 architecture holds an effective monopoly among desktop and notebook processors, as well as a growing majority among servers and workstations … A large amount of computer software supports the platform, including operating systems such as MS-DOS, Windows, Linux, BSD, Solaris, and Mac OS X [7]” IA-32 is 32-bit instruction set that extenends the original x86 16-bit architecture. IA-32 processors have 8 general purpose registers, 8 floating point stack registers and a number of segment, control, debug and test registers. For shellcode programming the following 5 addressing registers are important to know:


Base pointer. Primarily used to hold the address of the current stack frame. Also sometimes used as a general data or address register.


General register or "source index" for string operations. Also has a one-byte LODS[size] instruction for loading data from memory to the accumulator.


General register or "destination index" for string operations. Also has a one-byte STOS[size] instruction to write data out of the accumulator.


Stack pointer. Is used to hold the top address of the stack.


Instruction pointer. Holds the current instruction address.

General registers (accoumlators) - an accumulator is a register in which intermediate arithmetic and logic results are storedan accumulator is a register in which intermediate arithmetic and logic results are stored.


Primary accumulator (some instructions have shorter opcodes when used with this register).


Accumulator and general purpose register.


Accumulator and general purpose (also loop counter implicitly used by the loop instruction).


Accumulator and general purpose (also extension to the EAX register for some 64 bit values used by mul,div etc).

For more information the "IA-32 Intel® Architecture Software Developer's Manual Volume 1: Basic Architecture" is a good starting point.

6.1.2Virtual Memory concepts

Virtual memory is a construct implemented in as a subsystem of a processor. It essentially allows processes to operate as if they were alone on the system [5] as well as provide perpetual contiguous memory.

Figure 1 - Virtual Memory [6]

6.1.3Basic knowledge of Compilers and Debuggers

A basic understanding of C/C++ compilers is also necessary in order to be aware of intricacies in what the assembly code generated will be and also compiler constructs to optimize code and also the layout of the program in memory depending on the addressing mode. Experience with debuggers will prove useful for generating and analyzing disassembled code. Illustrations and examples in section 5.3 use the gdb compiler.

6.2Program flow dynamics

We assume the reader is understands fundamental concepts of computing and the x86 architecture. As Shellcode interrupts the normal flow of a process, it is important to understand how the instructions of a program are executed, how the process makes use of registers and memory and also how function calls are handled using a call stack. The stack is a memory area allocated by the operating system, it essentially used to hold the execution states of process, and a level of the stack represents an execution text of variables in that state and the return address to a memory area where the next instruction exists. Context transitions occur when a function is called, or when a hardware or software interrupt is serviced. The following diagram is an illustration of how a program is organized in memory.

The text region is determined by the program; it includes code and read only data. The data region contains static variables and dynamic memory allocated by the program. The stack region is the focus area pertaining to shellcode, because a stack overflow is the means by which shellcode can be injected into a running program. “A stack is an abstract data type frequently used in computer science. As stack of objects has the property that the last object placed on the stack will be the first object removed. This property is commonly referred to as last in, first out queue, or a LIFO. [4]” The following illustration, reworked from “Smashing The Stack For Fun And Profit [4]” demonstrates how a process uses the stack.

We start of with a simple program that makes a call to function ‘A’ which takes 3 arguments. Function A declares 2 character arrays and returns. The diagram below shows the C code and corresponding assembly code:

When function A is called, its arguments are first pushed on the stack, followed by the RET address which is the address of the next instruction following the function call. Following that are the local variables Buffer1 and Buffer 2. The following diagram shows the stack right before the epilog portion of function A which is not shown in the assembly code above.

Notice that Buffer2 and Buffer1 are using more space that was declared in the function code, the additional bytes are necessary since records can only be addressed in multiples of 4 bytes, hence the compiler compensates by putting in extra bytes.

6.3Introduction to Stack based buffer overflows

To illustrate the buffer overflow we use a simple program which initializes a string of 8 bytes and passes it to a function A, which takes character pointer and copies the string into a local buffer. Because there is no bounds checking in function A, it will copy the entire string overwriting SFP and RET location in the stack.

This mechanism can be used to modify the RET address. If the contents of the string were to be populated with machine opcodes and padded with an address which points to first instruction Buffer1 in this case, we would have achieved a means to execute foreign code. This is the crux of how ShellCode can be injected into a running program. In this scenario the address AAAA may be an address that lies outside the process’s memory area and a segmentation violation will be triggered.

7Developing the Shellcode

7.1Finding the vulnerability /Buffer Overflow

Systematic methods to detect a buffer flow are available. A trail an error approach is always an option. Simply execute the target application and specify large parameters, a segmentation fault is will indicate that a buffer overflow has occurred.

7.2Writing the Shellcode:

Shellcode is sequence of machine instructions or opcode. Opcode can be written directly in hexcode, or they can be written in assembly and converted to the opcodes, or they can be written in C in which case the assembly and then the opcodes must be extracted. To take advantage of the injected code and to gain access to the target system, system calls must be used. System calls are “a special case of a software initiated trap … the machine instruction used to initiate a system call typically causes a hardware trap that is handled specially by the kernel [8].”

A system call handler is issued by the kernel when such an interrupt is executed. On linux there are two ways of implementing a system call, they are icall87/icall27 gates and INT 0x80 software interrupt, these are machine instructions that cause the processor to jump to a system call entry point. The EAX register contains a descriptor for the system call. Refer to “Designing Shell Code Demystified [8] for details about accessing system calls in an assembly. Depending on the value in the EAX register, other registers will be used accordingly.
Section 6.3 demonstrates the development of shell code that spawns a shell.


The following example extracted from “Buffer Overflow and ShellCode [4].” The assembly code generated is for an Intel based Linux system. We start with knowledge of a program that possesses a buffer overflow vulnerability to exploit.

The process is as follows:

  1. Write C code

  2. Extract the assembly code

  3. Extract the opcode

  4. Append an function exit opcodes to allow the function exit gracefully

  5. Initialize a buffer with the opcode.

The C code looks as shown:

The next step is to extract the assembly code using a compiler. GDB is used in this case producing the assembly code show in the diagrams below. Refer to the referenced article for a full explanation of the assembly code.

Dump of assembler code for function main:


: pushl %ebp

0x8000131 : movl %esp,%ebp

0x8000133 : subl $0x8,%esp

0x8000136 : movl $0x80027b8,0xfffffff8(%ebp)

0x800013d : movl $0x0,0xfffffffc(%ebp)

0x8000144 : pushl $0x0

0x8000146 : leal 0xfffffff8(%ebp),%eax

0x8000149 : pushl %eax

0x800014a : movl 0xfffffff8(%ebp),%eax

0x800014d : pushl %eax

0x800014e : call 0x80002bc <__execve>

0x8000153 : addl $0xc,%esp

0x8000156 : movl %ebp,%esp

0x8000158 : popl %ebp

0x8000159 : ret

End of assembler dump.

To compose the shellcode, a few intricacies are needed:

  1. Moving the string “/bin/sh” somewhere in the buffer to be used for the injection, this is the parameter to be passed to the “execve” system call.

  2. Update the address of the string that is passed to the execute function to point where the string is located. Because it will not be known until run time where the shellcode will reside JMP can CALL instructions can be used to apply a relative address.

A condensed version of the shellcode will look like this:

And this corresponding shell code can be used to initialize the buffer that will be used.

Note that null bytes in the character buffer this will cause the string to be considered terminated when these bytes are processed, through strcpy for example, therefore the shellcode must be written such that there are no null opcodes.

To test the shellcode the following construct can be used.

Notice that we are modifying the return address with the address of the opcode buffer.

When main returns the shellcode will be executed.

8Solutions for detecting Shellcode

NIDS (Network Intrusion Detection System) can be used to identify shellcode on the wire using Signature databases and Protocol analysis methods. This targets the shellcode bytes and can easily expose the whole attack by pointing out the shellcode in the traffic.

IPS (Intrusion Prevention System) identifies shellcode by running the code on a sandbox/virtualization in order to detect if the given code is malicious or not. This targets the shellcode actions, and can easily see that the shellcode is trying to perform actions which can are considered to be "malicious".


Shellcode is a powerful mechanism for the exploitation of software vulnerabilities. Software engineers with a mind for security should be aware of how to develop them and understand how they can be injected through system vulnerabilities. Shellcode can be employed to automate software security tests, where the shellcode is written to expose and draw attention to security holes


1. Wikipedia -

2. Foster James C. & M Stuart (March 2003)

Sockets, Shellcode, Porting, & Coding: Reverse Engineering Exploits and Tool Coding for Security Professionals.

  1. R Angelo (September 2004)

The Basics of ShellCoding. Retrieved November 26 from

  1. Aleph One (November 08, 1996)

Smashing The Stack For Fun And Profit. Retrieved November 26, 2007 from

  1. D Ulrich (October 9, 2007)

What every programmer should know about memory. Retrieved November 26 from
6. Wikipedia (November 2007).

Virtual Memory. Retrieved November 26 from
7. Wikipedia (November 2007)

X86 Architecture. Retrieved November 26 from
8. Aleph One (November 08, 1996)

Designing Shellcode Demystified. Retrieved November 26 from

Directory: education -> classes -> sjsu engr
education -> 100 Village Primary School Music Classrooms
education -> Between global market and international and regional cooperation
education -> Biographies of Patriots of Color at The Battle of Bunker Hill John Ashbow Colony: Connecticut Age: 22 Race: Native American Status: Free Rank: Private Position: Rail Fence Unit: Putnam/Durkee
education -> ­a geospatial Activity a buffer from the Storms
education -> Guide to Preparedness
education -> Social Sciences Teaching Unit Levels 2 6 Environmental Justice
education -> I. Introduction 2 II. The Creation of Literate Environments
education -> Harold j. Brody, md
education -> October/November 2015 Teacher's Guide Table of Contents
sjsu engr -> Professor: Sinn Richard project report (04/11/2006) On Routing Information Protocol (rip) 2

Download 39.05 Kb.

Share with your friends:

The database is protected by copyright © 2023
send message

    Main page