Obfuscating instructions executed
Mallory cannot obtain a control flow graph (CFG) or perform program analysis on the executable code of C provided the instruction is being executed by C cannot be determined. Trent changes the instructions inside the executable code such that they cause analysis tools to produce incorrect results. C contains a section (Crestore) which changes these modified instructions back to their original contents when it executes. Crestore contains the offset from the current location and the value to be placed inside the offset. Trent places information to correct the modified instructions inside Crestore. Crestore is executed prior to executing other instructions inside C and Crestore corrects the values inside the modified instructions.
-
Execution of C on Client’s Machine
The executable code is received by the Client’s (Alice) machine. The received information contains the length of the code and the location where it should be placed and executed. Normally it is not possible to introduce new code into a process during run time. However Alice’s software (P) can use a Linux library call to place C at the required location and execute the code. C communicates the results of the verification back to Trent without relying on P. The details of its execution are discussed below.
-
Injection of code by P on itself
P makes a connection request to Trent. Trent grants the request and provides the number of bytes of challenge to be received and follows it with providing the executable code of C. Trent also sends the information on the location inside P where C should be placed. P receives the code and prepares the area for injection by executing the library utility mprotect on the area. The code section of a process in the Intel x86 architecture is write-protected. This utility changes the protection on the code specified area of the code section and allows this area to be overwritten with new values. Once the injection is complete P creates a function pointer which points to the address of the location where the code was injected and calls the function using the pointer, transferring control to C.
-
Obtaining measurements on the target machine
C obtains certain identifiers on MAlice that allow Trent to identify whether it indeed executed at the correct machine and process. These identifiers have to be located outside the process space of P; therefore C computes the following values in order to send them to Trent. The IP address of MAlice, mathematical checksum on the MD5 code residing inside P, MD5 hash values of overlapping sub-regions inside P, and the process state that allows C to determine whether it was executed inside a sandbox.
The first involves identifying the machine on which it is executing. Trent received an incoming connection from Alice, hence it is possible to track of the IP address of MAlice. Although most IP addresses are dynamic, there is little probability of an IP address changing in the small time window between a request being sent and C taking its measurements. C does not utilize the system call libraries to obtain values. It utilizes interrupts to execute system calls. This involves loading the stack with the correct operands for the system call, placing the system call number in the A register and the other registers and executing the interrupt instruction. The sample code for creating a socket is shown in Fig. 6.
Reading the IP address involves creating a socket on the network interface and obtaining the address from the socket by means another system call – ioctl. The obtained address is in the form of an integer which is converted to the standard A.B.C.D format. After this, the address is sent to Trent using the send routine inside the socketcall system call. It must be noted that the send is done using the socket provided by P and not using a new socket. This is done so that Mallory cannot bounce C to another machine. If Mallory did that, then Mallory must provide an existing connection to Trent. However as connections to any machine can exist only with Trent’s knowledge, this situation cannot arise.
Trent verifies the address of the machine and sends a response to C which then proceeds to take checksum on some portions of the code and follows up with an MD5 hash of the entire code section. As discussed in section 4.2 and 6.3, the sub-regions are defined randomly and such that they overlap. C sends the checksum and MD5 results to Trent utilizing the system interrupt method for send as discussed above. C obtains the pid of the process (P 0) under which it is executing using the system interrupt for getpid. It then locates all the remote connections established to Trent from MAlice. This is done by reading the contents of the ‘/proc/net/tcp/’ file. The file has a structure shown in Fig. 7.
As seen in figure there is a remote address and port information for every connection that allows C to identify any open connection to Trent. Once all the connections are identified, C utilizes the inode of each of the socket descriptor to locate any process utilizing it. This is done by scanning the ‘/proc/
/fd’ folder for all the running processes on MAlice. In the ideal situation there should be only one process id (P 1) utilizing the identified inode. If it encounters more than one such process, then it sends an error message back to Trent. Once the process id P 1 is obtained, C measures if the id P 0 and the id P 1 are the same. If so, C sends an affirmative to Trent. These measurements allow Trent to be certain that C executed on P residing on MAlice.
7. Remote kernel attestation
To measure the integrity of the kernel we implement a scheme which is similar to the user application attestation scheme. Trent′ is a trusted server who provides code (Ckernel) to MAlice. It is assumed that Alice has means such as digital signature verification scheme to determine whether Ckernel was sent by Trent′. Alice receives Ckernel using a user level application Puser, verifies that it was sent by Trent’ and places it in the kernel of the OS executing on MAlice. Ckernel is then executed which obtains integrity measurements (Hkernel) on the OS Text section, system call table, and the interrupt descriptors table. Ckernel passes these results to Puser, which returns these results to Trent′. If required Ckernel can encrypt the integrity measurement results using a one time pad or a simple substitution cipher, however as the test case generated is different in every instance, this is not a required operation. Figure 8 depicts this process. Trent′ also provides a kernel module Pkernel that provides ioctl calls to Puser. As seen in figure 8a, Puser receives Ckernel from Trent′. In figure 8b, Puser forwards the code to Pkernel. It is assumed that Pkernel has the ability to verify that the code was sent by Trent′. Pkernel places the received code in its code section at a location specified by Trent′ and executes it. Ckernel obtains an arithmetic and MD5 checksum on the specified regions of the kernel on MAlice and returns the results to Puser as seen in figure 8c. Puser then forwards the results to Trent′ who determines whether the measurements obtained from the OS on MAlice match with existing computations (figure 8d). Since Trent′ is an OS vendor or a corporate network administrator, it can be assumed that Trent′ has local access to a pristine copy of the kernel executing on MAlice to obtain expected integrity measurement values generated by Ckernel. Although this seems like Trent′ would need infinite memory requirements to keep track of every client, most OS installations are identical as they are off the shelf. In addition if Trent is a system administrator of a number of machines on a corporate network, Trent′ would have knowledge of the OS on every client machines.
7.1 Implementation
The kernel attestation was implemented on an x86 based 32 bit Ubuntu 8.04 machine executing with 2.6.24-28-generic kernel. In Linux the exact identical copy of the kernel is mapped to every process in the system. Since we use the system calls, and software interrupts for the application attestation part, this section describes the integrity measurement of the text section (which contains the code for system calls and other kernel routines), the system call table and the interrupt descriptor table.
The /boot/System.map-2.6.24-28-generic file on the client platform was used to locate the symbols to be used for kernel measurement. The kernel text section was located at virtual address 0xC0100000, the end of kernel text section was located to be at 0xc03219CA which corresponded to the symbol '_etext'. The system call table was located at 0xC0326520, the next symbol in the maps file was located at 0xc0326b3c, a difference of 1564 bytes. The 'arch/x86/include/asm/unistd_32.h' file for the kernel build showed the number of system calls to be 337. Since MAlice was a 32 bit system, the space required for the address mappings would be 1348 bytes. We took integrity measurements from 0xC0326520 - 0xC0326B3B. The Interrupt descriptor table was located at 0xc0410000 and the next symbol was located at 0xc0410800, which gives the IDT a size of 2048 bytes. A fully populated IDT should be 256 entries of 8 bytes each which gives a 2KB sized IDT, this is consistent with the System.maps file on the client machine.
Trent′ also provides a kernel module (Pkernel) to the client platform which is installed as a device driver for a character device. Pkernel offers functionalities using the ioctl call. Puser receives the code from the trusted authority and opens the char device. Puser then executes an ioctl which allows the kernel module to receive the executable code. As in the user application attestation case, Trent′does not send the MD5 code for every attestation instance. Instead the trusted authority sends a driver code which populates a data array and provides it to the MD5 code which stays resident on Pkernel. To prevent Mallory from exploiting this, the trusted authority also provides an arithmetic checksum computation routine which is downloaded for every attestation instance. This provides a degree of extra unpredictability to the results generated by the integrity measurement code.
Kernel modules can be relocated during compile time. This means that the Trent′ would not know where the MD5 code got relocated during installation of the module. In order to execute the MD5 code, the Trent′ requests the location of MD5 function in the kernel module from the client end. After obtaining the address, Trent′ generates the executable code Ckernel which has numerous calls to the MD5 code. At generation, the call address may not match the actual function address at the client end. Once Ckernel is generated, the call instructions are identified in the code and the correct target address is patched on the call instruction. Once this patching is done, Trent′ sends the code to the client end. The call address calculation is done as follows:
call_target = -( (address_injected_driver + call_locations[0] + length_ofcall ) - address_mdstring );
code_in_file[jump_locations[0] +1 ] = call_target;
Ckernel is loaded in a char array code_in_file. The location where Ckernel address to be injected is determined by Trent′ by selecting a location from a number of 'nop' locations in the module, this address is termed as address_injected_driver in the above code snippet. The call location in the generated executable code is determined by scanning the code for the presence of the call instruction. The length of call instruction is a constant value which is dependent on the current architecture. Finally the address of mdstring (which is the location of MD5 code) is obtained from the client machine as described above. The second statement changes the code array by placing the correct target address. This procedure is repeated for all the call instructions in the generated code. It must be noted that Ckernel calls only the MD5 code and no other function. If obfuscation is required, Trent′ can place some junk function calls which get executed by evaluating an ‘if statement’. Trent′ can construct several if statements such that they never evaluate to true. It can be noted that even if the client does not communicate the address of the MD5 code, Pkernel can be designed such that the MD5 driver provided by the trusted authority and the MD5 code reside on the same page. This means that the higher 20 bits of the address of the MD5 code and the downloaded code will be the same and only the lower 12 bits would be different. This allows the Trent′ to determine where Ckernel will reside on the client machine, and automatically calculate the target address for the MD5 code. This is possible because the C compiler produces lower 12 bits of function addresses while creating a kernel module and allows the higher 20 bits to be populated during module insertion.
Once the code is injected, Trent′ issues a message to the user application requesting the kernel integrity measurements. Puser executes another ioctl which causes the Pkernel to execute the injected code. Ckernel reads various memory locations in the kernel and passes the data to the MD5 code. The MD5 code returns the MD5 checksum value to Ckernel which in turn returns the value to the ioctl handler in the Pkernel. Pkernel then passes the MD5 and arithmetic checksum computations back to Puser which forwards the results to the Trent′.
If required the disable interrupt instruction can be issued by Ckernel to prevent any other process from obtaining hold of the processor. It must be noted that in multi processor systems disable interrupt instruction may not prevent a second processor from smashing kernel integrity measurement values. However, as the test cases are different for every attestation instance, Mallory may not gain anything by smashing integrity measurement values.
8. Results
The time threshold (T) is an important parameter in this implementation. We aim to prevent an attacker Mallory from intercepting C and providing fake results to Trent. If T is too large then Mallory may be able to obtain some information about the execution of C. The value of T must take into account network delays. Network delays between cities in IP networks are of the order of a few milliseconds [32]. Hence measuring the overall time required for one instance of Remote Attestation and adding a few seconds to the execution time can suffice for the value of T.
We obtained the source code for the VLC media player interface [33]. We removed some sections of the interface code and left close to 1000 lines of C code in the program. We measured various stages of the integrity measurement process. We took 2 pairs of machines running Ubuntu 8.04. One pair were legacy machines executing on an Intel Pentium 4 processor with 1 GB of ram, and the second pair of machines were Intel Core 2 Quad machine with 3 GB of ram. The tests measured were the time taken to generate code including compile time, time taken by the server to do a local integrity check on a clean copy of the application and time taken by the client to perform the integrity measurement and send a response back to the server.
To obtain an average measurement for code generation we executed the program in a loop of 1000 times and measured the time taken using a watch. We also measured the time reported by system clock and found to be a slight variation (order of 1 second) in the time perceived by the human eye using the watch and that reported by the system clock at the end of the loop. The time taken for compiling the freshly generated code was measured similarly. These two times are reported in table 1.
We then executed the integrity measurement code C locally on the server and sent it to the client for injection and execution. The time taken on the server is the compute time the code will take to generate integrity measurement on the server as both machines were kept with the same configuration in each case. These times are reported in table 2. It must be noted that the client requires a higher threshold to report results because it has to receive the code from the network stack, inject the code, execute it, return results back through the network stack to the server. Network delays also affect the time threshold.
We can see from the two tables that it takes an order of a few hundred milliseconds for the server to generate code, while the integrity measurement is very light weight and returns results in the order of a few milliseconds. Due to this the code generation process can be viewed as a huge overhead. However, the server need not generate new code for every instance of a client connection. It can generate the measurement code periodically every second and ship out the same integrity measurement code to all clients connecting within that second. This can alleviate the workload on the server. A value for T can be suitably computed from the table taking into consideration network hops required and be set to a value less than 5 seconds.
9. Conclusion and Future work
This paper implements a method for implementing Remote Attestation entirely in software. We also presented number of other schemes in literature that address the problem of program integrity checking. We reduced the window of opportunity for the attacker Mallory to provide fake results to the trusted authority Trent by implementing various forms of obfuscation and providing new executable code for every run. We implemented this scheme on Intel x86 architecture and set a time threshold for the response.
As future work we plan to implement this scheme using the virtualization extensions. We also plan to extend this work to find out whether the client process continued executing after the Remote Attestation was successful.
References
[1] Web link. In brief and statistics: The H open source. Retrieved on October 4, 2010, http://www.h-online.com/open/features/What-s-new-in-Linux-2-6-35-1047707.html?page=5
[2] T. Ball, E. Bounimova, B. Cook, V. Levin, J. Lichtenberg, C. McGarvey, B. Ondrusek, S. K. Rajamani and A. Ustuner, "Thorough static analysis of device drivers," ACM SIGOPS Operating Systems Review, vol. 40, pp. 73-85, 2006.
[3] A. Chou, J. Yang, B. Chelf, S. Hallem and D. Engler, "An empirical study of operating systems errors," in Proceedings of the Eighteenth ACM Symposium on Operating Systems Principles, 2001, pp. 73-88.
[4] A. Seshadri, M. Luk, E. Shi, A. Perrig, L. Van Doorn and P. Khosla, "Pioneer: Verifying code integrity and enforcing untampered code execution on legacy systems," in ACM SIGOPS Operating Systems Review, 2005, pp. 1-16.
[5] A. Seshadri, A. Perrig, L. van Doorn and P. Khosla. SWATT: SoftWare-based ATTestation for embedded devices. 2004 IEEE Symposium on Security and Privacy. pp. 272-282.
[6] R. Kennel and L. H. Jamieson, "Establishing the genuinity of remote computer systems," in Proceedings of the 12th USENIX Security Symposium, 2003, pp. 295-308.
[7] J. A. Garay and L. Huelsbergen, "Software integrity using timed excutable agents," in Proceedings of the 2006 ACM Symposium on Information, Computer and Communications Security, 2006, pp. 189-200.
[8] U. Shankar, M. Chew and J. D. Tygar, "Side effects are not sufficient to authenticate software," in Proceedings of the 13th USENIX Security Symposium, 2004, pp. 89-102.
[9] R. Kennel and L. H. Jamieson, "An Analysis of proposed attacks against GENUINITY tests," CERIAS Technical Report, Purdue University, 2004.
[10] F. Stumpf, O. Tafreschi, P. Röder and C. Eckert, "A robust integrity reporting protocol for remote attestation," in Second Workshop on Advances in Trusted Computing (WATC’06 Fall), 2006.
[11] R. Sailer, X. Zhang, T. Jaeger and L. Van Doorn, "Design and implementation of a TCG-based integrity measurement architecture," in SSYM'04: Proceedings of the 13th Conference on USENIX Security Symposium, 2004, pp. 223-228.
[12] K. Goldman, R. Perez and R. Sailer, "Linking remote attestation to secure tunnel endpoints," in STC '06: Proceedings of the First ACM Workshop on Scalable Trusted Computing, 2006, pp. 21-24.
[13] L. Wang and P. Dasgupta, "Coprocessor-based hierarchical trust management for software integrity and digital identity protection," Journal of Computer Security, vol. 16, pp. 311-339, 2008.
[14] N. L. Petroni Jr, T. Fraser, J. Molina and W. A. Arbaugh, "Copilot-a coprocessor-based kernel runtime integrity monitor," in Proceedings of the 13th Conference on USENIX Security Symposium-Volume 13, 2004.
[15] R. Sailer. IBM research - integrity measurement architecture. Retrieved on November 3, 2010, http://domino.research.ibm.com/comm/research_people.nsf/pages/sailer.ima.html
[16] T. Garfinkel, B. Pfaff, J. Chow, M. Rosenblum and D. Boneh, "Terra: A virtual machine-based platform for trusted computing," ACM SIGOPS Operating Systems Review, vol. 37, pp. 193 - 206, 2003.
[17] R. Sahita, U. Savagaonkar, P. Dewan and D. Durham, "Mitigating the lying-endpoint problem in virtualized network access frameworks," 18th IFIP/IEEE international conference on Managing virtualization of networks and services, 2007, pp. 135-146.
[18] V. Haldar, D. D. Chandra and M. M. Franz, "Semantic remote attestation: A virtual machine directed approach to trusted computing," in USENIX Virtual Machine Research and Technology Symposium, 2004, pp. 29-41.
[19] G. Wurster, P. C. van Oorschot and A. Somayaji, "A generic attack on checksumming-based software tamper resistance," in 2005 IEEE Symposium on Security and Privacy, 2005, pp. 127-138.
[20] B. Schwarz, S. Debray and G. Andrews, "Disassembly of executable code revisited," in Proceedings of Working Conference on Reverse Engineering, 2002, pp. 45-54.
[21] C. Collberg, C. Thomborson and D. Low, "Manufacturing cheap stealthy opaque constructs," in Proceedings of Working Conference on Reverse Engineering, 1998, pp. 184-196.
[22] C. Linn and S. Debray, "Obfuscation of executable code to improve resistance to static disassembly," in Proceedings of the 10th ACM Conference on Computer and Communications Security, 2003, pp. 290-299.
[23] K. D. Cooper, T. J. Harvey and T. Waterman, "Building a control flow graph from scheduled assembly code,"
[24] J. F. Levine, J. B. Grizzard and H. L. Owen. (2006, Detecting and categorizing kernel-level rootkits to aid future detection. IEEE Security & Privacy pp. 24-32.
[25] Web link, "Information about the knark rootkit," Retrieved on November 9 2010. http://www.ossec.net/rootkits/knark.php
[26] D. Sd. (2001), Linux on-the-fly kernel patching without LKM.
[27] P. A. Loscocco, P. W. Wilson, J. A. Pendergrass and C. D. McDonell, "Linux kernel integrity measurement using contextual inspection," in 2007 ACM Workshop on Scalable Trusted Computing, 2007, pp. 21-29.
[28] Web link, "Address space layout randomization," Retrieved on April 25, 2010. http://pax.grsecurity.net/docs/aslr.txt
[29] Web link, "Linux man pages online - kernel random number generator," Retrieved on August 30, 2010. http://linux.die.net/man/4/random
[30] Web link. Hackers discover HD DVD and blu-ray processing key - all HD titles now exposed. Retrieved on November 3, 2009. http://www.engadget.com/2007/02/13/hackers-discover-hd-dvd-and-blu-ray-processing-key-all-hd-t/
[31] Web link, "Hi-Def DVD Security is bypassed," Retrieved on November 3, 2009. http://news.bbc.co.uk/2/hi/technology/6301301.stm
[32] Web link, "Global IP Network Latency," Retrieved on January 17, 2010. http://ipnetwork.bgtmo.ip.att.net/pws/network_delay.html
[33] Web link, "VLC media player source code FTP repository," Retrieved on February 24, 2010. http://download.videolan.org/pub/videolan/vlc/
Machine
|
Test
generation
|
Compilation
time
|
Total
Time
|
Pentium 4
|
12.3
|
320
|
332
|
Quad Core
|
5.2
|
100
|
105
|
Table 1: Average code generation time in milliseconds on server end for Intel Pentium 4 and Core 2 Quad machines for one instance of the measurement
Machine
|
Server side
execution time
|
Client side
execution time
|
Pentium 4
|
0.6
|
22
|
Quad Core
|
0.4
|
16
|
Table 2: Time taken in milliseconds to compute the measurements on server and on the remote client
Figure Captions
Figure Number Caption
-
Challenge response Overview
-
Protocol Overview
-
Hash obtained on overlapping sub-regions. Two instances have different sub-regions
-
Procedure for obtaining the MD5 Hash of the entire code section
-
Snippet from the checksum code
-
ASM code for creating a socket
-
Contents of /proc/net/tcp file
-
Kernel remote attestation scheme
-
User application initiates attestation request
-
User application sends attestation code to kernel
-
Kernel returns integrity values to user application
-
Verification of kernel integrity by trusted server
Figures
Results
Measurements
C
Request
Trent
MAlice
P
C
Fig. 1
Verification Request
-
Trent Alice
Inject code at location, execute it
-
C Trent
Machine Identifier
-
Trent C
Proceed
-
C Trent
Initial Checksum
-
Trent C
Proceed
-
C Trent
MD5 Hash of specified regions
-
Trent C
Proceed
-
C Trent
Test of correct process ID
-
Trent C
Proceed/Halt
Fig. 2
200
160
110
50
0
200
150
80
60
0
Checksum 1
Checksum 2
Checksum 3
Checksum 4
Checksum 1
Checksum 2
Checksum 3
Checksum 4
Fig. 3
H12H3
H3
H12
H1H2
H2
H1
Concatenation
MD 5
Region 1
MD 5
Region 1
+
MD 5
MD 5
Region 3
+
MD 5
Region N
+
MD 5
Result
Fig. 4
{
……
x =
a = 0;
while (a<400) {
checksum 1 += Mem[a];
if ((a % 55) == 0)
{checksum2 += checksum1/x;}
a++;
}
send checksum2;
…..
}
Fig. 5
__asm__(“sub $12, %%esp\n”
“movl $2, (%%esp)\n”
“movl $1, 4(%%esp) \n”
“movl $0, 8(%%esp) \n”
“movl $102, %%eax\n”
“movl $1,%%ebx\n”
“movl %%esp, %%ecx\n”
“int $0x80\n”
“add $12, %%esp\n”
: “=a” (new_socket)
);
Fig. 6
sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode
0: 0100007F:1F40 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 5456 1 f6eb0980 299 0 0 2 -1
1: 00000000:C3A9 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 4533 1 f6ec0000 299 0 0 2 -1
2: 00000000:006F 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 4473 1 f6f60000 299 0 0 2 -1
3: 0100007F:0277 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 5690 1 f6ec0980 299 0 0 2 -1
4: 0100007F:0019 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 5358 1 f6ec04c0 299 0 0 2 -1
5: 0100007F:743A 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 5411 1 f6eb04c0 299 0 0 2 -1
Fig. 7
Trent′
Userland
Kernel attestation request
Ckernel
Operating System
Pkernel
Puser
Userland
Operating System
Pkernel
Puser
Ckernel
Userland
Operating System
Pkernel
Puser
Hkernel
Trent′
Userland
Kernel integrity measurements
OK
Operating System
Pkernel
Puser
Fig. 8d
Fig. 8c
Fig. 8b
Fig. 8a
Figure 8
Share with your friends: |