Pwnos design Document Version 0a



Download 135.4 Kb.
Page6/6
Date31.01.2017
Size135.4 Kb.
#13944
1   2   3   4   5   6

Threads and Processes

Process Management


The Windows Portable Executable Format is used as the executable format for PwnOS. It has support for dynamic linking via relocation entries and x86-64 code. Programs can then also be tested in Windows using a simple library simulating PwnOS. For details on the PE format, see (12). Programs loaded from disk are completely loaded immediately, instead of waiting for page faults to occur, because page faults are expensive with 2MB pages.
The only explicit support for Inter-Process Communication (IPC) in PwnOS is via files (i.e. pipes), and so is discussed in the Files section. This is because with most commonly 1 or 2 processes on the system, elaborate IPC is unnecessary.
The default page-level permissions on the user-accessible pages of a process are as follows.


Type

Permissions

Code

Read, Execute

Global Data

Read, Write

Heap

Read, Write

Stack

Read, Write

Allocated with AllocatePages

Custom

The arrangement of these pages in virtual memory is as follows.



processvm.png

Figure : User-Accessible Process Pages

The red blocks are guard pages (no access allowed) for accident prevention. The heap may or may not immediately follow the guard page after the stack, to allow for heaps that are larger than 2GB (by placing them after the 4GB mark). If the heap does not immediately follow the guard page after the stack, it must be preceded by its own guard page.


Each process also has its own page table tree and reverse page tables. These sets, although in different parts of physical memory, occupy the same virtual memory space, as described in the Page Memory Management section.
New processes are created with one thread starting at the main code entry point. This thread may have default properties, or some of these properties can be specified to CreateProcess. Each new thread will have its own stack with a guard page on each side.
For details on process management on x86-64, see (9).

Thread Management


Support for many threads in PwnOS requires careful use of the Global Descriptor Table (GDT). The layout of the GDT entries is as follows.
gdt.png

Figure : Global Descriptor Table (GDT)

The careful use relates to the Task State Segment (TSS) descriptors. Only one CPU can be running a given task (thread) at a time, and the number of tasks (including idle tasks) may exceed the maximum number of GDT entries (8,192). Also, the Thread Scheduler must be in a task separate from all others to effectively make use of the built-in task switching and state saving.


The solution is to have a single TSS descriptor for each processor, plus one for the Thread Scheduler. When a CPU is to switch threads, for any reason, the following occurs.

  1. (All switches to the Thread Scheduler must be done from PL0, including LAPIC timer handlers, so being in PL0 is assumed.)

  2. Thread disables interrupts (if not already disabled).

  3. Thread does FXSAVE to save its extended state.

  4. Thread sets its own status information to indicate why & when it is going to the Thread Scheduler.

  5. Thread spinlocks for access to the Thread Scheduler (since little time will be spent in it, and switching tasks is required for more elaborate synchronisation).

  6. Thread switches tasks to the Thread Scheduler. (The general CPU state is automatically saved in the TSS.)

  7. Thread Scheduler stops APIC timer for the previous thread’s timeout if it wasn’t already stopped.

  8. Thread Scheduler selects a thread to run.

  9. Thread Scheduler writes the descriptor for that thread’s TSS to the GDT entry for the current processor.

  10. Thread Scheduler does FXRSTOR to restore the extended state for the next thread.

  11. Thread Scheduler sets new thread status information to indicate that it is running.

  12. Thread Scheduler starts APIC timer for the new thread’s timeout.

  13. Thread Scheduler switches to the new thread’s task. (The general CPU state is automatically restored.)

  14. New thread releases the lock on Thread Scheduler. (Since this is always in PL0 code, this is not dependent on the application.)

  15. New thread enables interrupts (if returning to PL3).

This approach ensures proper and efficient functionality for even very large numbers of tasks. The structure used for threads encompasses the TSS for the thread and the extended state saved by FXSAVE, making most efficient use of the structures built into the CPU, instead of reorganising the data therein. However, Thread Scheduler does not use any extended state, and has no independent execution context, so it is not a full thread; it only needs a TSS. These structures, along with all scheduling data are kept on the PL0 heap.


Each thread also contains information on its current status, priority, any lock, notification, or I/O operation or device that it might be waiting for, and the time of the last status change. This allows the Thread Scheduler to implement any number of a wide variety of scheduling algorithms since it has enough information to make good decisions. For example, suppose that one thread has access to a device but is not waiting for an I/O operation to complete, and another thread of higher priority is waiting for access to the device. The first thread can be given a priority boost (possibly just temporarily) so that the higher priority thread is not left waiting too long. A similar situation occurs with locks, but this is discussed in the Synchronisation section.
All I/O interrupts will be assigned to the bootstrap processor, so that preference can be given to the other processors when scheduling higher priority threads, for example. This is done using the I/O APIC’s software interface.
Thread time slice timeouts are implemented using the Local APIC (LAPIC) timer on each CPU. Both the Programmable Interval Timer (PIT) and the CMOS Timer go through the I/O APIC, and so they cannot be used for an arbitrary number of CPUs concurrently. The LAPIC timer is local to each CPU, and so does not need intervention from another CPU to work for thread scheduling. The handler for the LAPIC timer is in PL0, and it simply spinlocks for access to the Thread Scheduler, calls the Thread Scheduler, then after returning from the Thread Scheduler (the next time that this thread is run), it releases Thread Scheduler access.
For more information on task switching, LAPICs, and the I/O APIC, see (9) and (10).

Device I/O


Although it is planned that PwnOS will support PCI and USB protocols (as given by (13) and (14)), the abstractions for these protocols have not yet been designed. As such, they are not discussed extensively in this document.
The driver for ATA devices (e.g. harddrives) supports the following operations.

  • Reading sectors with 28-bit and 48-bit addressing (with DMA once PCI is supported)

  • Writing sectors with 28-bit and 48-bit addressing (same)

  • Device identification

  • Removable media identification

In order to support these operations, the driver strictly follows the protocols presented in (15).
Reading and writing of sectors is done with blocking DMA I/O, and as such, they are followed by an I/O interrupt indicating that the operation has finished and that the requesting thread can be run again. Device identification and removable media identification are done with programmed I/O to avoid the overhead of setting up DMA, so the I/O interrupt is not needed.
The driver for PS/2 (and the driver for USB keyboards and mice) supports the following operations.

  • Receive key press/release

  • Receive mouse button press/release

  • Receive mouse movement data

  • Receive mouse scroll data

In order to support these operations, the driver is based upon information presented in (16) and much testing.
All of the PS/2 driver operations are interrupt-driven input operations. The API for the I/O module of PwnOS allows threads to register listeners in PL3 for these input events. These threads may be given a temporary priority boost to quickly handle the user input.
Graphics in PwnOS is done using a fixed linear frame buffer, as set up by the boot loader using the VBE functions (17). As such, to have the ability to display graphics output, a process need only have the linear frame buffer’s pages present in its virtual memory. The GetGraphicsAccess and ReleaseGraphicsAccess API functions just allocate and deallocate these pages, so in a sense, they are more closely related to the Page Memory module than the I/O module.
Separate Ethernet drivers are required for each type of Ethernet card, and due to lack of standardisation, these drivers may be very different from each other. However, they must all provide the Internet Protocol (IP) abstraction for the TCP driver (and UDP driver, if present). Likewise, the TCP driver must manage its abstraction for reading and writing over TCP connections (see the Files section).
Other useful references on devices and their configuration are (18) and (11).

Files


The NTFS filesystem is the only hard disk storage filesystem supported by PwnOS. This decision was made based on its quality of documentation compared to other filesystems, its performance, its extensibility, and its compatibility with Windows. Details on NTFS can be found at (19), and so the searching and manipulating the NTFS data structures is beyond the scope of this document.
Caching of Master File Table (MFT) entries for open files, and for recently used directories/files takes place, in addition to read/write caching of file clusters. The number of clusters cached or prefetched for reading, or cached for writing, depends on the size and number of previous requests for the file. Write caching is done until either some number of clusters has been filled, the next write is outside the cache, or a certain amount of time has passed.
Use of file path names in PwnOS is similar (and in some cases, identical) to that of Windows. PwnOS will accept “/” or “\” as a directory separator, and the format of paths in general are case-insensitive, Unicode strings exemplified by the following examples.


Path

Meaning

hd00:\PwnOS\Core.bin

harddrive 0, partition 0, directory “PwnOS”, file “Core.bin”

hd00:PwnOS/Core.bin

same as above

hd00://PwnOS\Core.bin

same as above

hd00:\\\pwnos\core.Bin

same as above

hd312:\MyDirectory\SubDir\Cool.doc

harddrive 3, partition 1 (extended), subpartition 2, directory “MyDirectory”, subdirectory “SubDir”, file “Cool.doc”

http://www.codecortex.com/index.php

HTTP protocol, domain “www.codecortex.com”, file “index.php”

http:www.codecortex.com\index.php

same as above

usb00:\MyFile.txt

USB port 0, partition 0, file “MyFile.txt”

\PwnOS\Core.bin

root of partition of current directory, directory “PwnOS”, file “Core.bin”

/PwnOS

root of partition of current directory, directory “PwnOS”

Core.bin

current directory, file “Core.bin”

PwnOS\Core.bin

current directory, subdirectory “PwnOS”, file “Core.bin”

C:\PwnOS\Core.bin

partition mapped to “C”, directory “PwnOS”, file “Core.bin”

tcp:\127.0.0.1:1234

TCP protocol, IP address 127.0.0.1, port 1234

prog:\MyProgram\MyPipe

Pipe (program) protocol, virtual directory “MyProgram”,virtual file “MyPipe”

Virtual files will be supported for abstractions of network communication protocols (such as HTTP, FTP, TCP), and for pipes. Both network abstractions and pipes work very similarly, except that in the case of pipes, both ends of the virtual file are accessed by local programs, and in the case of a network abstraction, (usually) only one end of the virtual file is accessed by a local program. The network abstractions also depend on the I/O module, whereas pipes do not. Both pipes and network abstractions are fully cached, i.e. all data not yet retrieved/sent is kept in memory buffers (which may increase/decrease in size).


The data structures used by the filesystem management code are kept in the PL0 heap.

Synchronisation


The synchronisation module of PwnOS provides mechanisms for mutual exclusion and other coordination of multiple threads. The most important of these mechanisms is the lock, and a wait-notify queue mechanism is also provided.
A lock has two main operations: get and release. It keeps track of which thread currently has access to it (if any), and which threads are waiting to have access to it (if any). Both get and release have a case where they can be done entirely from PL3, and another case where they must be done in PL0. The get operation goes as follows.

  1. If the current thread already has the lock, return.

  2. Atomically, do the following (using LOCK CMPXCHG16B):

    1. If no thread currently has access and no thread has exclusive access to the list of waiting threads,

      1. Claim access for the current thread

  3. If the current thread gained access, return.

  4. Switch to PL0.

  5. Disable interrupts.

  6. Spinlock for exclusive access to list of threads waiting for access.

  7. If the lock was released from PL3 before exclusive access was obtained (can only happen if this is the first thread to go in the list),

    1. Claim access for the current thread (since no other thread can claim it while this one has exclusive access to the list)

    2. Release access to the list of waiting threads.

  8. Else,

    1. Add current thread to list of threads waiting for access.

    2. Set the current thread status to indicate that this thread is waiting for access to this lock.

    3. Release access to list of threads waiting for access.

    4. Go to the Thread Scheduler (see Thread Management section). Upon returning, this thread will have gotten the lock.

  9. Enable interrupts.

  10. Return to PL3, then return to application.

In the cases where either the current thread already has the lock or the lock is free, the operation can complete without switching to PL0. Otherwise, the operation must be done in PL0. Similarly, the release operation is as follows.



  1. If the current thread does not have the lock, fail.

  2. Atomically do the following (using LOCK CMPXCHG16B):

    1. If the list of threads waiting for access is empty and no thread has exclusive access to the list,

      1. Set no threads to currently have the lock.

  3. If the lock just got released by the atomic operation, return.

  4. Switch to PL0.

  5. Disable interrupts.

  6. Spinlock for exclusive access to list of threads waiting for access.

  7. Remove the next thread to run from the list of waiting threads.

  8. Give that thread the claim to the lock’s access.

  9. Set that thread’s status to reflect that it is now able to resume running.

  10. Release access to the list of waiting threads.

  11. Enable interrupts.

  12. Return to PL3, then return to application.

The order of these steps is absolutely critical to their proper execution. Changing the order or function of these steps could break certain cases, and the explanation of how would be too long and detailed for this document. These operations would be much simpler if done completely in PL0, but allowing the operations to occur in PL3 for the very common cases (getting a free lock, and releasing an unwatched lock) can yield a large performance benefit where locks are used often.


The other, much simpler synchronisation mechanism provided by PwnOS is the wait-notify queue. A thread indicates that it wants to be notified of something after waiting in line to be notified. This can be used for larger constructs such as a simple inter-thread message coordination system. Since waiting requires being in PL0 anyway, and notifying requires modifying the shared data structure, both must have some component executing in PL0 in order to avoid problems. Thus, for simplicity, they both are completely implemented in PL0. This makes the operations so much simpler than getting and releasing locks that their steps are not discussed here.
Because PwnOS is aware of the constructs that applications will use for mutual exclusion, some problems can be averted or at least identified. For example, cycles in the graph of locks owned by threads and threads waiting for locks represent deadlocks, and the Thread Scheduler has full access to this graph. Because deadlock cycles are almost always very small (2 or 3 locks in each cycle), and because few threads at any given time would be both waiting for a lock and owning a lock, detection is an inexpensive operation that could be performed periodically but not often (~1 minute to 1 hour) without a significant performance hit.
The Thread Scheduler can also take into account the knowledge that, for example, 20 threads are waiting for a lock owned by a particular thread, so the owning thread should be given a priority boost (at least temporarily) to release the lock sooner. It is not necessary that all such information be used, but it opens up possibilities for a more intelligent scheduler.

References


1. Dickson, Neil. PwnOS Code Documentation. [Online] August 26, 2007. [Cited: October 16, 2007.] http://www.neildickson.com/os/documentation/.

2. ITRON Committee, TRON Association. μITRON4.0 Specification. Tokyo, Japan : TRON Association, 2002. 4.00.00.

3. International Data Corporation. HP-UX: A Foundation for Enterprise Workloads. s.l. : IDC, 2007. #206607.

4. Silicon Graphics, Inc. Cellular IRIX™ 6.4 Technical Report. 1996.

5. MINIX 3: A Highly Reliable, Self-Repairing Operating System. Jorrit N. Herder, Herbert Bos, et al. July 2006, s.l. : Operating Systems Review, 2006.

6. Sun Microsystems. Reference Materials. Solaris Operating System. [Online] November 2007. [Cited: November 3, 2007.] http://www.sun.com/software/solaris/reference_resources.jsp.

7. Scalability of Microkernel-Based Systems. Uhlig, Volkmar. June 2005, s.l. : Operating Systems Review, 2005.

8. Robert V. Baron, David Black, et al. Mach Kernel Interface Manual. 1990.

9. Intel Corporation. Intel® 64 and IA-32 Architectures Software Developer's Manuals. Intel. [Online] May 2007. [Cited: August 13, 2007.] http://www.intel.com/products/processor/manuals/. 253665-253669.

10. —. 82093AA I/O Advanced Programmable Interrupt Controller (I/O APIC) Datasheet. 1996. 29056601.

11. Hewlett-Packard Company, Intel Corporation, et al. Advanced Configuration and PowerInterface Specification, Revision 3.0. 2004.

12. Microsoft Corporation. Microsoft Portable Executable and Common Object File Format Specification, Revision 8.0. 2006.

13. Technical Committee T13. AT Attachment with Packet Interface - 6 (ATA-ATAPI-6). 2002. 1410D.

14. Hyde, Randall. Chapter 20 - The PC Keyboard. The Art of Assembly Language Programming, DOS 16-bit Edition. 2000.

15. Compaq Computer Corporation, Hewlett-Packard Company, et al. Universal Serial Bus Specification, Revision 2.0. 2000.

16. PCI Special Interest Group. PCI Local Bus Specification, Revision 2.2. 1998.

17. Video Electronics Standards Association. VESA BIOS Extension (VBE) Core Functions Standard, Version 3.0. 1998.

18. Gook, Michael. PC Hardware Interfaces. Wayne, Pennsylvania : A-List Publishing, 2004. 193176929X.

19. Richard Russon, Yuval Fledel. NTFS Documentation. Linux-NTFS. [Online] 2005. [Cited: October 21, 2007.] http://data.linux-ntfs.org/ntfsdoc.pdf.


Neil Dickson PwnOS Design Document Page of




Download 135.4 Kb.

Share with your friends:
1   2   3   4   5   6




The database is protected by copyright ©ininet.org 2024
send message

    Main page