Jeffrey Rodriguez Professor Seaver



Download 120.1 Kb.
Date31.01.2017
Size120.1 Kb.
#13955
Hyper Threading
By

Jeffrey Rodriguez


Professor Seaver

CST 123


December 6, 2004

Hyper Threading

Hyper Threading is Intel’s implementation of simultaneous multithreading on Pentium 4 processors. It allows multiple threads to execute at the same time in one processor. Hyper Threading was first announced in the fall of 2001 and made available in early 2002. Since then it has become widely popular on desktop PCs. According to Intel, Hyper Threading increases speed by 30% over an identical processor without it.

Originally codenamed ‘Jackson’, Hyper threading was fist announced at the annual Intel Developer Forum in 2001. Intel was not, however, the first company to develop simultaneous multithreading. In 1999, at the Microprocessor Forum in San Jose CA, Compaq announced it had achieved just that with its EV-8 Alpha processor. Unfortunately, the project was terminated prematurely and the processor was never made available. The technology was brought back and improved when Intel introduced it in their Xeon line of processors in 2002. In November of 2002, Hyper Threading was brought to the desktop PC market. The 3.06-gigahertz (GHz) Pentium 4 was the first of its kind to support hyper threading.

To understand hyper threading you must first understand the basics of how a processor works. A diagram of a very basic CPU can be found in Appendix A. For example, let us use a program that will add 7 and 10, and store the result in the accumulator.

MVI A, 7


ADI 10

HLT


The program is stored in RAM. In hexadecimal code, it looks like 3E, 07, C6, 0A, 76. First, the CPU fetches the first instruction, 3E, and stores it in a data register. The instruction decoder then decodes it. It recognizes it as a move immediate instruction and moves the next value, 07, into the accumulator. Each time a value is fetched from RAM, the program counter is incremented to point to the next instruction to fetch. Next, C6, the next instruction, is fetched. It is recognized as an 'add immediate' instruction. The controller sequence then tells the arithmetic logic unit (ALU) to add the next instruction to whatever is in the accumulator. The next instruction is fetched, 0A, and is added to the 07 that is already in the accumulator, resulting in 11 hexadecimal, or 17 base 10. The next instruction is fetched, 76, which tells the program to stop.

Modern processors are much more complicated and have many more registers that the one used in the above example. A diagram of an Intel Pentium 4 processor can be found in Appendix B. You can see the differences between a basic CPU and


Single Threaded CPU

http://arstechnica.com/paedia/images/figure-1.html


The diagram on the right represents a single threaded processor. The colored boxes in RAM signify different threads waiting to be executed. The ‘front end’ section inside the CPU is where instructions are fetched, decoded, and re-ordered. The ‘Execution Core’ is where the instructions are executed.

With this type of processor, only one thread may be executing at once, represented by the red blocks in the CPU.

Also, notice the empty blocks. These blocks are where the CPU was unable to any useful work, called pipeline bubbles. There are many reasons why this happens, including instructions decoded improperly or threads not ready to be executed. These empty spaces are not recoverable and will remain through the execution of the process.


Single threaded SMP

http://arstechnica.com/paedia/images/figure-2.html
One solution to speed up execution is to have multiple processors. For each processor we have, another thread can execute at the same time. This is called Symmetric Multiprocessing (SMP).

In the diagram above, each CPU can access RAM and is executing a different thread. The biggest problem with this solution is the amount of empty boxes, or pipeline bubbles. While adding more CPUs increases performance, it does not improve efficiency.

To help alleviate the problem of the pipeline bubbles, a CPU must be able to execute more than one thread at once, said to be a multithreaded CPU. One method of doing this is called super threading.

T
Super threaded CPU

http://arstechnica.com/paedia/images/figure-3.html
he diagram on the right illustrates this technique. First, notice that there are fewer pipeline bubbles. Right away this has improves the efficiency of the processor. Also, notice the arrows to the left of the diagram. These arrows emphasize how the processor can mix instructions from different threads. Each processor pipeline can only hold instructions from one thread. The CPU, however, can execute multiple pipelines each clock cycle. This allows multiple threads to execute with each CPU clock cycle.


Hyper Threading, or Simultaneous Multithreading (SMT), takes this idea even further. It allows instructions from threads to be on the same pipeline as one another. This minimizes the amount bubbles and maximizes the CPU efficiency.


Hyper Threaded CPU

http://arstechnica.com/paedia/images/figure-4.html


This is the biggest strength of Hyper Threading. It allows for one CPU to do the same work as two CPUs with greater efficiency. To achieve this, a hyper threaded CPU is divided into two logical CPUs. Each logical CPU has it’s own arcetectural state which includes some general purpose registers, control registers, the program counter, the advanced programmable interrupt controller (APIC), and some machine state registers. Other resources, such as cache, control logic and buses, are shared by the two logical processors. Once the arcetectural state is duplicated, the operating system now sees two processors.

The operating system can schedule processes on both logical processors as if they were two physical processors. This can greatly increase performance, up to thirty percent, according to Intel. Many people have tested hyper threading technology on their own and come to their own conclusions. I, too, have done my own tests.

For the tests, I used my current PC. Complete specifications of the test computer can be found in Appendix C. To perform the tests, I used PCMark 2004 v1.2. First, I restarted the PC and changed the BIOS configuration to disable hyperthreading. The PC then started up. I then stopped all processes that run automatically on startup. This left a total of 23 processes running that are part of Windows XP. I then ran the testing software. The same procedure was used for testing with hyperthreading enabled. Both tests were performed twice on different days.

After the first round of testing, there was an overall improvement of 12.2% with hyper threading enabled. More specifically, there was a 16.8% improvement in the CPU category, according to PCMark. The second round of testing showed even greater results with an overall 13% improvement with hyper threading and 18.5% improvement in the CPU category. Complete results can be found in Appendix D. While 13% is good, it’s clearly not the 30% that Intel claims. Perhaps the biggest performance improvement is when a user multitasks. According to some, increases up to 47% can be seen when running two applications such as a virus scan and video encoder.

Since hyper threading became availaable in 2002, it has become increasingly popular among home PC users. It’s use of effective technology increases performance which benefits the home user the most. Since it was incorporated into the Pentium 4 processors, the product line has grown to include processors from 2.8 GHz up to 3.8 GHz.

Appendix A – Simple CPU



8085 Microprocessor Programming, Textbook.

© 2001 Heathkit Company, Inc., Benton Harbor, Michigan.
Appendix B – Pentium 4

Hinton, Glenn. Dave Sager, Mike Upton, Darrell Boggs, Doug Carmean, Alan Kyker, Patrice Roussel,. “The Microarchitecture of the Pentium® 4 Processor” <http://developer.intel.com/technology/itj/q12001/articles/art_2.htm> Intel.

Appendix C – System Specifications





Central Processing Unit

Manufacturer

Intel

Family

Intel(R) Pentium(R) 4 CPU 3.20GHz

HyperThreadingTechnology

Available - 2 Logical Processors

Motherboard Info

Manufacturer

ASUSTeK Computer Inc.

Model

P4C800-E

Version

Rev 1.xx

BIOS Vendor

American Megatrends Inc.

BIOS Version

A M I - 9000302

Memory Info

Total Physical Memory

5 x 512MB DDR PC3200

Manufacturer

Corsair

Display Device

Description

ASUS A9800XT

Manufacturer

ATI Technologies Inc.

Driver Version

6.14.10.6476

Total Local Video Memory

256 MB

Sound Device

Description

SB Audigy 2 ZS Audio [DF00]

Driver Version

5.12.5.441

Manufacturer

Creative Technology, Ltd.

Hard Disk Drives

IDE

Western Digital 120GB

 

Western Digital 80GB

SATA

Western Digital 200GB

Operating System Info

Operating System

Microsoft Windows XP

Version

5.1.2600

Service Pack

Service Pack 2

Appendix D – Benchmark Results



PCMark04 Results

 

HT Result 1

non-HT Result 1

HT Result 2

non-HT Result 2

 

PCMark

4861

4329

4833

4274

PCMarks

CPU

4804.0

4110.0

4704.0

3969.0

 

Memory

4639.0

4518.0

4556.0

4558.0

 

Graphics

4430.0

4454.0

4440.0

4406.0

 

HDD

3851.0

3182.0

3443.0

3428.0

 

File Compression

5.5

4.1

5.4

4.0

MB/s

File Encryption

51.8

45.6

51.1

44.3

MB/s

File Decompression

38.0

27.1

37.8

27.5

MB/s

Image Processing

14.3

13.2

14.6

13.4

MPixels/s

Virus Scanning

2466.6

1565.8

2729.7

1599.8

MB/s

Grammar Check

2.0

2.2

2.1

2.4

KB/s

File Decryption

91.1

90.8

84.8

81.3

MB/s

Audio Conversion

2827.2

2819.9

2814.0

2814.9

KB/s

Web Page Rendering

5.6

5.5

5.6

5.4

Pages/s

WMV Video Compression

56.2

49.6

52.0

46.4

FPS

DivX Video Compression

63.3

55.2

62.9

51.7

FPS

Physics Calculation and 3D

180.5

173.2

176.0

178.6

FPS

Graphics Memory - 64 lines

2710.1

2712.6

2632.3

2628.7

FPS

File Compression

5.4

4.0

5.4

4.0

MB/s

File Encryption

50.6

44.9

49.9

37.8

MB/s

File Decompression

38.1

27.1

38.3

27.4

MB/s

Image Processing

14.7

13.2

14.4

13.2

MPixels/s

Grammar Check

4.3

4.2

4.3

4.3

KB/s

File Decryption

88.1

88.4

81.8

69.6

MB/s

Audio Conversion

2814.4

2820.8

2816.2

2828.4

KB/s

WMV Video Compression

56.0

49.6

56.1

49.3

FPS

DivX Video Compression

63.4

41.9

57.8

45.0

FPS

Raw Block Read - 8 MB

4759.4

4522.9

4801.0

4793.5

MB/s

Raw Block Read - 4 MB

4500.1

4312.4

4830.2

4771.3

MB/s

Raw Block Read - 192 KB

24224.0

24159.4

22637.9

21438.6

MB/s

Raw Block Read - 4 KB

45175.2

44512.6

45484.5

42864.6

MB/s

Raw Block Write - 8 MB

3986.3

3991.8

3992.6

3996.5

MB/s

Raw Block Write - 4 MB

3985.1

3991.7

4001.7

4002.2

MB/s

Raw Block Write - 192 KB

13867.3

13849.1

13894.0

13893.3

MB/s

Raw Block Write - 4 KB

13816.2

13797.3

13841.8

13794.1

MB/s

Raw Block Copy - 8 MB

1593.6

1476.6

1445.0

1434.2

MB/s

Raw Block Copy - 4 MB

1636.0

1489.2

1478.7

1474.5

MB/s

Raw Block Copy - 192 KB

11955.5

11578.4

11697.7

11571.0

MB/s

Raw Block Copy - 4 KB

13735.7

13667.5

13839.4

13828.9

MB/s

Random Access - 8 MB

2401.4

2449.5

2588.5

2596.4

MB/s

Random Access - 4 MB

2581.3

2480.0

2337.3

2550.5

MB/s

Random Access - 192 KB

8204.8

8098.3

7070.0

7232.5

MB/s

Random Access - 4 KB

12518.9

12502.5

12539.6

12417.6

MB/s

Transparent Windows

1608.6

1615.6

1609.1

1618.8

Windows/s

File Copying

25.4

15.2

19.3

17.7

MB/s


Download 120.1 Kb.

Share with your friends:




The database is protected by copyright ©ininet.org 2024
send message

    Main page