N15 page 1 of
Microprocessors
von Neumann (Princeton) architecture
instructions and data reside in same memory space (same address and data busses)
cannot fetch a new instruction and read/write data at the same time (bottleneck)
Harvard architecture
separate memory for instructions and data
data stored in RAM, program often stored in EEPROM with external serial load
Microprocessor history
Apollo guidance computer = 12,300 transistors (4,100 chips, one 3-input NOR gate per chip)
Intel
|
Instructions
|
Data
|
Addr
|
Transistors
|
Year
|
|
4004
|
46
|
4b
|
12b
|
2,300
|
1971
|
|
8008
|
48
|
8b
|
14b
|
3,500
|
1972
|
|
8080
|
256
|
8b
|
16b
|
4,500
|
1974
|
several support chips with different voltages
|
8085
|
256
|
8b
|
16b
|
6,500
|
1977
|
single +5V supply
|
8086
|
|
16b
|
20b
|
29,000
|
1978
|
|
8087
|
|
|
|
|
1980
|
first floating point coprocessor
|
8088
|
|
8b
|
20b
|
29,000
|
1979
|
IBM-PC
|
80186
|
|
16b
|
20b
|
55,000
|
1982
|
|
80286
|
|
16b
|
24b
|
134,000
|
1983
|
|
80386
|
|
32b
|
32b
|
275,000
|
1985
|
|
80486
|
|
32b
|
32b
|
1,180235
|
1989
|
|
Pentium
|
|
64b
|
32b
|
3,100,000
|
|
|
Pentium Pro
|
|
64b
|
32b
|
5,500,000
|
|
|
Pentium II
|
|
64b
|
32b
|
7,500,000
|
|
|
Pentium III
|
|
|
|
9,500,000
|
|
|
Pentium IV
|
|
|
|
42,000,000
|
|
|
|
|
|
|
|
|
|
MOS Technology
|
|
|
|
|
|
|
6502
|
256
|
8b
|
16b
|
3,510
|
1975
|
Apple II
|
|
|
|
|
|
|
|
Zilog
|
|
|
|
|
|
|
Z80
|
256
|
8b
|
16b
|
8,500
|
1976
|
8085 clone
|
|
|
|
|
|
|
|
Motorola
|
|
|
|
|
|
|
6800
|
72
|
8b
|
16b
|
4,100
|
1974
|
|
68000
|
|
16b
|
24b
|
68,000
|
1979
|
Macintosh, UNIX
|
68020
|
|
32b
|
32b
|
200,000
|
1984
|
UNIX
|
Main functions of Central Processing Unit (CPU)
Arithmetic logic unit (ALU)
Registers for temporary storage
Typically size of external data bus
Accumulator or Working (input and output of ALU)
Temporary (second input to ALU)
Flag status register (Carry, Zero, Negative, Interrupt Disable, etc.)
Instruction buffer
Control buffer
Data buffer
General purpose
Typically size of external address bus
Address buffer
Program counter (PC)
Stack pointer (SP)
Instruction decode lookup table (LUT)
Bus drivers
6502 CPU
Classic one byte instruction (8b internal data, 8b external data, 16b external address)
move content of PCL to address buffer low
move contents of PCH to address buffer high
set MEMR
fetch instruction byte from memory into data buffer
move contents of data buffer into instruction buffer
decode instruction to set internal control lines
increment PC by one
typical functions
a) arithmetic operations - result in W
single operand on W - zero, NOT, increment/decrement, two's complement, set/clear bit, rotate/shift
two operand on W and TEMP - add, AND, OR, XOR
b) internal move from register source to register destination
Classic two byte instruction (8b internal data, 8b external data, 16b external address)
same as one byte instruction
decode instruction to set internal control lines
increment PC by one
fetch a second byte into data buffer (usually contains a constant value)
move contents of data buffer into W (or another register)
increment PC again
typical functions
load constant into register destination
Classic three byte instruction (8b internal data, 8b external data, 16b external address)
same as one byte instruction
decode instruction to set internal control lines
increment PC by one
fetch a second byte into data buffer (usually contains low byte of an address)
move contents of data buffer into W (or another register)
increment PC again
fetch a third byte into data buffer (usually contains high byte of an address)
move contents of data buffer into TEMP (or another register)
increment PC again
move W into address buffer low
move TEMP into address buffer high
if read from memory
set MEMR
fetch byte from memory location into data buffer
move data buffer into W
if write to memory
move content of a register into data buffer
set MEMW
write data buffer into memory location
typical functions
a) read variable from memory
b) write variable to memory
c) conditional jump - JZ, JNZ, JC, JNC, JPOS, JNEG
move W and TEMP into PCL and PCH instead of address buffer if test is true
d) unconditional jump - JMP
e) jump to subroutine - JSR (see below)
Arduino program compiled into AVR instructions
int a, b, c; // 16b signed
a = 4;
b = 6;
c = a + b;
first pass of compiler assigns memory locations for data
hex location data
0x0100 lsB(a)
0x0101 msB(a)
0x0102 lsB(b)
0x0103 msB(b)
0x0104 lsB(c)
0x0105 msB(c)
second pass of complier creates instructions (hex opcode)
hex opcode command,args comment
a = 4;
84 e0 ldi r24, 0x04 Load Immediate - load lsB(a) into register 24 (r24)
90 e0 ldi r25, 0x00 Load Immediate - load msB(a) into register 25 (r25)
90 93 01 01 sts 0x0101, r25 Store Direct to Data Space - store msB(a) to memory
80 93 00 01 sts 0x0100, r24 Store Direct to Data Space - store lsB(a) to memory
b = 6;
86 e0 ldi r24, 0x06 Load Immediate - load lsB(b) into r24
90 e0 ldi r25, 0x00 Load Immediate - load msB(b) into r25
90 93 03 01 sts 0x0103, r25 Store Direct to Data Space - store msB(b) to mem
80 93 02 01 sts 0x0102, r24 Store Direct to Data Space - store msB(b) to mem
c = a + b;
80 91 02 01 lds r24, 0x0102 Load Direct from Data Space - load lsB(b) from mem into r24
90 91 03 01 lds r25, 0x0103 Load Direct from Data Space - load msB(b) from mem into r25
20 91 00 01 lds r18, 0x0100 Load Direct from Data Space - load lsB(a) from mem into r18
30 91 01 01 lds r19, 0x0101 Load Direct from Data Space - load msB(a) from mem into r19
82 0f add r24, r18 Add without Carry – add lsB(b) plus lsb(a), result in r24
93 1f adc r25, r19 Add with Carry – add msB(b) plus msb(a), result in r25
90 93 03 01 sts 0x0105, r25 Store Direct to Data Space - store msB(c) to mem
80 93 02 01 sts 0x0104, r24 Store Direct to Data Space - store lsB(c) to mem
11 bytes of text (excluding spaces and semicolons), 6 bytes of data, 52 bytes of opcode instructions
Historic evolution of CPU
More registers
More arithmetic functions
More instructions in instruction set
Bigger instruction decoder LUT due to propagation delay
Slower performance per instruction
Reduced instruction set computer (RISC)
Fewer registers
Fewer instructions
Smaller decoder LUT
Faster performance per instruction
PIC and Atmel are RISC with modified Harvard architecture
Call subroutine
1) push contents of PC onto stack and increment SP
2) load address of instructions for subroutine into PC
3) execute subroutine
Return from subroutine
4) pop old value of PC from stack and decrement SP
5) load value from stack into PC
Maximum size of stack limits number of levels of subroutine nesting
Interrupt
hardware generated subroutine - use stack to remember old PC
need special interrupt handler instructions
need list of addresses for different handler routines (interrupt vector)
mask off other interrupts
Share with your friends: |