|In the previous module we discussed about processor, the basic building blocks of processor i.e. ALU, Instruction memory, data memory, register bank and how they are connected to form data path of a processor.
In this module we will study about a specific processor which is very widely used in many mobile applications and low power applications.
It is known as ARM, full form is advanced RISC machines. We are going to talk about components of ARM processor, its data path, and basic instruction set.
An ARM processor is one of the families of CPUs based on the RISC (reduced instruction set computer) architecture developed by Advanced RISC Machines (ARM)
ARM makes 32-bit & 64-bit RISC multi-core processors. RISC processors are designed to perform a smaller number of types of computer instructions so that they can operate at a higher speed, performing more millions of instructions per second (MIPS).
By stripping out unneeded instructions and optimizing pathways, RISC processors provide outstanding performance at a fraction of the power demand of CISC (complex instruction set computing) devices.
The relative simplicity of ARM processors makes them suitable for low power applications. As a result, they have become dominant in the mobile and embedded electronics market, as relatively low-cost, small microprocessors and microcontrollers.
In 2005, about 98% of the more than one billion mobile phones sold each year used at least one ARM processor.
As of 2009, ARM processors account for approximately 90% of all embedded 32-bit RISC processors and are used extensively in consumer electronics, including PDAs, mobile phones, digital media and music players, hand-held game consoles, calculators and computer peripherals such as hard drives and routers.
In this module you will study about ARM processor architecture and implement it in verilog.
As discussed in the previous slide, ARM processors are extensively used in consumer electronic devices such as smartphones, tablets, multimedia players and other mobile devices, such as wearable’s. Because of their reduced instruction set, they require fewer transistors, which enable a smaller die size for the integrated circuitry (IC).
The ARM processor’s smaller size, reduced complexity and lower power consumption makes them suitable for increasingly miniaturized devices.
Now 64-bit ARM processors have also come in market, but in this course we are going to study 32 bit processor. It also has 16 bit variant i.e. it can be used as 32 bit and as 16 bit processor.
The datapath of ARM architecture can be of 5-stage or 3-stage. ARM7 have 3-stage of pipeline and ARM9 have 5-stage of pipeline. Here we are going to study ARM7 which have 3 stages :
In this slide we are discussing, the first stage is fetch, where Instruction is fetched from the instruction memory and placed in instruction register.
Figure shows the blocks of the processor in use during this stage.
Instruction fetch gets the next instruction from memory and increments the program counter.
It also stores the new address back into PC register.
This stage takes only one clock cycle.
The next stage is decode, In this stage instructions are decoded and data path control signals are prepared for the next cycle. The registers used in instruction are decoded in this stage. This stage also breaks instruction down to control signals that are ready to be executed.
This stage also takes one cycle. Figure shows the block used during decode stage.
This is the third and final stage of datapath.
In this stage register bank is read, operand is shifted, ALU result is generated and written back in a destination register.
This stage can be multi-cycle depending on the instruction executed. Figure shows the blocks used during this stage.
Execution retrieves the operands from the register file, which keeps track of registers and the data in each, performs ALU operations, reads from or writes to memory, and writes back to register file if that is necessary.
Now let us look at the ISA or instruction set architecture of ARM architecture.
ARM instructions are all 32 bit long (except for Thumb mode). There are 2^32 possible machine instructions. Fortunately they are structured.
The first 5 bits are opcode which can be of 3 types data processing, load store, jump/branch.
Next 4 bits are flags bit : whose value depend on type of instruction.
Next there are 2 register of 5 bit each. There are 8 general register and 8 conditional register.
The last 14 bits are used as immediate address or address a register.
ARM instruction set can be broadly divided into 3 types of instruction :
data processing such as ADD, SUBTRACT, MULTIPLY, SHIFT LEFT, SHIFT RIGHT, LOGICAL
the second type of operation is load and store, which is most commonly used
Jump , or conditional branch (which happens in if/else statement) which change the program counter value.
Once again, Data processing instructions can be of many types - such as ADD, SUBTRACT, MULTIPLY, SHIFT LEFT, SHIFT RIGHT, LOGICAL
For multiplication ARM processor uses Booths multiplier which you have already studied and implemented in previous modules.
One unique feature in ARM design is, it’s possible to combine SHIFT and ALU operations in a single instruction. Since one of the input of ALU comes via barrel shifter.
For shifting , ARM processor uses Barrel shifting which we have already designed in previous modules.
Move instructions cause data from one register to be moved or copied to another register.
Load instructions put data from an external source, such as memory, into a register.
Store instructions move data from a register to an external destination.
Instructions that move (or copy) data from one place to another are the most-frequently-used instructions in most programs.
These instructions load or store the value of a single register from or to memory. They can load or store a 32-bit word, a 16-bit half-word, or an 8-bit unsigned byte. Byte and half-word loads can either be sign-extended or zero-extended to fill the 32-bit register.
A few instructions are also defined that can load or store 64-bit double word values into two 32-bit registers.
Branching and Jumping is the ability to load the PC register with a new address that is not the next sequential address.
In general, a "jump" or "call" occurs unconditionally, and a "branch" occurs on a given condition. In this book we will generally refer to both as being branches, with a "jump" being an unconditional branch.
Branch can be conditional, depending on value of two register like R1 and R2 as shown in figure. If opcode is 010 and R1>R2, then branch will be taken.