Книга: Introduction to Microprocessors and Microcontrollers

An outline of Pentium operation

An outline of Pentium operation

See Figure 12.1.


Figure 12.1 The Pentium processor

Data and code caches

Connections to the outside world are via a 64-bit external data bus and a 32-bit address bus. The incoming data that consists of numerical data and instruction codes are loaded very quickly into two internal caches – an 8 kbyte data cache and an 8 kbyte code cache. These caches shift data very rapidly on the internal pathways that are 128 and 256 bits wide.

Whenever possible, the Pentium uses burst mode to read and write data. The burst mode system loads a cache for example, with more data than the width of the data bus. If a cache line is 128 bits wide and it is fed from a 64-bit data bus, then we could completely fill the line by transferring 64 bits and then another 64 bits. Burst mode loads all 128 bits very rapidly without further intervention from the microprocessor. Putting more new data into the cache will increase the chances of the cache holding the required information. This is called a cache ‘hit’.

Prefetch buffer

The prefetch buffer is a small internal memory that holds a list of instructions that are waiting to be executed. This ensures that the instruction decoder is never waiting for a new instruction from the external (slow) memory and it makes more efficient use of the external data bus since the new instructions can be loaded whenever the opportunity arises. When it gets a moment, the Pentium shifts an instruction from the external program into the cache and transfers one instruction from the cache into the prefetch buffer and also sends a signal to the microcode circuit to prepare the code for the next instruction. So, with all the housekeeping done, the instruction decoder can be fed with instructions and data at its maximum rate. The prefetch buffer is actually two independent 32-bit buffers, each providing input to one of the ALUs.

The instruction decoder

The instruction decoder performs much the same function as in other microprocessors. It has two outputs that are fed to the two ALUs called ‘u’ and ‘v’.

Arithmetic and logic units

These units are under the control of the aptly named control unit. The blocks, shown in the diagram as ALU ‘u’ and ALU ‘v’ are actually five step pipelines that can operate in parallel to execute two instructions in a single clock cycle. All commands other than floating point arithmetic can be executed in the ‘u’ pipeline and a more limited range can be carried out in the ‘v’ pipeline. The five-stage pipeline can speed the throughput to one instruction per clock cycle. In the correct conditions, both pipelines can be used simultaneously to handle two instructions in a single clock cycle. Sometimes this is not possible. Perhaps both instructions need access to the same piece of hardware, perhaps the result of an instruction is needed before the next instruction can be started. As a rather simplistic example, if we wished to add two numbers then divide the result by 10, we cannot start dividing anything until the first answer is available. One minor drawback is that instructions cannot overtake each other even if the second one could have been finished very rapidly and they are not dependent on each other.

Floating-point unit (FPU)

For floating-point arithmetic, the FPU has an 8-bit pipeline that is further enhanced by using a hardware multiplier and divider. This is a significant advance over the 80486, which was not pipelined in the FPU. Between them, the pipeline and the hardware, the FPU runs about ten times faster than the 80486 with equivalent clock speeds. You may remember from earlier discussions that one of the benefits of the RISC designs was the use of hardware for the execution of arithmetic operations.

The ‘u’ pipeline has some overlap with the floating-point pipeline so there are restrictions on the occasions when two instructions can be executed at the same time.

There are eight FPU registers 80-bits wide, arranged as a stack. Bits 0 to 63 hold a 64-bit mantissa. Bits 64 to 78 hold a 15-bit exponent and the last bit holds a sign bit.


Notice how the layout of the floating-point number differs from the example that we saw in Chapter 4.

Branch prediction

When the program reaches a ‘branch’ or ‘jump’ instruction, the microprocessor is sent to another part of the program. These instructions are usually ‘conditional’ as in ‘jump to address xxxx if the value in the accumulator is not zero’. When this jump happens, the next few instructions that are loaded into the pipelines are all incorrect and the pipeline has to be emptied and restocked with the new information. This is called ‘flushing’ the pipeline and causes an irritating delay of four or five clock cycles.

The branch prediction logic holds about 256 entries in a cache to aid the Pentium in guessing the next instruction. If we can guess what is coming next before it happens, then the data and instructions can be loaded ready to go.

But how do we guess? There are two likely outcomes: either the branch will be taken and we jump to another part of the program, or we don’t take the branch and we continue with the next instruction. The branch prediction logic argues that what the microprocessor did last time, it will probably do again. This is true more often than not. The reasoning behind this is that when a loop occurs, the program is sent back to repeat a section several, or many, times. It can only NOT take the branch once, so on average it will take a branch more often than it doesn’t.

In the cache are stored the instructions immediately before the branch or jump together with the target address assuming the branch is taken. It also stores statistical information of how often the branch was taken in the past. This information is used to predict the likely outcome of the current situation and is correct for about 85% of the time. When the branch has occurred, the history information is updated to make the next guess even better.

General purpose registers

The Pentium has seven general-purpose registers, all 32-bits wide. One of them is used as an accumulator and to maintain compatibility with the 80386 and the 80486, it can be addressed as a single 32-bit register, two 16-bit or four 8-bit registers. There are three other general-purpose registers that can be similarly split and three that only offer the choice of 32-and 16-bit use.

Interrupts

The handling of interrupts has not changed beyond all recognition since we were looking at the Z80.

There are two hardware interrupts available. The NMI or nonmaskable interrupt is activated by the pin voltage going to a logic 1 or high-level. Immediately on the completion of the current instruction, the Pentium puts the content of the flag register and the current address onto the stack. It then goes to the flag register and resets the interrupt flag to prevent any further interrupts. It then services the interrupt. The NMI normally occurs as a result of hardware failures to quickly limit the damage caused.

The IRQ or interrupt request is also activated by the appropriate pin going to a logic 1 or high-level but in this case remember that it is only a request and can be blocked by resetting the interrupt flag in the flag register. If more than one interrupt is received they are checked for priority and the highest one wins. IRQs are generally initiated by peripheral equipment such as a printer.

Exceptions

These interrupts are issued by the microprocessor itself and occur when the microprocessor has found itself in a difficulty that it cannot resolve.

When an exception occurs, an on-screen message often appears announcing that an exception has occurred and the Pentium attempts  the instruction again. Asking the Pentium for an impossible answer causes some exceptions. This could be ‘division by zero’. Dividing any number by zero is not possible and the Pentium cannot respond.

Another one, which often strikes terror into the heart of the user, is ‘General Protection Error’. The software has sent the Pentium off to an address that doesn’t exist and obviously, therefore, no instructions are available.

Оглавление книги


Генерация: 1.025. Запросов К БД/Cache: 3 / 1
поделиться
Вверх Вниз