Ken Shirriff Continues Reverse Engineering Intel's 8086, Now Focusing on Pipelining
Following on from his discovery of a hardware patch to the chip, Shirriff's reversing work now turns to performance-boosting pipelining.
Noted reverse engineer Ken Shirriff is continuing his efforts to peer into the inner workings of the Intel 8086, taking a look under the package at the silicon die and walking through the chip's microcode pipeline based on the circuitry therein.
"As the 8086 documentation will tell you, this instruction takes four clock cycles to execute," Shirriff explains by way of introduction to the part, launched by Intel as the first in what would become the x86 processor family in 1978. "But looking internally shows seven clock cycles of activity. How does the 8086 fit seven cycles of computation into four cycles? As I will show, the trick is pipelining."
To illustrate, Shirriff turns to a high-resolution microscope image of a decapsulated 8086 chip — with the packaging dissolved to reveal the silicon die and its circuitry, laid out like a farmer's fields. Thanks to its relative limited hardware compared to modern entries in the long-running x86 series, individual components are more easily spotted — and parts responsible for the pipelining process clearly visible, to those who know where to look.
"[The] ADD instruction is implemented in the 8086's microcode as four micro-instructions," Shirriff explains. "The 8086 documentation says this ADD instruction takes four clock cycles, and as we have seen, it is implemented with four micro-instructions. One micro-instruction is executed per clock cycle, so the timing seems straightforward. The problem, however, is that a micro-instruction can't be completed in one clock cycle. It takes a clock cycle to read a micro-instruction from the microcode ROM. Sending signals across an internal bus typically takes a clock cycle and other actions take more time.
"One solution would be to slow down the clock, so the micro-instruction can complete in one cycle, but that would drastically reduce performance. A better solution is pipelining the execution so a micro-instruction can complete every cycle. Execution of a micro-instruction [in the 8086] is pipelined, with three full clock cycles from the arrival of an instruction until the first micro-instruction completes in cycle 4. Although this system is complex, in the best case it achieves the goal of running a micro-instruction each cycle, without gaps."
This isn't the first time Shirriff has looked at the Intel 8086: late last year he took a gander at the very same chip and came across evidence of a hardware bug fix — matching "an obscure part" of Intel's official documentation, in which the company admitted to a flaw that could cause memory corruption if an interrupt should immediately follow certain MOV or POP instructions.
Shirriff's full write-up is available on his website; he has also confirmed plans to continue reverse engineering the 8086's operation through analysis of the die image, with more articles to follow.