MIPS architecture

MIPS, for Microprocessor without interlocked pipeline stages, is a RISC microprocessor architecture developed by MIPS Computer Systems Inc MIPS designs are used in SGI's computer product line, and have found broad application in embedded systems, Windows CE devices, and Cisco routers. The Nintendo 64 video game and Sony PlayStation2 consoles use MIPS processors. By the late 1990s it was estimated that one in three of all RISC chips produced were MIPS-based designs.

The early MIPS architectures were 32-bit implementations (generally 32 bit wide registers and data paths), later versions were 64-bit implementations. Five backward-compatible revisions of the MIPS instruction set exist, named MIPS I, MIPS II, MIPS III,MIPS IV and MIPS 32/64. The latest of these, MIPS 32/64 defines a control register set as well as the instruction set. Several "add-on" extensions are also available, including MIPS-3D which is a simple set of integer-based SIMD instruction set dedicated to common 3D tasks, MDMX which is a more extensive floating-point-based SIMD instruction set, MIPS16 which adds compression to the instruction stream to make programs take up less room, and the recent addition of MIPS MT, new multithreading additions to the system similar to HyperThreading in the latest Intel lineup.

Because the designers created such a clean instruction set (see Instructions), computer architecture courses in universities and technical schools often study the MIPS architecture. The design of the MIPS CPU family, together with SPARC, another early RISC architecture, greatly influenced later RISC designs like HP Precision Architecture and DEC Alpha.

Table of contents

1 History
2 MIPS CPU family
3 Other models and future plans
4 MIPS cores
5 Further reading
6 MIPS Programming

History

In 1981, a team led by John Hennessy at Stanford University started work on what would become the first MIPS processor. The basic concept was to dramatically increase performance through the use of deep instruction pipelines, a technique that was well known, but difficult to implement. Generally a pipleline spreads out the task of running an instruction into several steps, and then start working on "step one" of an instruction even before the preceeding instruction is complete. In contrast, traditional designs of the era waited to complete the entire instruction before moving on, thereby leaving large areas of the CPU idle as the process continued.

One major barrier to pipelining was that it required interlocks to be set up to ensure that instructions that took multiple clock cycles to complete would stop the pipeline from loading more data -- basically to pause while it completed. These interlocks can take a long time to set up, and were thought to be a major barrier to future speed improvements. A major design aspect of the MIPS design was to demand that all instructions take only one cycle to complete, thereby removing any needs for interlocking.

Although this design eliminated a number of useful instructions, notably things like multiply and divide which take multiple steps, it was felt that the overall performance of the system would be dramatically improved by running the chips at much higher clock rates. This ramping of the speed would be difficult with interlocking involved, as the time needed to set up locks is as much a function of die size as clock rate.

The elimination of these instructions became a contentious point, which many observers used to claim the design (and RISC in general) would never live up to its hype. If one simply replaces the one complex multiply instruction with many simpler additions, where is the speed increase? This simpleminded analysis ignored the fact that the speed of the design was in the pipelines, not the instructions.

In 1984 Hennessy was conviced of the future commercial potential of the design, and left Stanford to form MIPS Computer Systems. They released their first design, the R2000, in 1985, improving the design as the R3000 in 1988. These 32-bit CPUs formed the basis of their company through the 1980s, used primarily in SGI's series of workstations. These commercial designs deviated from the Stanford academic research by implementing most of the interlocks in hardware.

In 1991 MIPS released the first 64-bit CPU design, the R4000. The design was so important to SGI, at the time one of their only major customers, that SGI bought the company outright in 1992 in order to guarantee the design would not be lost given the financial difficulties MIPS had while building the design. Becoming an internal group at SGI, the company was now known as MIPS Technologies.

In the early 1990s MIPS started licensing their designs to 3rd party vendors. This proved fairly successful due to the simplicity of the core, which allowed it to be used in a number of applications that would have formerly used much less capable CISC designs of similar gate count, and therefore price. Sun Microsystems attempted to follow their success by licensing their SPARC core, but have never been anywhere near as successful. By the late 1990s MIPS was a powerhouse in the embedded processor field, and in 1997 the 48th million MIPS-based CPU shipped, making it the first RISC CPU to outship the famous Motorola 68000 family. They were so successful that SGI spun-off MIPS Technologies in 1998.

In 1999 MIPS formalized their licensing system around two basic designs, the 32-bit MIPS32 and 64-bit MIPS64. NEC, Toshiba and SiByte (later acquired by Broadcom) each obtained licenses for the MIPS64 as soon as it was announced. Success followed success, and today the MIPS cores are one of the most-used "heavyweight" cores in the marketplace for computer-like devices (hand-held computers, set-top boxes, etc.), with other designers fighting it out for other niches. Some indication of their success is the fact that Motorola uses MIPS cores in their set-top box designs, instead of their own PowerPC-based cores.

Fully half of MIPS income comes from licensing their designs today, while much of the rest comes from contract design work on cores that will then be produced by 3rd parties.

MIPS CPU family

The first commercial MIPS CPU, model, the R2000, was announced in 1985. It added multiple-cycle multiply and divide instructions in a somewhat independent on-chip unit. New instructions were added to retrieve the results from this unit back to the execution core. Ironically, the result-retrieving instructions were interlocked, which improved compiled code density but made the MIPS name meaningless.

The R2000 could be booted either big-endian or little-endian. It had thirty-two 32-bit general purpose registers, but no condition code register, considering it a potential bottleneck, a feature it shares with the AMD 29000. The program counter can be read like other registers.

The R2000 also had support for up to four co-processors, one of which was built into the main CPU and handled exceptions and traps, while the other three were left for other uses. One of these could be filled by the optional R2010 FPU, which had thirty-two 32-bit registers that could be used as sixteen 64-bit registers for double-precision.

The R3000 succeeded the R2000 in 1988, adding 32kB caches for instructions and data, 64kB total, along with cache coherency support for multi-processor use. However, it turned out that the multiprocessor support was flawed, and the R3000 was not widely used in this way. The R3000 also included a built-in MMU, a common feature on CPUs of the era. The R3000 was the first successful MIPS design in the marketplace, and eventually over 1 million were made. The R3000A was a speed bumped version running at 40MHz that delivered 32 VUPSs (basically equivalent to MIPS). Like the R2000, the R3000 was paired with the R3010 FPU.

The R4000 series, released in 1991, extended the MIPS instruction set to a full 64-bit architecture, moved the FPU onto the main die to create a single-chip system, and operated at a radically high internal clock speed (it was introduced at 100MHz). However, in order to achieve the clock speed the caches were reduced to 8kB each and took three cycles to access. The high operating frequencies were achieved through the technique of deep pipelining (called super-pipelining at the time). With the introduction of the R4000 a number of improved versions soon followed, including the R4400 of 1993 which included 16kB caches, largely bug-free 64-bit operation, and a controller for another 1MB external (level 2) cache.

MIPS, now a division of SGI called MTI, designed the lower-cost R4200, and later the even lower cost R4300, which was the R4200 with a 32 bit external bus. The R4300 was used in the Nintendo 64.

Quantum Effect Devices (QED), a separate company started by refugees from MIPS, designed the R4600, the R4700, the R4650 and the R5000. Where the R4000 had pushed clock frequency and sacrificed cache capacity, the QED designs emphasized large caches which could be accessed in just two cycles and efficient use of silicon area. The R4600 and R4700 were used in low-cost versions of the SGI Indy workstation as well as the first MIPS based Cisco routers. The R4650 was used in the original WebTV set-top boxes(now Microsoft TV). The R5000 FPU had more flexible single precision floating-point scheduling than the R4000, and as a result, R5000-based SGI Indys had much better graphics performance than similarly clocked R4400 Indys with the same graphics hardware. SGI gave the old graphics board a new name when it was combined with R5000 in order to emphasize the improvement. QED later designed the RM7000 and RM9000 family of devices for embedded markets like networking and laser printers.

The R8000 (1994) was the first superscalar MIPS design, able to execute two ALU and two memory operations per cycle. The design was spread over six chips: an integer unit (with 16KB instruction and 16KB L1 data caches), a floating-point unit, three full-custom secondary cache tag RAMs (two for secondary cache accesses, one for bus snooping), and a cache controller ASIC. The design had two fully pipelined double precision multiply-add units, which could stream data from the 4MB off-chip secondary cache. The R8000 powered SGI's Power Challenge computer servers in the mid 1990s and later became available in the Indigo2 Impact workstation. Its limited integer performance and high cost dampened appeal for most users, although its FPU performance fit scientific users quite well, and the R8000 was in the marketplace for only a year and remains fairly rare.

In 1995, MIPS released the R10000. This processor was a single-chip design, ran at a faster clock speed than the R8000, and had larger 32KB instruction and data caches. It was also superscalar, but its major innovation was out-of-order execution. Even with a single memory pipeline and simpler FPU, the vastly improved integer performance, lower price, and higher density made the R10000 preferable for most customers.

More recent designs have all been built on the R10000 core. The R12000 used an improved process to shrink the chip and run it at higher clock rates. The R14000 bumped the speed again to up to 600MHz, added support for DDR SRAM in the cache, and increased the computer bus speed to 200MHz for better throughput. The most recent version, the R16000, doubles the size of the caches to 64kB for both the instruction and data cache, adds support for up to 8MB of level 2 cache, and bumps the clock rates once again, to 700MHz.

Other models and future plans

Other members of the MIPS family include the R6000, a bipolar implementation of the R5000. The R6000 did not deliver the promised performance benefits, and although it saw some use in Control Data machines, it quickly disappeared. The RM7000 was a version of the R5000 with a built-in 256kB level 2 cache and a controller for optional level three cache. It was primarily targeted at embedded designs, including SGI's graphics processors and various networking solutions, primarily by Cisco. The R9000 name was never used.

At one time SGI had intended to move off the MIPS platform to the Intel Itanium, and development was to have ended with the R10000. The ever-longer delays in introducing the Itanium meant that the installed base of MIPS-based machines continued to increase. By 1999 it was clear that development had ended too soon, and the R14000 and R16000 were created as a result. SGI has hinted at a more complex R8000 style FPU for later R-series, and a dual core processor is probable. Low power consumption / heat dissipation will continue be a focus.

MIPS cores

In recent years most of technology used in the various MIPS generations has been offered as building-blocks for embedded processor designs. Both 32-bit and 64-bit basic cores are offered, known as the 4K and 5K respectively, and the design itself can be licenced as MIPS32 and MIPS64. These cores can be mixed with add-in units such as FPUs, SIMD systems, various input/output devices, etc.

MIPS cores have been very successful, they form the basis of many newer Cisco routers, cable modems and ADSL modems, smartcards, laser printer engines, set-top boxes, handheld computers, and the Sony PlayStation 2.

MIPS Programming

There is a freely available "MIPS R2000/R3000 Simulator" called SPIM for several operating systems (i.e., UNIX or GNU/Linux; MS Windows 95, 98, NT, 2000, XP; and DOS) which is good for learning MIPS assembly language programing and the general concepts of RISC-assembly language programing: http://www.cs.wisc.edu/~larus/spim.html

A summary of the R3000 instruction set can be found here.