Extreme processors use multiple architectures
At the annual InStat/MDR (Scottsdale, AZ, USA; www.instat.com) Analyst Choice Awards (Jan. 5, 2004), a number of new semiconductor vendors announced what In-Stat is touting as "extreme" processors. Targeted at DSP-based compute-intensive applications such as embedded image processing and DSP, these processors are using established SIMD (single instruction/ multiple data), MIMD (multiple instruction/multiple data), RISC, and VLIW (very-long instruction word) architectures-and sometimes a combination of multiple architectures-to achieve speeds as high as 25 GFLOPs. At the Extreme Processor session, Intrinsity (Austin, TX, USA; www.intrinsity.com), Cradle Technologies (Mountain View, CA, USA; www.cradle.com), and ClearSpeed (Los Gatos, CA, USA; www.clearspeed.com) were recognized for their excellence in technology innovation, design, and implementation.
Targeted at embedded-video and image-processing applications, Cradle Technologies' single-chip ECE 3400 multiprocessor features no less than eight independent DSPs, four RISC processors, on-board RAM, and programmable I/O. Unlike the combined RISC/SIMD approach taken by recently introduced processors such as the FastMATH processor from Intrinsity (see Vision Systems Design, December 2003, p. 8), Cradle's approach is a shared-memory MIMD device that uses a single 32-bit address space for all register and memory elements. According to Cradle, using a repeated sequence of DSP and RISC processors, the multiprocessor DSP (MDSP) hardware architecture also achieves a processing density superior to VLIW architectures while delivering linear scaling performance.
Cradle's MDSP architecture consists of multiple processors hierarchically connected by two levels of buses (see Fig. 1). Each processor subsystem consists of four RISC-like processors called processing engines, eight DSP processors, and one memory-transfer engine used for data movement. Programmed using standard ANSI C or a C-like assembly language, the chip is supplied with GNU-based optimizing C-compilers, assemblers, linkers, debuggers, a functional and performance accurate simulator, code profilers, and analysis tools. For real-time implementations, real-time kernels such as the open source, royalty-free, eCOS available from RedHat (Raleigh, NC, USA; sources.redhat.com/ecos/) can be used.
Combining different architectural processing types was also on the mind of ClearSpeed designers, who were also honored at the In-Stat/MDR Awards. The company's processor, the ClearSpeed CS30, sports an architecture that combines both a RISC-like control unit and a SIMD-like array of processing elements. In a white paper entitled "Multithreaded Array Processor Architecture," available on the company's Web site, the details of the design are compared with those of a conventional RISC-based processor (see Fig. 2).
FIGURE 2. In developing the CS30, ClearSpeed has built a processor architecture that combines both a standard RISC-like control unit that is coupled to a series of parallel execution units. While one execution unit handles program flow control, the other elements process parallel data.
The device appears as a single processor running a single C program where the RISC-like control unit executes a single instruction stream sending instructions to the execution unit that handles flow control, other processing elements, or one of the I/O controllers. Like Intrinsity and Cradle Technologies, ClearSpeed offers a C compiler, graphical debugger, suite of supporting tools and libraries, and PC-based add-in development boards for developers wishing to evaluate the part.
The development of such parts is a shining achievement in semiconductor design. As many industry analysts have pointed out, however, developing such technologies is not always successful. It is also a function of price/performance, ease of porting already established C code, time-to-market, market timing, third-party source availability, company reputation, and numerous other factors. Such architectures do, however, show that the future will not be a choice of a single architecture. More than likely, future systems will rely on a combination of two or more architectures.