Quantcast

Next-generation, high-performance processor unveiled

The prototype for a revolutionary new general-purpose computer processor, which has the potential of reaching trillions of calculations per second, has been designed and built by a team of computer scientists at The University of Texas at Austin.

The new processor, known as TRIPS (Tera-op, Reliable, Intelligently adaptive Processing System), could be used to accelerate industrial, consumer and scientific computing.

Professors Stephen Keckler, Doug Burger and Kathryn McKinley have been working on underlying technology that culminated in the TRIPS prototype for the past seven years. Their research team designed and built the hardware prototype chips and the software that runs on the chips.

“The TRIPS prototype is the first on a roadmap that will lead to ultra-powerful, flexible processors implemented in nanoscale technologies,” said Burger, associate professor of computer sciences.

TRIPS is a demonstration of a new class of processing architectures called Explicit Data Graph Execution (EDGE). Unlike conventional architectures that process one instruction at a time, EDGE can process large blocks of information all at once and more efficiently.

Current “multicore” processing technologies increase speed by adding more processors, which individually may not be any faster than previous processors.

Adding processors shifts the burden of obtaining better performance to software programmers, who must assume the difficult task of rewriting their code to run well on a potentially large number of processors.

“EDGE technology offers an alternative approach when the race to multicore runs out of steam,” said Keckler, associate professor of computer sciences.

Each TRIPS chip contains two processing cores, each of which can issue 16 operations per cycle with up to 1,024 instructions in flight simultaneously. Current high-performance processors are typically designed to sustain a maximum execution rate of four operations per cycle.

Though the prototype contains two 16-wide processors per chip, the research team aims to scale this up with further development.

Source University of Texas at Austin




The material in this press release comes from the originating research organization. Content may be edited for style and length. Want more? Sign up for our daily email.

43 thoughts on “Next-generation, high-performance processor unveiled”

  1. I suspect it is linked on some high traffic site somewhere. Also, it ranks high in search engines for the ‘phrase next generation processor.’

    Weird.

  2. This “Next-generation, high-performance processor unveiled”
    has been posted as one of the day’s “Top stories”
    since April 2007 !

    Does someone have a vested interest?

  3. Reading some of these comments, it is quite clear to me that there is often an astoundingly common correlation between the amount of knowledge a person has in a specialized area [such as CPU design], and their inability to conceptualize or accept possibly superior solutions. That isn’t to say I think this will be the next big thing, I don’t know enough about CPU architecture to make such a claim. I just find the lack of knowledge people have about this new architecture combined with the assumptions they are making about it’s viability to be somewhat onerous.

  4. It seems to me that your new computer should be very good for “simulated evolution” such as for instance “Gaussian adaptation” because you may test all individuals in a pululation of 1000 individuals in parallel.

    But the relatively small number os individuals in a population limits the number of degrees of freedom in the process, because the statistical certainty in the elements of the moment matrix of the Gaussian must be determined with sufficient precision.

    Gkm

  5. TRILLIONS of calculations in mathematical
    equations is very minor compared to 100,000 to 999,000 times in data calculations that
    is the true supremacy in data communications.

  6. Intel started demonstrating a chip capable of delivering a teraflop of performance last winter. (see http://techresearch.intel.com/articles/Tera-Scale/1449.htm)

    Theirs isn’t x86 compatible either, which means it wouldn’t have mass market appeal, even if Intel offered it as a product (which they don’t plan to do).

    Sustaining performance in the TF and PF domains takes more than a clever core architecture. Memory capacity and bandwidth, packaging and software all play key roles. The UT charts don’t say much about any of these aspects of their design.

  7. Why aren’t there processors that can compute in straight hex? There has to be a way to make a 0 state and 15 clean voltage ranges to create digital hex. It just seems to me that processing FFFF at an adress of ffff is a lot faster than processing 111111111 at an adress of 111111111 or whatever..

    And for optical media, the pits can be circular and match the maximum size readable by the beam. Then have 0 be empty, 1-8 ascending pit sizes, 9-F being inverse conectrentric circles unpitted. 9 would be the next circle edge size down from the other edge of the pit unpitted. F would be the outer ring of the pit with nothing pitted inside that ring.

  8. Advanced branch prediction requires many cycles and large in-core cache, both mean very expensive h/w compared to current CPU technologies. Furthermore, this can only be truly effective when automatic deep branch prediction is required, like when using very high-level programming languages (usually for AI).

    The current trend in desktop h/w is quite the opposite, that is to embed as many parallel cores as possible inside the PC, including general-purpose programmable GPU (graphics card) h/w. This can ease the burden of compatibility of instruction sets from classic x86 while exploiting the current CPU technologies to the max via massive parallelism.

    Also, it should be noted that most heavy-processing applications today, like climate simulations, weather prediction, molecular dynamics, pattern recognition, etc, are designed DSP-like, mainly focusing on simple math instructions that can be easily ported to parallel or vector machines. Hence, a new instruction set for graph-like branch prediction seems too specialized and cost-inefficient for now.

Comments are closed.