Scientific Information Bulletin

ERATO

THE CYCLIC PIPELINED COMPUTER

AND ERATO

A brief description of ERATO (Exploratory Research for Advanced Technology) and one of its projects, QMFL (quantum magneto flux logic), with particular emphasis on the cyclic pipelined computer (CPC), is given. CPC is a shared pipelined memory, single processor, multiple instruction stream architecture, originally designed to be compatible with Josephson junction devices. This ERATO project ends this year.

The Exploratory Research for Advanced Technology (ERATO) projects were started in 1981 by the Research Development Corporation of Japan (JRDC). JRDC is set up by Japanese law under the administration of the Science and Technology Agency (STA), which is a ministerial agency reporting directly to the Prime Minister's office (see Kahaner's E-mail report japgovt.udt, 30 July 1990). ERATO's objective is to conduct interesting basic research. Essentially it is an experiment in the management of research and development (R&D) in which mostly young researchers from industry, government, and universities gather and conduct multidisciplinary research on high risk projects. A great deal has already been written about ERATO (see, for example, Ref 1 and 2). In this article we want to focus on one particular program; nevertheless, for completeness, we present a thumbnail sketch of the general program.

There are about a dozen ERATO projects at any time; the total budget is around $30M, so the support levels for the projects vary around $2M to $5M per year. The staff also varies, but may be as large as about 20 researchers during the most active phase of a project.

by David K. Kahaner and Paul Spee

One of the most unusual things about ERATO is that all the projects are of fixed duration, 5 years. Although the program does not allow for extensions, promising activities might be continued by other organizations. To emphasize the temporary nature, each project rents whatever office and laboratory space it needs at a university, corporation, or research institute.

on young

ERATO focuses researchers; the average age is just slightly more than 31. They are given good facilities and good salaries. AJRDC study showed that starting salaries exceed those of 75% of U.S. Ph.D. chemists in industry, and that salaries of ERATO researchers with three or more years of experience exceed those of 90% of U.S. Ph.D. chemists in industry.

A key ingredient of each ERATO project is its director. The perfect person is charismatic, with a dynamic personality, eminent in his field, who is capable of attracting and inspiring his coworkers. Once found, the director is more or less free to recruit and organize the team as he sees fit. In fact, the projects are informally referred to by the director's name, i.e., "the Goto Project,” etc.

Eiichi Goto, who directs the QMFL project, typifies this profile. Goto, who retired from the University of Tokyo in

April 1991 and is now at the University of Kanagawa, invented the Parametron about 30 years ago. He is an extremely extroverted person and still bristles with new ideas. In fact, one of the younger scientists complained to me that Goto has so many ideas that it was difficult to keep up with his thinking. The proceedings of the latest project symposium (the Eighth RIKEN Symposium on Josephson Electronics, 15 March 1991) list Goto as a coauthor on all but one of the papers, including one on a new type of refrigerator.

About half of the ERATO researchers are seconded from industry, a few are from universities or national laboratories. The remainder are hired as individuals. Most of these are Japanese but about 10% are foreign. The seconding system preserves the researcher's seniority and benefits because ERATO reimburses the company for the researcher. The non-Japanese researchers give the projects a definite international flavor. Several of them speak little or no Japanese, and papers in the proceedings of the symposium mentioned above are almost entirely in English, although most of this was done as a preparation for presentations in the United States in August.

Patents for ERATO projects are jointly owned by the inventors and

JRDC. Researchers share legal expenses for patents they own with JRDC, but they may also assign ownership of the patent to JRDC. Company researchers may assign patent ownership to their company. Until 1988 there were 415 patent applications filed in Japan and 82 outside Japan. Up to 1988 the 338 ERATO researchers had written almost 1,400 papers, and of these more than one-third were published or presented outside of Japan.

Each year there is an ERATO symposium held in Tokyo. In each of four afternoon sessions, researchers from four different projects present the progress in their respective programs. Individual projects can also have symposia, although these are more informal.

A foreign researcher has, in principle, a 1-year contract, which may be renewed. In fact, the ERATO budget explicitly allows for foreign researchers to stay for the full length of a project, 5 years, and through 1989 27 researchers have participated, but only a few have remained the full 5 years. (Perhaps there is some concern among these young non-Japanese researchers about the incremental benefit of staying all 5 years. Employment opportunities exist within Japanese corporations, but upward mobility is questionable.) A few foreign companies have also sent researchers, including Allelix (Canada), Celltech (U.K.), Intel (U.S.), and 3M (U.S.). Some formal recruiting occurs, but most of the foreign researchers apply because of word-of-mouth recruiting. In 1989 there were five researchers from the United States. Foreign researchers receive the same base salary as Japanese, but they also receive moving expenses, a housing allowance, and some provision for Japanese language training. Researchers must locate their own housing; there are no special housing facilities because the ERATO projects are widely dispersed.

QUANTUM MAGNETIC FLUX PROJECT (GOTO-QMFL PROJECT)

This project began in 1986 and is directed by Professor E. Goto, recently retired as Professor on Information Science at Tokyo University. Goto is famous for his patenting in the 1950s of the Parametron, which uses resonating circuits in which current phase is used to store information. In fact, the first Japanese computers were based on the Parametron, e.g., the Hitachi HIPACxxx (P = Parametron). However, Hitachi eventually changed to transistor technology (the Hitachi HITAC-xxx).

In 1983 Goto proposed a Parametron-like element using Josephson junctions. The binary states of the element are the two locations of magnetic flux. This idea is a natural magnetic flux. This idea is a natural step in Josephson technology in which devices use a single quantum of flux. In 1982 IBM's Josephson program was abandoned; several Japanese companies have continued their research and have been reporting steady progress (see, for example, the comments about Hitachi in Kahaner's E-mail report parallel.903, 6 Nov 1990).

The current Goto-QMFL project is divided into three groups:

• Fundamental Property • Magnetic Shielding • Computer Architecture

The first group within the project is working on a new Josephson device working on a new Josephson device called QFP (Ref 3 and 4), in which the unit of information in not represented by voltage but by magnetic flux. The second group is researching a helium liquefying process and magnetic shielding. The third group is researching a new type of architecture called the cyclic pipelined computer (CPC) (Ref 5). Furthermore, software for this highly

pipelined parallel computer is being developed. The three groups illustrate the temporary nature of ERATO projects. When I first went to visit the Computer Architecture Group, it was housed in an ordinary office building in central Tokyo. Last fall the group moved to the Hitachi Central Research Laboratory in suburban Tokyo. The Fundamental Property Group is also at Hitachi and the Magnetic Shielding Group is at ULVAC.

The overall project's aims are (1) to demonstrate that QFP devices can operate in the range of 10 GHz, (2) to demonstrate the capability of removing magnetic flux from superconductors, and (3) to develop a computer architecture suitable for a QFP computer.

The Fundamental Properties Group has six to seven persons, and the Magnetic Shielding and Architecture Groups each have about four people, excluding secretaries. A discussion of the Fundamental Property and Magnetic Shielding Groups, which are essentially associated with building Josephson devices, was given in a recent Japan Technology Evaluation Program (JTECH) report (Ref 6). The Architecture Group was not in that author's (Rowell) area of expertise and was only mentioned in his report. His summary with respect to the Josephson technology is that the project is "plowing new ground (or old ground with new devices), and it will be most interesting to see the magnitude of its impact in 10 years' time.” A second JTECH study in 1989, “High Temperature Superconductivity in Japan," also has a short summary of the Goto project written by M. Dresselhaus, again only focusing on the Josephson aspects and concluding that "this technology benefits from very high speeds and extremely small power consumption and is being examined for a variety of digital applications including next

generation computers." The potential for high performance using Josephson devices comes from this combination of very high clock speeds (tens of GHz) and low power (109 W/gate). Another advantage of the QFP device is the flux transfer characteristics, and it has just been reported that a prototype of threedimensional integration was proven by stacking two chips together and by observing signal transfer between these chips (Ref 7). The hope, of course, is to replace the silicon with Josephson devices to build a three-dimensional package that is a computer in a 1-cm cube.

The Computer Architecture Group investigates new architectures to take advantage of specific features of Josephson devices. The main difference between Josephson devices and conventional devices is that Josephson devices act as a latch. Because there is no delay caused by the latches between the pipeline stages in a pipelined computer, the processor may be deeply pipelined. In pipelining, multiple instructions in computer are overlapped in execution. Each instruction is broken into parts, called stages. Pipelining is a key implementation technique used to make today's fast CPUs. Figure 1 shows a simple (and ideal) example of pipelining. In the figure five instructions execute in sequence. The stage of the instruction denoted with x's represents the actual execution (EX), as opposed to instruction fetch (IF), decode, etc.

In a super-pipelined computer, each stage is divided into smaller pipeline segments, as in Figure 2, which is also idealized.

(2) A situation that prevents the next instruction in the instruction stream from executing during its clock cycle. This could be a hardware resource conflict, a data conflict when an instruction depends on the results of an unfinished instruction, or a control problem when the program counter is changed because of a branch instruction (Ref 8).

(3) The memory system. Hennessy and Patterson (Ref 9) claim that the "biggest impact of pipelining on the machine resources is in the memory system." Highly pipelined processors require a much higher memory bandwidth than nonpipelined processors because instructions and data are fetched from and stored to memory at a much higher rate.

Concerning (1), as mentioned above, one of the distinct characteristics of Josephson logic is that each basic logic

11.

device acts as its own latch and, in principle, this permits a very large number of segments with little overhead.

Concerning (2), the CPC has two main characteristics: pipelined memory and a fixed number of instruction streams, which share the functional units and main memory. In a CPC, a fixed number of instruction streams share common hardware. Only the hardware, which can be considered part of the context of the particular instruction stream, is duplicated. This hardware includes the program counter, processor status, registers, etc. By alternating the instruction streams in a cyclic manner, distinct virtual processors are created. In effect, the CPC implements a multiple instruction multiple data (MIMD) computer. Figure 3 illustrates this idea with three distinct instruction streams in a pipelined computer. An analogous figure could be given for a superpipelined CPC.

12.

13. 14. 15. 16.

-----|-----|·

16.

---|

17.

18.

19.

(1) The extra overhead associated with

|-----|-----|xxxxx|- ---- |
· |----- |- ----|xxxxx|

---|

a large number of segments. Circuitry, called latches, is needed between the segments.

Concerning (3), if the performance of the CPU can be increased by pipelining, then why not increase the performance, that is, the access rate of the memory, by pipelining as well? If a memory access can be divided into successive independent operations, for example, decode column, decode row, access cell, output data, such operations could be executed in parallel, thus pipelining memory. In Josephson computers, the main memory is to be built with the same Josephson logic devices as those used in the processor. For such a computer, both the processor and the main memory would be naturally pipelined with the same pipeline pitch.

Memory is often a bottleneck in many high performance computer systems. By increasing the machine-level parallelism, the number of memory accesses (instruction fetch, operand fetch, operand store) increases, making further demands on the design of efficient memory systems. High performance computers often use techniques such as n-way low-order interleaving (distribute n memory modules over the lower bits) and n-bank memory, where the high order bits specify the bank and the low order bits are offsets into the bank. Low-order interleaving is especially efficient for array and vector processors where memory is often addressed sequentially (access to vector), while n-bank memory is used in a shared memory multiprocessor where processors and memory modules are connected through an interconnection network.

The pipelined memory of the CPC has the advantage that it does not suffer from performance degradation caused by memory access conflicts. Neither does the CPC require an interconnection network, which may suffer either from path conflicts or memory access conflicts (Ref 10).

Current high performance computers require cache memory that can keep up with the memory access rate. When the processor requests data that are not in the cache, a cache miss occurs and the data must be fetched from memory. For super-pipelined and superscalar computers, a cache miss can easily cause an overhead of a factor of 10. (In a superscalar machine, the hardware can issue a small number, two to four, independent instructions in a single clock cycle.) In the CPC, the pipeline pitch of the main memory is the same as the pipeline pitch of the processor. CPC does not currently implement a cache, but the group is still researching this question.

On the other hand, one disadvantage of a CPC is that the random memory access pattern of different instruction streams decreases locality of memory reference, but this is not a problem if a cache is not used. The Architecture Group feels that a CPC can be very well suited for random memory access patterns such as neural network simulations.

CPC STATUS

AND PROSPECTS

The work of the Computer Architecture Group has been overshadowed by the attention drawn to the hardware. The Architecture Group has been designing a computer architecture that is specifically suited for implementation on a machine with Josephson devices that are used both for the main processor as well as for the memory. The inherent rapid switching capability of Josephson devices means that it might be profitable to rethink some fundamental assumptions about the relationship of memory to processing. To most effectively implement their ideas, it is necessary to have Josephson technology in place, but all other aspects of the research are essentially independent of it. In other words, using basic assumptions about this technology, the group can design and simulate using silicon integrated circuits (ICs). Furthermore, the group feels that it would be reasonable to use CPC even without Josephson technology.

[merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small]

A2.

A3.

B1.

B2.

B3.

« iepriekšējā Turpināt »

Grāmatas