Scientific Information Bulletin

for other data transmissions to take place concurrently. The ability to slow down the HDTV signals could provide an effective means of "time sharing" among various signals while reducing data loss or contamination, a major problem in high-speed ATM transmissions.--David K. Kahaner,

ONRASIA

SUPERCOMPUTING JAPAN'92

CONFERENCE

Supercomputing Japan'92, held from 23-25 April 1992, is described, with particular emphasis on Hitachi's new 32-GFLOP supercomputer.

INTRODUCTION

Once each year Supercomputing Japan'xx is held. The two preceding conferences were in central Tokyo. This year's was held in a new conference center at the harbor in Yokohama. At the beginning of this century Yokohama was the center of international business in Japan but has been eclipsed by growth in Tokyo (which is now 30 minutes away by local train). Yokohama is currently trying to regain some of this activity with a major building program and harbor redevelopment. The convention center is spectacularly situated and well equipped, but attendance on the conference's opening day was very low, notwithstanding the presence of U.S. Ambassador Armacost at the opening ceremonies.

by David K. Kahaner

missing. The second and third day's papers covered applications in physics, biology, structures, automobiles, computational fluid dynamics, weather, and electromagnetics, but with only one or two papers in each section. A separate exhibition of vendor products runs for all 3 days. In previous years the exhibition has been thronged, and last year there were almost 9,000 visitors. While I can't speak about the second and third day, on the first day there seemed to be more vendor staff than visitors. This might have been due to bad weather, economy, or some other factors. There were fewer than 60 vendors represented, including publishers, societies, etc.; I assume that this was related to a slowdown in the Japanese computer industry.

I found only a very few items of special interest and almost nothing really new, save for Hitachi's exhibit describing its new supercomputer (announced a few weeks earlier) and Sanyo's new dataflow machine. Overall, one Western visitor commented, "Is this all there is?" I will mention a few Japan-related exhibits but omit any discussion of the U.S. vendors who were present, as expected.

This year's conference was organized around 3 days of technical program of which the first was all in English, with speakers almost entirely from the West, and two subsequent days of Japanese speakers. In the past, the technical part of the program has been poorly attended and this year's opening day was no exception, with one estimate of fewer than 100 listeners. Emphasis in all the technical talks is on overview and gen- NEC eral applications; the very specialized research papers that are often presented at Western meetings such as Supercomputing '91 (held in November 1991 in Albuquerque) are almost entirely

NEC described its upgraded version of the SX-3, named SX-3R. The major change here is that the clock cycle has been reduced to 2.5 ns (from

2.9 ns), increasing the peak performance of a four-processor system to about 26 GFLOPS. I was told by NEC scientists that the company feels that it has a need to continue to develop a high end vector supercomputer using bipolar large scale integration (LSI) and liquid cooling technologies, although new technologies such as GaAs are being studied. They also feel the need to develop a highly parallel computer using complementary metal oxide semiconductors (CMOS) and BiCMOS. But a key aspect of these plans is that connections between a parallel and vector supercomputer need to be strong; in other words, NEC feels there is a strong need to keep the logical architecture the same for both types of machines. My own discussions with NEC staff working on parallel computing have not suggested that much has been done along these lines yet.

NEC is involved in quite a large amount of research and development (R&D) that was not represented at this conference. For example, they have developed a number of parallel machines for special purpose applications including the following.

• TIP: Dataflow pipeline image processor (ring structured, data-driven processing, pipeline processing, 128 CPUs, connected by hierarchical rings, suitable for image and neuro processing).

• Video signal processor (using 288 custom VLSI chips, broadcast bus, video-rate real time processing, including raster/subregion parallelism).

• HAL, HAL II, HAL III: Logic simulation machines for VLSI (64 CPUs, multistage interconnection network, function level simulation capability, hardware implemented simulation algorithm, subcircuit parallelism).

NEC has also collaborated with Hitachi and Fujitsu in the National Hitachi and Fujitsu in the National Supercomputer Project that I have written about earlier. There is also activity on pipelined memory chips. References:

• K. Umezawa, T. Mizuno, and
H. Nishimori, "GaAs Multichip
Package for Supercomputer," IEICE
Transactions E 74(8), 2309-2316
(August 1991).

• F.

Okamoto, Y. Hakihara, C. Ohkubo, N. Nishi, H. Yamada, and T. Enomoto, “A 200-MFLOPS 100-MHz 64-bit BiCMOS VectorPipelined Processor (VPP) ULSI," IEEE J. Solid-State Circuits 26(12), 1885-1893 (December 1991).

• Cenju: Circuit simulation (transient
circuit analysis) machine (72 CPUs,
bus/cluster, multistage interconnec-
tion network, quasi-shared mem-
ory. Cenju is scheduled to be
upgraded this year to Cenju II with
faster, custom chips (currently
68000s). I reported on Cenju earlier
["Two Japanese Approaches to
Circuit Simulation," Scientific
Information Bulletin 16(1), 21-26
(1991)]. NEC is now attempting to
experiment with this machine on
other applications including concur-
rent fault simulation, line-search
router, magnetohydrodynamic HITACHI
(MHD) plasmas, and finite ele-
ment analysis. Cenju is the closest
that NEC has to a general parallel
computer.)

CHI, CHI2: Parallel inference machine (part of the Institute for New Generation Computer Technology (ICOT) Fifth Generation Computer Project, implements a parallel genetic algorithm, with a new object-oriented language A'Um90. Current applications are to DNA sequence searching.)

VPP: Vector pipeline processor chip (using 0.8-μm BiCMOS, this 64-bit chip runs at 100 MHz and has a peak performance of 200 MFLOPS-addition or subtraction can be done in parallel with multiplication, division, or logs, each at 100 MFLOPS.)

• N. Nishi, Y. Seo, R. Nakazaki, and
N. Ohno, "A Pipelined Storage for
Vector Processors," Proc. 4th Int.
Conf. on Supercomputing and Third
World Supercomputer Exhibition 1,
253-260 (April 1989).

The S-3800 is water cooled and has six models: 160, 180, 182, 260, 280, and 480. The first digit gives the number of processors, the second relates to the speed, and the last digit ‘2' in model 182 shows that the system has two scalar units in one processor. The S-3600 is an air-cooled supercomputer with four models: 120, 140, 160, and 180. The specifications for both of these supercomputers are given in the Appendix.

A large number of languages and software products focused on engineering applications have been announced for these machines. One of my favorites is a combination graphical user interface coupled with a powerful scientific programming environment called DEQSOL (Differential Equation Solver Language) that I reported on at length ["DEQSOL and ELLPACK: Problem-Solving Environments for Partial Differential Equations," 16(1), 7-19 (1991)]. ELLPACK is the most similar U.S. software effort that I am aware of, but DEQSOL is designed much more as an engineering tool than as a research environment for algorithms.

The Hitachi exhibit also showed various computer graphics demos, including turbines, tidal waves, flow around a cylinder, molecular dynam

On 31 March 1992, Hitachi Ltd. announced a new series of vector supercomputers, HITAC S-3800 and HITAC S-3600. The S-3800 can be obtained as a multiprocessor system, with a pipe-ics, eddy current for maglev train, etc., line pitch estimated to be 2 ns using silicon. These are the third generation of HITAC supercomputers, following HITAC S-810 and S-820. For the first time, Hitachi supercomputers support an OSF (Unix) operating system. As I reported earlier, Hitachi has not sold many of their 810/820s in the past year because performance lagged that of NEC and Fujitsu products, so this product has been needed.

Readers should note that I have not yet spoken to anyone who has run on yet spoken to anyone who has run on this machine, although delivery is scheduled for the end of this year. As far as I know Hitachi has no plans to market them in the United States.

all illustrating the power of the supercomputer in modelling applications. One of the most interesting (to me) was almost a perfect copy of a Cray demo showing air flow inside the body of a large commercial airliner; air "particles" vent from the ceiling and could be viewed dispersing throughout the interior. A number of cross sections were also shown. The simulations were done using rational Runge-Kutta integration. What made this fascinating was that Hitachi was modelling air flow inside a Shinkansen (train) body, rather than in an aircraft.

In March 1992 I accompanied Mr. Lloyd M. Thorndyke on a visit to Hitachi. Thorndyke is the founder of ETA Systems in the United States. At that time we were given an overview and explanation of Hitachi packaging technology, some of which is used in these new systems. Both of us were extremely impressed, especially Thorndyke, who has many years of experience in this field. Packaging refers to multichip module construction, boards, cooling, etc. One of the most interesting aspects of Hitachi's work was that the packaging technology looked to us like it was capable of being used cost effectively on products below their highest end machine. Another point to note here is how much life there seems to be left in silicon (2 ns) and that Japanese vendors have clearly demonstrated that they will obtain high performance by capitalizing on their expertise in fast technologies. For example, Hitachi scientists feel that current CMOS technology leads to seemingly high variability in performance, but that this may be related to lithography variability; thus improvements in the latter will translate into performance improvements.

in which to look for improvements. MATSUSHITA Technologies needed here are design automation, packaging, materials, and componentry. Increasing the number of circuits in the loop requires logic design technology and "smart" logic. Reducing the average number of cycles per instruction is typically done with pipelined (staged) execution leading pipelined (staged) execution leading to shorter execution pitch, parallel execution, and long instruction words.

The Hitachi scientists we spoke to emphasized to us that they feel that their products are competitive with IBM technology and lead in many important areas. (Quite different from the Western view of Hitachi as an IBM follower.) A good overview of the hardware technology used by Hitachi was presented in the paper "Hardware Technology for Hitachi M-880 Processor Group." This was given to us by

Mr. Fumiyuki Kobayashi
Chief Engineer, Hardware

Technology Supervisory Center Kanagawa Works, Hitachi Ltd. 1 Horiyamashita

Hadano, Kanagawa 259-13, Japan
Tel: +81-463-88-1311
Fax: +81-463-87-6866

Reducing machine cycle time is a very direct way to increase performance, FUJITSU but doing this requires a substantial number of technological improvements. Machine cycle time is a direct function of the switching time delay or fanout delay per gate for logic or the access time for memory times the number of circuits in the loop. Logic and memory improvements require semiconductor technology, such as developing faster circuit families, or circuits with larger fanout capability. Shortening signal paths, using lower dielectric constant materials, using pins and other conductors with lower inductance, and generally using LSI, modules, boards, and cables with better and more uniform electric characteristics are areas

Fujitsu was showing its VP series machines. The VPX line incorporates some manufacturing cost reductions and the X signifies that it runs Unix. I was shown a number of Fujitsu software products, a crystal structure design system (Crystruct), a computational material design package (COMDEP), as well as various computer-aided design (CAD) packages. While these have English names, they were entirely developed in Japan and looked competitive with Western products. Surprisingly, Fujitsu did not have any exhibit space devoted to its AP1000 parallel computer.

Matsushita (Panasonic, National) had a large exhibit showing off its ADENART. I have reported on this several times in the past, but to repeat, this is a 265-processor parallel machine built on 16 boards. Each board is built with a crossbar switch and communication between boards is designed to make three-dimensional (3D) alternating direction implicit (ADI) computation very efficient. The original design was from Kyoto University (Nogi), but Matsushita is commercializing the product. The current version is based on 68000 chips and has a peak performance of 2.5 GFLOPS. This is a running system, complete with English language language user's manual. One ADENART is to be installed at Tokyo University this year. The leader of the project,

Dr. Shunsuke Matsuda

ADENART Task Force Leader
Matsushita Electric Industrial Co.,
Ltd.

1-260, Yagumo-Higashimachi
Moriguchi, Osaka 570, Japan
Tel: +81-6-904-5147
Fax: +81-6-904-5594
E-mail: smatsuda@drl.mei.co.jp

told me that a newer version (based on a custom RISC chip) will have substantially higher performance. I've been told that the internal name for this new machine is either Ohm or Omega.

SANYO

Sanyo displayed an interesting dataflow machine (Cyberflow). This is a 64processor unit, a high speed display control unit, and an input/output (I/O) interface, packed together in a 0.1 m3 body. Peak performance is 640 MFLOPS, based on 10 MFLOPS/ 10 MIPS per chip. Each single VLSI

« iepriekšējā Turpināt »

Grāmatas