« iepriekšējāTurpināt »
for other data transmissions to take place concurrently. The ability to slow down the HDTV signals could provide an effective means of "time sharing" among various signals while reducing data loss or contamination, a major problem in high-speed ATM transmissions.--David K. Kahaner, ONRASIA
Supercomputing Japan'92, held from 23-25 April 1992, is described, with
particular emphasis on Hitachi's new 32-GFLOP supercomputer.
by David K. Kahaner
missing. The second and third day's 2.9 ns), increasing the peak performance
papers covered applications in physics, of a four-processor system to about Once each year Supercomputing biology, structures, automobiles, com- 26 GFLOPS. I was told by NEC scienJapan’xx is held. The two preceding putational fluid dynamics, weather, and tists that the company feels that it has a conferences were in central Tokyo. This electromagnetics, but with only one or need to continue to develop a high end year's was held in a new conference two papers in each section. A separate vector supercomputer using bipolar center at the harbor in Yokohama. At exhibition of vendor products runs for large scale integration (LSI) and liquid the beginning of this century Yokohama all 3 days. In previous years the exhibi- cooling technologies, although new was the center of international busi- tion has been thronged, and last year technologies such as GaAs are being ness in Japan but has been eclipsed by there were almost 9,000 visitors. While studied. They also feel the need to growth in Tokyo (which is now I can't speak about the second and develop a highly parallel computer using 30 minutes away by local train). third day, on the first day there seemed complementary metal oxide semiconYokohama is currently trying to regain to be more vendor staff than visitors. ductors (CMOS) and BiCMOS. But a some of this activity with a major building This might have been due to bad weather, key aspect of these plans is that conprogram and harbor redevelopment. economy, or some other factors. There nections between a parallel and vector The convention center is spectacularly were fewer than 60 vendors represented, supercomputer need to be strong; in situated and well equipped, but including publishers, societies, etc.; I other words, NEC feels there is a strong attendance on the conference's opening assume that this was related to a slow- need to keep the logical architecture day was very low, notwithstanding the down in the Japanese computer industry. the same for both types of machines. presence of U.S. Ambassador Armacost I found only a very few items of My own discussions with NEC staff at the opening ceremonies.
special interest and almost nothing really working on parallel computing have This year's conference was organized new, save for Hitachi's exhibit describ- not suggested that much has been done around 3 days of technical program of ing its new supercomputer (announced along these lines yet. which the first was all in English, with a few weeks earlier) and Sanyo's new NEC is involved in quite a large speakers almost entirely from the West, dataflow machine. Overall, one Western amount of research and development and two subsequent days of Japanese visitor commented, “Is this all there (R&D) that was not represented at this speakers. In the past, the technical part is?” I will mention a few Japan-related conference. For example, they have of the program has been poorly attended exhibits but omit any discussion of the developed a number of parallel machines and this year's opening day was no U.S. vendors who were present, as for special purpose applications includexception, with one estimate of fewer expected.
ing the following. than 100 listeners. Emphasis in all the technical talks is on overview and gen- NEC
• TIP: Dataflow pipeline image proeral applications; the very specialized
cessor (ring structured, data-driven research papers that are often presented NEC described its upgraded ver- processing, pipeline processing, at Western meetings such as Super- sion of the SX-3, named SX-3R. The 128 CPUs, connected by hierarchical computing '91 (held in November 1991 major change here is that the clock rings, suitable for image and neuro in Albuquerque) are almost entirely cycle has been reduced to 2.5 ns (from processing).
• Video signal processor (using 288 NEC has also collaborated with The S-3800 is water cooled and has
custom VLSI chips, broadcast bus, Hitachi and Fujitsu in the National six models: 160, 180, 182, 260, 280, and video-rate real time processing, Supercomputer Project that I have 480. The first digit gives the number of including raster/subregion parallel- written about earlier. There is also processors, the second relates to the ism).
activity on pipelined memory chips. speed, and the last digit '2' in model References:
182 shows that the system has two • HAL, HAL II, HAL III: Logic sim
scalar units in one processor. The S-3600 ulation machines for VLSI (64 CPUs, • K Umezawa, T. Mizuno, and is an air-cooled supercomputer with multistage interconnection network, H. Nishimori, "GaAs Multichip four models: 120, 140, 160, and 180. function level simulation capability, Package for Supercomputer,” IEICE The specifications for both of these hardware implemented simulation Transactions E 74(8), 2309-2316 supercomputers are given in the algorithm, subcircuit parallelism). (August 1991).
A large number of languages and • Cenju: Circuit simulation (transient • F. Okamoto, Y. Hakihara, software products focused on engi
circuit analysis) machine (72 CPUs, C. Ohkubo, N. Nishi, H. Yamada, neering applications have been bus/cluster, multistage interconnec- and T. Enomoto, “A 200-MFLOPS announced for these machines. One of tion network, quasi-shared mem- 100-MHz 64-bit BiCMOS Vector- my favorites is a combination graphical ory. Cenju is scheduled to be Pipelined Processor (VPP) ULSI," user interface coupled with a powerful upgraded this year to Cenju II with IEEEJ. Solid-State Circuits 26(12), scientific programming environment faster, custom chips (currently 1885-1893 (December 1991). called DEQSOL (Differential Equation 68000s). I reported on Cenju earlier
Solver Language) that I reported on at ["'Two Japanese Approaches to N. Nishi, Y. Seo, R. Nakazaki, and length (“DEQSOL and ELLPACK: Circuit Simulation,” Scientific N. Ohno, “A Pipelined Storage for Problem-Solving Environments for Information Bulletin 16(1), 21-26 Vector Processors,” Proc. 4th Int. Partial Differential Equations,” 16(1), (1991)]. NEC is now attempting to Conf. on Supercomputing and Third 7-19 (1991)). ELLPACK is the most experiment with this machine on World Supercomputer Exhibition 1, similar U.S. software effort that I am other applications including concur- 253-260 (April 1989).
aware of, but DEQSOL is designed rent fault simulation, line-search
much more as an engineering tool than router, magnetohydrodynamic HITACHI
as a research environment for (MHD) plasmas, and finite ele
algorithms. ment analysis. Cenju is the closest On 31 March 1992, Hitachi Ltd. The Hitachi exhibit also showed that NEC has to a general parallel announced a new series of vector super
announced a new series of vector super- various computer graphics demos, computer.)
computers, HITACS-3800 and HITAC including turbines, tidal waves, flow
S-3600. The S-3800 can be obtained as around a cylinder, molecular dynam• CHI, CHI2: Parallel inference a multiprocessor system, with a pipe- ics, eddy current for maglev train, etc.,
machine (part of the Institute for line pitch estimated to be 2 ns using all illustrating the power of the superNew Generation Computer Tech- silicon. These are the third generation computer in modelling applications. nology (ICOT) Fifth Generation of HITAC supercomputers, following One of the most interesting (to me) Computer Project, implements a HITACS-810 and S-820. For the first was almost a perfect copy of a Cray parallel genetic algorithm, with a time, Hitachi supercomputers support demo showing air flow inside the body new object-oriented language an OSF (Unix) operating system. As I of a large commercial airliner; air A’Um90. Current applications are reported earlier, Hitachi has not sold "particles” vent from the ceiling and to DNA sequence searching.) many of their 810/820s in the past year could be viewed dispersing throughout
because performance lagged that of NEC the interior. A number of cross sec• VPP: Vector pipeline processor chip and Fujitsu products, so this producttions were also shown. The simulations (using 0.8-um BiCMOS, this 64-bit has been needed.
were done using rational Runge-Kutta chip runs at 100 MHz and has a Readers should note that I have not integration. What made this fascinatpeak performance of 200 MFLOPS-- yet spoken to anyone who has run on ing was that Hitachi was modelling air addition or subtraction can be done this machine, although delivery is sched- flow inside a Shinkansen (train) body, in parallel with multiplication, divi- uled for the end of this year. As far as I rather than in an aircraft. sion, or logs, each at 100 MFLOPS.) know Hitachi has no plans to market
them in the United States.
In March 1992 I accompanied Mr. in which to look for improvements. MATSUSHITA
Matsushita (Panasonic, National) Systems in the United States. At that componentry. Increasing the number had a large exhibit showing off its time we were given an overview and of circuits in the loop requires logic ADENART. I have reported on this explanation of Hitachi packaging tech- design technology and “smart” logic. several times in the past, but to repeat, nology, some of which is used in these Reducing the average number of cycles
Reducing the average number of cycles this is a 265-processor parallel machine new systems. Both of us were extremely per instruction is typically done with built on 16 boards. Each board is built impressed, especially Thorndyke, who pipelined (staged) execution leading with a crossbar switch and communicahas many years of experience in this to shorter execution pitch, parallel tion between boards is designed to make field. Packaging refers to multichip execution, and long instruction words. three-dimensional (3D) alternating module construction, boards, cooling, The Hitachi scientists we spoke to direction implicit (ADI) computation etc. One of the most interesting aspects emphasized to us that they feel that very efficient. The original design was of Hitachi's work was that the packag- their products are competitive with IBM from Kyoto University (Nogi), but ing technology looked to us like it was technology and lead in many impor- Matsushita is commercializing the capable of being used cost effectively tant areas. (Quite different from the product. The current version is based on products below their highest end Western view of Hitachi as an IBM on 68000 chips and has a peak performachine. Another point to note here is follower.) A good overview of the hard- mance of 2.5 GFLOPS. This is a runhow much life there seems to be left in ware technology used by Hitachi was ning system, complete with English silicon (2 ns) and that Japanese ven- presented in the paper "Hardware language user's manual. One dors have clearly demonstrated that Technology for Hitachi M-880 Processor ADENART is to be installed at Tokyo they will obtain high performance by Group.” This was given to us by University this year. The leader of the capitalizing on their expertise in fast
project, technologies. For example, Hitachi Mr. Fumiyuki Kobayashi scientists feel that current CMOS Chief Engineer, Hardware
Dr. Shunsuke Matsuda technology leads to seemingly high Technology Supervisory Center ADENART Task Force Leader variability in performance, but that Kanagawa Works, Hitachi Ltd.
Matsushita Electric Industrial Co., this may be related to lithography 1 Horiyamashita
Ltd. variability; thus improvements in the Hadano, Kanagawa 259-13, Japan 1-260, Yagumo-Higashimachi latter will translate into performance Tel: +81-463-88-1311
Moriguchi, Osaka 570, Japan improvements.
Tel: +81-6-904-5147 Reducing machine cycle time is a
Fax: +81-6-904-5594 very direct way to increase performance, FUJITSU
E-mail: firstname.lastname@example.org but doing this requires a substantial number of technological improvements. Fujitsu was showing its VP series told me that a newer version (based on Machine cycle time is a direct function machines. The VPX line incorporates a custom RISCchip) will have substanof the switching time delay or fanout some manufacturing cost reductions tially higher performance. I've been delay per gate for logic or the access and the X signifies that it runs Unix. I told that the internal name for this new time for memory times the number of was shown a number of Fujitsu soft- machine is either Ohm or Omega. circuits in the loop. Logic and memory ware products, a crystal structure design improvements require semiconductor system (Crystruct), a computational SANYO technology, such as developing faster material design package (COMDEP), circuit families, or circuits with larger as well as various computer-aided design Sanyo displayed an interesting datafanout capability. Shortening signal (CAD) packages. While these have flow machine (Cyberflow). This is a 64paths, using lower dielectric constant English names, they were entirely devel- processor unit, a high speed display materials, using pins and other con- oped in Japan and looked competitive control unit, and an input/output (I/O) ductors with lower inductance, and with Western products. Surprisingly, interface, packed together in a 0.1 m3 generally using LSI, modules, boards, Fujitsu did not have any exhibit space body.
Peak performance is and cables with better and more uni- devoted to its AP1000 parallel computer. 640 MFLOPS, based on 10 MFLOPS/ form electric characteristics are areas
10 MIPS per chip. Each single VLSI