Scientific Information Bulletin

COMPUTING IN HIGH ENERGY PHYSICS '91

INTRODUCTION

The international Computing in High Energy Physics meeting, held
11-15 March in Tsukuba Science City, Japan, is summarized.

High energy physicists are engaged in "big ticket" physics. These are the people whose experiments require the large accelerators at CERN, Fermi National Accelerator Laboratory (FNAL), National Laboratory for High Energy Physics in Japan (KEK), the Super Collider at Texas (SSC), etc. The experiments generate massive amounts of data. Experiments can generate 100 MB/s of data, a terabyte a day, and a pentabyte a year. Acquiring, moving, and storing these data need high speed, high bandwidth networks, libraries of tapes and other external storage devices, as well as automated retrieval systems.

Israel
Italy
Japan
Malaysia
Spain
Switzerland

U.K.
U.S.
U.S.S.R.

Processing the data is required first in real time during the experiments and then as postprocessing afterwards for analysis. For both of these require- Total ments, the computing needs have always outstripped the capabilities of whatever was the current fastest supercomputer. Related theoretical analysis, such as lattice gauge theory, which does not depend on the experimental data, also requires tremendous computing resources, which will barely be satisfied by teraflop computers. In fact, special purpose computers are being built specifically for some of these analyses.

Each year high energy physicists from around the world who are interested in computing come together for their annual meeting [Computing in High Energy Physics (CHEP)]. This year it was held in Tsukuba Science City, about 1 hour outside Tokyo, from 11-15 March 1991. This was truly an international

272

(1) High energy physics computing demands are at least as great as those in some better known fields, such as fluid dynamics, molecular modeling, etc.

1 (2) The scientists working in high energy physics are already using large, interconnected, state-of-theart hardware for their experiments. Thus the use of complicated computer networks and collections of distributed computers for data processing and analysis does not put them off. Rather, they have been doing distributed and parallel computing for years using "farms" of minicomputers, typically Vax computers. Vax computers are so ingrained into the culture that performance is measured in VUPS (Vax units of performance). Postprocessing of data is also done on whatever is the largest machine available (in Japan these are typically FACOM or Hitachi mainframes). A good deal of the "tracking" computations can be vectorized but not much else. However, there is a definite movement toward RISC workstations and parallel computers. In fact, the computing environment

There were 30 plenary talks and 84 presentations in the parallel and poster sessions. There was also a small exhibition by vendors. Because several other meetings were being held during the same week, I was only able to attend the first day and a half of this conference. The purpose of this summary is to provide my general impressions of the work as far as I was able to assess it. Many thanks to Professor Yoshio Oyanagi (University of Tokyo) and Mr. Sverre Jarp (CERN), who participated in all of CHEP '91, read this report, and made important suggestions. A Proceedings is not yet available but will be published this summer by

surrounding some of these experiments may be more sophisticated (although often homegrown) than in laboratories that are famous for supercomputing. Also, at the software level these laboratories are already dealing with some extremely large (often multi-millions of lines) source programs. Few software tools are being used, and most of those are either homegrown or vendor supplied utilities. Thomas Nash (FNAL, nash@fnal.fnal.gov) emphasized the need for research in software engineering to aid in software management.

(3) The high energy physics (HEP) community has more or less spurned mainframes and related centralized services. HEP code has never been cost-justified on expensive supercomputers or mainframes because the programs (1) are often small and (2) rarely vectorize, hence, the interest for cheaper systems (Unix and RISC) that offer the promise of the huge computer quantities that the community needs. At the same time they realize that supercomputer companies can offer some services that cannot be duplicated elsewhere.

David O. Williams from CERN (davidw@cernvm.cern.ch) discussed the relationships between mainframes and workstations as seen by his constituency. With respect to the question "Is the role of the mainframe terminated?" he made the following conclusions.

• General purpose mainframes as we know them in HEP are at the start of their run-down phase. This phase will take about 5 years in HEP and longer in the general marketplace.

• The services provided by these mainframes are essential and

over time will be provided by more specialized systems. He urged mainframe builders to realign prices towards the workstation server market; emphasize integration; push the mainframe's input/output (I/O) advantage relative to workstations; perform research related to quickly accessing vast quantities of data on a worldwide basis; and emphasize dependability, service, ease of use, and other things that will have a big payoff for scientists. [Robert Grossman from the University of Illinois (grossman@uicbert. eecs.uic.edu) echoed a part of this by pointing out that performance of database systems will have to be dramatically improved. For example, the "distance" between two physical events in a 1015 item database can be very great. Thus the first query will always be expensive, but research needs to be done on methods to speed up subsequent queries.]

Williams' advice for workstation builders is to maintain aggressive pricing, emphasize integration, push I/O capacity, and develop good peripherals and multiprocessors.

I believe that most of the audience agreed with these points.

(4) Specialized computers for simulation, in particular quantum chromodynamics (QCD), have been built or are under development in the United States, Japan, Italy, and perhaps other countries. These QCD machines include one at Columbia University, Italy's APE, Tsukuba University's QCDPAX, IBM Yorktown Heights' GF11, and FNAL'S ACP-MAPS. The Japanese QCDPAX project began in the late 1970s and is now running with

480 nodes and a peak speed of about 14 GF (GFLOPS) [see D.K. Kahaner, "The PAX computer and QCDPAX: History, status, and evaluation," Scientific Information Bulletin 15(2), 57-65 (1990)], beginning its fifth generation. The Columbia machine has almost as long a history and has a similar performance. Table 1 (presented by Iwasaki at the meeting) gives some details of existing parallel computer projects dedicated to lattice gauge theory.

Several new machines are in the pipeline. A teraflop machine for QCD has been proposed to the U.S. Department of Energy by a collaboration of scientists from (mostly) U.S. universities and national laboratories. New machines are under development at FNAL and other places. The Japanese Ministry of Education, Science, and Culture (Monbusho) has just approved funding of the next generation PAX (about $10M from 1992-1996). All of these are estimating performance in the range of several hundred gigaflops within the next few years. The network topology of QCD machines has been getting more sophisticated, too, moving from one-dimensional (1D) (16 CPU), two-dimensional (2D) (16x16), three-dimensional (3D) (16x16x8), to fourdimensional (4D) (16x16x8x8). There are still plenty of problems, though, as neither the topology nor control structure [single instruction/multiple data (SIMD), multiple instruction/multiple data (MIMD), ?] is really settled. In addition, reliability (MTBF) as well as pin and cabling issues have to be addressed. Nevertheless, at the leading edge, some of these scientists are already talking about performance beyond one teraflop.

Table 1. Existing Parallel Computer Projects Dedicated to Lattice Gauge Theory

C For additional details concerning APE100, contact M. Malek (mmalek@onreur-gw.navy.mil), who is writing about high performance computing in ONR's London office.

(5) The community is very international, with visits to each other's laboratories and joint projects being very common. For example, Katsuya Amako (KEK, Japan) pointed out that in each of the Tristan experiments (Venus, Topaz, Amy), physicists from almost 17 institutes are participating. Frankly, this is one of the most well mixed international research communities that I have seen. Consequently, there is a great deal of data sharing and savvy about advanced computing and networking. There is not much going on within their world that is not rapidly known by all the active participants. On the other hand,

there does not seem to be nearly as much communication between this group and others doing high performance computing. I see several reasons for this, including an intuitive sense by the physicists that they have the best expertise needed to treat their problems (because their computing needs are so special purpose), and an almost exclusive dependence on VMS software until recently--thus an isolation from the Unix world. High energy physicists are moving heavily and rapidly from minicomputers to workstations, and a “wind of Unix” was definitely blowing through the conference. The growth of Unix is

already bringing people closer together, and I am very optimistic that all parties can learn from each other. In particular, it seems to me that as computing becomes more distributed, the experiences of the physics community, who have actually been doing this for some time, can be beneficial in more general situations. Similarly, the physicists can learn from computer scientists and algorithm developers, who have broader views. Incidentally, Japanese contributions in this area are bound to increase rapidly; when Unix is the accepted standard, then the best hardware will be easily adopted worldwide.

(6) For the future, the participants see

data storage, CPU power, and software as three crisis issues. Networking between remote scientists and the experiment, or among scientists, was seen as something that needed to be beefed up but not emphasized as at a crisis stage. In Japan, future high energy physics projects are viewed as large international collaborations, and there is a strong feeling that a more unified worldwide HEP computing environment is needed.

(7) Parallel computing is moving more into the mainstream of Japanese science. Two Japanese parallel computers that I reported on within the past year (QCDPAX and AP1000) were used to perform real work presented at this meeting. In addition, some applications of transputers were also shown. I am predicting that we will see this trend continue as Japanese-built parallel machines are installed in other "friendly" outside user installations.

« iepriekšējā Turpināt »

Grāmatas