« iepriekšējāTurpināt »
The Electronic Dictionary Research Institute is described and a potential
new project on knowledge archives is discussed.
by David K. Kahaner
The basic conclusions of this study, in Funding arrangements are complex (see,
addition to the comments above, cen- for example, my article "Advanced Machine translation (MT) is a major tered on the fact that fairly conven- Telecommunication Research Institute activity in Japan. It is considered an tional approaches are being employed, (ATR),” Scientific Information Bulletin investment in the future to develop a great deal of pre-editing is still in use,
a great deal of pre-editing is still in use, 17(2), 19-23 (1992)), but in the case of and enhance Japan's national informa- and more native English speakers are EDR, its funding by JKTC is through tion capacity. Most Japanese electron- needed during the development stages MITI. EDR was set up to run as a ics companies are involved in the devel- of the projects.
9-year project, ending in 1994. EDR's opment of MT systems; some are already The purpose of this report is not to total budget from JKTC is ¥14B, plus in use and many others are almost ready. focus on MT directly but to describe about 30% from the participating Many users of MT systems consider one important research activity, the companies. There is no comparable them to be extremely valuable in limited Japan Electronic Dictionary Research size project in the United States. fields such as technical manual transla- Institute (EDR). EDR's functions are EDR has a laboratory in central tion. The Japanese also believe that as follows:
Tokyo (adjacent to the Institute for MT will promote standardization of
New Generation Computer Technoltechnical writing and glossary develop- (1) To produce computer software ogy (ICOT)] with 50-70 people, includment and increase the use of electronic (programs and a database that can ing three or four computer scientists. media for document transmission. There be used as a dictionary) and to In addition, there are distributed labois no question that MT is being more perform research into systems uti- ratories at each of the industrial firms actively pursued in Japan than in any lizing such a dictionary.
associated with the project. The total other country.
work force has been as high as 300, From the U.S. side, there have been (2) To license industrial ownership of although it is normally about 100. numerous studies on Japanese MT. The the products of (1) and to license Workers are often employed from most definitive was conducted by the “know-how,” including the copy- commercial dictionary companies, and Japan Technical Evaluation Center right for the computer programs. the central laboratory has many “com(JTEC), with a workshop on MT con
pany” employees, too. On my visit, I ducted in March 1991, and a compre- Thus EDR's work can be viewed as a met hensive report issued shortly after that. major underpinning to MT systems. See
EDR is a private company that is Toshio Yokoi, General Manager
supported by the Japan Key Technol- Japanese Electronic Dictionary Machine Translation in Japan ogy Center (JKTC), as well as a collec- Research Institute, Ltd. (January 1992)
tion of Japanese companies: Fujitsu, Mita-Kokusai Building Annex Jamie Carbonell, Chair
NEC, Hitachi, Sharp, Toshiba, Oki, 4-28 Mita 1-chome Japanese Technology Evaluation Mitsubishi, and Matsushita. JKTC is Minato-ku, Tokyo 108, Japan Center
run both by the Ministries of Interna- Tel: +81-3-3798-5521 Loyola College in Maryland tional Trade and Industry (MITI) and
Fax: +81-3-3798-5335 4501 North Charles Street
Posts and Telecommunications (MPT). Baltimore, MD 21210-2699
03/92 Word dictionary complete (1st that EDR believes will make it rela
ed., Japanese and English) tively easy to add new languages. In Shin-ya Amano, Research Manager
for commercial use
addition, EDR chose the information 5th Research Laboratory, Research Dictionary interface published processing field as a prototype techniCenter
cal application field. They estimate that c/o Toshiba Corporation Research 03/93 Complete for commercial use 30-50 million words exist in the fields and Development Center
Word dictionary (2nd ed.) of mechanics, chemistry, biology, 1, Komukai Toshibacho, Saiwai-ku
Concept dictionary (1st ed.) medicine, economics, law, etc. EDR Kawasaki, Kanagawa 210, Japan
Bilingual dictionaries thinks of itself as providing a knowlTel: +81-44-533-0484
edge base, rather than a database, in Fax: +81-44-533-0625
Co-occurrence dictionaries the sense that each dictionary item in E-mail: email@example.com
the EDR dictionaries is created and
Dictionary interface (3rd ed.) described by linguistic specialists. These The current status of the project is
data are verified, evaluated, and coras follows. Four major dictionaries exist: The EDR project has only modest rected in connection with text dataWord, Concept, Co-occurrence, and interaction with ICOT, although it is bases with computer tools. (This is not Bilingual.
just next door. ICOT's work is seen as to say that EDR content is perfect; the
fairly theoretical when viewed from JTEC panel found opportunities to Word dictionary
EDR's perspective. It is hoped that in criticize some individual entries.) General dictionary
time bridges will be built between MT Japanese: 200K words work at Japanese companies through USE OF EDR DICTIONARIES
English: 200K words EDR to ICOT. Some early EDR work AND A POTENTIAL NEW Specialized information processing was done with Prolog, although now PROJECT terminology dictionary
most of the programming is in C on Japanese: 100K words standard Unix workstations. (Generally, EDR staff members want to encour
English: 100K words in Japan, there is much reduced interest age international cooperation in the Corpus
in LISP-based artificial intelligence (AI) use and further development of the Japanese: 500K sentences related work; most successful projects dictionaries. All the results of the EDR English: 500K sentences are now being written in C.)
project will be sold at reasonable prices
EDR researchers see themselves as I was told. The same conditions regardConcept dictionary
providing the key technologies for MT. ing the use of the EDR electronic dicClassifications: 400K concepts They freely admit that their data struc- tionaries are to be applied to all users, Descriptions: 400K concepts
tures are relatively flat, but believe that no matter whether they are domestic
successful MT over general fields can- (Japanese) or overseas. EDR plans to Co-occurrence dictionary
not occur without a very large diction- set its prices much lower than those of Japanese: 300K words ary and associated software tools to machine readable dictionaries currently English: 300K words access it. They believe that research on on sale. Further, for academic users,
small-scale dictionaries can help to such as universities and public research Bilingual dictionary
improve electronic dictionary technol- institutions, special measures are being Japanese-English: 300K words ogy, but that the minimal requirement planned, including very low prices. English-Japanese: 300K words of an electronic dictionary is that it Details are being formulated now and
should be of large scale. Consequently, should be available some time this spring A schedule for access has also been language accumulation cannot be done EDR's view of the role of the elecpublished.
manually, even with the best efforts. A tronic dictionary as a primary tool for
large dictionary requires the develop- knowledge acquisition leads naturally 01/91 Dictionary interface published ment of computer, natural language, to an extension in order to study the (1st ed.)
and knowledge-processing technology. accumulation of other knowledge. The 12/91 External evaluation group For example, different language dic- EDR project formally ends in 1994, established
tionaries interact through so-called head although sales or licenses of the dicWord, Concept, Co-occurrence concepts, i.e., word dictionaries are tionaries may provide funding for
dictionaries given to six connected through concepts, in a way maintenance and other research. Hence universities for evaluation
project planners are investigating what to do next. We were given a careful description of one potential project, proposed by EDR's General Manager Toshio Yokoi, under the general heading of Knowledge Archives. He explained to us that at the moment this is simply an idea, and there are no firm commitments from the Japanese Government. The project may not be implemented, or it might be rearranged in a significant way. Nevertheless, Yokoi's idea is to push forward in the area of very large-scale knowledge bases and to develop “knowledge archives.” He wants to perform research and development of various technologies in the following areas.
• The technology to acquire and col
lect in an automated way vast amounts of knowledge.
• The technology in which knowledge
bases are self-organized so that substantial amounts of knowledge can be systematically stored.
• The technology that supports the
creation of new knowledge by using vast amounts of existing knowledge.
The development of appropriate and applicable knowledge bases that fulfill the need for various knowledge usage.
• The technology that translates and
transmits knowledge to promote the interchange and common use of knowledge.
• The development of a basic knowl
edge base that can be shared by all applications.
A fuller description of Yokoi's proposal is given by him in the paper “Knowledge Archives -- Very LargeScale Knowledge Bases Forming the Basis of Knowledge Processing Technology," available from him at the address above.
NEURAL NETWORK RESEARCH AND
DEVELOPMENT IN ASIA
The 1991 International Joint Conference on Neural Networks, held on
18-21 November 1991 at Singapore, is summarized and assessed.
by Clifford Lau
provided by the work of S. Amari on AUSTRALIA
the mathematics of neural computing In the past decade, there have been and by the work of K. Fukushima on In Australia, neural network research significant increases in research and the neocognitron. Together with the is spread out in many universities and development (R&D) in the area of increases in R&D activities, there have industrial research laboratories such neural networks in the United States, been many conferences on the subject as the University of Western Australia, Europe, and Asia. In the United States of neural networks.
the University of Melbourne, the Unimuch of the research activity is sup- The 1991 International Joint Con- versity of New South Wales, Royal ported by the Office of Naval Research ference on Neural Networks was held Melbourne Institute of Technology, (ONR) and the Defense Advanced on 18-21 November 1991 at the Westin Queensland University of Technology, Research Projects Agency (DARPA). Stamford and Westin Plaza in Singapore. Monash University, and Telecom The impetus is provided by the work of At the conference, which was attended Australia Research Laboratory. Much P. Werbos on the backpropagation by about 530 people from all over the of the work is in applying neural netalgorithm, the work of J. Hopfield on world, 440 papers were presented. A work technology to various problems. neural modeling, and the work of breakdown of the authors and attendees Table 2 lists the research topics and S. Grossberg on the adaptive resonance as well as their countries is given in locations. theory. In Europe, funding increases Table 1. To no one's surprise, many As can be seen from Table 2, the have also been seen in the European papers were from the United States research interest in neural network Community (EC). There the impetus (slightly over 100) and from Japan technology in Australia is very broad. is provided by the work of T. Kohonen (slightly under 100). However, there of particular interest to the Navy is the in Finland on self-organizing maps and were a significant number of papers work of Mathew J. Boek at the Royal the work of R. Eckmiller in Germany from Australia, China, Korea, Taiwan, Melbourne Institute of Technology on on neural control. In Japan, neural and Singapore.
the application of neural networks to network research is seen as the natural This report summarizes the research rotating machine fault diagnosis. A follow-on to the fifth generation com- and development work in neural net- backpropagation network is used to puter program. The New Information works in these and other Asian coun- classify the condition of an operating Processing Technology (NIPT) pro- tries. The research in the United States desk fan based on its vibration signagram, also called the Real World is not included in this report because ture. Data from a set of experiments Computing program, is a multimillion the state of the art in neural network are used to train the network. The trained dollar, long-term program that is still research is probably familiar to those network is then used to detect and clasin the planning stage (see the article by who follow this field in the United States. sify faults commonly occurring in indusD. Kahaner, “First New Information The research in Europe is also not trial fans, such as impeller unbalance Processing Technology Workshop '91,” included here because the papers, even and cracked impeller blades. The results Scientific Information Bulletin 17(1), though many were presented, are not of these experiments show that the 51-60 (1992)]. The impetus in Japan is representative of the large amount of network is quite successful at distineffort in Europe.
guishing between the two types of faults