Lapas attēli
PDF
ePub

INTERNATIONAL SYMPOSIUM ON
THREE-DIMENSIONAL IMAGE
TECHNOLOGY AND ARTS

Three-dimensional imaging from computers and film is reviewed. The International Symposium on Three-Dimensional Image Technology and Arts, held from 5-7 February 1992 in Tokyo, plus industrial tours to NHK, Sony, and Matsushita are described.

INTRODUCTION

Human three-dimensional or stereo vision evolved millions of years ago as a critical survival mechanism. Euclid wrote about it in 280 A.D., and research to duplicate our sense of depth vision with artificially produced images has been going on since 1600 (in 1890 the first three-dimensional movies were shown at the Paris International Exposition). Thousands of papers have been written. This report describes current work based upon the International Symposium on Three-Dimensional Image Technology and Arts, held from 5-7 February 1992 (in Tokyo), plus industrial tours to NHK, Sony, and Matsushita.

by David K. Kahaner

(IIS is at the third part of the Tokyo University campus structure of Hongo, Komaba, and Roppongi). Nearly 130 scientists participated, with 37 papers presented, 17 from outside of Japan. (The Proceedings are entirely in English.) There is an active threedimensional (3D) research community in Japan. Last July (1991) the Institute of Television Engineers of Japan ran a small international workshop at which related research in Japan and the European Community (EC) was surveyed. Research activities related to this are also described in my recent Scientific Information Bulletin article on virtual reality [“Virtual reality,” 16(4), 43-45 (1991)]. Two upcoming meetings also worth noting are the following:

DISCUSSION OF SYMPOSIUM 1. Second International Conference on

The symposium was directed by

Prof. Joji Hamasaki
Institute of Industrial Science (IIS)
University of Tokyo
7-22-1 Roppongi
Minato-ku, Tokyo 106, Japan
Fax: +81-3-3402-5078
Tel/fax: +81-3-3479-2766
Tel: +81-3-3402-6231 x2314

Artificial Reality and Tele-Existence, 1-3 July 1992, Tokyo

c/o Japan Technology Transfer
Association (JTTAS)
Plaza Mikado Bldg, 6F
2-14-5 Akasaka

Minato-ku, Tokyo 107, Japan
Tel: +81-3-3584-0207
Fax: +81-3-3585-9369
E-mail: icat@atr-sw.atr.co.jp

[blocks in formation]

[For reasons that I didn't understand, the industrial participants from the United States were small or very small companies. The Japanese were well represented by Toshiba, Sony, NHK, Fujitsu, Hitachi, etc. Does this mean that large U.S. industry is not interested in three-dimensional imaging? Certainly not, as there are products from Bolt Bernack and Newman, Texas Instruments, Tektronix, and others on the market already. So where were they? Perhaps this symposium was not too well advertised outside of Japan. Also, there was a meeting of the Society of Photo-Optical Instrumentation Engineers (SPIE) at about the same time in San Jose, CA. SPIE is a large meeting for which 3D imaging, especially holography, is one part, and it probably kept U.S. participation lower than normal.]

This symposium was supported by various electronics, television, and robotics societies in Japan, as well as the Ministry of Posts and Telecommunications (MPT). At the opening ceremony, MPT representative Saito explained that 3D is considered the most important research project that his ministry is currently undertaking. A funded 5-year project will focus on developing 3D image transmission technology to permit transmission of these images over optical fiber. Applications are seen to be TV telephones and conference systems. MPT has allocated $1.3M in FY92 and plans to allocate at least that much in subsequent years. Some cooperation with overseas research groups is being considered. (See also the electronically distributed report on MITI's plans for an "Image" laboratory, image.lab, 20 Feb 1992.)

I would like to express my thanks to several attendees who patiently explained many aspects of this technology to me, especially M. Pusch (HeinrichHertz Institute, Germany, and University of Tokyo), W. Fetter (SIROCO), R. Kroiter and P. Panabaker (Imax), M. Starks (3D TV), H. Tilton

(Visonics), L. Lipton (StereoGraphics), P. McLaurin (University of South Carolina), M. Holzbach (Asaca), and J. Hamasaki (University of Tokyo). Of course, any errors are entirely my own.

It is useful to make the distinction between 3D, biplano-stereoscopic, and multiplano-stereoscopic images. (This dichotomy is due to L. Lipton.) True 3D images not only give the sensation of depth but allow observers to "look around" to their sides and perhaps even their back. Biplano-stereo images are produced from (only) two original images. They also can give a very realistic sensation of depth but have no "look around" capability. An observer moving his/her head while viewing a stereo image causes the image to shift slightly, but no occluded visual information comes into view and the perspective remains the same. Multiplano images are composed of more than two original images and do have some look around capability. An imaging system in which the image aspect changes depending on the observer's viewpoint is called autostereoscopic. In describing systems, the distinction between 3D and stereo images is often ignored. Indeed, there are "3D" and even “4D" workstations on sale, but this almost always means three-dimensional images projected onto a two-dimensional (2D) display device. I will try to use the terms accurately but won't change terms that authors use in describing their work.

In viewing the real world, it is known that the sense of depth is the result of ten or more different factors. For example, overlapping or occlusion, where one object obscures part of another, is a depth clue that does not depend on having two eyes. Another monocular clue is the image of road edges that we expect to be parallel. Similar monocular clues are related to retinal image size, areal perspective, shading, shadows, texture, etc. An important binocular clue to distance from the observer is the difference in angle between the

viewing axes of left and right eyes when both are focusing on a point (convergence). Adjustment of the focal length of the crystalline lens (accommodation) is another clue, although this is mostly monocular. Binocular parallax is the most important binocular clue, relating to the fact that each eye sees a slightly shifted view of the image. Individuals differ greatly in their ability to use these clues either because of physical impairments, training, or some processing difficulties. This is much like color vision: people who lack it entirely discover so at an early age; others whose abilities are below average may go through their entire lives accommodating in other ways.

Images can be viewed on electronic displays such as TVs, cathode ray tubes (CRTs), flat panels, etc. or in hard copy form such as a photograph, plot image, and so forth. Viewing images may or may not require the use of special glasses. Anaglyph images require red/green glasses and most people are familiar with these from a large number of motion pictures (in the 1950s and 1960s) that required them, but their use can be traced back to as early as 1858. Anaglyph techniques can be used for viewing either still images on paper or dynamic images such as films. However, the current trend for films, video tape, or computer screen images has been toward polarized glasses instead, or systems without glasses.

Current work seems to be primarily directed toward stereo vision, although the main technique for full 3D imaging is holography, i.e., the reconstruction of the object wavefront. The original principle involves illuminating the object with a laser and simultaneously recording the reflected (or diffused) light from the object and a reference beam from the laser, creating an interference fringe pattern. The recorded pattern can later be illuminated with the same laser to reproduce the image. Work in holographic techniques has recently focused

on using conventional light rather than a laser and the creation of holographic stereograms, in which multiple images of an object are recorded by ordinary cameras at different positions and a hologram of each image is recorded sequentially. Holograms can provide a very high resolution and geometrically accurate image that can be viewed without glasses and in principle is indistinguishable from the original object. But initial enthusiasm for holography has waned somewhat as practical problems reproducing color and providing dynamic displays have not yet been effectively solved. Also, many techniques for holographic imaging produce images smaller than observers would like to see. As one researcher commented, holograms provide too much information, i.e., it isn't really necessary to completely reconstruct the wavefront to have an effective image. However, some of the most exciting developments in this area are being carried out at the Massachusetts Institute of Technology (MIT) media laboratory under the direction of Prof. Stephen Benton. Benton's early claim to fame was the invention of the white light transmission (dubbed rainbow) hologram and, more recently, the practical demonstration of holographic video. The laboratory is also working on holograms that are in full color, large size, animated, and can be totally synthesized by computer. Perhaps most importantly, this work has re-energized the field and forced researchers to take a more serious look at ongoing related work. At this symposium papers discussed applications of holography to medicine, cameras for holographic imaging of moving objects, and holographic TV using liquid crystal displays (LCDs). All the speakers pointed to Benton's work as years ahead, although he did not attend the symposium. (The work is known from his publications as well as his recent visit to Japan.)

Nevertheless, we are still years away from practical holography in our homes such as holographic TV. For example, any practical holographic display device relying on Benton's approach will require time-bandwidth products far exceeding those available with single channel acousto-optic modulators and will require other techniques such as multichannel modulators, parallelism, etc.

There are some other 3D techniques. An interesting one (S. Yamada, Shibaura Institute of Technology) involves projecting a sequence of cross-sectional "slices" on a screen whose diameter is changing. If the moving speed of the screen and the scanning speed of the images are fast enough and synchronized properly, 3D images can be recognized by an afterimage effect of human eyes. Yamada's main application is to viewing of medical images.

The "Pulfrich" effect is another that can be used for stereo vision on a normal television with glasses. This uses the brightness dependent delay-time of the human eye-brain to create the impression of a pseudo-stereo picture. This was demonstrated in Germany a year ago to broadcast pseudo-stereo via normal television equipment. Observers saw (with the help of a dark glass filter on the right eye-glass lens) stereo pictures when the objects moved from the right to the left on the screen. According to different velocities of objects in different depth levels, a stereopicture was seen. One advantage of this technique is that the observer can choose between watching with glasses (stereo) or without (2D). The major disadvantage is that everything must be in motion. (There were no papers presented at this symposium on the Pulfrich effect.)

A serious issue with either 3D or stereo vision is viewer comfort. This can range from little-or-none to physiological disturbances that can cause nausea in some observers. Anaglyph

images are often too dark or the color is poor. The process is simple in principle but tricky in practice. For example, when viewing computer screens, the color filters need to be matched to the screen phosphors. Otherwise there is cross-talk and the observer becomes confused by confused by the stereoscopic information.

If right then left eye images are displayed sequentially from a source, and a synchronized shutter system in front of the eyes allows the right eye image to only enter the right eye, etc., then stereo vision can be observed. The shutter can be mounted in glasses that are matched with a display in which two constituent pictures are presented in alternation instead of simultaneously. The glasses occlude one eye and then the other in synchronism with the image presentation. This is often called "field sequential." This method avoids the retinal rivalry caused by anaglyph viewing but can introduce other discomfort such as the increase of flicker (on 60-Hz displays), the introduction of time parallax between the two images, or the possibility of “ghosting" between the images due to phosphor persistence. On computer displays flicker can be solved by increasing from 60 to 120 frame refreshes per second, although this is accomplished by halving the number of pixels that are painted per frame, perhaps leading to lower resolution. (Also see the comment by Starks below concerning "apparent" resolution of 3D images.) Most glasses-based shutter systems use LCDs, which work with polarized light. Currently, glasses using LCDs can provide good switching speed and reasonable extinction of the alternating lenses. The electrooptical polarizing shutters now in use transmit about 30% of the unpolarized input light (rather than 50% for perfect polarizers), and this reduces the image brightness a little, but in practice this does not appear to be a major problem. Some eye-glass shutters are

connected by wires to the monitor (tethered); others are controlled by infrared and are wireless. Another system uses a polarizing shutter mounted on the display device and eye-glasses with fixed (circularly) polarized lenses. While this reduces the complexity of the eyeglass system, the large screen-covering shutter is expensive to produce and is fragile. Lipton (StereoGraphics) commented that his company produced such systems for Tektronix but had many difficulties and is now using shutter glasses. For images that are generated by film (as opposed to those generated by computer), it is extremely important that all aspects of the production process, including film registration, color balance during the developing, balanced illumination during projection, etc., be maintained.

Flat panel displays are frequently used as part of a stereo imaging system. Among these, liquid crystal displays are popular. The "standard" type LCD is a direct matrix display, which has a structure of light modulating material sandwiched by horizontal and vertical stripe electrodes. The horizontal electrodes are scanned one by one and signal voltages are applied to all the vertical electrodes according to the image on the selected horizontal electrode. Plasma display panels and electroluminescent displays have fast response but LCDs do not, although they have high resolution and good color reproducibility. Increasing the number of scan lines rapidly decreases the average brightness of LCDs. But great strides have been made in this technology, and at least one symposium participant, T. Uchida (Tohoku University), felt that they will be mass produced and suitable for portable PCs and workstations in the near future.

An active matrix display has a diode or thin film transistor (TFT) at each pixel. These displays can provide fast response time (less than 10 ms), good display quality, and contrast better than

[blocks in formation]

He also noted that among the most promising ideas was the use of optically addressed, amorphous silicon plus ferroelectric liquid crystal, with 5-10μm resolution, 70 μs response time, and 60:1 contrast ratio. The main problem here seems to be that the active area is only about 1.4 cm, currently.

As mentioned above, 3D vision is partly physical, and partly psychological, and hence it is difficult to accurately quantify and resolve problems associated with it. For example, M. Starks points out that "ordinary consumer NTSCTVs with well done VHS stereoscopic tapes look equal or superior to any HDTV I've seen ... due to the fact that stereo will usually have a greater information content than mono and the highly sophisticated image processing systems in the brain have been evolved to take advantage of this." Another psychological issue is related to perception of images. We are accustomed to seeing close-up pictures of people framed by the edge of our viewing screen, photograph, etc. R. Kroiter (Imax) wondered if we will be as ready to accept heads cut off at the neck floating in space? (This may be a real concern, but not to everyone. McLaurin commented to me that "we have all adjusted to new presentation formats

since their inception in cave dwellings and will continue to adapt. [Kroiter's] work is limited mostly to entertainment and not the distribution or presentation of information. When one leaves the realm of entertainment and enters scientific applications, one finds that it is information that is most important, not the general public's acceptance of an image for entertainment.")

Nevertheless, in the past few years many objective studies of binocular vision and perception have been performed and there is now a substantial collection of data that are being used in the planning of real systems. For example, work to determine the angular field over which 3D vision was effective has led to the design of large screen theaters by the Canadian company Imax. Also, the Russian author S. Amelianova wrote a paper on a very careful experiment to develop methods to evaluate observer's 3D vision thresholds in 3D perception and parallax. Amelianova's thesis was a 1991 Oscar winner for technical achievements. (Starks notes that Russian/Soviet work in 3D is substantial.) Other studies are even more focused on perception. For example, a strange but common visual sensation, named "Ganzfeld," concerns what happens when our total field of view is obscured by a single light color, such as in a fog or a white-out. This has been a basic topic for psychologists since the 1930s, and an experiment was reported at the symposium by M. Hara (Asahi University). The problem is far from settled, and much more research is needed to understand how we process binocular images.

I was surprised to discover the quality of work going on at the Industrial Products Research Institute of the Ministry of International Trade and Industry (MITI), located in the city of Tsukuba and presented by T. Takeda. This project is designed to measure basic ocular functions, eye movement, accommodation, and pupil diameter

during actual vision of 3D displays. The fundamental motivation here is to determine if there are any hazards to users, such as the concerns once voiced that the electromagnetic radiation from video display terminals (VDTs) would be harmful. The group has developed a 3D optometer that is capable of measuring these quantities yet allowing the eye to move freely with a 40° horizontal and 30° vertical range. One important result of their experiments is that the greater the depth stimulation presented by the stereo pictures, the more the observer's focus point shifts. Observers then perceive blurred images and struggle to reduce the blur and adjust to the depth perception. The adjustment process produces more visual fatigue than when viewing 2D images. Creators of stereo images strive to evoke as much depth sensation as possible to heighten their viewer appeal and may unwittingly add to viewer fatigue.

There is no doubt that entertainment has been a prime mover in this subject. Imax, mentioned above, builds large stereo motion picture systems that are used in theaters around the world. NHK, the Japanese governmental TV system, is hoping to develop stereo TV, and presumably its main customers will be ordinary viewers, not scientists. NHK scientists are already talking about broadcasting of stereo images in the 21st century. All these applications involve enhancing the sense of "presence" or "reality" and are often accompanied by enhancements in audio presence as well. Similarly, high definition TV (HDTV) is another related technology, and it is obvious that stereo TV is being viewed as the next step beyond HDTV. NHK estimates that in 1990 there were about 80,000 viewers of HDTV in Japan but that the number would be 10 million by the year 2000. It is also not surprising that research in "virtual reality" (VR) is associated with stereo imaging. At the moment, however, VR systems, especially those using

helmet-mounted displays, suffer severe limitations due to the poor resolution limitations due to the poor resolution of their LCD displays. However, imaging of their LCD displays. However, imaging applications are also rife in science. In 1982, C. Smith wrote that "future generations will be astonished that for a few decades in the 20th century we were happy to accept these small flat images as a representation of the real three-dimensional world." It seems obvious that in robotics, photogrammetry, pattern recognition, etc., threemetry, pattern recognition, etc., threedimensional imaging would be a great help.

P. Panabaker from Imax made an interesting presentation about commercial applications of 3D in theaters. Imax is a Canadian company established about 15 years ago with the assistance of the National Film Board (NFB) of Canada. (Those readers who, like me, are film buffs will know that NFB has been at the forefront of film making since the earliest days of cinema.) Imax builds complete theaters and shows films that it produces itself. Almost 2 years ago, I wrote about Fujitsu's 3D computer generated movie at the Osaka Expo'90 generated movie at the Osaka Expo'90 (see the electronic report fujitsu.3d, 25 April 1990). This exhibit was created by Imax, which also did the Suntory exhibit (which I did not see.) Panabaker described the care that is required to produce 3D films that are not only very realistic but also comfortable to watch realistic but also comfortable to watch for long periods of time. For computer graphics, Imax generates and records images at 4000x3000 pixel resolution, recorded on separate left and right 70-mm film frames, and has shown such films as early as the mid-1980s. For the Fujitsu exhibit Imax used lesser resoluFujitsu exhibit Imax used lesser resolution (2048x1500) but produced tion (2048x1500) but produced 11 minutes of full color, fully shaded rasterized images, with separate film strips for each eye. Each frame required several minutes of computation on VP200 class machines and the project spanned 21 months. The two Imax projectors are huge--each requires more than 15 kW of arc illumination--and

project onto a screen 29 meters wide that is specially metallically coated after erection. This kind of technology is still far too expensive to be available in the home, but it is gratifying to see such hi-tech ideas made into successful products.

(StereoGraphics)

L. Lipton described the infrastructure necessary for practical use of electronic stereo displays by working scientists. StereoGraphics markets one of the most successful of these. Symposium participants from the West were well aware of this and spoke highly of it, but Japanese attendees were strangely unaware of it. Lipton made another point that in some fields of science, stereo imaging is part of the training and hence professionals in these disciplines use it as a standard tool, cartography being an obvious example. However, in other areas, students and young researchers are not similarly exposed and these people have to be alerted to the possibilities. I admit to being persuaded by this and feel that there are two points here. First, that research in the general area of stereo or 3D imaging will go forward. Second, that it has already progressed to the stage where it is now practical; computers that can generate one view perspective can just as easily generate a slightly different one. Moderately priced systems exist that can convert this to stereo images on most workstations and even many PC monitors, and these can be viewed comfortably by any one of several polarized glass systems, wired or wireless. I believe that scientists now using computer graphics could put these techniques to excellent use immediately. Also useful, but for a different constituency, there are not only stereo slide and print cameras available but a $3,500 Toshiba camcorder as well. (M. Starks pointed out that the Korean Goldstar Company has recently received a patent for a 3D VCR.) All of the above are barely more expensive to purchase and process

« iepriekšējāTurpināt »