WO1996025710A1

WO1996025710A1 - Multiple camera system for synchronous image recording from multiple viewpoints

Info

Publication number: WO1996025710A1
Application number: PCT/US1996/002166
Authority: WO
Inventors: Harry Q. Mok; Robert B. Rowe; Paul D. Lewis
Original assignee: Atari Games Corporation
Priority date: 1995-02-14
Filing date: 1996-02-14
Publication date: 1996-08-22

Abstract

A system and method for creating animation sequences including a plurality of video cameras (108, 110, 112, 114, 116), a plurality of video storage devices (118, 120, 122, 124, 126), a video synchronizer (130), a remote control (132), a digitizer (162) and a graphics processing computer (168). While a plurality of video cameras (108, 110, 112, 114, 116) capture an object's image from multiple viewpoints, a video synchronizer (130) generates a plurality of time codes. The remote control (132) directs the plurality of storage devices (118, 120, 122, 124, 126) to synchronously record the multiple viewpoints and time codes. One viewpoint is then processed by the digitizer (162) and the graphics processing computer (168). An animator analyzes the processed images in order to identify a plurality of key frames. After identification of the plurality of key frames, the digitizer (162) and graphics processing computer (168) then process the corresponding key frames associated with the remaining viewpoints.

Description

MULTIPLE CAMERA SYSTEM FOR SYNCHRONOUS IMAGE RECORDING FROM MULTIPLE VIEWPOINTS

Field of the Invention The present invention generally relates to a method and apparatus for developing animation sequences, and more particularly, to a multi-camera system that simultaneously captures information about or images of an object from multiple viewpoints. The invention further relates to a method for digitizing and processing such images to produce an animation sequence that corresponds to each viewpoint.

Background of the Invention

Digital video animation is a technology that is often used for developing animated video games and simulators. It's use includes a technique wherein a video camera is employed to film an actor performing a live action motion. The video camera captures the live action motion by recording a series of video frames on a video tape. The video frames are then digitized and processed to create an animation sequence. An animation sequence is therefore a series of frames that show a video game character performing the recorded motion.

The animation sequence is stored in the computer memory of a video game. The video game retrieves each frame of the animation sequence from its memory and displays the frame on a video monitor. When the video game displays each frame of the animation sequence rapidly, a displayed character appears to perform the live action motion in a lifelike manner.

Digital video animation techniques typically require a video camera to capture video images, a digitizer to digitize the video images, a graphics processing computer to process the digitized video images and an animator that uses the processed, digitized video images to create an animation sequence. Thus the creation of an animation sequence with digital video techniques is a costly and time consuming process.

For example, conventional video cameras capture thirty video frames each second. An image capture computer then must digitize each video frame into a two-dimensional array of picture elements, or pixels. Since each video frame contains over 300,000 pixels, large amounts of memory are required to store an entire animation sequence. In order to reduce costs, video game manufactures have attempted to reduce the amount of memory required to store and display an animation sequence. One common approach is to use "key frame" techniques to reduce the number of video frames necessary to simulate lifelike motion. In other words, every video frame isn't necessary to create the illusion of lifelike movement.

For example, a one-second video segment might show an actor throwing a punch. In a typical punch, the actor clinches the fingers of his hand into a fist, recoils his hand and then quickly extends his hand and arm forward. If it takes one second to throw a punch, the video segment will consist of thirty frames. In many of these frames, however, the actor's movement will only vary slightly from one frame to the next.

Using key frame techniques, an animator reduces the number of frames by selecting only those video frames that are necessary to show lifelike motion. The selected video frames are called key frames. Often an animator can reduce a one second video segment from thirty frames to less than ten key frames.

For example, the animator searches the frames at the beginning of the video segment to select a first key frame that shows the actor's fingers and hand clinched to form a fist. The animator then skips a number of frames to select a second key frame that displays the character's hand and arm partially recoiled. The animator then skips an number of frames to show the actor's hand fully recoiled. The animator then skips a number of frames to show the actor's hand partially extended. This selection process continues until the character's hand and arm are fully extended.

In operation, the video game displays the key frames at a specified speed during the video game. Thus, the key frames provide lifelike movement while reducing the amount of computer memory necessary to store the animation sequence.

The selection of key frames is a time consuming, trial and error process. First an experienced animator must view each frame to select the likely key frames. The key frames are then extracted and viewed. If the movement does not appear lifelike, the animator must select different key frames, reorganize the key frames and view the animation sequence again.

This problem is exacerbated in applications such as live action video games where a character can be viewed from a variety of different angles or viewpoints. For example, in conventional video games, a video game player can move a video game character to the right by pushing a joystick to the right. As the character moves to the right, the video game will display the right side of the character. Likewise, as the video game player moves his joystick to the left, the video game displays the left side of the character as the character reverses direction and moves left.

In more realistic video games, a character can move in many directions. Thus, the video game will display a rear view when a character moves away from the video game player. Likewise, the video game will display the front view when a character is moving towards the video game player. Thus, the video game displays a variety of viewpoints based on the character's position.

Typically, conventional live action video games display as much as eight different viewpoints of a character. The eight different viewpoints may include the front, the rear and different side views of the character. Conceptionally the different viewpoints can be thought of as different points on a compass. North represents the front viewpoint, south represents the rear viewpoint, east and west represent the side viewpoints, while southeast, southwest, northeast and northwest represent the remaining viewpoints. The viewpoints captured at the southeast, southwest, northeast and northwest viewpoints are often called quarter viewpoints.

The animation techniques associated with multiple viewpoints typically consist of using a video camera to film each viewpoint separately. Thus, an animator must capture and process an animation sequence for each viewpoint. If a video game requires eight viewpoints, the animator must film an actor from eight different viewpoints, digitize the video frames of each viewpoint and process the digitized video frames to create an animation sequence that corresponds to each viewpoint.

In other words, to display eight different viewpoints of a character, an animator must film, digitize and process eight different live action movements. One filming of the front view, one filming of the back view, one filming of each side view and one filming of each quarter view. For example, if an animator needs eight different viewpoints of a character throwing a punch, the animator first films a character punching directly at the camera. The character then rotates to the next viewpoint and the animator films the character repeating the same punch. This continues until the animator has filmed the character repeating the same punch in all eight viewpoints.

As explained above, the live action movement filmed at each viewpoint must be digitized and processed. One problem that adds to the time and expense of processing the images recorded at each viewpoint is that a character cannot identically repeat his movements for each viewpoint. For example, the velocity and acceleration of a punch will vary each time the punch is repeated. Thus, when animating the live action movement of each viewpoint, an animator typically has to select different key frames to compensate for the variations in an actor's movements. Another problem that adds to the time and expense of processing the live action movement filmed at each viewpoint is that an actor's body parts in one viewpoint may not align with the actor's body parts in another viewpoint. For instance, if an actor fails to properly align his feet when filming each viewpoint, the feet of the animated video game character in one viewpoint will not align with the character's feet in another viewpoint. Instead of showing the video game character pivoting on his feet as the character rotates into a different position, the feet will suddenly shift when the video game displays the new viewpoint. Sudden shifts of the feet or other body parts cause unwanted distractions that do not appear lifelike.

Consequently, an animator must select key frames that minimize sudden movements when shifting from one viewpoint to another. The animator selects the best key frames by comparing the alignment of body parts in each viewpoint. The animator must compare the alignment of body parts in each viewpoint and revise the key frames selected for each viewpoint to minimize misaligned body parts. As a result, the animator may have to replace a key frame that shows lifelike movement, with a key frame that minimizes misaligned body parts. This trial and error process is lengthy and difficult.

Due to increasing realism and complexity in video game technology, many conventional video games display eight different viewpoints of up to fifty different live action motions. The live action motions can include punches, jumps, kicks, back flips, bends, twists, etc. Thus, in order to animate these movements, an animator will need to process each of the fifty different live action movements from eight different viewpoints resulting in over four hundred different video segments. The four hundred video segments are then digitized and processed to create four hundred different animation sequences.

Summary of the Invention The problems outlined above are in large part solved by the method and apparatus of the present invention.

That is, the multiple camera system of the present invention reduces the time and effort required to animate a character performing a desired action while at the same time is speedy, creates more lifelike movement and reduces the amount of computer processing equipment.

According to the present invention, a multiple video camera system is provided that simultaneously captures the action of a character from multiple viewpoints. One feature of the invention is to reduce the costly and time consuming process of animating the video segments associated with each viewpoint. When an animator uses only one camera, or unsyπchronized cameras to capture different viewpoints of an object, the animator must determine the key frames for each viewpoint. On the other hand, when the animator uses synchronized video capture equipment, the key frames for one viewpoint are identical to the key frames of any other viewpoint.

Instead of digitizing and processing eight different video sequences, the animator only digitizes and processes a single video sequence. Once the animator selects the key frames for one viewpoint, the identical key frames are selected from the other viewpoints. In other words, synchronized filming frees the animator from repeating the process of determining the key frames for each viewpoint.

Another feature of this invention is to reduce the amount of computer hardware necessary to process multiple viewpoints. With the present invention, the animator selects the key frames by viewing the video frames from a single viewpoint. Once the animator has selected the key frames, the present invention only digitizes and processes the key frames associated with the other viewpoints. In other words, the present invention does not digitize and process unneeded video frames.

Consequently, the present invention does not need the additional computer hardware and memory required to digitize and store the unneeded video frames captured from the other viewpoints. Since key frames discard nearly 66% of a video sequence, this invention results in approximately a 66% decrease in memory and computer processing time.

A further feature of the invention is to reduce the time and effort required to minimize the variations that result from filming each viewpoint separately. For example, when filming multiple viewpoints with one camera, an actor must rotate and repeat the same action for each viewpoint. The multiple camera system of the present invention, on the other hand, simultaneously captures multiple viewpoints of a single performance. Since the present invention captures a single performance, each viewpoint contains the same movements.

Thus, an animator does not need to engage in the time consuming process of selecting key frames for each viewpoint. Since each viewpoint contains the identical movements, the key frames from one viewpoint match the key frames for every other viewpoint. A further feature of the invention is to reduce the time and effort required to minimize the misalignments that result from filming each viewpoint separately. The present invention properly aligns each camera prior to filming an actor's movements. Thus, the actor's body parts align in each viewpoint. Consequently, the animator does not need to select key frames that minimize sudden movements that result from misalignments of a character's bodily features. Yet another feature of the invention is to reduce the number of viewpoints needed to animate a character.

The present invention reduces the number of viewpoints by reusing certain video sequences. For example, a viewpoint of a character's right side can be altered via computer processing and reused to represent the character's left side. Altering a viewpoint to represent another viewpoint is called "mirror imaging."

While mirror imaging does not apply to front and back viewpoints, mirror imaging can reduce the number of quarter and side viewpoints necessary to capture a particular live action motion. For example, where eight viewpoints are desired, the animator must typically film one viewpoint that captures the character's front, one viewpoint of the character's back, three viewpoints along the character's right side and three viewpoints along the character's left side.

With mirror imaging, instead of filming all eight viewpoints, the cameras are positioned to capture only capture five viewpoints, one viewpoint that captures the character's front, one viewpoint that captures the character's back and three viewpoints to capture two quarter viewpoints and one side viewpoint of the character. The present invention then processes the two quarter viewpoints and the side viewpoint to create mirror images. For example, if an animator films three viewpoints along the right side of a character, the present invention processes the right side images to create mirror images that represent the viewpoints along the left side of the character.

Brief Description of the Drawings These and other aspects, advantages and novel features of the invention will become apparent upon reading the following detailed description of the invention and upon reference to the accompanying drawings in which: Figure 1 is a block diagram of the basic configuration of the multiple camera system; Figure 2 illustrates a side view of an actor and the rear camera in the multiple camera system; Figure 3 is a block diagram of the image processing system of the present invention; and Figure 4, comprising Figures 4.A.-4.E), is a flow chart illustrating the basic operation of the present invention as it processes the key frames of each viewpoint.

Detailed Descriptions of the Preferred Embodiment

Reference will now be made to the drawings wherein like numerals refer to like parts. Figure 1 illustrates the preferred embodiment of the multiple camera chro akey system 100 of the present invention. This preferred embodiment captures live action animation sequences of an actor 102 positioned on a platform 104 and surrounded by wall panels 106. The platform 104 functions as a stage for the actor 102 while the wall panels 106 function as a backdrop. The platform 104 is preferably centered within the wall panels 106.

The color of the platform 104 and the wall panels 106 are chosen to accentuate the actor 102 and to improve the image processing required to produce a final animation sequence. Typically, the platform 104 and the wall panels 106 are the same color. In the preferred embodiment, the platform 104 and the wall panels 106 are chromakey green.

The multiple camera system of the present invention films the actor 102 with five professional quality chromakey cameras identified as front camera 108, front-side camera 110, side camera 112, side-rear camera 114 and rear camera 116. The five cameras 108, 110, 112, 114 and 116 are connected to five video tape recorders 118, 120, 122, 124 and 126. In addition, a video synchronizer 130 and a remote control 132 connect to the five video tape recorders 1 18, 120, 122, 124 and 126.

More particularly, the front camera 108 sends video images to the video tape recorder 1 18. The front-side camera 110 sends video images to the video tape recorder 120. The side camera 112 sends video images to the video tape recorder 122. The side-rear camera 114 sends video images to the video tape recorder 124 and the rear camera 116 sends video images to the video tape recorder 126.

The five cameras 108, 110, 112, 114 and 116 film thirty frames per second which are recorded on the video tape recorders 118, 120, 122, 124 and 126. The five cameras 108, 110, 112, 114 and 116 are arranged in a semi-circle about the actor 102. Each camera 108, 110, 112, 114 and 116 is spaced so as to capture a different viewpoint of a live action sequence. In the preferred embodiment, each camera 108, 110, 112, 114 and 116 is positioned to correspond to the viewpoints displayed by a video game. Thus, for a video game that displays the front back, side and quarter viewpoints, each camera 108, 110, 112, 114 and 116 is positioned 45 degrees apart from the adjacent camera. Hence, the front camera 108 is positioned 45 degrees from the front-side camera 110, the front-side camera 110 is positioned 45 degrees from the side camera 112, the side camera 112 is positioned 45 degrees from the side-rear camera 114 and the side-rear camera 114 is spaced 45 degrees from the rear camera 116. Figure 2 illustrates a side view of the rear camera 116 and the actor 102. The rear camera 116 is positioned a distance R from the center of the platform 104 and at a height H above the platform 104. In the preferred embodiment, the distance R is approximately 180 inches from the center of the platform 104 and height H is approximately 105 inches above the platform 104. The four other cameras 108, 110, 112 and 114 are positioned in a similar manner. These dimensions keep the cameras out of each other's line of sight, while allowing the cameras to film a pleasing view of the actor 102. One of ordinary skill in the art, however, will appreciate that cameras 108, 110, 112, 114 and 116 can be positioned to capture different angles of the live actor 102.

In addition to receiving the video images generated by the five cameras 108, 110, 112, 114 and 116, the five video tape recorders 118, 120, 122, 124 and 126 also record a time code generated by the video synchronizer 130. Thus, each video frame recorded on the video tape recorders 118, 120, 122, 124 and 126, also includes a corresponding time code.

In the preferred embodiment, the video synchronizer 130 uses commonly known techniques to generate and deliver a time code reference signal (not shown) to the video tape recorders 118, 120, 122, 124 and 126. The time code reference signal is typically referred to as the "black burst" synchronizing signal or the "house" synchronizing signal for a video production facility. This time code reference signal identifies the beginning of each video frame by denoting the end of each scan line in a video frame {horizontal sync) and the end of the last scan line (vertical sync) in a video frame.

Thus, the time code reference signal generated by the video synchronizer 130 provides the data necessary to substantially synchronizes the video tape recorders 118, 120, 122, 123 and 126. Once the video synchronizer 130 generates the time code reference signal, the time code reference signal is sent to the video tape recorders 118, 120, 122, 124 and 126 via a cable 136. The time codes identify each video frame recorded by video tape recorders 118, 120, 122, 124 and 126.

Further, the video tape recorders 118, 120, 122, 124 and 126 are controlled by the remote control 132. The remote control 132 simultaneously activates and deactivates the video tape recorders 118, 120, 122, 124 and 126. In the preferred embodiment, the five cameras 108, 110, 112, 114 and 116 continuously generate and send video frames to the video tape recorders 118, 120, 122, 124 and 126. The video tape recorders 118, 120, 122, 124 and 126 do not begin recording the video frames until the remote control 132 generates a "record signal" (not shown) that is sent to the video tape recorders 188, 120, 122, 124 and 126 via a cable 134.

The "record signal" from remote control 132 activates each video tape recorder 118, 120, 122, 124 and 126 to begin recording the video frames captured by the cameras 108, 110, 112, 114 and 116. Thus, the remote control 132, allows an animator or video lab technician to activate each video tape recorder 118, 120, 122, 124 and 126 at approximately the same time.

In addition, the video tape recorders 118, 120, 122, 124 and 126 stop recording the video frames when the remote control 132 generates a "stop signal" (not shown) via the cable 134. The "stop signal" from the remote control 132 directs the video tape recorders 118, 120, 122, 124 and 126 to stop recording video frames captured by cameras 108, 110, 112, 114 and 116. Since the remote control 132 substantially simultaneously activates the video tape recorders 118, 120, 122,

124 and 126 the first video frame recorded by one video tape recorder has the same time code as the first video frame recorded by the other video tape recorders. Furthermore, each subsequent video frame is recorded at substantially the same time by the video tape recorders 118, 120, 122, 124 and 126. For example, each recorded frame is numbered sequentially. Thus the second frame recorded by video tape recorders 118, 120, 122, 124 and 126 will correspond to the second video frame generated by the cameras 108, 110, 112, 114 and 116.

In the preferred embodiment, the cameras 108, 110, 112, 114 and 116 are JVC KY27 Beta Cams that capture broadcast quality video images. The video tape recorders 118, 120, 122, 124 and 126 are Sony model 1800 video tape recorders. The remote control is a SVRM-1000 remote control unit manufactured by Sony. The video synchronizer 130 is a model 9560 manufactured by Grass Valley Inc. One of ordinary skill in the art, however, will appreciate that other commercially available video cameras, video tape recorders, remote controls and video synchronizers can be used.

The following example illustrates the operation of the multiple camera chromakey system. Before filming, all cameras must be focused and pointed towards the same location. In addition, the lighting for each camera angle must be properly adjusted. The animator then directs the actor 102 to assume a desired pose on the platform 104. The actor 102 then rehearses the desired live action motion. For example, the actor 102 may throw a few punches.

After a few rehearsals, the animator directs the actor 102 to prepare for filming.

Once the actor 102 assumes the correct pose, the animator or a video technician uses the remote control 132 to generate a "record signal." Typically, the animator or video technician presses the record button on the remote control 132. The "record signal" in turn activates the video tape recorders 118, 120, 122, 124 and 126. At substantially the same point in time, the video tape recorders 118, 120, 122, 124 and 126 begin recording the video frames generated by the cameras 108, 110, 112, 114 and 116. In addition, the video tape recorders 118, 120, 122, 124 and 126 record the time codes generated by the video synchronizer 130.

The actor 102 then performs the desired live action motion. Upon completion of the live action motion the animator deactivates the video tape recorders 118, 120, 122, 124 and 126 by pressing the stop button on the remote control 132.

While the actor 102 performs, the five cameras 108, 110, 112, 114 and 116 film the live action motion from five different viewpoints. The video tape recorders 118, 120, 122, 124 and 126 synchronously record th video images filmed by the cameras 108, 110, 112, 114 and 116 and the time codes generated by the vide synchronizer 130. Referring to Figure 3, the video frames and time codes recorded by the video tape recorders 118, 120, 122, 124 and 126 are separately stored on five video tapes 152, 154, 156, 158 and 160. The arrangement of the platform 104, the wall panels 106, and cameras 108, 110, 112, 114 and 116, is dependent upon a number of factors such as the number of viewpoints desired, the type of live action motion and other artistic considerations. For example, a three camera system could capture the front, back and side of a character. In addition to live action motion, the system could also be used for stop motion animation and other types of motion capture. Figure 3 also illustrates the image processing system 150 of the present invention. The image processing system 150 includes a video tape player 166, a digitizer 162, a hard drive 164, a graphics processing computer 168 and an animation computer 170.

The viewpoints captured by the five video tape recorders 118, 120, 122, 124 and 126 are recorded on the video cassettes 152, 154, 156, 158 and 160. More particularly, the front viewpoint is stored on the front video tape 152. The front-side viewpoint is stored on the front-side video tape 154. The side viewpoint is stored on the side video tape 156. The side-rear viewpoint is stored on the side-rear video tape 158 and the rear viewpoint is stored on the rear video tape 160.

In the present invention, only one viewpoint is digitized and processed to determine the ideal key frames. Typically, the animator selects the viewpoint that a video game player will most often observe while playing a video game. In the preferred embodiment, the animator digitizes and processes the side viewpoint stored on the side video tape 156 since a video game player most often views the side of a character while playing a video game.

Since the cameras in the preferred embodiment are Sony Beta Cams, the video images are captured on Beta Cam video tapes in a Beta Cam format. The information stored in the Beta Cam format is then transferred to VHS format and stored on a VHS video tape. The transferring of the captured video images from the Beta Cam format to the VHS format is called a window dub. The window dub also transfers the time codes to a VHS format.

The animator loads side video tape 156 into video tape player 166. Using the time codes, the digitizer 162, locates the desired video segment, digitizes each video frame of the desired video segment and stores the digitized video frames on the hard drive 164.

In the preferred embodiment, the animator inputs the beginning time code and ending time code for a desired video segment into the digitizer 162. The digitizer 162 then locates the beginning time code and digitizes each video frame into a two-dimensional grid of picture elements or pixels where each pixel represents the color displayed at a particular point on the video screen. The digitizer 162 digitizes the video frame into a standard format defined by the NTSC (National Television System Committee). The NTSC standard for a video frame is 640 pixels by 480 pixels which totals over 300,000 pixels. The pixel array for each frame is stored in a file that represents the video frame. The digitizer 162 continues to digitize each frame until the digitizer 162 reaches the last frame as identified by the ending time code. The files on the hard drive 164 are networked to the graphics processing computer 168 and the animation computer 170. Accordingly, the graphics processing computer 168 and the animation computer 170 can access the files on the hard drive 164.

In this preferred embodiment, the digitizer 162 is an Accom WSD digital disk recorder. The hard drive 164 is a remote hard drive unit manufactured by Western Scientific. The hard drive unit contains 60 gigabytes of storage space. The graphics processing computer 168 is a Silicon Graphics Indy workstation using a Unix operating system. The animation computer 170 is an Apple Macintosh. A person skilled in the art, however, can appreciate that the basic hardware of the digitizer 162, the hard drive 164, the graphics processing computer 168 and the animation computer 170 can be implemented using any number of different computer configurations and models without departing from the scope of the present invention. For example, an animator could use other image capture computers such as an Abekas Disscus instead of an Accom WSD. In addition, an animator could use other graphics workstations instead of a Silicon Graphics Indy workstation. Also, the animator could use an IBM PC compatible instead of an Apple Macintosh.

The animator uses the graphics processing computer 168, to perform well known video processing techniques. First, the animator processes the first digitized video frame to determine optimum process parameters 172. The optimum process parameters 172 include the contrast, scale factor, darkness and background color. These parameters are stored and used to process the key frames associated with each viewpoint.

Using well known techniques, the graphics processing computer 168 reduces the number of available colors, adjusts the contrast, scales the image, adjusts the darkness and removes the background color in each digitized frame. In particular, the graphics processing computer 168 reduces the number of available colors from 17 million to sixty-four, increases the contrast between the actor 102 and the background before removing the background color in each digitized frame. In the preferred embodiment, the graphics processing computer 168 runs a commercially available program called Debabelizer" developed by Equilibrium Technologies of Sausalito, California which is one of a number of suitable graphics processing programs available on the market. In order to reduce the large memory requirements needed to store each digitized image, the graphics processing computer 168 also uses well known techniques to scale each digitized video frame to about half the size of the original image. Accordingly, the scaled-down video frames reduce the amount of memory required to store each video frame by about fifty percent.

The graphics processing computer 168 provides files representative of each processed video frame to the animation computer 170. As described in more detail below, the animator uses the animation computer 170 to select the ideal key frames. Once the animator has selected the key frames, the animation computer 170 deletes the unneeded digitized frames from the hard drive 164.

As is explained in more detail below, the animator uses the key frames from the side video tape 156 (the side viewpoint) to automatically select the key frames for the video tapes 152, 154, 158 and 160 (the other viewpoints). Using the time code that identifies each key frame, the graphics processing computer 168 directs the digitizer 162 to only digitize and store the key frames associated with the other viewpoints. In other words, instead of digitizing and storing every frame from the other viewpoints, the graphics processing computer 168 directs the digitizer 162 to only digitize and store the desired key frames of other viewpoints.

Figure 4, comprising Figures 4{a)-4(e), illustrate a flow chart of the basic operation of the multiple camera chromakey system 100 used to create key frame animation sequences for each viewpoint. From a start state 200, the animator operating the system proceeds to state 202 where the animator directs the actor 102 to assume the proper position for a live action motion. For example, the animator directs the actor 102 to assume the proper standing pose prior to throwing a punch.

Once the actor 102 has assumed the proper pose, the animator or video lab technician proceeds to state 204 where the animator or video lab technician activates the remote control 132. The remote control 132 generates the "record signal" that activates the five video tape recorders 118, 120, 122, 124 and 126. Control then passes from state 204 to state 206 wherein each video tape recorder 118, 120, 122, 124 and 126 captures a different viewpoint of the live action motion performed by the actor 102. In state 206, the video tape recorders 118, 120, 122, 124, and 126 also record the time codes generated by the video synchronizer 130. The video tape recorders 118, 120, 122, 124 and 126 combine the time codes with the video frame generated by the five cameras 108, 110, 112, 1 14 and 116.

While in state 206 the cameras 108, 110, 112, 114 and 116 capture the entire live action motion performed by the actor 102. The five cameras 108, 110, 112, 114 and 116 store the captured video on separate video tapes 152, 154, 156, 158 and 160. Thus each video tape 152, 154, 156, 158 and 160 contains the time codes generated by the video synchronizer 130 and the video frames captured by its corresponding camera. More particularly, front camera 108 captures the front viewpoint and records the captured video frames on the front video tape 152. The front-side camera 110 captures the front-side viewpoint and records the captured video frames on the front-side video tape 154. The side camera 112 captures the side viewpoint and records the captured video frames on the side video tape 156. The side-rear camera 114 captures the side-rear viewpoint and records the captured video frames on the side-rear video cassette 158 and the rear camera 116 captures the rear viewpoint and records the captured video frames on the rear video tape 160.

When the actor 102 completes the live action motion, the animator deactivates the video tape recorders 118, 120, 122, 124 and 126 in state 210. In state 210, the animator or video lab technician directs the remote control 132 to generate a stop signal that simultaneously deactivates video tape recorders 118, 120, 122, 124 and 126. Once the actor 102 has completed the live action motion in state 210, the animator can then decide in decision state 212 whether additional shots of the live action motion are needed. For example, if the actor 102 failed to properly perform the live action motion, the animator can decide to film another shot in decision state 212. If the animator decides to capture an additional shot in decision state 212, the animator repeats state 202, state 204, state 206, state 208, state 210 and state 212. Once the animator is satisfied that a sufficient number of shots of the live action motion have been taken, the animator decides whether to record a different live action motion in decision state 214. For example, if the animator has captured the actor 102 performing a punch, the animator now may desire to capture the actor 102 performing a different live action motion such as a kick, back flip, etc.

If the animator decides to capture additional live action motions in decision state 214, the animator repeats state 202, state 204, state 206, state 208, state 210 and state 212. Once the animator is satisfied that a sufficient number of live action motions have been captured, the animator removes the video tapes 152, 154, 156, 158 and 160 from the video tape recorders 118, 120, 122, 124 and 126.

Referring to Figure 4(B), the animator chooses the optimum viewpoint for selecting key frames in state 216.

Typically, the optimum viewpoint is the side viewpoint. In the preferred embodiment, the side viewpoint is displayed most often; therefore, for purposes of example, in state 216 the animator selects the side viewpoint captured by the side camera 112 and recorded on the side video tape 156. The animator then loads the side video tape 156 into the video tape player 166.

In state 218 the animator views each shot of a particular live action motion to determine the best shot.

For instance, if the animator asked the actor 102 to repeat a particular live action motion, the side video tape 156 will contain multiple shots of the same live action motion. The animator looks at each shot recorded on the side video tape 156 to see which one looks the best. The animator determines the best shot based on a number of factors including the proper positioning, the fluidity of movement, visual impact, etc.

The best shot is identified with a beginning time code and an ending time code. The animator uses the best shot time code to direct the digitizer 162 to digitize each frame of the best shot in state 220.

In state 220 the animator inputs the best shot time code into the digitizer 162. The digitizer 162 locates the first f ame of the best shot on the side video tape 156 and digitizes each video frame of the live action motion captured in the best shot. The digitizer 162 stores the digitized video frames on the hard drive 164.

After the digitizer 162 digitizes and stores each video frame on the hard drive 164, the animator proceeds to state 222, where the animator directs the graphics processing computer 168 to determine the optimum process parameters 172. In state 222, the animator selects the optimum process parameters 172 that optimize contrast, enhance color, set brightness and correct for other discrepancies. The optimum process parameters 172 are stored in the graphics processing computer 168 for later processing.

Control passes from state 222 to state 224 wherein the graphics processing computer 168 retrieves a digitized video frame from the hard drive 164. In state 226, the graphics processing computer 168 removes the background that surrounds the actor 102, scales the frame to about fifty percent of its original size and reduces the number of colors.

The removal of the background, the reduction in size and the reduction in the number of colors is accomplished using well known graphic processing techniques. In the preferred embodiment, the graphics processing computer 168 runs a commercially available program called Debabelizer" developed by Equilibrium Technologies of Sausalito, California which is one of a number of suitable graphics processing programs available on the market. More particularly, the graphics processing computer 168 reduces the number of available colors from 17 million to sixty-four, increases the contrast between the actor 102 and the background and removes the background color in each digitized frame. Furthermore, the graphics processing computer 168 also scales each digitized vide frame to about half the size of the original image. Accordingly, the scaled-down video frames reduce the amoun of memory required to store each video frame by about fifty percent.

In state 228, the graphics processing computer 168 transfers the scaled frame to the animation compute 170. The graphics processing computer 168 then decides in decision state 230 whether it has processed each fram of the best shot in the above described fashion. If the graphics processing computer 168 has not processed all o the frames, the graphics processing computer 168 returns to state 224 where the next frame is retrieved.

Referring to Figure 4(C), once the graphics processing computer 168 has processed all of the frames, th animator begins the process of selecting "key frames" in state 232. In state 232, the animator uses the animatio computer 170 to retrieve and display a scaled frame. The animator then uses well known techniques to determin the proper key frames.

As explained above, key frame techniques reduce the number of frames necessary to simulate lifelike motion In state 234 the animator views one frame at a time and selects only those frames that are necessary to sho lifelike motion. In the preferred embodiment the animator can use a number of publicly available animation program such as Fractal Design Painter 3.0 by Fractal Design Corporation, or other similar animation programs.

By using key frame techniques, an animator can often reduce a second of full-motion video from thirt frames to less than ten key frames. For example, a one-second, full-motion video segment might show the acto

102 performing a back flip. If it takes one second to perform a back flip, the video segment will consist of thirt video frames. Since many of the video frames only show slight changes from one video frame to the next vide frame, an experienced animator can remove unnecessary video frames when he selects the key frames.

When selecting the key frames for a back flip, for instance, the animator begins by searching the frame at the beginning of the video segment to select a first key frame that shows the actor 102 in a crouched position The animator then skips a number of frames to select a second key frame that displays the actor 102 springing u while arching his back. The animator then skips a number of frames to show the actor 102 with his feet off th ground with his body and head leaning backwards. This selection process continues until the actor 102 has lande back on his feet.

The key frames are further processed to create an animation sequence that displays only the key frames. By only displaying the key frames, the animation sequence reduces the amount of video frames and computer memor needed to show lifelike movement. In order to select the optimum key frames, the animator must view each frame. If the animator decide in decision state 234 that the frame displayed in state 232 is not a key frame, the animator returns to state 23 and retrieves the next frame. If, on the other hand the animator decides in decision state 234 to select the fram as a key frame, the animator proceeds to state 236.

In the present invention, each key frame is identified by its corresponding frame number. As explaine above, each frame is numbered sequentially. Thus, the first frame is frame number one, the second frame is fram number two and so on. When the animator selects a key frame, the animator simply records the frame number In state 236, the animator records the frame number for each key frame.

In state 238, the animation computer 170 then decides whether it has retrieved each frame of the live action motion. If the animation computer 170 has not retrieved all of the frames, the animation computer 170 returns to state 232 where the next frame is retrieved. If the animation computer 170 has retrieved each frame, the animator proceeds to state 240 where the animator uses the animation computer 170 to view the entire key frame sequence in state 240. The animator can view each key frame individually, multiple key frames, or the entire key frame sequence. When viewing the entire key frame sequence, the animator will often determine that he needs to add or delete certain key frames to optimize the live action motion. If the animator decides to add or delete a key frame in decision state 242, the animation computer 170 returns to state 232 and repeats the key frame selection process in states 234, 236 and 238. If, on the other hand, the animator decides not to revise the key frame sequence in decision state 242, the animation computer 170 proceeds to state 244.

In state 244, the animation computer 170 only retains the selected key frames. In other words, the animation computer 170 deletes all of the unneeded frames. After deleting all of the unneeded frames, only the key frames remain on the hard drive 164.

Referring to Figure 4(D), after the animation computer 170 has deleted the unnecessary frames, the animator proceeds to state 246 where the animator enters the key frame sequence into the graphics processing computer 168. The animator enters the starting time code of the live action motion, each key frame number and the name of the storage file. The starting time code identifies the location of the first frame of the live action motion. The key frame numbers identify which frames are key frames. The storage file name is used to store the key frames on the hard drive 164 in the proper file location.

Once the beginning time code, key frame numbers and file name are input into the graphics processing computer 168, the animator puts the next video tape into the video tape player. Proceeding to state 248, the graphics processing computer 168 directs the digitizer 162 to retrieve and digitize a key frame associated with the next viewpoint. In the present invention, the animator can use the key frames he selected on the side video tape

156 (the side viewpoint) to identify the key frames on the video tapes 152, 154, 158 and 160.

Since the key frames of one viewpoint match the key frames of the other viewpoints, the graphics processing computer 168 automates the rest of the key frame selection process. As explained above, the first frame of each live action motion is identified with a time code. Every subsequent video frame is then numbered sequentially. For example, to find frame number three, the digitizer 162 locates the time code of the first frame and counts each subsequent frame until it locates frame number three.

Thus in state 248, the animation computer 170 uses a time code and a frame number to retrieve a corresponding key frame stored on the other video tapes 152, 154, 158 and 160. The digitizer 162 identifies the proper key frame, digitizes the key frame and stores the key frame on the hard drive 164.

In state 250 the graphics processing computer 168 accesses the digitized key frames stored on the hard -14- drive 164 and proceeds to state 250. Based on well known image processing techniques, the graphics processing computer 168 uses the optimum process parameters 172 to reduce the colors of each key frame from 17 million to sixty-four, enhance the contrast, adjust the darkness, remove the background and scale each key frame to approximately fifty percent of its original size. In decision state 252, the animation computer 170 decides whether the digitizer 162 has retrieved and digitized each key frame. If the digitizer 162 has not retrieved all of the key frames, the animation computer 170 returns to state 248 where the next key frame is retrieved. If the animation computer 170 decides that the digitizer 162 has retrieved and digitized each key frame, the animation computer 170 proceeds to decision state 254.

In decision state 254, the animation computer 170 decides whether the digitizer 162 has retrieved and digitized the key frames for each viewpoint. If the digitizer 162 has not retrieved and digitized the key frame sequence for each viewpoint, the graphics processing computer 168 proceeds to state 256 where the graphics processing computer 168 directs the animator or video lab technician to load the next video tape. After the animator or video lab technician loads the next video tape, the graphics processing computer 168 returns to state 248 where it repeats the process of digitizing the key frames for the next viewpoint. Referring to Figure 4(E), after the graphics processing computer 168 completes processing each key frame, the animator proceeds to state 258. In state 258, the animator specifies the viewpoint that the graphics processing computer 168 will "mirror image" to create another viewpoint. In the preferred embodiment, the mirror image processing is typically transferred to a graphics programmer. The graphics programmer then directs the graphics processing computer in states 258 to 270. Thus, in the preferred embodiment, once the animator identifies the key frames, the graphic programmer uses the key frames to create mirror images.

Mirror imaging reduces the number of viewpoints needed to animate the actor 102. The present invention reduces the number of viewpoints by reusing certain viewpoints. For example, a viewpoint of a character's right side can be altered via computer processing and reused to represent the character's left side.

While mirror imaging does not apply to front and back viewpoints, mirror imaging can reduce the number of side viewpoints. For example, where eight viewpoints are desired, the animator must typically film one viewpoint that captures the character's front, one viewpoint of the character's back, three viewpoints of the character's right side and three viewpoints of the character's left side.

With mirror imaging, instead of filming all eight viewpoints, the animator films only five viewpoints, one viewpoint that captures the character's front, one viewpoint that captures the character's back and three viewpoints to capture one side of the character. In the preferred embodiment, the front-side viewpoint captured by front-side camera 110, the side viewpoint captured by the side camera 112, and the side-rear viewpoint captured by side-rear camera 114 are mirror imaged to create three other viewpoints.

In state 258, the graphics programmer specifies one of the three viewpoints. For example, the graphics programmer may specify the front-side viewpoint. The graphics processing computer 168 then proceeds to state 260 where it retrieves a first key frame of the specified viewpoint. In this example, the graphics processing computer 168 retrieves a front-side key frame. Proceeding to state 262, the graphics processing computer 168 creates a mirror image of the key frame.

The graphics processing computer 168 uses well known techniques to create a mirror image of the key frame.

Essentially, the graphics processing computer 168 translates every pixel in the key frame about a vertical axis. This is similar to creasing a piece of paper in half so that the resulting a vertical axis runs through the center of the paper. The image on one side of the paper is then "folded" into the other side of the paper.

In the present invention, the graphics processing computer 168 reversely arranges every pixel about a vertical axis that runs through the center of an item being processed. For example, a pixel that on the right edge of the key frame is translated to a pixel at the same height on the left side of the key frame.

In the preferred embodiment, the graphics processing computer 168 runs a commercially available program called Debabelizer" developed by Equilibrium Technologies of Sausalito, California which is one of a number of suitable graphics processing programs available on the market that can "mirror image" a video frame.

After mirror imaging the key frame the graphics programmer directs the graphics processing computer 168 to proceed to state 264 where the graphics processing computer 168 stores the mirror image into a new file. When complete, the new file will contain the mirror images for the entire mirror imaged viewpoint. After storing the mirror image key frame, the graphics programmer proceeds to decision state 266. In decision state 266, the graphics programmer decides whether the graphics processing computer 168 has retrieved every key frame for the specified viewpoint. If not, the graphics programmer directs the graphics processing computer 168 to return to state 260 to retrieve the next key frame. If the graphics processing computer 168 has processed the last key frame, the graphics programmer directs the graphics processing computer 168 to proceed to decision state 268.

In decision state 268, the graphics programmer determines if additional viewpoints need mirror imaging. If the additional viewpoints need mirror imaging, the graphics programmer directs the graphics processing computer 168 to return to state 258 where the graphics processing computer 168 begins the process of mirror imaging the next viewpoint. Once the graphics programmer determines in the preferred embodiment that the graphics processing computer 168 has created a mirror image of the front-side viewpoint, the side viewpoint, and the side-rear viewpoint, the graphics programmer proceeds to end state 270.

Hence, in the preferred embodiment, the multiple camera system frees the animator from selecting key frames for each viewpoint. Further this system eliminates the variations that occur when each viewpoint is filmed separately with unsynchronized cameras. Finally, the present invention allows an animator to reuse certain viewpoints by mirror imaging existing viewpoints into corresponding viewpoints.

While the above detailed description has shown, described and pointed out the fundamental novel features of the invention as applied to a preferred embodiment, it will be understood that various omissions and substitutions and changes in the form and details of the illustrated device may be made by those skilled in the art without departing from the spirit of the invention. Consequently, the scope of the invention should not be limited to the foregoing discussion but should be defined by the appended claims.

Claims

WHAT IS CLAIMED IS:

1. A system for creating an animation sequence comprising: a plurality of video cameras coupled to at least one video storage device; a video synchronizer coupled to said video storage device; and a controller coupled to said video storage device such that in response to a signal from sai controller said video storage device to synchronously stores images captured by said plurality of vide cameras.

2. The system of Claim 1, wherein said video synchronizer provides a time code that indicates the beginnin of said images captured by said plurality of video cameras.

3. The system of Claim 1, wherein said plurality of video cameras are positioned to capture multiple viewpoint of an object.

4. The system of Claim 3, wherein said plurality of video cameras are positioned to capture multiple viewpoint of an object performing a live action motion.

5. The system of Claim 4, wherein said plurality of video cameras are positioned to capture a front viewpoint a back viewpoint, and at least one side viewpoint of an object performing a live action motion.

6. The system of Claim 1 , wherein said video storage device comprises a plurality of video tape recorders.

7. A system for creating an animation sequence comprising: a plurality of video cameras, each of said video cameras positioned to capture a differen viewpoint of an object; a plurality of video storage devices, wherein one of said plurality of video storage devices i coupled to one of said plurality of video cameras; a controller coupled to said plurality of video storage devices such that said controller activate said plurality of video storage devices to substantially synchronously record a plurality of images capture by said plurality of video cameras; and a video synchronizer coupled to said plurality of video storage devices, said video synchronize generating a plurality of identification codes that identify said plurality of images, wherein said plurality o video storage devices record said plurality of identification codes in combination with said plurality o images.

8. The system of Claim 7, wherein said plurality of video cameras are positioned to capture multiple viewpoint of an object performing a live action motion.

9. The system of Claim 8, wherein said plurality of video cameras are positioned to capture a front viewpoint, a back viewpoint, and at least one side viewpoint of an object.

10. The system of Claim 7 further comprising a graphics processing system, said graphics processing system comprising: a first plurality of key frames from a first viewpoint; an image digitizer in communication with said storage medium and at least one of said plurality of video storage devices, wherein said image digitizer utilizes said first plurality of key frames to locate and digitize a second plurality of key frames stored on said video storage device and wherein said image digitizer stores said second plurality of digitized key frames on said storage medium.

11. The system of Claim 10 wherein said image digitizer digitizes a plurality of key frames from a plurality of viewpoints.

12. The system of Claim 10 wherein said graphics processing system further comprises: a graphics processor in communication with said storage medium, wherein said graphics processor scales the horizontal and vertical dimensions of each digitized key frame by reducing the number of horizontal and vertical pixels that represent each digitized key frame.

13. The system of Claim 12 wherein said graphics processor processes said plurality of digitized key frames to create a plurality of mirror image key frames and stores said plurality of mirror image key frames on said storage medium.

14. A system for creating an animation sequence comprising: a storage medium; a first plurality of key frames stored on said storage medium; at least one video storage device that stores a plurality of substantially synchronously recorded images, wherein said plurality of substantially synchronously recorded images define an object from a plurality of viewpoints; an image digitizer in communication with said storage medium and said video storage device, wherein said image digitizer utilizes said first plurality of key frames to locate and digitize a second plurality of key frames stored on said video storage device.

15. The system of Claim 14 wherein said image digitizer accesses said video storage device to retrieve and digitize a plurality of key frames from a plurality of viewpoints.

16. The system of Claim 15 wherein said graphics processor processes said plurality of digitized key frames to create a plurality of mirror image key frames and stores said plurality of mirror image key frames on said storage medium.

17. A method of creating an animation sequence comprising the steps of: positioning an object in a desired pose; providing a plurality of electronic capture devices that are positioned to electronically capture images from multiple viewpoints of said object; generating a plurality of time code signals; generating an activation signal; and synchronously recording said images and said plurality of time code signals with said plurality of electronic capture devices in response to said activation signal.

18. The method as defined in Claim 17, wherein said plurality of electronic capture devices comprise a plurality of video cameras and a plurality of video tape recorders.

19. The method as defined in Claim 17, wherein said multiple viewpoints include a front viewpoint, a rear viewpoint, and at least one side viewpoint.

20. The method as defined in Claim 19, further comprising the step of processing at least one of said multiple viewpoints to identify a first plurality of key frames.

21. A method of creating an animation sequence comprising the steps of: positioning an object in a desired pose; providing a plurality of electronic capture devices that are positioned to electronically capture images from multiple viewpoints of said object; generating a plurality of time code signals; generating an activation signal; synchronously storing said images and said plurality of time code signals with a plurality of image storage devices in response to said activation signal; processing a first viewpoint to identify a first plurality of key frames; and using said first plurality of key frames to select and process a second plurality of key frames from a second viewpoint.

22. The method of Claim 21 further comprising the step of using said first plurality of key frames to select and process the key frames of the remaining viewpoints.

23. The method of Claim 21 further comprising the step of processing said second plurality of key frames to create a plurality of mirror image key frames.