CA2297344A1 - Look direction microphone system with visual aiming aid - Google Patents
Look direction microphone system with visual aiming aid Download PDFInfo
- Publication number
- CA2297344A1 CA2297344A1 CA002297344A CA2297344A CA2297344A1 CA 2297344 A1 CA2297344 A1 CA 2297344A1 CA 002297344 A CA002297344 A CA 002297344A CA 2297344 A CA2297344 A CA 2297344A CA 2297344 A1 CA2297344 A1 CA 2297344A1
- Authority
- CA
- Canada
- Prior art keywords
- look direction
- microphone system
- aimer
- transducers
- direction microphone
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R25/00—Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception
- H04R25/40—Arrangements for obtaining a desired directivity characteristic
- H04R25/407—Circuits for combining signals of a plurality of transducers
-
- G—PHYSICS
- G02—OPTICS
- G02C—SPECTACLES; SUNGLASSES OR GOGGLES INSOFAR AS THEY HAVE THE SAME FEATURES AS SPECTACLES; CONTACT LENSES
- G02C11/00—Non-optical adjuncts; Attachment thereof
- G02C11/06—Hearing aids
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/20—Arrangements for obtaining desired frequency or directional characteristics
- H04R1/32—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
- H04R1/40—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
Landscapes
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Otolaryngology (AREA)
- Acoustics & Sound (AREA)
- General Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- General Physics & Mathematics (AREA)
- Ophthalmology & Optometry (AREA)
- Optics & Photonics (AREA)
- Neurosurgery (AREA)
- Studio Devices (AREA)
Abstract
A wearable microphone system with visual aiming aid is described. The system may be used as a hearing aid, or as an audio memory prosthetic for those suffering from a memory impairment, or the like. In one embodiment, the system includes a viewframe for aiming the directional microphone system. In another embodiment, the look direction of the system is controlled by an eyetracker. In other embodiments the system further includes a camera, to provide a combined audiovideo memory recall system. In this form, the invention is useful for creating a personal documentary, or for creating an annotated conversation and capture system.
Description
.", r ':~:Fs'fY
II~'b e'r a ,.r Patent Application ~" m;~~T
.x r ~~
of n .,~ "
W. Steve G. Mann ~' 00'0 v~3 for LOOK DIRECTION MICROPHONE SYSTEM WITH VISUAL
AIMING AID
of which the following is a specification:
FIELD OF THE INVENTION
The present invention pertains generally to an apparatus for use as a hearing aid, audio memory recall system, or the like.
BACKGROUND OF THE INVENTION
In U.S. Pat. No. 6002776, Directional acoustic signal processor and method therefor, December 14, 1999, Neal Ashok Bhadkamkar and John-Thomas Ngo describe a two stage directional microphone system. in which a first stage attempts to extract sources of signals as if the acoustic environment were anechoic and a second stage removes any residual crosstalk between the channels, which may be caused by echoes and reverberation.
In ''Advanced Beamforming Concepts: Source Localization Using the Bispectrum, Labor Transform, Wigner-Ville Distribution, and Nonstationary Signal Representa>-dons" , in Proceedings of the 25th Asilomar Conference on Signals, Systems, and Computers, vol. 2, 818-824, 1991, Jeffery C. Allen provides a. good overview of beam-forming, the Wigner distribution, and the like.
In "Signal Processing for a Cocktail Party Effect", Journal of the Acoustical So-ciety of America 50(2), pp. 656-660, 1971, O.M. Mracek Mitchell, Carolyn A.
Ross, and G.H. Yates provide a good overview of the so-called "Cocktail Party Effect" in which a human listener can focus attention onto any one desired sound source, amid a confusing plurality of sound sources, when the human listener is actually present at an event such as a cocktail party. or the like. This ability is often lost or greatly reduced when a human listener only a recording of the event, or listens to the event through a hearing aid, or other similar device.
Numerous other articles described time-delayed correlations and the like.
A boom mic, located directly in front of each speaker's mouth; each transmitting on a unique channel, would facilitate listening by the hearing impaired, in which the listener could switch from one channel to another to hear different individuals. This situation, however, requires that all, or at least some participants wear microphones.
While suitable in a lecture hall where a hearing impaired student might request that the professor wear a microphone, this situation is unrealistic in most normal day-to-day situations.
Microphone-array processing methods have been in existence for many years. Us-ing multiple microphones with spatial separation, a beamforming effect can create a strong "listen direction", by separately delaying the outputs of each microphone rel-ative to one another. These delays are chosen to remove differences in time of flight of sounds arriving from the listen direction (for simplicity, a "far field"
model is often assumed, in which sound sources are modeled as producing planar sound waves).
Signals arriving from the listen direction are aligned in time, and add construc-tively, to produce a strong output, whereas signals arriving from other directions do not add as constructively, and thus produce a weaker output, owing to the fact that they are temporally misregistered (mismisaligned).
Beamforming was used in a directional hearing aid invented by Widrow and Brear-ley.
Nunxerous other beamforming microphone systenns have been proposed, many that take into account multiple echoes and reverberations arising from a typical indoor environment, or the like.
In photography (and in movie and video production), it is desirable to capture events in a natural manner with minimal disruption and disturbance. Current state-of-the-art photographic or video apparatus, even in its most simple form, creates a disturbance to others and attracts considerable attention on account of the gesture of bringing the camera up to the eye, and, especially if a large microphone boom is used by an additional person who is responsible for sound quality. Even if' the size of the camera could be reduced to the point of being negligible (e.g. no bigger than the eyecup of a typical camera viewfinder, for example), the very gesture of holding a device up to, or bringing a device up to the eye is unnatural and attracts considerable attention, especially in establishments such as gambling casinos or department stores where photography is often prohibited. Moreover, if good quality sound is desired, a large cumbersome directional microphone is needed, perhaps with a sound technician or sound crew, or the participants need to wear microphones. Although there exist a variety of covert video cameras such a canxera concealed beneath the jewel of a necktie clip, cameras concealed in baseball caps, and cameras concealed in eyeglasses, these cameras tend to produce inferior sound quality, not just because of the technical limitations imposed by their snxall size, but, more importantly because they lack a means of accurately aiming both the camera and the microphone systems.
Because of the lack of good compositional and sound quality, investigative video and photojournalism made with such cameras su$'ers from poor composition.
SUMMARY OF THE INVENTION
In one aspect, this invention can be used to provide a hearing aid, where the user can aim the hearing aid by way of a visual aiming aid comprised of a viewscreen, so that the hearing direction of the hearing aid corresponds to the look direction of the wearer of the hearing aid when the wearer is looking at the desired sound source through the viewscreen.
In another aspect, a hearing aid is provided where the visual aiming aid comprises an eyetracker, so that the hearing direction corresponds to the look direction regard-less of whether or not the wearer is looking at the desired sound source through any such viewscreen.
In another embodiment, a wearable camera system with viewfinder provides the visual hearing direction aiming aid, so that the system can be used as an a,udiovideo-graphic memory recall system, or the like.
In another aspect of this invention there is provided a wearable eyeglass based de-vice allowing the wearer to view data, such as from the screen of a wearable computer (WearComp) system, and use the raster of the wearable computer display screen as a hearing direction aiming aid.
In another aspect of this invention there is provided a wearable eyeglass based device allowing the wearer to view electronically stored pictures captured along with speech recordings from the individual so pictured, such that the wearer can remember faces and names, and remember how to pronounce a person's name, by listening to a flashback of that person pronouncing his or her name.
In another aspect this invention provides a method of simultaneously aiming a camera and a microphone system listen direction in which both hands may be left free, and in which the direction in which the camera and listen direction of the microphone system is clearly indicated to the wearer of the apparatus of the invention by means of some marking that appears as if it were superimposed on the real subject matter within the scene.
In another aspect, this invention provides a user with a visual information display embedded in a clear transparent material, where the user can see the display while simultaneously looking through the clear transparent material, and where the user can use the display of this visual information as a listen direction aiming aid.
One of the intended uses of the invention is for a wearable camera for capturing video of exceptionally high sound quality. Th a device need not necessarily be covert.
In fact, it may be manufactured as a fashionable device that serves as both a visible crime deterrent, as well as a self~xplanatory (through its overt obviousness) tool for documentary videomakers and photojournalists.
There are several reasons why it might be desired to wear a videographic memory aid over a sustained period of time:
1. There is the notion of a personal visual diary of sorts.
2. There is the idea of being always ready. By constantly recording into a circular buffer, a retroactive record function, such as a button that instructs the device to "begin recording from five minutes ago" may be useful in personal safety (crime reduction) as well as in ordinary everyday usage, such as remembering how to pronounce a person's name by listening several times to that person pronouncing their own name.
3. There is the fact that the wearable videographic memory recall system after being worn for a long period of time, begins to behave as a true extension of the wearer's mind and body. As a result, the composition of video shot with the device is often impeccable without even the need for conscious thought or effort on the part of the user. Also, one can engage in other activities, and one is able to record the experience without the need to be encumbered by a camera, or even the need to remain aware, at a conscious level, of the camera's existence.
This lack of the need for conscious thought or effort suggests a new genre of documentary video characterized by long-term psychophysical adaptation to the device. The result is a very natural first-person perspective documentary, whose artistic style is very much as if a recording could be made from a video tap of the optic nerve of the eye itself, together with a tap into the ears of tile wearer. Events that may be so recorded include involvement in activities such as a cocktail party, that cannot normally be well recorded unobtrusively from a first-person perspective using cameras of the prior art. Moreover, a very natural personal memory device results, in which the process and use of the device is much more like one's own memory than like a separate recording device.
4. A computational system, either built into the wearable system, or worn on the body elsewhere and connected to the system, may be used to enhance sound. This may be of value to the hearing impaired. The computer may also perform other tasks such as sound recognition, such that it may be of value to those who are completely deaf. A speech recognizer, combined with a listen direction aiming aid, would allow a deaf person to aim the apparatus in the desired listen direction. Moreover, by combining the listen direction aiming aid with an output device, such as a wearable computer screen, the system functions as a true extension of the mind and body. For example, an eyetracker that observes where the wearer is looking at a wearable text screen, focuses the microphone listen direction there, under the realistic assumption that the wearer is looking at the person he or she is trying to listen to. Because the device is worn constantly, it may also funcaion as a photographicwideographic memory prosthetic, e.g. to help those with Alzheimer's disease (a degenerative disease in which mild forgetfulness progresses to severe memory loss) , or the like, recall and listen to previously captured material.
It is possible with this invention to provide a means for a user to experience additional information overlaid on top of his or her audio perception such that the information is relevant to the imagery being viewed by a camera included in the system.
D
The invention facilitates a new form of visual art, in which the artist rnay capture, with relatively little effort; a personal experience as experienced from his or her own perspective. With some practice, it is possible to develop a very steady body posture and mode of movement that best produces memories of the genre pertaining to this invention. Because the apparatus may be made lightweight and situated close to the head, there is not the protrusion associated with carrying a hand-held camera and microphone. Also because components of the apparatus of the invention are mounted very close to the head, in a manner that balances the weight distribution, the apparatus does not restrict the wearer's head movement or encumber the wearer appreciably. Mounting close to the head minimizes the moment of inertia about the rotational axis of the neck, so that the head can be turned quickly while wearing the apparatus. This arrangement allows one to easily record the experiences of ordinary day-to--day activities from a first-person perspective. Moreover, because both hands are free, much better balance and posture is possible while using the apparatus.
Anyone skilled in the arts of body movement control as is learned in the martial arts such as karate, as well as in dance, most notably ballet, will have little difficulty capturing exceptionally high quality memories using the apparatus of this invention.
With known video or movie cameras, the best operators tend to be very large peo-ple who have trained for many years in the art of smooth control of the cumbersome video or motion picture film cameras used, together with a sound technician who has also trained in the art of controlling a microphone boom or the like. In addition to requziring very people to optimally operate such production systems, various stabi-lization devices are often used, which make the apparatus even more cumbersome.
The apparatus of the invention may be optimally operated by people of any size.
Even young children can become quite proficient in the use of the wearable memory capture system.
A typical embodiment of the invention comprises one or two spatial light modu-lators or other display means built into a pair of bifocal eyeglasses or reading glasses, together with a microphone array. Sometimes one or more CCD (charge coupled device) image sensor arrays and appropriate optical elements are also included to provide a camera function. In the bifocal embodiment of the invention, typically a beamsplitter or a mirror silvered on both sides is used to combine the image of the aiming aid with the apparent position of the listen direction, or to display the extent of coverage of the camera. The beamsplitter is typically embedded into the eyeglass lens, and is made much wider than it needs to he, so that it appears to be a cut line in the eyeglass lens typical of bifocal eyeglasses. The aiming aid may comprise one or more of:
~ A reticle, graticule, rectangle, or other marking that appears to float within a portion of the field of view;
~ A display device that shows a video image, or some other dynamic information perhaps related to the video image coming from the camera;
~ An eye tracker that determines where the wearer is looking.
The invention allows a person to wear the apparatus continuously and therefore always end up with the ability to produce a memory from something that was expe-rienced a couple of minutes ago. This may be useful to everyone in the sense that we may not want to miss a great memory opportunity, and often great memory oppor-tunities only become known to us after we have had time to think about something we previously experienced.
Such an apparatus might also be of use in personal safety. Although there are a growing number of video surveillance cameras installed in the environment allegedly for "public safety" , there have been recent questions as to the true benefit of such centralized surveillance infrastructures. Most notably there have been several exam-ples in which such centralized infrastructure has been abused by the owners of it (as in roundups; detainment, and execution of peaceful demonstrators). Moreover, "pub-lic safety'' systems rnay fail to protect individuals against crimes committed by the organizations that installed the systems. The apparatus of this invention allows the storage and retrieval of memories by transmitting and recording then at one or more remote locations. Memories may be transmitted and recorded in different countries.
so that they would be difficult for the perpetrator of a crime to destroy, in the event that the perpetrator of a crime might wish to do so through the political machinery of a single country, or even through murder of the person who experienced the event.
Thus a shared memory system helps to prevent the destruction of an individual's own memory; and may even prevent so-called "hear no evil, so no evil, speak no evil"
murders.
The apparatus of the invention allows memories to be captured in a natural man-ner, without giving an unusual appearance to others (such as a potential killer or human rights violator).
Vitrionics provides an important aspect of the invention, namely the use of light sources embedded directly in glass, plastic, or the like, that a user can look through.
For example, a very small light emitting diode die (having a size that is on the order of the size of a grain of sand) embedded within a sheet of clear glass, when held close to the eye, will be sufficiently out of focus that it will not be seen by the wearer, or by other people observing the wearer, when it is turned off. Such a light source will behave like a grain of sand in the glass, and be relatively imperceptible. When switched on, it will appear to the wearer as a,n out of focus point, and the wearer will see a large circular blob of light, rather than a point of light. This point source of light may be made diretional so that only the wearer can see it. The apparent direction of this blob of light may be used by the wearer to aim a directional microphone system.
The size of the blob can be used to remind the wearer of the field of response of the microphone array.
Within this circular blob, the wearer can see sharply defined markings placed on or embedded in the glass between the light source and the eye. A display in which material is seen sharply or somewhat sharply within the blurry appearance of a, light source is called a blurry display or blurry information display. In many ways, a conventional in-focus display is to a blurry information display what a conventional camera with lens is to a pinhole camera. Blurry information displays typically have infinite depth of field, in the sense that as long as the eye cannot focus on the light source, the subject matter is visible in equally sharp focus or detail regardless of where the eye is focused.
A display in which there is at least one light source embedded in a transparent or partially transparent material through which the user of the display looks at other objects is called a vitrionic display. A vitrionic display may also be a blurry display, if, for example, the light source is a point source of light in the field of view view of the user and is too close to the user's eye for the user to focus on, or if there are optics to ensure that the source of light is out of focus.
These kinds of displays may be used as aiming aids, for aiming microphone arrays included in the eyeglass frames, or elsewhere on the body.
D
Accordingly, the present invention in one aspect comprises an aimer for aiming a microphone array and a processor for processing to form a beam with listen direction corresponding to the look direction of the aimer.
According to another aspect of the invention, there is provided a separate mi-crophone array and aimer housing, the relative orientation of which is known to the processor, and used to adjust the beamforming in accordance with the look direction.
According to another aspect of the invention, there is provided a symbiotic com-munity of users each wearing the apparatus of the invention, and using each other's microphones to collectively beamform for at least one member of the community.
According to another aspect of the invention, there is provided a method of sound capture comprising the steps of sighting a desired target in the aimer, and locking onto the target with a beamforming system.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention will now be described in more detail, by way of examples which in no way are meant to limit the scope of the invention, but, rather, these examples will serve to illustrate the invention with reference to the accompanying drawings, in which:
FIG. 1 illustrates the principle and components of the sound capture system fitted into typical flat-top bifocal eyeglasses.
FIG. 1a illustrates the invention using microphones pointed off axis.
FIG. lb illustrates the microphone processor of the invention.
FIG. lc illustrates separate microphone array and aimer.
FIG. ld illustrates a community of users in symbiosis.
FIG. 2 illustrates an older style of bifocal eyeglasses in which the cut line runs along the entire width of the lens.
FIG. 3 illustrates the concealment of an LED inside an eyeglass lens. for use as a listen direction aiming aid.
FIG. 4 illustrates how the concealed LED can be used to center a sound source in the field of maximum sensitivity of a microphone array also concealed within the eyeglasses.
FIG. 5 shows an Embodiment of the invention in which there are two LEDs con-cealed within an eyeglass lens, so that the left and right edges of a rectangular bound-ary of sound sensitivity can be made visible to the wearer of a microphone array.
FIG. 5a shows a top view, while FIG. 5b shows an inside view and FIG. 5c shows the view seen by the wearer when the apparatus is in operation.
FIG. 6 shows an embodiment of the wearable listen direction aiming aid invention in which the display contains four LEDs concealed in what appears to others like an ordinary eyeglass lens of ordinary trifocal construction.
FIG. 6a shows the inside view, and FIG. 6b shows the view as seen by the wearer, when the device is in operation.
FIG. 7a shows a vitrionic aimer based on reflected light.
FIG. 7b shows a vitrionic aimer based on reflected light from boundary points of aimer region.
FIG. 8 shows a manufacturing process for a vitrionic aimer.
FIG. 9 shows a camera-based head tracking aimer.
FIG. 10 shows an eye tracker aimer installed in a reality mediator.
FIG. 11 shows some examples of aimer viewscreens.
FIG. 12 shows a camera based aimer.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
While the invention shall now be described with reference to the preferred em-bodiments shown in the drawings, it should be understood that the intention is not to limit the invention only to the particular embodiments shown but rather to cover all alterations, modifications and equivalent arrangements possible within the scope of appended claims.
In all aspects of the present invention, references to "camera" mean any device or collection of devices capable of simultaneously determining a quantity of light ar-riving from a plurality of directions and or at a plurality of locations, or determining some other attribute of light arriving from a plurality of directions and or at a plu-rality of locations. Similarly references to "ainxing aid" shall not be limited to just viewfinders or miniaturized television monitors or computer monitors, but shall also include computer data display means, as well as fixed display means, where such fixed display means include crosshairs, graticules, reticles, brackets, etc.; and other video display devices, still picture display devices, ASCII text display devices, terminals, and systems that directly scan light onto the retina of the eye to form the perception of an image, as well as eye tracking devices.
References to "processor", or "computer" shall include sequential instruction, par-allel instruction, and special purpose architectures such as digital signal processing hardware, Field Programmable Gate Arrays (FPGAs), programmable logic devices, as well as analog signal processing devices.
References to "lens", in the context of eyeglasses, shall include the degenerate case of a "lens" of infinite focal length (zero reciprocal focal length which is often denoted as zero diopters of strength). Therefore, even the "pseudo intellectual" glasses sold strictly for fashion (e.g. having no corrective power) are considered to contain "lenses" .
References to "aimer" , refer to both passive as well as active aiming devices for aiming a microphone array at a desired sound source; such aimers may include active ultrasonic pointing devices, passive viewscreens, head trackers, and eyetrackers.
When it is said that object "A" is "borne" by object ''B", this shall include the possibilities that. A is attached to B, that A is part of B, that A is built, into B, or that A is B.
FIG. 1 is a diagram depicting a pair of bifocal eyeglasses, containing an enzbodi-ment of the invention. Eyeglasses are normally held upon the head of the wearer by way of temple side pieces 110. Microphones 120 are concealed inside the temple side pieces, and face outward through holes in the temple side pieces. These holes have the appearance of ordinary pop rivets, or the like, as might ordinarily be expected to hold the temple side pieces onto the frames 130. Alternatively, the microphones, or microphone openings may take on the appearance of other decorative or functional components of the eyeglasses. Within frames 130 are two lenses 140. Lenses 140 contain inset lenses 150 which may have different prescriptions than lenses 140 in which they are set. There is typically a cut line 160 between lens 140 and lens 150.
Typically lens 140 will provide a prescription for distant objects while inset lens 150 will provide a prescription for nearby objects. Lens 150 is commonly used for reading. Accordingly, lens 140 may, in some cases, have infinite focal length (zero power) and simply serve as a support for lens 150, which is typically a lens of positive focal length (e.g. a magnifying glass).
In some cases, lens 140 may be non existant, as in typical reading glasses, in which lens 150 is mounted directly to frame 130, and the wearer looks over frame 130 to see distant objects and looks through lens 150 to see nearby objects.
In many modern multifocal eyeglasses, there is no visible cut line 160, and instead there is a gradual transition from the prescription of lens 140 to that of lens 150. Such multifocal lenses are known as progressive. The purpose of such gradual transition is to accommodate a variety of distances, in situations where the wearer normally has an inability to focus over any appreciable range of distances, as well as for improved appearance. Since the need for multifocal lenses shows a deficency on the part of the user, there has been a trend toward hiding this deficency, just as there has been a trend toward contact lenses and laser eye treatments to eliminate the need for eyewear altogether. However, amid the desire among some to hide the fact that they need bifocals or even just ordinary glasses; there are others who like to wear eyeglasses.
Even some people who do not need eyeglasses often wear so-called pseudo-intellectual glasses, which are glasses in which lens 140 has infinite focal length.
~Toreover, bifocal eyeglasses and reading glasses a,re often associated with intellectuals, and thus there is a portion of society that would readily wear glasses having the general appearance of those depicted in FIG. 1, even if they did not require a prescription of any kind.
The eyeglass lenses 140 may also contain markings 1?0 made directly on the glass.
Such markings, for example, may be the manufacturer's nanxe or an abbreviation (such as the letters "GA" engraved on the left lens of Georgio Arnxani glasses). For illustration in this disclosure, the markings "L" and "R" denote left and right lenses, as labeled from the perspective of the wearer. Such markings will help make it clearer which lens and which side of the lens is being shown.
Vitrionics nxay be concealed in eyeglass lenses 140. Cut line 160 may be used to conceal vitrionics, or to mask the artifacts and regular structures of vitrionics and associated matter embedded in or on eyeglass lens 140.
Alternatively, frames 130 may be constructed of dark smoked plastic or the like, and be somewhat transparent but quite dark. In this way, vitrionics may be concealed in frames 130. This method of concealment is particularly useful when the eyeglasses are half glasses (reading glasses) in which the top portion of frames 130 passes directly in front of the central region of the eye of the wearer.
Concealment of vitrionics and associated optics allows an aimer to be built into eyeglass lens 140, so that the aimer can be used to aim the microphone array com-prised of microphones 120. In this way, the wearer of the eyeglasses can look directly at another person, and use the aimer to orient the eyeglasses so that the other person is centered in the region for which a processor maximizes the sensitivity of the micro-phone array, in at least an initial guess for processing parameters related to outputs of microphones 120.
A drawback of the apparatus of Fig 1 is the limited baseline of the microphones 120. The maximum spacing between microphones corresponds to the width of eyeglass frames 130 which is quite limited, especially in relation to lower frequency audio signals that might be of interest.
FIG. la is a diagram depicting a pair of bifocal eyeglasses, having at least some mi-crophones 120 mounted off--axis, so that they do not necessarily point in the forward direction toward a probably source of sound to be captured. In Fig. 1a, microphones 120 have a directionality depicted by rays 121 emenating therefrom. Some of micro-phones 120 may be mounted so they face upward and capture echoes off the ceiling of a typical room. Although there will not always be a ceiling present (e.g.
during outdoor use) there will be times when a ceiling is present, during which these upward facing microphones will assist. Other microphones may be aimed downward to cap-ture echoes off the floor or ground. Additionally some microphones may be mounted at the back of tenuple side pieces 110 to point directly away from the sound source of interest, and to help in null steering, to capture sound information from unwanted interference and help use this information to null out the interference therefrom.
Microphones that are upward or downward facing may be concealed behind the eyeglass frames.
FIG. 1b depicts the processing system for processing outputs from the micro-phones 120. In this example, there are depicted four microphones. Four microphones are often sufficient for many purposes (e.g. the well known documentary video Shoot-ingBack of http://wearcam.org/shootingback was shot using four microphones fol-lowed by a process of delay and summation of outputs therefrom), bizt of course more may be added for further improvements in sound quality. For simplicity, the process will be described for four microphones, where it is understood that this number is arbitrary. Initially, a first step of applying a delay to the output of a,t least three of the four microphones is done. In the preferred embodiment, the delays are done with four delay units, Dl, D2, D;3, and D4, since using three delays would require switching them around. With four delays, one of them can be set to zero delay.
For simplicity, later in this disclosure, Dx, 172, D3, and D4 will be referred to as delays D to denote the set of delay elements Dk, k; E {1, 2, 3, 4~, as well as for other numbers of delay elements not necessarily equal to four.
In the preferred embodiment, the delays are accomplished by moving samples for a discrete delay (to within the nearest sample) and by interpolation for subsample delays. An interpolation filter is designed using methods well known in the art.
The outputs of delays Dl, D2, D3, and D4, if simply summed, would maximize the signal strength in the look direction as aimed by the aimer in lens 140, if both the person wearing the apparatus and the sound source (e.g. the other person speaking) were both in an anechoic chamber. However, due to multiple echoes an d reverbera-tions in a typical room, the outputs of delays Dl, D2, D3, and D4 can be improved by further processing prior to summation.
A filter FIL is applied to these outputs to cancel echoes from each of the micro-phones, and to arrive at a best approximation to echo cancelled outputs.
Within FIL, echoes and reverberations entering the first microphone are removed, not only by way of the signal from this first microphone, but also by way of signals from the other microphones, which help in gathering information about these echoes.
Filter FIL also assists in localizing interference sources, for steering nulls thereto.
Additionally, other microphones such as may be mounted facing away from the desired sound source. further help in this null steering.
A plurality of outputs from FIL, which may number more than, less than, or the same as the number of inputs to filter FIL, emerge, and are individually filtered by filters Fx, F2, F3, and F4. Filters Fl, F2, F,~, a,nd F4, at the very least;
restore the 1?
low frequency components that are lost by FIL. Beamforming, null steering, and the like tend to give sound lacking in low frequency components (e.g. what the layperson might describe as ''tinny" or "thin" ). This lack of low frequency components is known to anyone in the art, or in fact, to many laypersons, and was portrayed in the movie "The Conversation" (Director Francis Ford Coppola), in which three microphones were used to capture sound, the sound being passed through delay (by way of manually starting and stopping three open reel tape recorders at different times) the output being summed through an audio mixer.
Therefore, at the very least, filters Fl, F2, F3, and F4 boost low frequencies with respect to high frequencies. Preferably filters Fl, F2, F3, and F4 more accurately undo any undesirable effects of filter FIL, while leaving the signals free of echo and reverberation.
Because microphones 120 are pointing in different directions, they will capture very different signal strengths. Rather than just average the delayed and processed signals of interest, or merely select the best desired signal of interest, a comparame-terizer 190 combines the signals optimally, based on relative certainty.
The comparameterizer 190 processes the signals based on comparametric equa-dons, to capture any nonlinearities in the signals processed thus far. The output of the comparameterizer comprises at least one clear sound signal, and preferably two clear sound signals, O - L and OR, for supplying to left and right earphones respectively.
Some or all of the processing components such as:
~ delays Dl, D2, D3, and D4;
~ filter FIL;
~ filters Fl, F2, F3, and F4;
~ comparameterizer 190 may be incorporated into a, wearable processor 7_99. In the preferred embodiment this wearable processor 199 is a WearComp as defined and described in the lead article of Proceedings of the IEEE, Vol. 86, No. 11, on Intelligent Signal Processing.
So far the microphones 120 have been fixed with respect to the aimer in lens 140.
Since the aimer and microphones are both held by frames 130 they cannot move with respect to one another. However, in other embodiments, some or all of the microphones are separate from the airner.
FIG. lc depicts an aimer separate from the microphone array. Microphones 120 are mounted in a chest plate 198 for wearing on the chest; and for being concealed under an acoustically transparent shirt. Chest plate 198 is ordinarily curved to fit the body, and may be flexible or rigid. In the preferred embodiment it is rigid so that the spatial relationship between microphones 120 is fixed. Chest plate 198 contains relative position and rotation sensors that will be called orientors.
Orientors 196 allow the position and rotation of chest plate 198 with respect to eyeglass frames 130 to be determined. Eyeglass frames 130 contain relative orienters 195. By way of communications link 110, processor 199 calculates the relative orientation between chest plate 198 and frames 130. Orinters 195 and 196 are connected to inputs of processor 199. Link 110 may be a radio link, infrared, or otherwise. In the preferred embodiment, link 110 is a wire concealed inside eyeglass safety straps. A
satisfactory safety strap is that which is sold under the trade name Croakies (TM). In the preferred embodiment, link 110 also carries electrical power to the eyeglasses to operate an aimer comprised of a vitrionic display.
In the preferred embodiment, orienters 195 comprise three relative rotation sensors and a relative displacement sensor. Likewise for orienters 196, each being responsive to relative yaw, pitch, and roll as well as a fourth sensor providing for the measurement of relative translation between chest plate 196 and frames 130.
The microphone placement on plate 198 can be regular, random, or structured (e.g. using the Golumb ruler principle to maximize the number of unique microphone spacing distances). Delays D are responsive to the relative orientation between frames 130 and plate 198, so that the initial guess for beam look direction of microphones 120 is that which is sighted in the aimer incorporated into lens 140. The aimer provides a guess as would form an ideal beam in the look direction if both wearer and sound source were located in an anechoic chamber. However, to make up for typical room acoustics, a least mean squares algorithm, or the like, may then be used to refine this guess, and further echo cancellation may subsequently be applied.
FIG. ld depicts a community of users in which at least two users assist at least one of the two at least two users. Suppose that a first user wearing eyeglass frames 130 and a second user wearing eyeglass frames 130a both combine efforts to assist the first user.
Two sets of microphones produce signals that are transmitted by antennas 110 and 110a to receiver Rx. R.eceiver Rx accepts signals from at least one microphone on each user's apparatus, and preferably multiple microphones from each user.
These microphone signals are delayed by Dl and D~, assuming only one microphone from each user (noting that in the preferred embodiment there are more microphones from each user). The delayed microphone signals a,re processed by filter FIL, the output of which goes to comparameterizer COM, to produce a final signal audible by the first user.
A processor determines relative orientation between orienter 195 in the first user's frames 130 and orienter 195a in the second user's frames 130a. This orientation information is used for the initial conditions of a beamforming process, as started by the first user sighting a target sound source. Ordinarily, with only one user, a far field approximation (e.g. that the arriving sound waves a,re planewa,ves) is sufficient.
However, with multiple users, a nearfield model is preferable. Accordingly, the aimer preferably provides range information (distance from the first user to the target sound source of interest). This may be found by way of an ultrasonic range finder.
An ultrasonic rangefin der apparatus can also be used to beam ultrasonic waves at the target sound source, and model the room acoustics at ultrasonic frequencies.
This rough model can then help the model at audible frequencies.
FIG. 2 depicts the left lens of an older style of bifocal eyeglasses. The cut line extends all the way across. Older styles of bifocal eyeglasses such as that depicted in FIG. 2 are becoming popular among some individuals, so that a eyeglasses with lenses like those depicted in FIG. 2 would not look particularly out of place.
Lens 240 ordinarily has a prescription suitable for looking at distant objects;
while lens 250 is suitable for looking at near objects. Lens 240 may have infinite focal length (zero power) if the glasses are intended to be worn by a person who has normal vision (e.g.
does not need corrective eyewear). Such a person may wear these glasses simply to facilitate doing fine work such as soldering, needlework, or the like, by virtue of lenses 250 which may simply serve the same purpose as ordinary magnifying glasses.
Alternatively, lenses 250 and 260 may both be of infinite focal length, and the cut line 260 therebetween may be simply for cosmetic purposes, e.g. to provide the wearer with the appearance of a traditional intellectual who might wear old-style bifocal eyeglasses.
Thus eyewear as depicted in FIG. 2 may be constructed to meet the prescription of one who requires bifocal glasses, or one who requires only ordinary unifocal glasses, or one who requires no glasses at all.
Accordingly, the eyewear as depicted in FIG. 2 may be constructed for nearly anyone, and may also be used as a basis in which to conceal additional apparatus.
FIG. 3 shows how an LED (light emitting diode) may be concealed within the lens material (such as glass or plastic) of an eyeglass lens 340 to form part of a vitrionic aimer. Lens 340 may comprise two separate prescriptions, one prescription, or no prescription at all, depending on the wearer's needs or lack thereof. A thin wire 310, preferably made of stainless steel or other material that is silver in colour, is embedded on or within the lens material 340. Wire 310 carries electrical current to the anode terminal of an LED 320 also embedded in the lens material. Wire 330 carries electrical current from the cathode terminal of the LED. (For purposes of this disclosure conventional current,, in which electricity flows from plus to minus, is used rather than electron current in which electrons flow from minus to plus.) LEDs normally come in clear plastic housings, which are quite large (e.g. on the order of three millimeters in diameter). However, the internals of an LED
may be embedded into the lens material 340, so that the lens material itself forms the protective housing around these internals. In this way, the LED 320 is too small to be easily seen by the unaided eye of someone who might be looking at the wearer's eyeglasses. Thus, so long as the LED 320 is not illuminated, it will remain essentially invisible to others.
A miniature shroud 321 is typically placed over LED 320 so that people other than the wearer of the glasses cannot easily see the light from LED 320. This shroud is preferably made in an irregular shape. so that it has the appearance of a spec of dust or small particle of dirt. The side facing the wearer (and behind the LED
fronx the wearer's perspective) is preferably black, while the side facing away from the wearer is preferably dust or dirt-<:oloured, and may comprise dirt or dust particles or other imperfections embedded into the glass.
As an alternative to an LED, a scintillating fiber optic or other light source em-bedded in the glass may be used.
Moreover, a cut line on the inside of the glass, such as crosshairs scratched into the glass, may be used to project an image of an aimer directly onto the retina of the wearer, so that it is in focus for all focal distan<:es of the wearer's own eye lens.
The lens depicted in FIG. 3 is a left lens, denoted by the letter "L'' . In actual embodiments of the invention, the vitrionic ainxer may be incorporated into either the left lens, the right lens, or both lenses. When the vitrionic aimer is incorporated into only one lens, typically the other lens will be a dummy lens having the same physical appearance as the lens containing the vitrionic aimer. Typically the dummy lens will contain at least copies of wires 310 and 330. These dummy copies of the wire will give the other lens the same appearance as the one that contains the vitrionic aimer.
In particular, both lenses will look like ordinary bifocal lenses.
Ordinarily, hearing aids, and the like, should be as unobtrusive as possible.
Con-ventional hearing' aids are often hidden in the ear, and concealed by long hair, and colored to match the skin and hair of the wearer for better concealment.
Accordingly, the purpose of the dummy copy of the wires, etc., is either to make the apparatus more normal looking (e.g. so it doesn't look different than traditional bifocal eye-glasses) or to provide visual symmetry in even noncovert versions of the apparatus.
Visual symmetry is a well known concept.. For example, in kitchen countertops with sinks, when there are drawers under the counter, there are often "dummy"
drawers under the sink that don't actually work, but that. provide a visually appealing balance to the actual functioning drawers on the other side of the counter. Thus dummy units provide a pleasing visual symmetry regardless of whether or not the apparatus of the invention is to be covert or concealed in function.
FIG. 4 depicts subject matter being captured using an embodiment of the vitrionic aimer of the invention. In this example, subject matter ma,y, for example;
comprise a person, defined by point 410.
An arbitrary point 410 on the sound source of interest radiates light in all direc-tions, and some of this light may be collected by a customer wearing eyeglasses in which hearing aid is concealed.
Light from 410 passes through the customer's eyeglasses, in particular, through a lens 340 of the customer's eyeglasses, and then through lens 420 of customer's eye 430. This light converges to a point 440 on the customer's retina. To the left of eye 430 is shown the image of the clerk upon the customer's retina, and point 450 of this image corresponds to point 410 of the subject matter.
LED 320 is located in eyeglass lens 340 which is very close to the wearer's eye.
Humans with normal healthy vision cart focus on objects that are between about inches (approximately lOcm) and infinity, away from the lens 420 in the eye 430.
Thus objects such as LED 320 which are closer than 4 inches must appear out of focus. LED 320 is so close to the eye, in fact, that it will appear extremely out of focus. The customer will not be able to see the LED in his eyeglasses, and in fact the customer will see a very large circular-shaped blob which is the out-of-focus image of the point source LED 320.
The circular disk that one sees from a point source of light that is out of focus is known as the ci~~cle of confr~sion.
Rays of light from LED 320 are denoted by dashed lines which eye lens 420 is too weak to focus, so that they spread out and strike the retina at 460, defining a circular disk of light 470. LED 320 is typically red or green, so that disk 470 appears as a large circle of red or green light.
The exact shape of this disk 470 is determined by the shape of the opening in the eye, and disk 470 will also show imperfections in the eye lens 420, such as dust on eye lens 420, or any irregularities in the eye iris of lens 420.
However, despite these irregularities, the circular blob 470 will indicate to the wearer the aiming direction in which the apparatus of the invention will initially begin searching from.
Thus the customer may make use of LED 320 embedded in eyeglass lens 340 to orient his wearable hearing aid in the direction of the subject matter, and to know that the subject matter is centered in the maximal gain point.
Moreover, the colour and state of the LED (e.g. whether the LED is flashing, and at what rate, and in the case of a multicolour LED whether it is red or green or whichever other colours it may assume) may convey additional information to the wearer of the apparatus, such as sound levels, and the like, which might be especially useful to the deaf, who could still determine that the apparatus was correctly cap-turing sound, and that the sound was of good quality and level. Sound quality and correct recording level are particularly important to speech recognizers and speech memory systems.
Alternatively, LED 320 may, for example, turn red to indicate that a recording device, speech recognizes, or transmitter is active. The LED may begin flashing when hard disk space on the recording device is almost full, or to notify a deaf wearer that a radio contact with an assistant has been made, so that the assistant can provide remote help. The rate of flashing may be used to indicate to the wearer how much disk space remains, or the quality of the connection, or may be used to produce Morse code output, of a speech recognizes or remote assistant.
In order that light from the LED that might be reflected off' of the inside surface of the glass 340, or off of the wearer's eye 430, is not seen by others (such as the clerk or the like), LED 320 is automatically adjusted in brightness in accordance with ambient light levels. Typically the camera is capable of measuring the quantity of light received, and also estimating the scene contrast, and from this information, provides a control voltage to LED 320 so that it becomes bright when necessary (such as outdoors on a sunny day) and darker when it does not need to be so bright (such as in a dimly lit corridor or stairwell of a department store).
FIG. 5 depicts another embodiment of the vitrioW c aimer in which two LEDs are concealed with wir ing along the cut line of (or where the cut line would be in) a bifocal eyeglass lens 340. LEDs 520 and 521 may be of similar construction to LED
320 and may be similarly shrouded so that others facing the wearer do mot readily see the light from LEDs 520 and 521.
Wire 510 carries electric current to LED 520, which is connected in series with LED 521 by wire 511, followed by wire 512 which completes the circuit. It is preferable that the LEDs be wired in series, so that a single current limiting resistor or drive circuit can drive both of them at equal brightness. (Wiring LEDs in parallel is known to provide unreliable and sometimes unpredictable results. ) FIG. 5a is a top view of FIG. 5, looking at the eyeglass lens 340 on edge. The surface of the glass 340 that faces away from the wearer is designated 341, while that facing toward the wearer is designated as 342. LEDs 520 and 521 are embedded inside the glass but located near surface 341. On the other surface 342 a,re scratch marks 550 and 551 which are constructed to look like part of the optical cut lines around normal bifocal insets.
These cut lines are made in an inside-out bracket shape.
FIG. 5b is an inside view of FIG. 5, looking at the eyeglass lens from the wearer's side.
FIG. 5c is an inside view of FIG. 5, looking at the eyeglass lens from the wearer's side, but showing how it appears when the LEDs 520 and 521 are turned on, and the glass is too close to a wearer's eye to focus on. Instead, light from LED 520 projects an image of scratch mark 550 directly onto the retina of the wearer's eye.
Since the image of scratch mark 550 is not inverted (e.g. since it is projected directly onto the retina), it will appear to the wearer as if it is inverted. This is because upright objects are normally presented inverted (upside down) on the retina, and this is what we are used to. (See for example, Stratton, 1896.) It is for this reason that the two halves of the brackets are each backwards.
What the wearer sees is inward-facing brackets as shown in FIG 5c. These are seen as dark lines within the circles of confusion 570 and 571. Circles of confusion 570 and 571 arise from LEDs 520 and 521 respectively, since each is a point source that is too close to the eye for the eye lens to focus on.
Brackets 550 and 551 are sufficient to provide an aimer for the wearer to see what sound sources will be within the microphone array's field of maximum coverage and what will not. Most notably, brackets 550 and 551 are made to match the 3dB
points in the central lobe of the array, when aimed dead center, if the apparatus were in an anechoic chamber. Of course the actual performance will vary in typical situations.
FIG. 6 shows an embodiment of the aimer in concealed in eyeglass lens 340 con-figured to appear as if it were an ordinary trifocal eyeglass lens.
The same series configuration of LEDs as that depicted in FIG 5 is used, but a second row higher up, in which wire 610 carries electricity to the anode terminal of LED 620 which is connected in series with LED 621 by wa,y of wire 611, a,nd in which wire 612 completes the circuit.
Each pair of LEDs has its own current limiting resistor or the like which is typically mounted in the eyeglass frames so that a single set of wires concealed within the frames can power the LEDs. These wires a,re typically connected to a waist-worn power supply and the wiring from the glasses to the power supply is typically concealed within an eyeglass safety strap. A satisfactory eyeglass safety strap for concealment of wiring is one sold under the trade name "Croakies" .
FIG. 6a shows the inside surface 342 of lens 340 after it has been marked for use with the four LEDs depicted in FIG 6. Four "L"-shaped scratches or similar marks are made on the inside surface 342 of eyeglass lens 340. L 650 will be seen in the upper left hand corner, L 651 will be seen in the upper right corner, L 652 will be seen in the lower left corner, and L 653 will be seen in the lower right corner of the aixner's field of view.
FIG. 6b shows the inside surface 342 of lens 340 after it has been marked for use with the four LEDs depicted in FIG 6, and when it is placed too close to the eye to focus on, and when further, all four LEDs are turned on.
Although each L appears in its proper place (e.g. the upper left L appears to the wearer to be situated at the upper left corner of the frame), each of them is inverted within its corresponding circle of confusion. LED 620 defines a circle of confusion 670. LED 621 defines a circle of confusion 671. LED 520 defines a, circle of confusion 672. LED 521 defines a circle of confusion 673.
Note that it is acceptable if these circles of confusion overlap. For example may overlap with 672, as may circle of confusion 671 overlap with 6?3.
However, so long as the overlap does not extend into the "L"-shaped marking the apparatus will work fine. For example, as long as circle of confusion 670 does not extend into L 652, then L 652 will continue to be clearly defined. (Otherwise, L 652 will be seen as a double image.) Accordingly, an improvement to the invention may be made by making the camera collect rays of light that are collinear with the rays of light enter ing the eye from the viewfinder.
FIG. 7a shows an embodiment of the vitrionic aimer of the invention in which very small mirrors 710 and 750 are used to reflect light from a single point source 700 (embedded, at least partially, in the lens material of an eyeglass lens) into an eye 430 of the wearer of the glasses. This arrangement is similar to that depicted in Fig. 5 and Fig. 6, arid tlms there rnay also be made markings on the inside surface 705 of the glass or plastic lens material as illustrated in Fig. 5abc and Fig. Gab.
Alternatively, mirrors 710 and 750 may be made either slightly curved, or made small enough that they generate rays of light 712 and 752 that behave sufficiently like single rays of light that they no longer form large circles of confusion, and instead pass directly through the center of the lens 420 of eye 430 to form small points of light 715 and 725 on the retina of eye 430.
The dashed lines 713 and 753 denote the surface normals of mirrors 710 and 750.
Because light is shining directly into eye 430, it can be very low in intensity such that others will not likely see the light that leaks out of the apparatus.
Alternatively, there may also be design of light source 700 to cause it to direct light primarily along rays 711 and 751 in the directions of mirrors 710 and 750, so that there is not stray light as might be the case if light source 700 radiated equally in all directions over the full 4~r steredians (1.0 spats) of solid angle.
There may be two mirrors 710 and 750 for the arrangement of light as was il-lustrated in Fig. 5, or there may be four mirrors 710 and 750. In the case of four mirrors, 710 denotes two mirrors one above the oiler, and so does 750, so that the arrangement of light is as depicted in Fig. 6.
FIG. 7b shows another embodiment of the wearable vitrionic aimer. Five mirrors 710, 720, 730, 740, and 750, define a row across the top of a rectangular viewframe.
Mirrors directly below these, also denoted by 710, 720, 730, 740, and 750, define the bottom of a rectangular viewframe. Mirr ors 710, also denote a stacking one above the other to define multiple vertical points of light, creating the left edge of the viewframe, and similarly for mirrors 750. In this case, point source 700 shines out along rays 711, 721, 731, 741, and 751 to illuminate respectively mirrors 710, 720. 730. 740, and 750.
The viewframe is comprised of rays of light converging at point 790 in the center of lens 420 of eye 430.
Thus the number of mirrors may be increased, or a single blazed mirror or blazed reflection grating may be used to generate each of the top and bottom of the viewframe. For the left and right sides, single tall slender mirrors 710 and 750 may be used.
Moreover, the vitrionics of the aimer may also be made responsive to the eye itself, as for example, the apparatus being made responsive to light in the reverse direction as well (coming back from the eye).
FIG. 8 shows an alternate embodiment of the wearable vitrionic aimer system in which a light sensitive material is used along surface 800, and is illuminated by point source 700 for an exposure. Point source 700 creates a cone of light between and including rays 811 and 821.. During the illumination of source 700, a. point source 805 that is coherent with source 700 creates a cone of light between and including rays 806 and 807. These rays of diverging light are converted to rays of converging light by lens 810 which is part of the manufacturing system. Since sources 700 and 805 are coherent and mutually coherent (they are normally part of a single laser source split two ways during manufacture, after which light source 700 becomes a single source for use different from that used during manufacture) an interference pattern is created along emulsion 800. Eye 430 is shown in Fig. 8 as dashed lines, indicating that it is absent during manufacture (exposure) of the device (emulsion).
After exposure, emulsion 800 is developed. A satisfactory development process is a curing process as may be obtained with DuPont photopolyrner, so that it is not necessary to remove emulsion 800 from within the glass or to soak it in conventional photographic film developer.
After development, emulsion 800 becomes a. grating that will give rise to light rays entering the eye when illuminated with light source 700, so that light source 805 and lens 810 are no longer needed. Thus in actual usage, lens 810 and source 805 will be absent, having been only present for manufacture.
Preferably expulsion 800 will take the form of two thin lines, one above the other, to form the top and bottom of the rectangular aimer screen, and there will also be two lines up and down to form the left and right sides of the aimer screen.
During manufacture, a high power laser of good coherence length may be used, while use after manufacture may be with a lower power laser 700 or even a noncoherent (but still narrowband of similar color) light source of lesser coherence length.
The process of making the reflection grating 800 is similar to the process of mak-ing a Denisyuk reflection hologram. However, an alternative embodiment of the viewfinder may be constructed for use in an edge---lit process, or some other process.
FIG. 9 shows an alternate embodiment of the microphone system. This micro-phone system also includes an added feature of a wearable camera system suitable for augmented reality and mediated reality (as described http://wearcam.org/mr.htm), as well as for more traditional forms of videography or audiovisual memory prosthesis or recall.
The wearable camera system serves double duty, functioning as both a head tracker for the aimer of the microphone beam, as well as a reality mediator. A camera 910, concealed within the nosebridge of the eyeglass frames 130, or a pair of cameras 910 concealed within the temple side pieces 110 of eyeglass frames 130. is connected to a head tracker 931 within a WearComp 999. The WearComp 999 is typically worn around the waist, in a shirt pocket, or comprises components spread throughout an undershirt, as described in http://wearcam.org/procieee.html.
The output of head tracker 931 is a scene description, which is supplied to a coor-dinate transformer 932. The coordinate transformer maps from camera coordinates of camera 910, or camera pair 910 to eye coordinates. These eye coordinates are passed on to a graphics rendering system 933, and rendered in eye coordinates;
onto viewscreen 970. Again, viewscreen 970 is preferably embedded in lens 140 so that a light source also embedded in lens 140 can drive the vitrionic display of the invention.
A complete machine vision analysis of the scene, followed by a complete graphics rendering of the entire scene is typically too formidable a task for a small battery powered wearcomp 999.
Accordingly, in typical use of the invention, only a small fraction of the scene details are rendered. For example, in a speech recognizer, or in a wearable face rec-ognizes, only a virtual name tag is rendered. Because of the coordinate transformer, this name tag appears to the wearer as if it were attached to the subject, even though there is a discrepancy between camera location 910 and the eyeball location of the wearer. In the stereo version, two views are rendered, one for each eye, so that the name tag appears to hover in the same depth plane as the subject matter.
Light from viewscreen 970 is projected down through the glass to beamsplitter 960.
Some will pass through beanrsplitter 960, and unfortunately some will be reflected outward where people might be able to see it. Because of the desire that the apparatus be unobtrusive (ideally invisible to others), beamsplitter 960 must be a polarizing beamsplitter and the viewscreen 970 must be polarizing, and oriented appropriately, to minimize light reflecting off the beamsplitter 960.
A curved concave mirrorlike surface 961 reflects some light back up to the beam-splitter 960 and into the eye, while providing magnification as concave mirrorlike surfaces do. This mirrorlike surface is disguised as part of a bulge 950 in the eyeglass lens. This bulge may be the actual prescription of the wearer, or may be simply a magnifying region useful to anyone who does not normally- wear glasses but wishes to be able to do fine work. Alternatively, the bulge may be made so that it does not provide magnification, but simply looks like the inset lens of bifocal eyeglasses.
Mirrors 960 and 961 a,re partially silvered, and embedded in the glass in such a way as to appear as cut lines for inset bifocal lens 950. Although mirrors 960 and only need to be about as wide as they are deep, they may be extended across much further than necessary, so that they will look like normal bifocal c:ut lines.
What is described in Fig. 9 is a reality mediator, where there is a head tracker which also serves as the aimer. Microphone processor 935 initially directs what would be maximal response at subject matter corresponding to subject matter imaged at the center of viewscreen 970, were the wearer and subject matter both in an anechoic room. However, subsequent processing cancels echoes produced by room acoustics.
A model of the room acoustics is constructed for this purpose. The room acoustics model is responsive to the head tracker, so that directions from which various echoes come, are updated by way of the user's rotation of his or her head.
FIG. 10 shows an aimer comprising an eye tracker built into an EyeTap (TM) reality mediator, in eyeglasses having eight microphones. Only four of the eight mi-crophones are shown, since only the left half of the eyeglasses are shown.
Microphones 120 are in various orientations including forward pointing, downward pointing, and upward pointing. A microphone 120r is also rearward pointing. All eight microphones have separate connections to a microphone processor 1099.
The eye tracker serves double duty as both the aimer, as well as the focus and vergence controller. Eyetracker assembly 1010 (comprising camera and infrared light sources) illuminates and observes the eyeball by way of rays of light 1011 that partially reflect off beamsplitter 1020. Beamsplitter 1020 also allows the wearer to see straight through to mirror 1021 and thus see virtual light from viewfinder 1034. The eyetracker 1010 reports the direction of eye gaze and conveys this information as a signal 1012 to eye tracker processor 1030 which converts this direction into "X'' and "Y'' coordinates that correspond to the screen coordinates of viewfinder screen 1034. These "X"
and "Y" coordinates, which a,re expressed as signal 1031, indicate where on the viewfinder screen 1034 the wearer is looking. The eyetracker processor 1030 also supplies control signals to microphone processor 1099. Signal 1031 and the video output 1032 of camera 1073 are both passed to focus analyzer 1040. Focus analyzer 1040 selects a portion of the video signal 1032 in the neighbourhood around the coordinates specified by signal 1031. This neighbourhood is where the wearer is looking, and is assumed to be where the audiovisual attention should be focused. In this way, focus analyzer 1040 ignores video except in the vicinity of where the wearer of the apparatus is looking. Because the coordinates of the camera match the coordinates of the display (by way of the virtual light principle), the portion of video analyzed by focus analyzer 1040 corresponds to where the wearer is looking. The focus analyzer 1040 examines the high-frequency content of the video in the neighbourhood of where the wearer is looking, to derive an estimate of how well focused that portion of the picture is. This degree of focus is conveyed by way of focus sharpness signal 1041 to focus controller 1050 which drives, by way of focus signal 1051, the servo mechanism 1072 of camera 1073. Focus controller 1050 is such that it causes the servo mechanism 1072 to hunt around until it sharpness signal 1041 reaches a global or local maximum.
The focus analyzer 1040 and focus controller 1050 thus create a, feedback control system around camera 1073 so that it tends to focus on whatever objects) is (are) in the vicinity of camera and screen coordinates 1031. Thus camera 1073 acts as an automatic focus camera, but instead of always focusing on whatever is in the center of its viewfinder, it focuses on whatever is being looked at by the left. eye of the wearer.
In addition to driving the focus of the left camera 1073, focus controller also provides a control voltage 1052 identical to the control voltage of 1051.
Control signal 1052 drives servo mechanism 1078 of lens 1079, so that the apparent depth of the entire screen 1034 appears focused at the same depth as whatever object the wearer is looking at. In this way, all objects in the viewfinder appear in the depth plane of the one the wearer is looking at.
Focus controller 1050 provides further control voltages, 1053 and 1054 for the right eye camera and right eye viewfinder, where these signals 1053 and 1054 are identical to that of 1051. Moreover, focus controller 1050 provides the same control voltage to the vergence controller 1094 so that it can provide the control signal to angle the left and right assemblies inward by t;he correct amount, so that all focus and vergence controls are based on the depth of the object the left eye is looking at.
It is assumed left and right eyes are looking at the same object, as is normal for any properly functioning human visual system.
In other embodiments of the invention, it may be desired to know which object is of interest when there are multiple objects in the same direction of gaze, as might happen when the wearer is looking through a dirty glass window. In this case there are three possible objects of interest: the object beyond the window, the object reflected in the glass, and the dirt on the window. All three may be at different depth planes but in the same gaze direction.
An embodiment of the wearable system with a human-driven autofocus camera (e.g. driven by eye focus), could be made from an eye tracker that would measure the focus of the wearer's left eye. Preferably, however, two eyetrackers may be used, one on the left eye, and one on the right eye, in order to attempt to independently track each eye, and attempt to obtain a better estimate of the desired focus by way of the vergence of the wearer's eyes.
In operation, this embodiment of the invention is quite simple. Consider the situation where the wearer is trying to capture an investigative documentary in which the wearer is talking to two corrupt security guards. If it were not for the eye tracker, the sound quality might be impaired by the fact; that the wearer, wanting to place both guards in the field of view, would not be having either of them centered in the field of view. This would confuse the first stage of the beamforming microphone processor.
However, with the embodiment of the invention depicted in Fig. 10, the wearer can place both guards within the field of view of the camera 1073, and direct the attention of both the microphone processor and the camera focus back and forth between the two guards, as they take turns explaining to the wearer why video surveillance cameras have helped to reduce crime.
FIG. 11 shows twenty examples of aimer screens, arranged four across and five down. The upper left example corresponds to a display raster viewfinder in which an xterm (a text screen) fills the entire display field and matches the field of coverage of the microphone system, insofar as this is the field of coverage that the beamforming system will scan. The next example to the right of the xterm is one in which nine circles of confusion (out of focus light sources) form a perceived crosshair.
The upper rightmost viewframe corresponds to an embodiment of the invention in which two rows of five mirror fragments are embedded in the lens material of the eyeglasses to reflect light from a single point source into an eye of the wearer of the glasses, creating a total of ten apparent light sources that provide a clearly defined boundary in which the wearer can easily imagine a rectangular frame of the microphone beamforming scanning region.
Some of the other examples of viewframes depicted in this figure include aimer frames that have "L" shaped or rounded shapes to define the four corners of the rectangular field of beam scanning. Others are simply crosshairs in which the length and width of the crosshairs provide the wearer with an awareness of the rectangular field of interest.
The four examples along the bottom row are examples of rough shapes that are formed on the retina by shining a point source of light at diffraction gratings (e.g.
scratchings on or in the eyeglass lens material) which also convey a sense of a bound-ary.
Some viewframes, like the one in the lower right hand corner of the figure, are quite a bit more intense than others, so in these cases, the amount of light needs to be reduced (and in fact controlled more carefully to balance the amount of light in the scene) so that the viewframe does not obliterate the subject matter when it is too intense. Likewise the intensity of the light must be sufficient that the viewframe does not fade from visibility when the scene brightness is much higher. It is for this reason, especially when using some of the more dense viewframes like the one in the lower right hand corner of the figure, that AI=3C (automatic brightness control) of the aimer is so important to an optimal embodiment of the apparatus of the listen direction aiming and compositional apparatus of the invention.
The aimer screens of the invention are often not visible in their entirety, but, rather, are partially visible depending on where the eye is pointing. Thus when the wearer of the glasses looks up toward the upper left hand corner of the viewframe, this corner of the viewframe becomes visible but the other portions of the viewframe may not necessarily be visible. Nevertheless, the aimer screen in its entirety can be imagined, and it forms a basis for composition in the sense that as the wearer looks around the aixner screen, objects entering or leaving the region over which the beamformer will scan can be readily discerned.
FIG. 12a shows the camera based aimer and why it is desirable that it direct the microphone processor to adjust processing in response to for example, the turning of the head of the wearer of the apparatus. It is assumed that wearer 1290 and sound source 1200 will both move slowly relative to the ability of the sound processor to keep updating any changes in the model of the roonx acoustics or the like.
However, it is also assumed that rapid rotation of the wearer's head 1291 may outpace the processing capability of WearComp 999, or require a larger more capable WearComp 999, or use more of its limited resource. Accordingly, especially insofar as WearComp 999 may already be part of a reality mediator system that includes a camera based headtracker anyway, it is prudent, to use this information also to update the room model. Camera 910 which forms part of wearer 1290's computer mediated reality system, can therefore also be used to directly affect the processing of acoustic signals.
Camera 910 is depicted as head mounted, for clarity, while it will be understood that camera 910 would preferably be concealed inside eyeglasses, or be an EyeTap (TM) system in which the eye itself is, in effect, the camera 910.
Direct sound 1201 from source 1200 enters microphones 120. Indirect sound 1211, from sound 1210 heading toward the walls of the room, also enters microphones 120. A rear-facing microphone 120r receives primarily indirect sound an d is used by processors or processes in WearComp 999 to enhance the direct sound 1201 while attenuating indirect sound 1211. A relatively complicated acoustic model is created by WearComp 999 to account for the multiple reflections and flue processing for microphones 120 and 120r. Changes in head orientation are determined from camera 910 using the VideoOrbits methodology of http://wearcam.org/orbits also described in the lead article of Proceedings of the. IEEE, Vol. 86, No. 11.
These changes in head orientation adjust the listen direction of the microphone array, so that it matches the look direction change determined by camera 910.
Alter-natively, a face recognizer may also home in on the sound source 1200.
However, in the preferred embodiment, the VideoOrbits method is used.
Additionally, a microphone 1212 worn close to the mouth of wearer 1290, captures the wearer's own voice much more strongly than that of the sound source 1200.
A
signal from microphone 1212 is thus used by processor 999; to determine; by way of comparison to signals from the other microphones, when the sound is coming from the wearer. Although the sound from the wearer's voice may be stronger than that of sound source 1200 on all microphones, the microphone 1212 may also be used to null out, the wearer's own voice in the same way that microphone 120r which primarily receives unwanted echoes helps improve sound from microphones 120.
Thus a signal that is only weakly responsive to the wearer's own voice may be obtained. A separate signal that is strongly responsive to the wearer's voice may also be obtained. These separate signals may be processed separately, if, for example, processor 999 is to begin to understand a conversation between the wearer and another person. Each of these two signals may, of course, be used to help steer a.
null at the other, and therefore purify itself further. These two separate sides of a conversation may be transmitted over a standard video channel, as left and right sides of the stereo sound subcarrier portion of the video channel. Therefore, by way of antenna 1280, a remote entity, which may be a computer program or another pE:rson, may assist wearer 1290, either by way of speech for hearing in headset 1250, or a, display of information in eyeglasses 1233.
Both the beamforming and null steering are of course responsive to the aimer;
comprised of the VideoOrbits head tracker applied to the signal from camera 910.
FIG. 12b is a block diagram of the camera based aimer of Fig. 12a. Head tracker 1220 is responsive to an input from camera 910. Camera 910 and microphones 120 are fixed with respect to one another. Therefore, when camera 910 turns (with the wearer's head) so do microphones 120. Moreover, microphones 120 maintain the same spatial relationship with one another. Initially, room modeler 1230 has no effect, and models the room as an anechoic room, so that microphone processor 1240 accepts input from a plurality of microphones 120 and processes these inputs by delay, etc., what would be optimal processing in an anechoic chamber, to obtain signals 1241 for the wearer's speech and 1242 for the sounce source 1200. These signals are initially processed assuming that sound source 1200 is visually sighted and aligned directly in the center of wearer 1290's aimer. After an initial "lock on" of the source 1200, the room modeler 1230 optimizes for this locked on source 1200.
Then when head tracker 1220 detects a rotation of the head, a direction of arrival update signal, DOA, is sent immediately to microphone processor 1240 to track the target sound source 1200. At the same time, head tracker 1220 also updates the room model of room modeler 1230. In one preferred embodiment, head tracker 1220 does not update the room model in the room modeler 1230, but instead simply informs the room modeler 1230 that the wearer's head has rotated substantially, such that the model should be considered stale a,nd therefore should be rebuilt.
This preferred embodiment is suitable for investigative documentary of multiple subjects, such as capturing a conversation between two corrupt security guards or the like. Such discussions often take place standing still. The wearer stands at one place in a room, and two or more guards stand apart, facing the wearer. As the wearer turns to a first guard, the wearer centers the first guard in the aimer, and presses a small "lock on" button concealed in his or her jacket. This target sound source remains locked for a portion of the conversation until a second guard begins to speak. Then the wearer turns his or her lead to the other guard, and the head tracker 1220 senses this abrupt turning of the head, and informs the room nxodeler and microphone processor to update. Since it; is customary for multiple speakers to let each other speak (e.g. it is unusual for two or more people to speak at once), the system can adapt to (learn) the two sides of the conversation, as well as a third side, namely that of the wearer by way of microphone 1212. Thus the sy stem can output three signals, 1241, 1242, and 1243, where signal 1241 is the wearer's voice, 1242 is the first guard's and 1243 is the second guard's.
Since the first and second guard are not normally speaking at the same time, the system can simply switch an output between 1242 and 1243 as a crude but functional embodiment of the conversation separation aspect of the invention. This allows a three track recording system to capture separately the three sides of the conversation.
A speech recognizes running on WearComp 999, or offline, ca,n then produce an annotated transcript of the conversation and synchronize this with notes typed by wearer 1290 on a one handed keyboard, such as that manufactured by Handykey Corporation (www.handykey.com) concealed in a pocket. Thus a documentary video can be captured in which conversational elements are separated for easier indexing an annotation.
Moreover, such a system is of great benefit to one with memory disability, with regard to the photographic memory recall possible.
From the foregoing description, it will thus be evident that the present invention provides a design for a wearable camera with a viewfinder. As various clxanges can be made in the above embodiments and operating methods without departing from the spirit or scope of the invention, it is intended that all matter contained in the above description or shown in the accompanying drawings should be interpreted as illustrative and not in a limiting sense.
Variations or modifications to the design and construction of this invention, within the scope of the invention, may occur to those skilled in the art upon reviewing the disclosure herein. Such variations or modifications, if within the spirit of this invention, are intended to be encompassed within the scope of any claims to patent protection issuing upon this invention.
II~'b e'r a ,.r Patent Application ~" m;~~T
.x r ~~
of n .,~ "
W. Steve G. Mann ~' 00'0 v~3 for LOOK DIRECTION MICROPHONE SYSTEM WITH VISUAL
AIMING AID
of which the following is a specification:
FIELD OF THE INVENTION
The present invention pertains generally to an apparatus for use as a hearing aid, audio memory recall system, or the like.
BACKGROUND OF THE INVENTION
In U.S. Pat. No. 6002776, Directional acoustic signal processor and method therefor, December 14, 1999, Neal Ashok Bhadkamkar and John-Thomas Ngo describe a two stage directional microphone system. in which a first stage attempts to extract sources of signals as if the acoustic environment were anechoic and a second stage removes any residual crosstalk between the channels, which may be caused by echoes and reverberation.
In ''Advanced Beamforming Concepts: Source Localization Using the Bispectrum, Labor Transform, Wigner-Ville Distribution, and Nonstationary Signal Representa>-dons" , in Proceedings of the 25th Asilomar Conference on Signals, Systems, and Computers, vol. 2, 818-824, 1991, Jeffery C. Allen provides a. good overview of beam-forming, the Wigner distribution, and the like.
In "Signal Processing for a Cocktail Party Effect", Journal of the Acoustical So-ciety of America 50(2), pp. 656-660, 1971, O.M. Mracek Mitchell, Carolyn A.
Ross, and G.H. Yates provide a good overview of the so-called "Cocktail Party Effect" in which a human listener can focus attention onto any one desired sound source, amid a confusing plurality of sound sources, when the human listener is actually present at an event such as a cocktail party. or the like. This ability is often lost or greatly reduced when a human listener only a recording of the event, or listens to the event through a hearing aid, or other similar device.
Numerous other articles described time-delayed correlations and the like.
A boom mic, located directly in front of each speaker's mouth; each transmitting on a unique channel, would facilitate listening by the hearing impaired, in which the listener could switch from one channel to another to hear different individuals. This situation, however, requires that all, or at least some participants wear microphones.
While suitable in a lecture hall where a hearing impaired student might request that the professor wear a microphone, this situation is unrealistic in most normal day-to-day situations.
Microphone-array processing methods have been in existence for many years. Us-ing multiple microphones with spatial separation, a beamforming effect can create a strong "listen direction", by separately delaying the outputs of each microphone rel-ative to one another. These delays are chosen to remove differences in time of flight of sounds arriving from the listen direction (for simplicity, a "far field"
model is often assumed, in which sound sources are modeled as producing planar sound waves).
Signals arriving from the listen direction are aligned in time, and add construc-tively, to produce a strong output, whereas signals arriving from other directions do not add as constructively, and thus produce a weaker output, owing to the fact that they are temporally misregistered (mismisaligned).
Beamforming was used in a directional hearing aid invented by Widrow and Brear-ley.
Nunxerous other beamforming microphone systenns have been proposed, many that take into account multiple echoes and reverberations arising from a typical indoor environment, or the like.
In photography (and in movie and video production), it is desirable to capture events in a natural manner with minimal disruption and disturbance. Current state-of-the-art photographic or video apparatus, even in its most simple form, creates a disturbance to others and attracts considerable attention on account of the gesture of bringing the camera up to the eye, and, especially if a large microphone boom is used by an additional person who is responsible for sound quality. Even if' the size of the camera could be reduced to the point of being negligible (e.g. no bigger than the eyecup of a typical camera viewfinder, for example), the very gesture of holding a device up to, or bringing a device up to the eye is unnatural and attracts considerable attention, especially in establishments such as gambling casinos or department stores where photography is often prohibited. Moreover, if good quality sound is desired, a large cumbersome directional microphone is needed, perhaps with a sound technician or sound crew, or the participants need to wear microphones. Although there exist a variety of covert video cameras such a canxera concealed beneath the jewel of a necktie clip, cameras concealed in baseball caps, and cameras concealed in eyeglasses, these cameras tend to produce inferior sound quality, not just because of the technical limitations imposed by their snxall size, but, more importantly because they lack a means of accurately aiming both the camera and the microphone systems.
Because of the lack of good compositional and sound quality, investigative video and photojournalism made with such cameras su$'ers from poor composition.
SUMMARY OF THE INVENTION
In one aspect, this invention can be used to provide a hearing aid, where the user can aim the hearing aid by way of a visual aiming aid comprised of a viewscreen, so that the hearing direction of the hearing aid corresponds to the look direction of the wearer of the hearing aid when the wearer is looking at the desired sound source through the viewscreen.
In another aspect, a hearing aid is provided where the visual aiming aid comprises an eyetracker, so that the hearing direction corresponds to the look direction regard-less of whether or not the wearer is looking at the desired sound source through any such viewscreen.
In another embodiment, a wearable camera system with viewfinder provides the visual hearing direction aiming aid, so that the system can be used as an a,udiovideo-graphic memory recall system, or the like.
In another aspect of this invention there is provided a wearable eyeglass based de-vice allowing the wearer to view data, such as from the screen of a wearable computer (WearComp) system, and use the raster of the wearable computer display screen as a hearing direction aiming aid.
In another aspect of this invention there is provided a wearable eyeglass based device allowing the wearer to view electronically stored pictures captured along with speech recordings from the individual so pictured, such that the wearer can remember faces and names, and remember how to pronounce a person's name, by listening to a flashback of that person pronouncing his or her name.
In another aspect this invention provides a method of simultaneously aiming a camera and a microphone system listen direction in which both hands may be left free, and in which the direction in which the camera and listen direction of the microphone system is clearly indicated to the wearer of the apparatus of the invention by means of some marking that appears as if it were superimposed on the real subject matter within the scene.
In another aspect, this invention provides a user with a visual information display embedded in a clear transparent material, where the user can see the display while simultaneously looking through the clear transparent material, and where the user can use the display of this visual information as a listen direction aiming aid.
One of the intended uses of the invention is for a wearable camera for capturing video of exceptionally high sound quality. Th a device need not necessarily be covert.
In fact, it may be manufactured as a fashionable device that serves as both a visible crime deterrent, as well as a self~xplanatory (through its overt obviousness) tool for documentary videomakers and photojournalists.
There are several reasons why it might be desired to wear a videographic memory aid over a sustained period of time:
1. There is the notion of a personal visual diary of sorts.
2. There is the idea of being always ready. By constantly recording into a circular buffer, a retroactive record function, such as a button that instructs the device to "begin recording from five minutes ago" may be useful in personal safety (crime reduction) as well as in ordinary everyday usage, such as remembering how to pronounce a person's name by listening several times to that person pronouncing their own name.
3. There is the fact that the wearable videographic memory recall system after being worn for a long period of time, begins to behave as a true extension of the wearer's mind and body. As a result, the composition of video shot with the device is often impeccable without even the need for conscious thought or effort on the part of the user. Also, one can engage in other activities, and one is able to record the experience without the need to be encumbered by a camera, or even the need to remain aware, at a conscious level, of the camera's existence.
This lack of the need for conscious thought or effort suggests a new genre of documentary video characterized by long-term psychophysical adaptation to the device. The result is a very natural first-person perspective documentary, whose artistic style is very much as if a recording could be made from a video tap of the optic nerve of the eye itself, together with a tap into the ears of tile wearer. Events that may be so recorded include involvement in activities such as a cocktail party, that cannot normally be well recorded unobtrusively from a first-person perspective using cameras of the prior art. Moreover, a very natural personal memory device results, in which the process and use of the device is much more like one's own memory than like a separate recording device.
4. A computational system, either built into the wearable system, or worn on the body elsewhere and connected to the system, may be used to enhance sound. This may be of value to the hearing impaired. The computer may also perform other tasks such as sound recognition, such that it may be of value to those who are completely deaf. A speech recognizer, combined with a listen direction aiming aid, would allow a deaf person to aim the apparatus in the desired listen direction. Moreover, by combining the listen direction aiming aid with an output device, such as a wearable computer screen, the system functions as a true extension of the mind and body. For example, an eyetracker that observes where the wearer is looking at a wearable text screen, focuses the microphone listen direction there, under the realistic assumption that the wearer is looking at the person he or she is trying to listen to. Because the device is worn constantly, it may also funcaion as a photographicwideographic memory prosthetic, e.g. to help those with Alzheimer's disease (a degenerative disease in which mild forgetfulness progresses to severe memory loss) , or the like, recall and listen to previously captured material.
It is possible with this invention to provide a means for a user to experience additional information overlaid on top of his or her audio perception such that the information is relevant to the imagery being viewed by a camera included in the system.
D
The invention facilitates a new form of visual art, in which the artist rnay capture, with relatively little effort; a personal experience as experienced from his or her own perspective. With some practice, it is possible to develop a very steady body posture and mode of movement that best produces memories of the genre pertaining to this invention. Because the apparatus may be made lightweight and situated close to the head, there is not the protrusion associated with carrying a hand-held camera and microphone. Also because components of the apparatus of the invention are mounted very close to the head, in a manner that balances the weight distribution, the apparatus does not restrict the wearer's head movement or encumber the wearer appreciably. Mounting close to the head minimizes the moment of inertia about the rotational axis of the neck, so that the head can be turned quickly while wearing the apparatus. This arrangement allows one to easily record the experiences of ordinary day-to--day activities from a first-person perspective. Moreover, because both hands are free, much better balance and posture is possible while using the apparatus.
Anyone skilled in the arts of body movement control as is learned in the martial arts such as karate, as well as in dance, most notably ballet, will have little difficulty capturing exceptionally high quality memories using the apparatus of this invention.
With known video or movie cameras, the best operators tend to be very large peo-ple who have trained for many years in the art of smooth control of the cumbersome video or motion picture film cameras used, together with a sound technician who has also trained in the art of controlling a microphone boom or the like. In addition to requziring very people to optimally operate such production systems, various stabi-lization devices are often used, which make the apparatus even more cumbersome.
The apparatus of the invention may be optimally operated by people of any size.
Even young children can become quite proficient in the use of the wearable memory capture system.
A typical embodiment of the invention comprises one or two spatial light modu-lators or other display means built into a pair of bifocal eyeglasses or reading glasses, together with a microphone array. Sometimes one or more CCD (charge coupled device) image sensor arrays and appropriate optical elements are also included to provide a camera function. In the bifocal embodiment of the invention, typically a beamsplitter or a mirror silvered on both sides is used to combine the image of the aiming aid with the apparent position of the listen direction, or to display the extent of coverage of the camera. The beamsplitter is typically embedded into the eyeglass lens, and is made much wider than it needs to he, so that it appears to be a cut line in the eyeglass lens typical of bifocal eyeglasses. The aiming aid may comprise one or more of:
~ A reticle, graticule, rectangle, or other marking that appears to float within a portion of the field of view;
~ A display device that shows a video image, or some other dynamic information perhaps related to the video image coming from the camera;
~ An eye tracker that determines where the wearer is looking.
The invention allows a person to wear the apparatus continuously and therefore always end up with the ability to produce a memory from something that was expe-rienced a couple of minutes ago. This may be useful to everyone in the sense that we may not want to miss a great memory opportunity, and often great memory oppor-tunities only become known to us after we have had time to think about something we previously experienced.
Such an apparatus might also be of use in personal safety. Although there are a growing number of video surveillance cameras installed in the environment allegedly for "public safety" , there have been recent questions as to the true benefit of such centralized surveillance infrastructures. Most notably there have been several exam-ples in which such centralized infrastructure has been abused by the owners of it (as in roundups; detainment, and execution of peaceful demonstrators). Moreover, "pub-lic safety'' systems rnay fail to protect individuals against crimes committed by the organizations that installed the systems. The apparatus of this invention allows the storage and retrieval of memories by transmitting and recording then at one or more remote locations. Memories may be transmitted and recorded in different countries.
so that they would be difficult for the perpetrator of a crime to destroy, in the event that the perpetrator of a crime might wish to do so through the political machinery of a single country, or even through murder of the person who experienced the event.
Thus a shared memory system helps to prevent the destruction of an individual's own memory; and may even prevent so-called "hear no evil, so no evil, speak no evil"
murders.
The apparatus of the invention allows memories to be captured in a natural man-ner, without giving an unusual appearance to others (such as a potential killer or human rights violator).
Vitrionics provides an important aspect of the invention, namely the use of light sources embedded directly in glass, plastic, or the like, that a user can look through.
For example, a very small light emitting diode die (having a size that is on the order of the size of a grain of sand) embedded within a sheet of clear glass, when held close to the eye, will be sufficiently out of focus that it will not be seen by the wearer, or by other people observing the wearer, when it is turned off. Such a light source will behave like a grain of sand in the glass, and be relatively imperceptible. When switched on, it will appear to the wearer as a,n out of focus point, and the wearer will see a large circular blob of light, rather than a point of light. This point source of light may be made diretional so that only the wearer can see it. The apparent direction of this blob of light may be used by the wearer to aim a directional microphone system.
The size of the blob can be used to remind the wearer of the field of response of the microphone array.
Within this circular blob, the wearer can see sharply defined markings placed on or embedded in the glass between the light source and the eye. A display in which material is seen sharply or somewhat sharply within the blurry appearance of a, light source is called a blurry display or blurry information display. In many ways, a conventional in-focus display is to a blurry information display what a conventional camera with lens is to a pinhole camera. Blurry information displays typically have infinite depth of field, in the sense that as long as the eye cannot focus on the light source, the subject matter is visible in equally sharp focus or detail regardless of where the eye is focused.
A display in which there is at least one light source embedded in a transparent or partially transparent material through which the user of the display looks at other objects is called a vitrionic display. A vitrionic display may also be a blurry display, if, for example, the light source is a point source of light in the field of view view of the user and is too close to the user's eye for the user to focus on, or if there are optics to ensure that the source of light is out of focus.
These kinds of displays may be used as aiming aids, for aiming microphone arrays included in the eyeglass frames, or elsewhere on the body.
D
Accordingly, the present invention in one aspect comprises an aimer for aiming a microphone array and a processor for processing to form a beam with listen direction corresponding to the look direction of the aimer.
According to another aspect of the invention, there is provided a separate mi-crophone array and aimer housing, the relative orientation of which is known to the processor, and used to adjust the beamforming in accordance with the look direction.
According to another aspect of the invention, there is provided a symbiotic com-munity of users each wearing the apparatus of the invention, and using each other's microphones to collectively beamform for at least one member of the community.
According to another aspect of the invention, there is provided a method of sound capture comprising the steps of sighting a desired target in the aimer, and locking onto the target with a beamforming system.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention will now be described in more detail, by way of examples which in no way are meant to limit the scope of the invention, but, rather, these examples will serve to illustrate the invention with reference to the accompanying drawings, in which:
FIG. 1 illustrates the principle and components of the sound capture system fitted into typical flat-top bifocal eyeglasses.
FIG. 1a illustrates the invention using microphones pointed off axis.
FIG. lb illustrates the microphone processor of the invention.
FIG. lc illustrates separate microphone array and aimer.
FIG. ld illustrates a community of users in symbiosis.
FIG. 2 illustrates an older style of bifocal eyeglasses in which the cut line runs along the entire width of the lens.
FIG. 3 illustrates the concealment of an LED inside an eyeglass lens. for use as a listen direction aiming aid.
FIG. 4 illustrates how the concealed LED can be used to center a sound source in the field of maximum sensitivity of a microphone array also concealed within the eyeglasses.
FIG. 5 shows an Embodiment of the invention in which there are two LEDs con-cealed within an eyeglass lens, so that the left and right edges of a rectangular bound-ary of sound sensitivity can be made visible to the wearer of a microphone array.
FIG. 5a shows a top view, while FIG. 5b shows an inside view and FIG. 5c shows the view seen by the wearer when the apparatus is in operation.
FIG. 6 shows an embodiment of the wearable listen direction aiming aid invention in which the display contains four LEDs concealed in what appears to others like an ordinary eyeglass lens of ordinary trifocal construction.
FIG. 6a shows the inside view, and FIG. 6b shows the view as seen by the wearer, when the device is in operation.
FIG. 7a shows a vitrionic aimer based on reflected light.
FIG. 7b shows a vitrionic aimer based on reflected light from boundary points of aimer region.
FIG. 8 shows a manufacturing process for a vitrionic aimer.
FIG. 9 shows a camera-based head tracking aimer.
FIG. 10 shows an eye tracker aimer installed in a reality mediator.
FIG. 11 shows some examples of aimer viewscreens.
FIG. 12 shows a camera based aimer.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
While the invention shall now be described with reference to the preferred em-bodiments shown in the drawings, it should be understood that the intention is not to limit the invention only to the particular embodiments shown but rather to cover all alterations, modifications and equivalent arrangements possible within the scope of appended claims.
In all aspects of the present invention, references to "camera" mean any device or collection of devices capable of simultaneously determining a quantity of light ar-riving from a plurality of directions and or at a plurality of locations, or determining some other attribute of light arriving from a plurality of directions and or at a plu-rality of locations. Similarly references to "ainxing aid" shall not be limited to just viewfinders or miniaturized television monitors or computer monitors, but shall also include computer data display means, as well as fixed display means, where such fixed display means include crosshairs, graticules, reticles, brackets, etc.; and other video display devices, still picture display devices, ASCII text display devices, terminals, and systems that directly scan light onto the retina of the eye to form the perception of an image, as well as eye tracking devices.
References to "processor", or "computer" shall include sequential instruction, par-allel instruction, and special purpose architectures such as digital signal processing hardware, Field Programmable Gate Arrays (FPGAs), programmable logic devices, as well as analog signal processing devices.
References to "lens", in the context of eyeglasses, shall include the degenerate case of a "lens" of infinite focal length (zero reciprocal focal length which is often denoted as zero diopters of strength). Therefore, even the "pseudo intellectual" glasses sold strictly for fashion (e.g. having no corrective power) are considered to contain "lenses" .
References to "aimer" , refer to both passive as well as active aiming devices for aiming a microphone array at a desired sound source; such aimers may include active ultrasonic pointing devices, passive viewscreens, head trackers, and eyetrackers.
When it is said that object "A" is "borne" by object ''B", this shall include the possibilities that. A is attached to B, that A is part of B, that A is built, into B, or that A is B.
FIG. 1 is a diagram depicting a pair of bifocal eyeglasses, containing an enzbodi-ment of the invention. Eyeglasses are normally held upon the head of the wearer by way of temple side pieces 110. Microphones 120 are concealed inside the temple side pieces, and face outward through holes in the temple side pieces. These holes have the appearance of ordinary pop rivets, or the like, as might ordinarily be expected to hold the temple side pieces onto the frames 130. Alternatively, the microphones, or microphone openings may take on the appearance of other decorative or functional components of the eyeglasses. Within frames 130 are two lenses 140. Lenses 140 contain inset lenses 150 which may have different prescriptions than lenses 140 in which they are set. There is typically a cut line 160 between lens 140 and lens 150.
Typically lens 140 will provide a prescription for distant objects while inset lens 150 will provide a prescription for nearby objects. Lens 150 is commonly used for reading. Accordingly, lens 140 may, in some cases, have infinite focal length (zero power) and simply serve as a support for lens 150, which is typically a lens of positive focal length (e.g. a magnifying glass).
In some cases, lens 140 may be non existant, as in typical reading glasses, in which lens 150 is mounted directly to frame 130, and the wearer looks over frame 130 to see distant objects and looks through lens 150 to see nearby objects.
In many modern multifocal eyeglasses, there is no visible cut line 160, and instead there is a gradual transition from the prescription of lens 140 to that of lens 150. Such multifocal lenses are known as progressive. The purpose of such gradual transition is to accommodate a variety of distances, in situations where the wearer normally has an inability to focus over any appreciable range of distances, as well as for improved appearance. Since the need for multifocal lenses shows a deficency on the part of the user, there has been a trend toward hiding this deficency, just as there has been a trend toward contact lenses and laser eye treatments to eliminate the need for eyewear altogether. However, amid the desire among some to hide the fact that they need bifocals or even just ordinary glasses; there are others who like to wear eyeglasses.
Even some people who do not need eyeglasses often wear so-called pseudo-intellectual glasses, which are glasses in which lens 140 has infinite focal length.
~Toreover, bifocal eyeglasses and reading glasses a,re often associated with intellectuals, and thus there is a portion of society that would readily wear glasses having the general appearance of those depicted in FIG. 1, even if they did not require a prescription of any kind.
The eyeglass lenses 140 may also contain markings 1?0 made directly on the glass.
Such markings, for example, may be the manufacturer's nanxe or an abbreviation (such as the letters "GA" engraved on the left lens of Georgio Arnxani glasses). For illustration in this disclosure, the markings "L" and "R" denote left and right lenses, as labeled from the perspective of the wearer. Such markings will help make it clearer which lens and which side of the lens is being shown.
Vitrionics nxay be concealed in eyeglass lenses 140. Cut line 160 may be used to conceal vitrionics, or to mask the artifacts and regular structures of vitrionics and associated matter embedded in or on eyeglass lens 140.
Alternatively, frames 130 may be constructed of dark smoked plastic or the like, and be somewhat transparent but quite dark. In this way, vitrionics may be concealed in frames 130. This method of concealment is particularly useful when the eyeglasses are half glasses (reading glasses) in which the top portion of frames 130 passes directly in front of the central region of the eye of the wearer.
Concealment of vitrionics and associated optics allows an aimer to be built into eyeglass lens 140, so that the aimer can be used to aim the microphone array com-prised of microphones 120. In this way, the wearer of the eyeglasses can look directly at another person, and use the aimer to orient the eyeglasses so that the other person is centered in the region for which a processor maximizes the sensitivity of the micro-phone array, in at least an initial guess for processing parameters related to outputs of microphones 120.
A drawback of the apparatus of Fig 1 is the limited baseline of the microphones 120. The maximum spacing between microphones corresponds to the width of eyeglass frames 130 which is quite limited, especially in relation to lower frequency audio signals that might be of interest.
FIG. la is a diagram depicting a pair of bifocal eyeglasses, having at least some mi-crophones 120 mounted off--axis, so that they do not necessarily point in the forward direction toward a probably source of sound to be captured. In Fig. 1a, microphones 120 have a directionality depicted by rays 121 emenating therefrom. Some of micro-phones 120 may be mounted so they face upward and capture echoes off the ceiling of a typical room. Although there will not always be a ceiling present (e.g.
during outdoor use) there will be times when a ceiling is present, during which these upward facing microphones will assist. Other microphones may be aimed downward to cap-ture echoes off the floor or ground. Additionally some microphones may be mounted at the back of tenuple side pieces 110 to point directly away from the sound source of interest, and to help in null steering, to capture sound information from unwanted interference and help use this information to null out the interference therefrom.
Microphones that are upward or downward facing may be concealed behind the eyeglass frames.
FIG. 1b depicts the processing system for processing outputs from the micro-phones 120. In this example, there are depicted four microphones. Four microphones are often sufficient for many purposes (e.g. the well known documentary video Shoot-ingBack of http://wearcam.org/shootingback was shot using four microphones fol-lowed by a process of delay and summation of outputs therefrom), bizt of course more may be added for further improvements in sound quality. For simplicity, the process will be described for four microphones, where it is understood that this number is arbitrary. Initially, a first step of applying a delay to the output of a,t least three of the four microphones is done. In the preferred embodiment, the delays are done with four delay units, Dl, D2, D;3, and D4, since using three delays would require switching them around. With four delays, one of them can be set to zero delay.
For simplicity, later in this disclosure, Dx, 172, D3, and D4 will be referred to as delays D to denote the set of delay elements Dk, k; E {1, 2, 3, 4~, as well as for other numbers of delay elements not necessarily equal to four.
In the preferred embodiment, the delays are accomplished by moving samples for a discrete delay (to within the nearest sample) and by interpolation for subsample delays. An interpolation filter is designed using methods well known in the art.
The outputs of delays Dl, D2, D3, and D4, if simply summed, would maximize the signal strength in the look direction as aimed by the aimer in lens 140, if both the person wearing the apparatus and the sound source (e.g. the other person speaking) were both in an anechoic chamber. However, due to multiple echoes an d reverbera-tions in a typical room, the outputs of delays Dl, D2, D3, and D4 can be improved by further processing prior to summation.
A filter FIL is applied to these outputs to cancel echoes from each of the micro-phones, and to arrive at a best approximation to echo cancelled outputs.
Within FIL, echoes and reverberations entering the first microphone are removed, not only by way of the signal from this first microphone, but also by way of signals from the other microphones, which help in gathering information about these echoes.
Filter FIL also assists in localizing interference sources, for steering nulls thereto.
Additionally, other microphones such as may be mounted facing away from the desired sound source. further help in this null steering.
A plurality of outputs from FIL, which may number more than, less than, or the same as the number of inputs to filter FIL, emerge, and are individually filtered by filters Fx, F2, F3, and F4. Filters Fl, F2, F,~, a,nd F4, at the very least;
restore the 1?
low frequency components that are lost by FIL. Beamforming, null steering, and the like tend to give sound lacking in low frequency components (e.g. what the layperson might describe as ''tinny" or "thin" ). This lack of low frequency components is known to anyone in the art, or in fact, to many laypersons, and was portrayed in the movie "The Conversation" (Director Francis Ford Coppola), in which three microphones were used to capture sound, the sound being passed through delay (by way of manually starting and stopping three open reel tape recorders at different times) the output being summed through an audio mixer.
Therefore, at the very least, filters Fl, F2, F3, and F4 boost low frequencies with respect to high frequencies. Preferably filters Fl, F2, F3, and F4 more accurately undo any undesirable effects of filter FIL, while leaving the signals free of echo and reverberation.
Because microphones 120 are pointing in different directions, they will capture very different signal strengths. Rather than just average the delayed and processed signals of interest, or merely select the best desired signal of interest, a comparame-terizer 190 combines the signals optimally, based on relative certainty.
The comparameterizer 190 processes the signals based on comparametric equa-dons, to capture any nonlinearities in the signals processed thus far. The output of the comparameterizer comprises at least one clear sound signal, and preferably two clear sound signals, O - L and OR, for supplying to left and right earphones respectively.
Some or all of the processing components such as:
~ delays Dl, D2, D3, and D4;
~ filter FIL;
~ filters Fl, F2, F3, and F4;
~ comparameterizer 190 may be incorporated into a, wearable processor 7_99. In the preferred embodiment this wearable processor 199 is a WearComp as defined and described in the lead article of Proceedings of the IEEE, Vol. 86, No. 11, on Intelligent Signal Processing.
So far the microphones 120 have been fixed with respect to the aimer in lens 140.
Since the aimer and microphones are both held by frames 130 they cannot move with respect to one another. However, in other embodiments, some or all of the microphones are separate from the airner.
FIG. lc depicts an aimer separate from the microphone array. Microphones 120 are mounted in a chest plate 198 for wearing on the chest; and for being concealed under an acoustically transparent shirt. Chest plate 198 is ordinarily curved to fit the body, and may be flexible or rigid. In the preferred embodiment it is rigid so that the spatial relationship between microphones 120 is fixed. Chest plate 198 contains relative position and rotation sensors that will be called orientors.
Orientors 196 allow the position and rotation of chest plate 198 with respect to eyeglass frames 130 to be determined. Eyeglass frames 130 contain relative orienters 195. By way of communications link 110, processor 199 calculates the relative orientation between chest plate 198 and frames 130. Orinters 195 and 196 are connected to inputs of processor 199. Link 110 may be a radio link, infrared, or otherwise. In the preferred embodiment, link 110 is a wire concealed inside eyeglass safety straps. A
satisfactory safety strap is that which is sold under the trade name Croakies (TM). In the preferred embodiment, link 110 also carries electrical power to the eyeglasses to operate an aimer comprised of a vitrionic display.
In the preferred embodiment, orienters 195 comprise three relative rotation sensors and a relative displacement sensor. Likewise for orienters 196, each being responsive to relative yaw, pitch, and roll as well as a fourth sensor providing for the measurement of relative translation between chest plate 196 and frames 130.
The microphone placement on plate 198 can be regular, random, or structured (e.g. using the Golumb ruler principle to maximize the number of unique microphone spacing distances). Delays D are responsive to the relative orientation between frames 130 and plate 198, so that the initial guess for beam look direction of microphones 120 is that which is sighted in the aimer incorporated into lens 140. The aimer provides a guess as would form an ideal beam in the look direction if both wearer and sound source were located in an anechoic chamber. However, to make up for typical room acoustics, a least mean squares algorithm, or the like, may then be used to refine this guess, and further echo cancellation may subsequently be applied.
FIG. ld depicts a community of users in which at least two users assist at least one of the two at least two users. Suppose that a first user wearing eyeglass frames 130 and a second user wearing eyeglass frames 130a both combine efforts to assist the first user.
Two sets of microphones produce signals that are transmitted by antennas 110 and 110a to receiver Rx. R.eceiver Rx accepts signals from at least one microphone on each user's apparatus, and preferably multiple microphones from each user.
These microphone signals are delayed by Dl and D~, assuming only one microphone from each user (noting that in the preferred embodiment there are more microphones from each user). The delayed microphone signals a,re processed by filter FIL, the output of which goes to comparameterizer COM, to produce a final signal audible by the first user.
A processor determines relative orientation between orienter 195 in the first user's frames 130 and orienter 195a in the second user's frames 130a. This orientation information is used for the initial conditions of a beamforming process, as started by the first user sighting a target sound source. Ordinarily, with only one user, a far field approximation (e.g. that the arriving sound waves a,re planewa,ves) is sufficient.
However, with multiple users, a nearfield model is preferable. Accordingly, the aimer preferably provides range information (distance from the first user to the target sound source of interest). This may be found by way of an ultrasonic range finder.
An ultrasonic rangefin der apparatus can also be used to beam ultrasonic waves at the target sound source, and model the room acoustics at ultrasonic frequencies.
This rough model can then help the model at audible frequencies.
FIG. 2 depicts the left lens of an older style of bifocal eyeglasses. The cut line extends all the way across. Older styles of bifocal eyeglasses such as that depicted in FIG. 2 are becoming popular among some individuals, so that a eyeglasses with lenses like those depicted in FIG. 2 would not look particularly out of place.
Lens 240 ordinarily has a prescription suitable for looking at distant objects;
while lens 250 is suitable for looking at near objects. Lens 240 may have infinite focal length (zero power) if the glasses are intended to be worn by a person who has normal vision (e.g.
does not need corrective eyewear). Such a person may wear these glasses simply to facilitate doing fine work such as soldering, needlework, or the like, by virtue of lenses 250 which may simply serve the same purpose as ordinary magnifying glasses.
Alternatively, lenses 250 and 260 may both be of infinite focal length, and the cut line 260 therebetween may be simply for cosmetic purposes, e.g. to provide the wearer with the appearance of a traditional intellectual who might wear old-style bifocal eyeglasses.
Thus eyewear as depicted in FIG. 2 may be constructed to meet the prescription of one who requires bifocal glasses, or one who requires only ordinary unifocal glasses, or one who requires no glasses at all.
Accordingly, the eyewear as depicted in FIG. 2 may be constructed for nearly anyone, and may also be used as a basis in which to conceal additional apparatus.
FIG. 3 shows how an LED (light emitting diode) may be concealed within the lens material (such as glass or plastic) of an eyeglass lens 340 to form part of a vitrionic aimer. Lens 340 may comprise two separate prescriptions, one prescription, or no prescription at all, depending on the wearer's needs or lack thereof. A thin wire 310, preferably made of stainless steel or other material that is silver in colour, is embedded on or within the lens material 340. Wire 310 carries electrical current to the anode terminal of an LED 320 also embedded in the lens material. Wire 330 carries electrical current from the cathode terminal of the LED. (For purposes of this disclosure conventional current,, in which electricity flows from plus to minus, is used rather than electron current in which electrons flow from minus to plus.) LEDs normally come in clear plastic housings, which are quite large (e.g. on the order of three millimeters in diameter). However, the internals of an LED
may be embedded into the lens material 340, so that the lens material itself forms the protective housing around these internals. In this way, the LED 320 is too small to be easily seen by the unaided eye of someone who might be looking at the wearer's eyeglasses. Thus, so long as the LED 320 is not illuminated, it will remain essentially invisible to others.
A miniature shroud 321 is typically placed over LED 320 so that people other than the wearer of the glasses cannot easily see the light from LED 320. This shroud is preferably made in an irregular shape. so that it has the appearance of a spec of dust or small particle of dirt. The side facing the wearer (and behind the LED
fronx the wearer's perspective) is preferably black, while the side facing away from the wearer is preferably dust or dirt-<:oloured, and may comprise dirt or dust particles or other imperfections embedded into the glass.
As an alternative to an LED, a scintillating fiber optic or other light source em-bedded in the glass may be used.
Moreover, a cut line on the inside of the glass, such as crosshairs scratched into the glass, may be used to project an image of an aimer directly onto the retina of the wearer, so that it is in focus for all focal distan<:es of the wearer's own eye lens.
The lens depicted in FIG. 3 is a left lens, denoted by the letter "L'' . In actual embodiments of the invention, the vitrionic ainxer may be incorporated into either the left lens, the right lens, or both lenses. When the vitrionic aimer is incorporated into only one lens, typically the other lens will be a dummy lens having the same physical appearance as the lens containing the vitrionic aimer. Typically the dummy lens will contain at least copies of wires 310 and 330. These dummy copies of the wire will give the other lens the same appearance as the one that contains the vitrionic aimer.
In particular, both lenses will look like ordinary bifocal lenses.
Ordinarily, hearing aids, and the like, should be as unobtrusive as possible.
Con-ventional hearing' aids are often hidden in the ear, and concealed by long hair, and colored to match the skin and hair of the wearer for better concealment.
Accordingly, the purpose of the dummy copy of the wires, etc., is either to make the apparatus more normal looking (e.g. so it doesn't look different than traditional bifocal eye-glasses) or to provide visual symmetry in even noncovert versions of the apparatus.
Visual symmetry is a well known concept.. For example, in kitchen countertops with sinks, when there are drawers under the counter, there are often "dummy"
drawers under the sink that don't actually work, but that. provide a visually appealing balance to the actual functioning drawers on the other side of the counter. Thus dummy units provide a pleasing visual symmetry regardless of whether or not the apparatus of the invention is to be covert or concealed in function.
FIG. 4 depicts subject matter being captured using an embodiment of the vitrionic aimer of the invention. In this example, subject matter ma,y, for example;
comprise a person, defined by point 410.
An arbitrary point 410 on the sound source of interest radiates light in all direc-tions, and some of this light may be collected by a customer wearing eyeglasses in which hearing aid is concealed.
Light from 410 passes through the customer's eyeglasses, in particular, through a lens 340 of the customer's eyeglasses, and then through lens 420 of customer's eye 430. This light converges to a point 440 on the customer's retina. To the left of eye 430 is shown the image of the clerk upon the customer's retina, and point 450 of this image corresponds to point 410 of the subject matter.
LED 320 is located in eyeglass lens 340 which is very close to the wearer's eye.
Humans with normal healthy vision cart focus on objects that are between about inches (approximately lOcm) and infinity, away from the lens 420 in the eye 430.
Thus objects such as LED 320 which are closer than 4 inches must appear out of focus. LED 320 is so close to the eye, in fact, that it will appear extremely out of focus. The customer will not be able to see the LED in his eyeglasses, and in fact the customer will see a very large circular-shaped blob which is the out-of-focus image of the point source LED 320.
The circular disk that one sees from a point source of light that is out of focus is known as the ci~~cle of confr~sion.
Rays of light from LED 320 are denoted by dashed lines which eye lens 420 is too weak to focus, so that they spread out and strike the retina at 460, defining a circular disk of light 470. LED 320 is typically red or green, so that disk 470 appears as a large circle of red or green light.
The exact shape of this disk 470 is determined by the shape of the opening in the eye, and disk 470 will also show imperfections in the eye lens 420, such as dust on eye lens 420, or any irregularities in the eye iris of lens 420.
However, despite these irregularities, the circular blob 470 will indicate to the wearer the aiming direction in which the apparatus of the invention will initially begin searching from.
Thus the customer may make use of LED 320 embedded in eyeglass lens 340 to orient his wearable hearing aid in the direction of the subject matter, and to know that the subject matter is centered in the maximal gain point.
Moreover, the colour and state of the LED (e.g. whether the LED is flashing, and at what rate, and in the case of a multicolour LED whether it is red or green or whichever other colours it may assume) may convey additional information to the wearer of the apparatus, such as sound levels, and the like, which might be especially useful to the deaf, who could still determine that the apparatus was correctly cap-turing sound, and that the sound was of good quality and level. Sound quality and correct recording level are particularly important to speech recognizers and speech memory systems.
Alternatively, LED 320 may, for example, turn red to indicate that a recording device, speech recognizes, or transmitter is active. The LED may begin flashing when hard disk space on the recording device is almost full, or to notify a deaf wearer that a radio contact with an assistant has been made, so that the assistant can provide remote help. The rate of flashing may be used to indicate to the wearer how much disk space remains, or the quality of the connection, or may be used to produce Morse code output, of a speech recognizes or remote assistant.
In order that light from the LED that might be reflected off' of the inside surface of the glass 340, or off of the wearer's eye 430, is not seen by others (such as the clerk or the like), LED 320 is automatically adjusted in brightness in accordance with ambient light levels. Typically the camera is capable of measuring the quantity of light received, and also estimating the scene contrast, and from this information, provides a control voltage to LED 320 so that it becomes bright when necessary (such as outdoors on a sunny day) and darker when it does not need to be so bright (such as in a dimly lit corridor or stairwell of a department store).
FIG. 5 depicts another embodiment of the vitrioW c aimer in which two LEDs are concealed with wir ing along the cut line of (or where the cut line would be in) a bifocal eyeglass lens 340. LEDs 520 and 521 may be of similar construction to LED
320 and may be similarly shrouded so that others facing the wearer do mot readily see the light from LEDs 520 and 521.
Wire 510 carries electric current to LED 520, which is connected in series with LED 521 by wire 511, followed by wire 512 which completes the circuit. It is preferable that the LEDs be wired in series, so that a single current limiting resistor or drive circuit can drive both of them at equal brightness. (Wiring LEDs in parallel is known to provide unreliable and sometimes unpredictable results. ) FIG. 5a is a top view of FIG. 5, looking at the eyeglass lens 340 on edge. The surface of the glass 340 that faces away from the wearer is designated 341, while that facing toward the wearer is designated as 342. LEDs 520 and 521 are embedded inside the glass but located near surface 341. On the other surface 342 a,re scratch marks 550 and 551 which are constructed to look like part of the optical cut lines around normal bifocal insets.
These cut lines are made in an inside-out bracket shape.
FIG. 5b is an inside view of FIG. 5, looking at the eyeglass lens from the wearer's side.
FIG. 5c is an inside view of FIG. 5, looking at the eyeglass lens from the wearer's side, but showing how it appears when the LEDs 520 and 521 are turned on, and the glass is too close to a wearer's eye to focus on. Instead, light from LED 520 projects an image of scratch mark 550 directly onto the retina of the wearer's eye.
Since the image of scratch mark 550 is not inverted (e.g. since it is projected directly onto the retina), it will appear to the wearer as if it is inverted. This is because upright objects are normally presented inverted (upside down) on the retina, and this is what we are used to. (See for example, Stratton, 1896.) It is for this reason that the two halves of the brackets are each backwards.
What the wearer sees is inward-facing brackets as shown in FIG 5c. These are seen as dark lines within the circles of confusion 570 and 571. Circles of confusion 570 and 571 arise from LEDs 520 and 521 respectively, since each is a point source that is too close to the eye for the eye lens to focus on.
Brackets 550 and 551 are sufficient to provide an aimer for the wearer to see what sound sources will be within the microphone array's field of maximum coverage and what will not. Most notably, brackets 550 and 551 are made to match the 3dB
points in the central lobe of the array, when aimed dead center, if the apparatus were in an anechoic chamber. Of course the actual performance will vary in typical situations.
FIG. 6 shows an embodiment of the aimer in concealed in eyeglass lens 340 con-figured to appear as if it were an ordinary trifocal eyeglass lens.
The same series configuration of LEDs as that depicted in FIG 5 is used, but a second row higher up, in which wire 610 carries electricity to the anode terminal of LED 620 which is connected in series with LED 621 by wa,y of wire 611, a,nd in which wire 612 completes the circuit.
Each pair of LEDs has its own current limiting resistor or the like which is typically mounted in the eyeglass frames so that a single set of wires concealed within the frames can power the LEDs. These wires a,re typically connected to a waist-worn power supply and the wiring from the glasses to the power supply is typically concealed within an eyeglass safety strap. A satisfactory eyeglass safety strap for concealment of wiring is one sold under the trade name "Croakies" .
FIG. 6a shows the inside surface 342 of lens 340 after it has been marked for use with the four LEDs depicted in FIG 6. Four "L"-shaped scratches or similar marks are made on the inside surface 342 of eyeglass lens 340. L 650 will be seen in the upper left hand corner, L 651 will be seen in the upper right corner, L 652 will be seen in the lower left corner, and L 653 will be seen in the lower right corner of the aixner's field of view.
FIG. 6b shows the inside surface 342 of lens 340 after it has been marked for use with the four LEDs depicted in FIG 6, and when it is placed too close to the eye to focus on, and when further, all four LEDs are turned on.
Although each L appears in its proper place (e.g. the upper left L appears to the wearer to be situated at the upper left corner of the frame), each of them is inverted within its corresponding circle of confusion. LED 620 defines a circle of confusion 670. LED 621 defines a circle of confusion 671. LED 520 defines a, circle of confusion 672. LED 521 defines a circle of confusion 673.
Note that it is acceptable if these circles of confusion overlap. For example may overlap with 672, as may circle of confusion 671 overlap with 6?3.
However, so long as the overlap does not extend into the "L"-shaped marking the apparatus will work fine. For example, as long as circle of confusion 670 does not extend into L 652, then L 652 will continue to be clearly defined. (Otherwise, L 652 will be seen as a double image.) Accordingly, an improvement to the invention may be made by making the camera collect rays of light that are collinear with the rays of light enter ing the eye from the viewfinder.
FIG. 7a shows an embodiment of the vitrionic aimer of the invention in which very small mirrors 710 and 750 are used to reflect light from a single point source 700 (embedded, at least partially, in the lens material of an eyeglass lens) into an eye 430 of the wearer of the glasses. This arrangement is similar to that depicted in Fig. 5 and Fig. 6, arid tlms there rnay also be made markings on the inside surface 705 of the glass or plastic lens material as illustrated in Fig. 5abc and Fig. Gab.
Alternatively, mirrors 710 and 750 may be made either slightly curved, or made small enough that they generate rays of light 712 and 752 that behave sufficiently like single rays of light that they no longer form large circles of confusion, and instead pass directly through the center of the lens 420 of eye 430 to form small points of light 715 and 725 on the retina of eye 430.
The dashed lines 713 and 753 denote the surface normals of mirrors 710 and 750.
Because light is shining directly into eye 430, it can be very low in intensity such that others will not likely see the light that leaks out of the apparatus.
Alternatively, there may also be design of light source 700 to cause it to direct light primarily along rays 711 and 751 in the directions of mirrors 710 and 750, so that there is not stray light as might be the case if light source 700 radiated equally in all directions over the full 4~r steredians (1.0 spats) of solid angle.
There may be two mirrors 710 and 750 for the arrangement of light as was il-lustrated in Fig. 5, or there may be four mirrors 710 and 750. In the case of four mirrors, 710 denotes two mirrors one above the oiler, and so does 750, so that the arrangement of light is as depicted in Fig. 6.
FIG. 7b shows another embodiment of the wearable vitrionic aimer. Five mirrors 710, 720, 730, 740, and 750, define a row across the top of a rectangular viewframe.
Mirrors directly below these, also denoted by 710, 720, 730, 740, and 750, define the bottom of a rectangular viewframe. Mirr ors 710, also denote a stacking one above the other to define multiple vertical points of light, creating the left edge of the viewframe, and similarly for mirrors 750. In this case, point source 700 shines out along rays 711, 721, 731, 741, and 751 to illuminate respectively mirrors 710, 720. 730. 740, and 750.
The viewframe is comprised of rays of light converging at point 790 in the center of lens 420 of eye 430.
Thus the number of mirrors may be increased, or a single blazed mirror or blazed reflection grating may be used to generate each of the top and bottom of the viewframe. For the left and right sides, single tall slender mirrors 710 and 750 may be used.
Moreover, the vitrionics of the aimer may also be made responsive to the eye itself, as for example, the apparatus being made responsive to light in the reverse direction as well (coming back from the eye).
FIG. 8 shows an alternate embodiment of the wearable vitrionic aimer system in which a light sensitive material is used along surface 800, and is illuminated by point source 700 for an exposure. Point source 700 creates a cone of light between and including rays 811 and 821.. During the illumination of source 700, a. point source 805 that is coherent with source 700 creates a cone of light between and including rays 806 and 807. These rays of diverging light are converted to rays of converging light by lens 810 which is part of the manufacturing system. Since sources 700 and 805 are coherent and mutually coherent (they are normally part of a single laser source split two ways during manufacture, after which light source 700 becomes a single source for use different from that used during manufacture) an interference pattern is created along emulsion 800. Eye 430 is shown in Fig. 8 as dashed lines, indicating that it is absent during manufacture (exposure) of the device (emulsion).
After exposure, emulsion 800 is developed. A satisfactory development process is a curing process as may be obtained with DuPont photopolyrner, so that it is not necessary to remove emulsion 800 from within the glass or to soak it in conventional photographic film developer.
After development, emulsion 800 becomes a. grating that will give rise to light rays entering the eye when illuminated with light source 700, so that light source 805 and lens 810 are no longer needed. Thus in actual usage, lens 810 and source 805 will be absent, having been only present for manufacture.
Preferably expulsion 800 will take the form of two thin lines, one above the other, to form the top and bottom of the rectangular aimer screen, and there will also be two lines up and down to form the left and right sides of the aimer screen.
During manufacture, a high power laser of good coherence length may be used, while use after manufacture may be with a lower power laser 700 or even a noncoherent (but still narrowband of similar color) light source of lesser coherence length.
The process of making the reflection grating 800 is similar to the process of mak-ing a Denisyuk reflection hologram. However, an alternative embodiment of the viewfinder may be constructed for use in an edge---lit process, or some other process.
FIG. 9 shows an alternate embodiment of the microphone system. This micro-phone system also includes an added feature of a wearable camera system suitable for augmented reality and mediated reality (as described http://wearcam.org/mr.htm), as well as for more traditional forms of videography or audiovisual memory prosthesis or recall.
The wearable camera system serves double duty, functioning as both a head tracker for the aimer of the microphone beam, as well as a reality mediator. A camera 910, concealed within the nosebridge of the eyeglass frames 130, or a pair of cameras 910 concealed within the temple side pieces 110 of eyeglass frames 130. is connected to a head tracker 931 within a WearComp 999. The WearComp 999 is typically worn around the waist, in a shirt pocket, or comprises components spread throughout an undershirt, as described in http://wearcam.org/procieee.html.
The output of head tracker 931 is a scene description, which is supplied to a coor-dinate transformer 932. The coordinate transformer maps from camera coordinates of camera 910, or camera pair 910 to eye coordinates. These eye coordinates are passed on to a graphics rendering system 933, and rendered in eye coordinates;
onto viewscreen 970. Again, viewscreen 970 is preferably embedded in lens 140 so that a light source also embedded in lens 140 can drive the vitrionic display of the invention.
A complete machine vision analysis of the scene, followed by a complete graphics rendering of the entire scene is typically too formidable a task for a small battery powered wearcomp 999.
Accordingly, in typical use of the invention, only a small fraction of the scene details are rendered. For example, in a speech recognizer, or in a wearable face rec-ognizes, only a virtual name tag is rendered. Because of the coordinate transformer, this name tag appears to the wearer as if it were attached to the subject, even though there is a discrepancy between camera location 910 and the eyeball location of the wearer. In the stereo version, two views are rendered, one for each eye, so that the name tag appears to hover in the same depth plane as the subject matter.
Light from viewscreen 970 is projected down through the glass to beamsplitter 960.
Some will pass through beanrsplitter 960, and unfortunately some will be reflected outward where people might be able to see it. Because of the desire that the apparatus be unobtrusive (ideally invisible to others), beamsplitter 960 must be a polarizing beamsplitter and the viewscreen 970 must be polarizing, and oriented appropriately, to minimize light reflecting off the beamsplitter 960.
A curved concave mirrorlike surface 961 reflects some light back up to the beam-splitter 960 and into the eye, while providing magnification as concave mirrorlike surfaces do. This mirrorlike surface is disguised as part of a bulge 950 in the eyeglass lens. This bulge may be the actual prescription of the wearer, or may be simply a magnifying region useful to anyone who does not normally- wear glasses but wishes to be able to do fine work. Alternatively, the bulge may be made so that it does not provide magnification, but simply looks like the inset lens of bifocal eyeglasses.
Mirrors 960 and 961 a,re partially silvered, and embedded in the glass in such a way as to appear as cut lines for inset bifocal lens 950. Although mirrors 960 and only need to be about as wide as they are deep, they may be extended across much further than necessary, so that they will look like normal bifocal c:ut lines.
What is described in Fig. 9 is a reality mediator, where there is a head tracker which also serves as the aimer. Microphone processor 935 initially directs what would be maximal response at subject matter corresponding to subject matter imaged at the center of viewscreen 970, were the wearer and subject matter both in an anechoic room. However, subsequent processing cancels echoes produced by room acoustics.
A model of the room acoustics is constructed for this purpose. The room acoustics model is responsive to the head tracker, so that directions from which various echoes come, are updated by way of the user's rotation of his or her head.
FIG. 10 shows an aimer comprising an eye tracker built into an EyeTap (TM) reality mediator, in eyeglasses having eight microphones. Only four of the eight mi-crophones are shown, since only the left half of the eyeglasses are shown.
Microphones 120 are in various orientations including forward pointing, downward pointing, and upward pointing. A microphone 120r is also rearward pointing. All eight microphones have separate connections to a microphone processor 1099.
The eye tracker serves double duty as both the aimer, as well as the focus and vergence controller. Eyetracker assembly 1010 (comprising camera and infrared light sources) illuminates and observes the eyeball by way of rays of light 1011 that partially reflect off beamsplitter 1020. Beamsplitter 1020 also allows the wearer to see straight through to mirror 1021 and thus see virtual light from viewfinder 1034. The eyetracker 1010 reports the direction of eye gaze and conveys this information as a signal 1012 to eye tracker processor 1030 which converts this direction into "X'' and "Y'' coordinates that correspond to the screen coordinates of viewfinder screen 1034. These "X"
and "Y" coordinates, which a,re expressed as signal 1031, indicate where on the viewfinder screen 1034 the wearer is looking. The eyetracker processor 1030 also supplies control signals to microphone processor 1099. Signal 1031 and the video output 1032 of camera 1073 are both passed to focus analyzer 1040. Focus analyzer 1040 selects a portion of the video signal 1032 in the neighbourhood around the coordinates specified by signal 1031. This neighbourhood is where the wearer is looking, and is assumed to be where the audiovisual attention should be focused. In this way, focus analyzer 1040 ignores video except in the vicinity of where the wearer of the apparatus is looking. Because the coordinates of the camera match the coordinates of the display (by way of the virtual light principle), the portion of video analyzed by focus analyzer 1040 corresponds to where the wearer is looking. The focus analyzer 1040 examines the high-frequency content of the video in the neighbourhood of where the wearer is looking, to derive an estimate of how well focused that portion of the picture is. This degree of focus is conveyed by way of focus sharpness signal 1041 to focus controller 1050 which drives, by way of focus signal 1051, the servo mechanism 1072 of camera 1073. Focus controller 1050 is such that it causes the servo mechanism 1072 to hunt around until it sharpness signal 1041 reaches a global or local maximum.
The focus analyzer 1040 and focus controller 1050 thus create a, feedback control system around camera 1073 so that it tends to focus on whatever objects) is (are) in the vicinity of camera and screen coordinates 1031. Thus camera 1073 acts as an automatic focus camera, but instead of always focusing on whatever is in the center of its viewfinder, it focuses on whatever is being looked at by the left. eye of the wearer.
In addition to driving the focus of the left camera 1073, focus controller also provides a control voltage 1052 identical to the control voltage of 1051.
Control signal 1052 drives servo mechanism 1078 of lens 1079, so that the apparent depth of the entire screen 1034 appears focused at the same depth as whatever object the wearer is looking at. In this way, all objects in the viewfinder appear in the depth plane of the one the wearer is looking at.
Focus controller 1050 provides further control voltages, 1053 and 1054 for the right eye camera and right eye viewfinder, where these signals 1053 and 1054 are identical to that of 1051. Moreover, focus controller 1050 provides the same control voltage to the vergence controller 1094 so that it can provide the control signal to angle the left and right assemblies inward by t;he correct amount, so that all focus and vergence controls are based on the depth of the object the left eye is looking at.
It is assumed left and right eyes are looking at the same object, as is normal for any properly functioning human visual system.
In other embodiments of the invention, it may be desired to know which object is of interest when there are multiple objects in the same direction of gaze, as might happen when the wearer is looking through a dirty glass window. In this case there are three possible objects of interest: the object beyond the window, the object reflected in the glass, and the dirt on the window. All three may be at different depth planes but in the same gaze direction.
An embodiment of the wearable system with a human-driven autofocus camera (e.g. driven by eye focus), could be made from an eye tracker that would measure the focus of the wearer's left eye. Preferably, however, two eyetrackers may be used, one on the left eye, and one on the right eye, in order to attempt to independently track each eye, and attempt to obtain a better estimate of the desired focus by way of the vergence of the wearer's eyes.
In operation, this embodiment of the invention is quite simple. Consider the situation where the wearer is trying to capture an investigative documentary in which the wearer is talking to two corrupt security guards. If it were not for the eye tracker, the sound quality might be impaired by the fact; that the wearer, wanting to place both guards in the field of view, would not be having either of them centered in the field of view. This would confuse the first stage of the beamforming microphone processor.
However, with the embodiment of the invention depicted in Fig. 10, the wearer can place both guards within the field of view of the camera 1073, and direct the attention of both the microphone processor and the camera focus back and forth between the two guards, as they take turns explaining to the wearer why video surveillance cameras have helped to reduce crime.
FIG. 11 shows twenty examples of aimer screens, arranged four across and five down. The upper left example corresponds to a display raster viewfinder in which an xterm (a text screen) fills the entire display field and matches the field of coverage of the microphone system, insofar as this is the field of coverage that the beamforming system will scan. The next example to the right of the xterm is one in which nine circles of confusion (out of focus light sources) form a perceived crosshair.
The upper rightmost viewframe corresponds to an embodiment of the invention in which two rows of five mirror fragments are embedded in the lens material of the eyeglasses to reflect light from a single point source into an eye of the wearer of the glasses, creating a total of ten apparent light sources that provide a clearly defined boundary in which the wearer can easily imagine a rectangular frame of the microphone beamforming scanning region.
Some of the other examples of viewframes depicted in this figure include aimer frames that have "L" shaped or rounded shapes to define the four corners of the rectangular field of beam scanning. Others are simply crosshairs in which the length and width of the crosshairs provide the wearer with an awareness of the rectangular field of interest.
The four examples along the bottom row are examples of rough shapes that are formed on the retina by shining a point source of light at diffraction gratings (e.g.
scratchings on or in the eyeglass lens material) which also convey a sense of a bound-ary.
Some viewframes, like the one in the lower right hand corner of the figure, are quite a bit more intense than others, so in these cases, the amount of light needs to be reduced (and in fact controlled more carefully to balance the amount of light in the scene) so that the viewframe does not obliterate the subject matter when it is too intense. Likewise the intensity of the light must be sufficient that the viewframe does not fade from visibility when the scene brightness is much higher. It is for this reason, especially when using some of the more dense viewframes like the one in the lower right hand corner of the figure, that AI=3C (automatic brightness control) of the aimer is so important to an optimal embodiment of the apparatus of the listen direction aiming and compositional apparatus of the invention.
The aimer screens of the invention are often not visible in their entirety, but, rather, are partially visible depending on where the eye is pointing. Thus when the wearer of the glasses looks up toward the upper left hand corner of the viewframe, this corner of the viewframe becomes visible but the other portions of the viewframe may not necessarily be visible. Nevertheless, the aimer screen in its entirety can be imagined, and it forms a basis for composition in the sense that as the wearer looks around the aixner screen, objects entering or leaving the region over which the beamformer will scan can be readily discerned.
FIG. 12a shows the camera based aimer and why it is desirable that it direct the microphone processor to adjust processing in response to for example, the turning of the head of the wearer of the apparatus. It is assumed that wearer 1290 and sound source 1200 will both move slowly relative to the ability of the sound processor to keep updating any changes in the model of the roonx acoustics or the like.
However, it is also assumed that rapid rotation of the wearer's head 1291 may outpace the processing capability of WearComp 999, or require a larger more capable WearComp 999, or use more of its limited resource. Accordingly, especially insofar as WearComp 999 may already be part of a reality mediator system that includes a camera based headtracker anyway, it is prudent, to use this information also to update the room model. Camera 910 which forms part of wearer 1290's computer mediated reality system, can therefore also be used to directly affect the processing of acoustic signals.
Camera 910 is depicted as head mounted, for clarity, while it will be understood that camera 910 would preferably be concealed inside eyeglasses, or be an EyeTap (TM) system in which the eye itself is, in effect, the camera 910.
Direct sound 1201 from source 1200 enters microphones 120. Indirect sound 1211, from sound 1210 heading toward the walls of the room, also enters microphones 120. A rear-facing microphone 120r receives primarily indirect sound an d is used by processors or processes in WearComp 999 to enhance the direct sound 1201 while attenuating indirect sound 1211. A relatively complicated acoustic model is created by WearComp 999 to account for the multiple reflections and flue processing for microphones 120 and 120r. Changes in head orientation are determined from camera 910 using the VideoOrbits methodology of http://wearcam.org/orbits also described in the lead article of Proceedings of the. IEEE, Vol. 86, No. 11.
These changes in head orientation adjust the listen direction of the microphone array, so that it matches the look direction change determined by camera 910.
Alter-natively, a face recognizer may also home in on the sound source 1200.
However, in the preferred embodiment, the VideoOrbits method is used.
Additionally, a microphone 1212 worn close to the mouth of wearer 1290, captures the wearer's own voice much more strongly than that of the sound source 1200.
A
signal from microphone 1212 is thus used by processor 999; to determine; by way of comparison to signals from the other microphones, when the sound is coming from the wearer. Although the sound from the wearer's voice may be stronger than that of sound source 1200 on all microphones, the microphone 1212 may also be used to null out, the wearer's own voice in the same way that microphone 120r which primarily receives unwanted echoes helps improve sound from microphones 120.
Thus a signal that is only weakly responsive to the wearer's own voice may be obtained. A separate signal that is strongly responsive to the wearer's voice may also be obtained. These separate signals may be processed separately, if, for example, processor 999 is to begin to understand a conversation between the wearer and another person. Each of these two signals may, of course, be used to help steer a.
null at the other, and therefore purify itself further. These two separate sides of a conversation may be transmitted over a standard video channel, as left and right sides of the stereo sound subcarrier portion of the video channel. Therefore, by way of antenna 1280, a remote entity, which may be a computer program or another pE:rson, may assist wearer 1290, either by way of speech for hearing in headset 1250, or a, display of information in eyeglasses 1233.
Both the beamforming and null steering are of course responsive to the aimer;
comprised of the VideoOrbits head tracker applied to the signal from camera 910.
FIG. 12b is a block diagram of the camera based aimer of Fig. 12a. Head tracker 1220 is responsive to an input from camera 910. Camera 910 and microphones 120 are fixed with respect to one another. Therefore, when camera 910 turns (with the wearer's head) so do microphones 120. Moreover, microphones 120 maintain the same spatial relationship with one another. Initially, room modeler 1230 has no effect, and models the room as an anechoic room, so that microphone processor 1240 accepts input from a plurality of microphones 120 and processes these inputs by delay, etc., what would be optimal processing in an anechoic chamber, to obtain signals 1241 for the wearer's speech and 1242 for the sounce source 1200. These signals are initially processed assuming that sound source 1200 is visually sighted and aligned directly in the center of wearer 1290's aimer. After an initial "lock on" of the source 1200, the room modeler 1230 optimizes for this locked on source 1200.
Then when head tracker 1220 detects a rotation of the head, a direction of arrival update signal, DOA, is sent immediately to microphone processor 1240 to track the target sound source 1200. At the same time, head tracker 1220 also updates the room model of room modeler 1230. In one preferred embodiment, head tracker 1220 does not update the room model in the room modeler 1230, but instead simply informs the room modeler 1230 that the wearer's head has rotated substantially, such that the model should be considered stale a,nd therefore should be rebuilt.
This preferred embodiment is suitable for investigative documentary of multiple subjects, such as capturing a conversation between two corrupt security guards or the like. Such discussions often take place standing still. The wearer stands at one place in a room, and two or more guards stand apart, facing the wearer. As the wearer turns to a first guard, the wearer centers the first guard in the aimer, and presses a small "lock on" button concealed in his or her jacket. This target sound source remains locked for a portion of the conversation until a second guard begins to speak. Then the wearer turns his or her lead to the other guard, and the head tracker 1220 senses this abrupt turning of the head, and informs the room nxodeler and microphone processor to update. Since it; is customary for multiple speakers to let each other speak (e.g. it is unusual for two or more people to speak at once), the system can adapt to (learn) the two sides of the conversation, as well as a third side, namely that of the wearer by way of microphone 1212. Thus the sy stem can output three signals, 1241, 1242, and 1243, where signal 1241 is the wearer's voice, 1242 is the first guard's and 1243 is the second guard's.
Since the first and second guard are not normally speaking at the same time, the system can simply switch an output between 1242 and 1243 as a crude but functional embodiment of the conversation separation aspect of the invention. This allows a three track recording system to capture separately the three sides of the conversation.
A speech recognizes running on WearComp 999, or offline, ca,n then produce an annotated transcript of the conversation and synchronize this with notes typed by wearer 1290 on a one handed keyboard, such as that manufactured by Handykey Corporation (www.handykey.com) concealed in a pocket. Thus a documentary video can be captured in which conversational elements are separated for easier indexing an annotation.
Moreover, such a system is of great benefit to one with memory disability, with regard to the photographic memory recall possible.
From the foregoing description, it will thus be evident that the present invention provides a design for a wearable camera with a viewfinder. As various clxanges can be made in the above embodiments and operating methods without departing from the spirit or scope of the invention, it is intended that all matter contained in the above description or shown in the accompanying drawings should be interpreted as illustrative and not in a limiting sense.
Variations or modifications to the design and construction of this invention, within the scope of the invention, may occur to those skilled in the art upon reviewing the disclosure herein. Such variations or modifications, if within the spirit of this invention, are intended to be encompassed within the scope of any claims to patent protection issuing upon this invention.
Claims (20)
1. A look direction microphone system for use as a hearing aid, memory aid, or the like, said look direction microphone system comprising:
~ a plurality of transducers for receiving waves from at least one source, including, if present, echoes and reverberations thereof, and for generating at least, one signal in response thereto; wherein each of said plurality of transducers receives waves from said at least one source, including, if present, echoes and reverberations thereof, and for generating one of said at least one signal;
~ a wearable mount for holding said plurality of transducers in a fixed spatial relationship with respect to one another;
~ one or more delayers, each for receiving one of said plurality of signals and for generating a delayed signal in response thereto;
~ an aimer for being operably fixed relative to said wearable mount;
~ a signal processor for enhancing waves coming from a look direction of said aimer relative to waves coming from other directions, said signal processor responsive to inputs from said plurality of transducers
~ a plurality of transducers for receiving waves from at least one source, including, if present, echoes and reverberations thereof, and for generating at least, one signal in response thereto; wherein each of said plurality of transducers receives waves from said at least one source, including, if present, echoes and reverberations thereof, and for generating one of said at least one signal;
~ a wearable mount for holding said plurality of transducers in a fixed spatial relationship with respect to one another;
~ one or more delayers, each for receiving one of said plurality of signals and for generating a delayed signal in response thereto;
~ an aimer for being operably fixed relative to said wearable mount;
~ a signal processor for enhancing waves coming from a look direction of said aimer relative to waves coming from other directions, said signal processor responsive to inputs from said plurality of transducers
2. The look direction microphone system of claim 1, the processing performed by said signal processor comprising two steps, a first step of ignoring echoes and processing as if under anechoic conditions, and a second step of echo cancellation to compensate for room acoustics, said first step being directed at enhancing sound waves coming from a look direction of said aimer when said look direction microphone system is worn by a user of said look direction microphone system.
3. The look direction microphone system of claim 1, said aimer for aiming a beam of ultrasonic waves at said source, said microphones responsive to audible sound as well as ultrasonic waves, processing performed by said signal processor comprising a first step of modeling room acoustics at ultrasonic frequencies, to build an ultrasonic model, and a second step of using said ultrasonic; model as a starting value for an adaptive room acoustics model for capturing audible sound from said source.
4. The look direction microphone system of claim 1, where said aimer is a vitrionic display said vitrionic display including a. transparent material through which a user of said vitrionic display may look at objects within said user's field of view, and at least one light source embedded in said transparent material, said transparent material for fixing with respect to said wearable mount.
5. The look direction microphone system of claim 1, where said aimer comprises an eye tracker, said eye tracker for fixing with respect to said wearable mount
6. A conversation capture system using the look direction microphone system of claim 5, said conversation capture system for producing sounds from a plurality of sound sources, and for identifying among said plurality of sound sources, said identifying responsive to an output from said eye tracker.
7. The look direction microphone system of claim 1, where said aimer comprises a head tracker, said head tracker comprising a camera for fixing with respect to said wearable mount, said signal processor responsive to an input from said camera.
8. A conversation capture system using the look direction microphone system of claim 7, said conversation capture system for producing sounds from a plurality of sound sources, and for identifying among said plurality of sound sources, said identifying responsive to an output from said eye tracker.
9. The look direction microphone system of Claim 4, where said light source is a diode, said diode emitting light.
10. The look direction microphone system of Claim 9, where said diode is a solid state laser diode operating at a wavelength between 600 nm and 700 nm.
11. The look direction microphone system of Claim 4, including a plurality of light sources embedded in said transparent material.
12. The look direction microphone system of Claim 7 said wearable mount comprising eyeglass frames, said frames bearing a display, said display responsive to an input from said camera.
13. The look direction microphone system of Claim 4, where said transparent material is the lens material of an eyeglass lens.
14. The look direction microphone system of Claim 4 having a listen direction aligned with visibility of a circle of confusion of said at least one light source being visible to a wearer of said look direction microphone system.
15. The look direction microphone system of Claim 13 where said eyeglass lens contains at least one electric light source embedded in lens material of said eyeglass lens, and where said eyeglass lens also contains at least two pieces of wire, where said pieces of wire are also embedded at least partially inside lens material of said eyeglass lens, and where said pieces of wire are connected to at least two separate terminals of said light source.
16. The look direction microphone system of claim 15, where said pieces of wire are collinear.
17. The look direction microphone system of claim 15, where said eyeglass lens is mounted in one side of an eyeglass frame made for accepting two eyeglass lenses, and in which an eyeglass lens of essentially identical appearance, and containing at least dummy copies of said pieces of wire, is mounted in the other side of said eyeglass frame.
18. A look direction microphone system for use as a hearing aid; memory aid, or the like, said look direction microphone system comprising:
~ a plurality of transducers for receiving waves from at least one source, including, if present, echoes and reverberations thereof, and for generating at least one signal in response thereto, wherein each of said plurality of transducers receives waves from said at least one source, including, if present, echoes and reverberations thereof, and for generating one of said at least one signal;
~ a wearable mount for holding said plurality of transducers in a fixed spatial relationship with respect to one another;
~ a first orienter fixed with respect to said wearable mount;
~ one or more delayers, each for receiving one of said plurality of signals and for generating a delayed signal in response thereto;
~ an aimer;
~ a second orienter fixed with respect to said aimer;
~ a signal processor for enhancing waves coming from a look direction of said aimer relative to waves coming from other directions, said signal processor responsive to inputs from said plurality of transducers, said signal processor responsive to the relative orientation of said first orienter and second orienter.
~ a plurality of transducers for receiving waves from at least one source, including, if present, echoes and reverberations thereof, and for generating at least one signal in response thereto, wherein each of said plurality of transducers receives waves from said at least one source, including, if present, echoes and reverberations thereof, and for generating one of said at least one signal;
~ a wearable mount for holding said plurality of transducers in a fixed spatial relationship with respect to one another;
~ a first orienter fixed with respect to said wearable mount;
~ one or more delayers, each for receiving one of said plurality of signals and for generating a delayed signal in response thereto;
~ an aimer;
~ a second orienter fixed with respect to said aimer;
~ a signal processor for enhancing waves coming from a look direction of said aimer relative to waves coming from other directions, said signal processor responsive to inputs from said plurality of transducers, said signal processor responsive to the relative orientation of said first orienter and second orienter.
19. A look direction microphone system for use as a hearing aid, memory aid, collaboration aid, or the like, said look direction microphone system comprising:
~ a first set of transducers, said first set comprising one or more transducers for receiving waves from at least one source, including, if present, echoes and reverberations thereof, wherein each of said first set of transducers receives waves from said at least one source, including, if present, echoes and reverberations thereof;
~ a first wearable mount for holding said first set of transducers in a fixed spatial relationship with respect to one another;
~ a second set of transducers, said second set comprising one or more transducers for receiving waves from said at least one source, including, if present, echoes and reverberations thereof, wherein each of said second set of transducers receives waves from said at least one source, including, if present, echoes and reverberations thereof;
~ a second wearable mount for holding said second set of transducers in a fixed spatial relationship with respect to one another;
~ one or more delayers, each for receiving one of a plurality of signals from said first and second sets of transducers, and for generating a delayed signal in response thereto;
~ an aimer for being of known orientation with respect to at least one of said wearable mounts;
~ a signal processor for enhancing waves coming from a look direction of said aimer relative to waves coming from other directions, said signal processor responsive to inputs from said first set of transducers and said second set of transducers.
~ a first set of transducers, said first set comprising one or more transducers for receiving waves from at least one source, including, if present, echoes and reverberations thereof, wherein each of said first set of transducers receives waves from said at least one source, including, if present, echoes and reverberations thereof;
~ a first wearable mount for holding said first set of transducers in a fixed spatial relationship with respect to one another;
~ a second set of transducers, said second set comprising one or more transducers for receiving waves from said at least one source, including, if present, echoes and reverberations thereof, wherein each of said second set of transducers receives waves from said at least one source, including, if present, echoes and reverberations thereof;
~ a second wearable mount for holding said second set of transducers in a fixed spatial relationship with respect to one another;
~ one or more delayers, each for receiving one of a plurality of signals from said first and second sets of transducers, and for generating a delayed signal in response thereto;
~ an aimer for being of known orientation with respect to at least one of said wearable mounts;
~ a signal processor for enhancing waves coming from a look direction of said aimer relative to waves coming from other directions, said signal processor responsive to inputs from said first set of transducers and said second set of transducers.
20. A method of sound capture comprising the steps of:
~ visually sighting a sound source target in an aimer;
~ delaying outputs from a plurality of microphones, such that the relative delays correspond to a direction of arrival as indicated by said aimer;
~ supplying these delayed outputs to a processor for further processing.
~ visually sighting a sound source target in an aimer;
~ delaying outputs from a plurality of microphones, such that the relative delays correspond to a direction of arrival as indicated by said aimer;
~ supplying these delayed outputs to a processor for further processing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA002297344A CA2297344A1 (en) | 1999-02-01 | 2000-01-31 | Look direction microphone system with visual aiming aid |
Applications Claiming Priority (9)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA002261376A CA2261376A1 (en) | 1998-02-02 | 1999-02-01 | Means and apparatus for acquiring, processing, and combining multiple exposures of the same scene or objects to different illuminations |
CA2,261,376 | 1999-02-01 | ||
CA2,264,973 | 1999-03-15 | ||
CA002264973A CA2264973A1 (en) | 1998-03-15 | 1999-03-15 | Eye-tap for electronic newsgathering, documentary video, photojournalism, and personal safety |
CA002267877A CA2267877A1 (en) | 1998-04-14 | 1999-04-01 | Means and apparatus for collegial identification of persons such as officials asking for identification |
CA2,267,877 | 1999-04-01 | ||
CA2,280,425 | 1999-08-16 | ||
CA002280425A CA2280425C (en) | 1998-10-13 | 1999-08-16 | Aremac incorporating a focus liberator so that displayed information is in focus regardless of where the lens of an eye of a user is focused |
CA002297344A CA2297344A1 (en) | 1999-02-01 | 2000-01-31 | Look direction microphone system with visual aiming aid |
Publications (1)
Publication Number | Publication Date |
---|---|
CA2297344A1 true CA2297344A1 (en) | 2000-08-01 |
Family
ID=31892275
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002297344A Abandoned CA2297344A1 (en) | 1999-02-01 | 2000-01-31 | Look direction microphone system with visual aiming aid |
Country Status (1)
Country | Link |
---|---|
CA (1) | CA2297344A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2335425A2 (en) * | 2008-09-25 | 2011-06-22 | Alcatel-Lucent USA Inc. | Self-steering directional hearing aid and method of operation thereof |
CN104244157A (en) * | 2013-06-14 | 2014-12-24 | 奥迪康有限公司 | A hearing assistance device with brain-computer interface |
WO2015047593A1 (en) * | 2013-09-24 | 2015-04-02 | Nuance Communications, Inc. | Wearable communication enhancement device |
CN106797519A (en) * | 2014-10-02 | 2017-05-31 | 索诺瓦公司 | The method that hearing auxiliary is provided between users in self-organizing network and correspondence system |
US10698109B2 (en) | 2018-01-19 | 2020-06-30 | Sony Corporation | Using direction of arrival with unique audio signature for object location detection |
EP4085655A1 (en) * | 2020-01-03 | 2022-11-09 | Orcam Technologies Ltd. | Hearing aid systems and methods |
-
2000
- 2000-01-31 CA CA002297344A patent/CA2297344A1/en not_active Abandoned
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2335425A4 (en) * | 2008-09-25 | 2012-05-23 | Alcatel Lucent Usa Inc | Self-steering directional hearing aid and method of operation thereof |
EP2335425A2 (en) * | 2008-09-25 | 2011-06-22 | Alcatel-Lucent USA Inc. | Self-steering directional hearing aid and method of operation thereof |
US10743121B2 (en) | 2013-06-14 | 2020-08-11 | Oticon A/S | Hearing assistance device with brain computer interface |
CN104244157A (en) * | 2013-06-14 | 2014-12-24 | 奥迪康有限公司 | A hearing assistance device with brain-computer interface |
EP2813175A3 (en) * | 2013-06-14 | 2015-04-01 | Oticon A/s | A hearing assistance device with brain-computer interface |
US9210517B2 (en) | 2013-06-14 | 2015-12-08 | Oticon A/S | Hearing assistance device with brain computer interface |
CN104244157B (en) * | 2013-06-14 | 2020-03-17 | 奥迪康有限公司 | Hearing aid device with brain-computer interface |
US11185257B2 (en) | 2013-06-14 | 2021-11-30 | Oticon A/S | Hearing assistance device with brain computer interface |
WO2015047593A1 (en) * | 2013-09-24 | 2015-04-02 | Nuance Communications, Inc. | Wearable communication enhancement device |
US9848260B2 (en) | 2013-09-24 | 2017-12-19 | Nuance Communications, Inc. | Wearable communication enhancement device |
CN106797519A (en) * | 2014-10-02 | 2017-05-31 | 索诺瓦公司 | The method that hearing auxiliary is provided between users in self-organizing network and correspondence system |
US10698109B2 (en) | 2018-01-19 | 2020-06-30 | Sony Corporation | Using direction of arrival with unique audio signature for object location detection |
EP4085655A1 (en) * | 2020-01-03 | 2022-11-09 | Orcam Technologies Ltd. | Hearing aid systems and methods |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10447966B2 (en) | Non-interference field-of-view support apparatus for a panoramic sensor | |
JP6094190B2 (en) | Information processing apparatus and recording medium | |
US10595149B1 (en) | Audio augmentation using environmental data | |
EP1064783B1 (en) | Wearable camera system with viewfinder means | |
CN101165538B (en) | Imaging display apparatus and method | |
US9927948B2 (en) | Image display apparatus and image display method | |
US6307526B1 (en) | Wearable camera system with viewfinder means | |
US9182598B2 (en) | Display method and display apparatus in which a part of a screen area is in a through-state | |
EP1625745B1 (en) | Mirror assembly with integrated display device | |
US20020085843A1 (en) | Wearable camera system with viewfinder means | |
US20100074460A1 (en) | Self-steering directional hearing aid and method of operation thereof | |
CA2388766A1 (en) | Eyeglass frames based computer display or eyeglasses with operationally, actually, or computationally, transparent frames | |
JP2012029209A (en) | Audio processing system | |
JP2001522063A (en) | Eyeglass interface system | |
US11758347B1 (en) | Dynamic speech directivity reproduction | |
US20020067271A1 (en) | Portable orientation system | |
US12028419B1 (en) | Systems and methods for predictively downloading volumetric data | |
US20240098409A1 (en) | Head-worn computing device with microphone beam steering | |
US10674259B2 (en) | Virtual microphone | |
CA2297344A1 (en) | Look direction microphone system with visual aiming aid | |
TWI768175B (en) | Hearing aid system with radio scene switching function | |
CA2247649C (en) | Covert camera viewfinder or display having appearance of ordinary eyeglasses | |
CA2290765C (en) | Covert camera viewfinder or display having appearance of ordinary eyeglasses | |
EP4432053A1 (en) | Modifying a sound in a user environment in response to determining a shift in user attention | |
US11734905B1 (en) | Systems and methods for lighting subjects for artificial reality scenes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
FZDE | Discontinued |