[go: up one dir, main page]

WO2024125478A1 - 音频呈现方法和设备 - Google Patents

音频呈现方法和设备 Download PDF

Info

Publication number
WO2024125478A1
WO2024125478A1 PCT/CN2023/138019 CN2023138019W WO2024125478A1 WO 2024125478 A1 WO2024125478 A1 WO 2024125478A1 CN 2023138019 W CN2023138019 W CN 2023138019W WO 2024125478 A1 WO2024125478 A1 WO 2024125478A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio content
user
audio
information
posture information
Prior art date
Application number
PCT/CN2023/138019
Other languages
English (en)
French (fr)
Inventor
高玮隆
徐艺晨
金烨鑫
韩佳
Original Assignee
索尼(中国)有限公司
索尼集团公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 索尼(中国)有限公司, 索尼集团公司 filed Critical 索尼(中国)有限公司
Priority to CN202380083647.2A priority Critical patent/CN120322747A/zh
Publication of WO2024125478A1 publication Critical patent/WO2024125478A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/016Input arrangements with force or tactile feedback as computer generated output to the user
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/162Interface to dedicated audio devices, e.g. audio drivers, interface to CODECs

Definitions

  • the present disclosure relates to audio signal processing, and in particular to audio signal rendering.
  • the present disclosure provides for optimizing audio signal presentation, in particular optimizing audio signal presentation for a specific user.
  • the present disclosure also provides optimized interactive audio signal presentation.
  • a receiving-side device for interactive audio presentation comprising a processing circuit configured to: receive relevant information about audio content to be presented from a control-side device for interactive audio presentation, wherein the audio content to be presented comprises audio content set based on user posture information, and present the audio content, wherein presenting the audio content comprises presenting the audio content in a tactile manner.
  • a control-side device for interactive audio presentation comprising a processing circuit, configured to: obtain audio content presentation indication information, the audio content presentation indication information comprising indication information based on posture information of a user to whom the audio is to be presented, send relevant information of the audio content to be presented to a receiving-side device for audio interactive presentation, wherein the audio content to be presented comprises indication information based on posture information of a user to whom the audio is to be presented, The audio content of the posture information setting.
  • a method for a receiving side for interactive audio presentation comprising: receiving relevant information of audio content to be presented from a control side device for interactive audio presentation, wherein the audio content to be presented comprises audio content set based on user posture information, and presenting the audio content, wherein presenting the audio content comprises presenting the audio content in a tactile manner.
  • a control side method for interactive audio presentation comprising: obtaining audio content presentation indication information, the audio content presentation indication information including indication information based on posture information of a user to whom the audio is to be presented, and sending relevant information of the audio content to be presented to a receiving side device for audio interactive presentation, wherein the audio content to be presented includes audio content set based on the posture information of the user.
  • a device comprising at least one processor and at least one storage device, wherein the at least one storage device has program codes and/or instructions stored thereon, which when executed by the at least one processor may enable the at least one processor to perform the method described herein.
  • a storage medium storing program codes and/or instructions.
  • the program codes and/or instructions are executed by a processor, the method described herein may be performed.
  • a program product includes program codes and/or instructions.
  • the processor may perform the method as described herein.
  • a computer program comprising program codes and/or instructions, which when executed by a processor may cause the processor to perform the method as described herein.
  • FIG. 1 shows a conceptual diagram of audio presentation according to an embodiment of the present disclosure.
  • FIG. 2A shows a conceptual diagram of an interactive audio presentation according to an embodiment of the present disclosure.
  • FIG. 2B shows a flowchart of an interactive audio presentation according to an embodiment of the present disclosure.
  • FIG3A shows a block diagram of a receiving-side device for interactive audio presentation according to an embodiment of the present disclosure.
  • FIG3B shows a flowchart of a receiving-side method for interactive audio presentation according to an embodiment of the present disclosure.
  • FIG. 4A shows a block diagram of a control-side device for interactive audio presentation according to an embodiment of the present disclosure.
  • FIG4B shows a flowchart of a control-side method for interactive audio presentation according to an embodiment of the present disclosure.
  • FIG. 5 shows a conceptual flow chart of setting audio content to be presented according to an embodiment of the present disclosure.
  • 6A to 6C show schematic diagrams of exemplary gesture detection.
  • FIG. 7A is a schematic diagram showing an exemplary gesture of a first user (a performer or player) according to the present disclosure.
  • FIG. 7B is a schematic diagram showing an exemplary gesture of a second user (listener) according to the present disclosure.
  • FIG. 8 illustrates a graph of an exemplary audio conversion according to an embodiment of the present disclosure.
  • FIG. 9 shows an exemplary implementation of a receiving-side device according to an embodiment of the present disclosure.
  • FIG. 10 shows an exemplary implementation scenario according to an embodiment of the present disclosure.
  • FIG. 11 illustrates a block diagram showing an exemplary hardware configuration of a computer system capable of implementing an embodiment of the present disclosure.
  • first the terms “first”, “second”, etc. are used merely to distinguish elements or steps, but are not intended to indicate time sequence, priority or importance.
  • the system provides an improved interactive audio presentation solution, especially for the hearing impaired.
  • the present disclosure provides an improved audio presentation solution, in particular, capable of presenting audio content to hearing-impaired people in a tactile manner. More particularly, the audio content to be presented can be provided to the listener as corresponding vibrations via a tactile providing device worn on the listener.
  • the present disclosure proposes an improved interactive audio presentation solution, in particular, the audio content, such as the audio content being played, can be affected by detecting specific inputs of the listener, so that the audio content can be presented to the listener in a more user-desired manner, rhythm, etc. More particularly, the present disclosure proposes to affect the audio content according to gesture information, so as to achieve more convenient interaction.
  • the audio content mentioned in the context of the present disclosure can be in various suitable forms, and as an example can be related to audio, which can cover any suitable type of music signal, such as music melody, track, phoneme, sequence, sound effect, etc.
  • the audio content can correspond to a complete piece of music or a part thereof, or even to a music clip corresponding to a specific user input, such as a specific user gesture.
  • Fig. 1 shows a conceptual diagram of audio presentation according to an embodiment of the present disclosure.
  • the audio presentation according to an embodiment of the present disclosure is particularly suitable for hearing-impaired people, and the audio presentation can be implemented based on gesture information.
  • data/information related to audio presentation is collected.
  • gesture information of members participating in the audio presentation is collected.
  • the personnel participating in the audio presentation may include listeners, especially hearing-impaired people.
  • the personnel participating in the audio presentation may also include specific members responsible for the audio presentation, such as a host, DJ, performer, performer, etc.
  • the collected data/information related to the audio presentation may also include other data/information, such as parameter information of members (including such as identity ID, etc.), instructions for starting and/or stopping the audio presentation, other data/information related to the audio presentation control, etc.
  • audio processing is performed based on the collected data/information.
  • the audio content to be presented can be set based on the collected data/information, especially the gesture information, which will be described in detail later.
  • the set audio content to be presented is presented to the audience.
  • the audio content presentation can be achieved through tactile means.
  • audio presentation can also be achieved through other means. For example, it can be presented to the audience through video, visual effects, lighting effects, etc., so as to enrich the audio presentation effect.
  • Figure 2A shows a conceptual diagram of the interactive audio presentation according to an embodiment of the present disclosure
  • Figure 2B shows a flowchart of the interactive audio presentation according to an embodiment of the present disclosure.
  • the interactive audio presentation according to some embodiments of the present disclosure can be applicable to various application scenarios, such as live music scenes in various music bars and gathering places where music interactions are participated in.
  • a scene includes a first user and a second user.
  • the first user can be a person who is responsible for or leads the audio presentation in the scene, such as a host, DJ, etc., who can start, pause, end, set, and adjust the audio content to be presented to the user.
  • the second user can be the audio presentation object of the scene, such as a customer, participant, listener, etc. in a music bar or gathering place.
  • at least one of the first and second users can especially be a hearing-impaired person.
  • the first and second users may not be people participating in the scene, for example, they may be people who participate in music through the network, cloud, etc.
  • the audio content is set by acquiring the posture information of the first user, and then the set audio content is presented to at least one of the first user and the second user.
  • audio content can be generated or created based on the posture information of the first user, such as audio content corresponding to the posture information of the first user, such as audio content combined from audio units corresponding to the posture information of the first user.
  • the posture information of the first user may only indicate the start, pause, stop, etc. of the audio presentation, so that when the user posture information indicates the start, specific audio content can be presented, such as playing specific music, such as pre-set audio/music.
  • the audio content can be presented to the user in various appropriate ways, such as tactile, video, visual, etc. Special effects, lighting, etc. will not be described in detail here.
  • the audio content is adjusted by acquiring the gesture information of the second user, for example, the audio/music being played is adjusted, and then the adjusted audio content is presented to at least one of the first user and the second user.
  • the adjustment of the audio content can be implemented in various appropriate ways, such as adjusting the volume, melody, etc. of the audio playback, and such adjustment can be reflected in the tactile implementation accordingly.
  • audio data processing can be implemented in an appropriate manner, for example, it can be implemented by software, hardware, firmware, etc. It can be located on the control side of a system for audio presentation, and implemented by a control side device for audio presentation, such as a server, a control device, etc. in a network.
  • a control side device for audio presentation such as a server, a control device, etc. in a network.
  • the user to whom the audio is presented can correspond to the receiving side of the system, which can be configured with a receiving side device to receive audio content so as to present it to the user in an appropriate manner.
  • the receiving side device can cooperate with various presentation devices, such as tactile, visual effects, lighting and other presentation devices to present it to the user.
  • the presentation device can also be included in the receiving side device.
  • the receiving side devices for the first user and the second user may be different.
  • different receiving side devices and/or audio content presentation devices may be used for different users according to the needs of the users.
  • the receiving side devices for the first user and the second user may also be the same.
  • such a receiving side device and/or an audio content presentation device may be able to set functions separately, so that different function configurations can be set for different users according to the needs of the users, for example, certain functions may be turned on or off accordingly for different users.
  • the receiving-side device 300 includes a processing circuit 302, which is configured to receive relevant information about audio content to be presented, wherein the audio content to be presented may include audio content set based on user gesture information, and to cause the audio content to be presented, wherein causing the audio content to be presented may include causing the audio content to be presented in a tactile manner.
  • audio content may be provided to a receiving-side device in an appropriate manner, so that the audio content-related information may be in various appropriate forms accordingly.
  • the audio content can be in various suitable formats and directly provided to the receiving side as audio content related information.
  • the audio content is music to be played, which can be in various suitable music formats, such as mp3, midi, other suitable formats, etc., and is directly sent to the receiving side.
  • the audio content related information may be information indicating the audio content, such as an index of the audio content.
  • the audio content and the audio index may be pre-associated and stored, and the corresponding audio content may be called according to the audio index during the application process.
  • the audio content related information may be information/data obtained by converting the audio content.
  • the audio content can be converted into information/data suitable for the presentation method on the control side and then transmitted to the receiving side as relevant information.
  • the audio content to be presented can be set in various appropriate ways, including but not limited to generation, creation, adjustment, etc. In particular, it can be set based on the user's posture information.
  • the user's posture information includes at least one of the user's posture (including, for example, the posture of a specific part, the spatial position, etc.), and the posture motion information, wherein the posture motion information includes at least one of the posture motion trajectory and the motion acceleration.
  • the posture motion information may include the moving direction, moving speed, moving acceleration, moving frequency, etc. of a specific posture.
  • the user posture may include a specific gesture, a spatial position, etc.
  • the action of the gesture may refer to the action of the gesture, such as how a specific gesture swings, the speed of the swing, the direction of the swing, the frequency of the swing, etc.
  • the application scenario of the receiving side device may include various types of users, especially including the first user and the second user, as described above.
  • the audio content may include audio content set based on the posture information of at least one of the first user and the second user.
  • Fig. 5 shows a conceptual flow chart of the setting of the audio content to be presented according to an embodiment of the present disclosure.
  • the user's posture is acquired or detected, and information and/or data related to the user's posture is generated, thereby setting the audio content to be presented so as to be presented to the user.
  • the setting of the audio content to be presented can usually be implemented on the control side of the system, especially by the control side device.
  • the acquisition or detection of the user gesture may be performed in various appropriate ways.
  • the user's gesture may be acquired by video acquisition, image capture, and the like.
  • the user's movements may be acquired by a camera/camera, and then the user's gesture analysis may be performed from the acquired image or video of the user's movements to acquire information/data related to the user's gesture.
  • this may be achieved by camera motion capture, camera color capture, and the like.
  • a specific color or a specific label may be set on a specific part of the user, and then the movement of the corresponding part may be acquired by camera color recognition.
  • a patch of a specific color may be attached to at least one finger of the user, and then the information/data related to the user's finger gesture/movement may be captured by camera color recognition, as shown in FIG6A .
  • the pre-installed cameras in the party venue can be used to capture the postures of the users in the party scene; in remote scenarios, the postures of each user can be captured using their own dedicated cameras and then uploaded to the network; This can be done on the server side or in the cloud for gesture capture.
  • the user's gesture may be acquired through camera skeleton capture.
  • the movement of a specific part of the user's hand such as the overall outline of the finger, the skeleton, etc.
  • the movement state of the finger skeleton may be detected by a device such as a projector through a specific algorithm to acquire the finger gesture.
  • the user may wear a specific gesture capture device, such as a motion capture sensor, a gyroscope, etc., and then obtain information/data related to the user's gesture based on the data of the gesture capture device, as shown in Figure 6C.
  • the user gesture information to be obtained is the user's hand gesture information
  • the gesture capture device may include a motion capture device that can be worn on at least one finger of the user, and the gesture information is based on the gesture information of each finger wearing the motion capture device and/or a combination thereof.
  • the information/data related to the user's posture can be regarded as being obtained on the receiving side and provided to the control side for audio content setting.
  • the control side device is further configured to obtain the user's posture information determined by the posture capture device, and send the obtained user posture information to the control side device.
  • the receiving side device can also provide other appropriate information, such as parameter information of the user, such as user ID, etc.
  • gesture capture and conversion can be an exemplary implementation in a network scenario.
  • the user waves his hand in front of the camera, so that the movement of the user's fingers can be captured by the computer camera.
  • the movement state and trajectory of the user's fingers can be determined by comparing the pixel differences between adjacent pictures, and finger motion data is generated accordingly.
  • Such data can be represented and stored in various appropriate ways, for example, including the data number of each finger, and corresponding data, including but not limited to swing speed, swing position, time point, etc.
  • the user's finger posture, etc. can be determined in this way. This can be achieved in various ways known in the art, which will not be described in detail here.
  • the corresponding audio content is determined based on the determined motion data of the user's fingers, such as converting it into MIDI and musical processing.
  • corresponding audio content may be set based on the acquired relevant information/data of the user's gesture to be presented to the user.
  • the audio content to be presented may include audio content constructed based on audio units or a specific combination corresponding to the posture information of the first user.
  • the audio content may be set based on the association or correspondence between the posture data and the audio unit.
  • the audio unit may be a component unit of the audio content, for example, corresponding to at least one of a phoneme, a sequence of sounds, an audio segment, etc.
  • at least one of the user's gesture information may be obtained.
  • the user can generate audio content by performing a combination of audio units.
  • the audio units can be combined to generate audio content.
  • the audio units corresponding to the user's various gestures can be combined to obtain audio content to be presented.
  • the combined audio content can also be appropriately processed, such as filtering, smoothing, etc.
  • the correlation/correspondence between user gestures and audio units can be pre-constructed, for example, various gestures can be trained and corresponding audio units can be set for each gesture.
  • the first user can provide relatively fine gesture information, such as gesture information of multiple fingers, and perform corresponding control for audio content, such as controlling multiple tracks of audio, generating more accurate audio content, thereby more accurately presenting audio.
  • user gestures can be stored in a database in association with corresponding audio units.
  • User gestures and audio units can be stored in various appropriate ways. For example, each user gesture and the audio unit corresponding thereto can be stored in a list manner, in a mapping manner, etc.
  • user gestures, corresponding audio units, user gesture change modes, corresponding audio unit change modes, etc. can be included in the database, but are not limited thereto. As long as it is possible to generate audio content and/or change audio content based on the acquired user gestures from the data stored in the database.
  • the audio content setting can be performed in an appropriate manner.
  • a machine learning or deep learning algorithm can be used to set the audio content based on the posture data, so that the audio MIDI signal converted according to the posture data is better filtered and smoothed to enhance its musicality.
  • the machine learning or deep learning algorithm may include various algorithms known in the art, which will not be described in detail here.
  • the machine learning or deep learning algorithm may also be pre-trained based on the training data, and the training can be performed in various appropriate ways, which will not be described in detail here.
  • the trained AI model inputs the posture data and outputs a MIDI signal for presenting audio.
  • the trained AI model inputs the posture data of multiple users and the initial audio content on the performance side, and outputs an adjusted MIDI signal for presenting audio, so as to enable the audience to co-create music.
  • the audio content to be presented may include specific audio content specified by the user gesture information.
  • the specific user gesture may correspond to the specific audio content, so that when the specific user gesture is detected, the complete audio content can be directly sent to the receiving side device for presentation.
  • the user's gesture information may also correspond to audio content presentation indication information, which may, for example, indicate a specific operation of the audio presentation, such as start, pause, stop, etc., so that when the gesture information is detected, a corresponding operation may be performed on the audio content presentation.
  • audio content here may be pre-set or associated with the user's gesture.
  • FIG. 7A shows a schematic diagram of an exemplary posture of the first user (performer) according to the present disclosure, wherein different postures may correspond to different music presentation operations, such as continuous movement may correspond to performance, clenching a fist may correspond to drumming, and music recording start, pause, end, and other operations may also correspond to other postures.
  • different postures may correspond to different music presentation operations, such as continuous movement may correspond to performance, clenching a fist may correspond to drumming, and music recording start, pause, end, and other operations may also correspond to other postures.
  • it may correspond to the generation and/or creation of audio content dominated by the first user such as a performer, performer, host, etc. in the audio content presentation scenario.
  • the created or generated audio content can be presented to the user in various appropriate ways.
  • the relevant information of the audio content is converted into data suitable for an audio presentation device; and the converted data is provided to the audio presentation device.
  • the converted data can be driving data or input data of the audio presentation device, so that the audio presentation device can present the audio content to the user in a specific manner.
  • Data conversion can be achieved in various appropriate ways.
  • audio data can be changed into tactile data in various appropriate ways, such as using an analog signal method, an FFT (Fast Fourier Transform) filtering method, and the like.
  • FFT Fast Fourier Transform
  • Fig. 8 shows a schematic diagram of data conversion, wherein different types of music data are converted to obtain respective waveform data and used to drive audio presentation data. Since the obtained waveform data can often reflect the characteristics of different types of music data, the audio presentation device can also accurately present the characteristics, melody, etc. of the music to the user.
  • the receiving side device may be separated from the audio presentation device, and in other embodiments, the receiving side device may be integrated with the audio presentation device. In particular, the receiving side device may include the audio presentation device.
  • the audio content may be provided to the receiving side in various appropriate timing modes. In some embodiments, once a playable/presentable audio unit/segment is available based on the user's gesture information, it is sent to the receiving side. In other embodiments, a predetermined number of audio units/segments, or even the entire audio content, may be sent to the receiving side each time.
  • the audio content may also be audio content to be presented that is set in other ways, such as audio content that starts playing/presenting upon receiving a specific play/presentation instruction, or audio content that starts playing/presenting according to a preset order/instruction, such as audio content predetermined in a concert hall, on-site, etc., which will not be described in detail here.
  • the audio presentation device is a tactile sensation providing device, so that the audio presentation device is The device provides the audio content to the user in a tactile manner.
  • the tactile providing device may include a tactile feedback device worn on at least one of the user's hand, wrist, arm, etc., such as a glove, wristband, armband, etc., which can provide tactile feedback to at least one finger, back of hand, wrist, arm, etc. of the user.
  • the received information/data can be directly forwarded to the tactile device.
  • the receiving side device can convert the audio content into information/data suitable for a tactile presentation mode, and then forward it to the tactile device.
  • the receiving side device can include a conversion unit, which is configured to convert the information/data obtained for the audio presentation mode for audio presentation.
  • the tactile sensation providing device may be implemented in various appropriate ways.
  • the tactile sensation providing device may include a vibrator that can provide vibrations corresponding to the characteristics of the audio content, such as melody, to the user, so that the hearing-impaired person can feel the melody of the music.
  • the tactile sensation providing device can be implemented by an inertial actuator, a piezoelectric semiconductor transducer, an electro-active polymer actuator (EAP), etc., which will not be described in detail here.
  • each user finger may correspond to a specific audio track to set (eg, generate or influence) a different timbre of the audio content.
  • the tactile sensation providing device includes at least one tactile unit, wherein each tactile unit may correspond to a specific audio track in the audio content to be presented.
  • the tactile sensation providing device may include a tactile feedback device in the form of a glove or a finger sleeve, and a vibration motor may be provided for at least one finger component to provide vibration feedback.
  • the vibration motor of each finger component of the glove or finger sleeve style tactile feedback device may provide vibration feedback according to the sound intensity and rhythm of the corresponding audio track.
  • the tactile feedback providing device can be set to provide tactile feedback only for audio tracks that are difficult for the hearing-impaired person to hear, such as tactile feedback for audio tracks of a specific frequency (such as a high-frequency audio track).
  • a specific frequency such as a high-frequency audio track.
  • the correspondence between multiple fingers and multiple audio tracks can be predefined, so that the gesture of each collected finger can control the corresponding audio track, but the tactile feedback unit is only provided on the finger corresponding to the specific audio track, such as the high-frequency audio track, so that only the corresponding audio content of the specific audio track is tactilely fed back to the user.
  • the feedback can be mainly provided for audio frequencies and rhythms that the user is sensitive to.
  • the feedback device can be set to provide tactile feedback only for the audio track where the drum beats are located. This can enhance the user experience of the user feeling the music. For example, when the user listens to music, the user feels feedback at a specific rhythm melody, which further improves the user experience.
  • the tactile sensation providing device includes a glove or finger sleeve style tactile feedback device
  • it can also be appropriately configured to facilitate identification, control, and ease of user operation.
  • it can be pre-specified that a specific one or more fingers are dedicated to gesture control, while another one or more fingers are dedicated to tactile feedback.
  • the presentation device worn by the performer's finger may correspond to the audience.
  • the presentation device worn by the performer's finger may be the same as that worn by the audience, where the performer's finger corresponds to the audience's finger, for example, the same finger corresponds to the same track.
  • the performer's finger-worn device may be different from that worn by the audience, but the correspondence is pre-set.
  • the receiving side device can also make the audio content be presented to the user in other appropriate ways.
  • it can be presented to the user in the form of sound, video display, lighting, visual effects, etc.
  • the receiving side device can convert the audio content into information/data suitable for other presentation modes, and then provide it to be forwarded to the corresponding presentation device.
  • the audio content can be converted and used in various presentation devices in common.
  • audio content can be provided to the user in an audio manner.
  • the audio content can be played to the user through a speaker, etc.
  • the audio content can be further processed before playing, such as converting the audio content into low-frequency content suitable for hearing-impaired people.
  • necessary audio playback software, audio playback equipment, etc. may also be included, which will not be described in detail here.
  • a speaker can be a speaker of a portable device, a speaker in a theater scene, a speaker set up in a KTV, a bar, a gathering place, etc., or a speaker of other appropriate types, etc.
  • the audio content may be provided to the user in a video format.
  • it may be provided to the user via a video presentation device.
  • it may be presented to the user via various types of screens, such as a projector, a computer screen, a screen of a portable device, etc., in various appropriate videos.
  • Such videos may be video tracks, special effects, pictures, short videos, etc. corresponding to the audio content, and may be pre-set and stored.
  • audio can be presented through lighting effects, in particular, the lights of the presentation device can flash accordingly according to the rhythm of the audio content.
  • a presentation device can be fixedly arranged, such as a fixed screen, a flashing device, etc., or it can be portable, such as a screen of a portable device, a flashing device, such as a wristband, an ornament, etc.
  • the lighting effect can be achieved by an LED on an electronic wristband.
  • the present disclosure further proposes optimized interactive audio presentation.
  • audio interaction can be achieved based on user gestures.
  • the user's gestures can be obtained to adjust the audio content.
  • the performer can present the audio content to the audience as described above, and after the audience obtains the audio content, they can give feedback through their actions, such as expressing the user's emotions through actions, adaptively adjusting the audio content according to the user's actions, etc. In this way, user interaction can be achieved.
  • the audio content to be presented includes audio content obtained by adjusting the audio content based on the user's gesture information.
  • the adjustment includes at least one of the following: increasing or decreasing the volume of the audio content; adjusting the rhythm of the audio content; enhancing the effect of the audio content; adding additional effects to the audio content.
  • a user gesture may correspond to a specific audio unit, an audio clip, etc.
  • a specific user gesture may correspond to a modification to the specific audio unit, such as increasing or decreasing the intensity of the specific audio unit, changing the rhythm of the audio unit, etc.
  • the presentation effect of the audio clip may be adjusted accordingly.
  • a specific action may indicate that the presentation effect of the audio clip is to be increased, such as increasing the volume, increasing the tactile effect, etc.; or that the presentation effect of the audio clip is to be decreased, such as decreasing the volume, decreasing the tactile effect, etc.
  • the adjustment here may be performed as described above for the audio content modification.
  • the entire hand indicates a bass when on the left, a treble when on the right, a high octave of a specific note when upward, and a low octave of a specific note when downward.
  • the user can express his/her emotion of liking the specific audio content through actions, such as through specific waving actions, etc. In this way, the emotion can be presented through video in the party scene.
  • the application scenario of interactive audio presentation is particularly suitable for achieving an influence on the presented audio content based on the gesture of the second user (e.g., the audience).
  • the second user can provide relatively rough gesture information, such as gesture information of only one finger, to perform corresponding control on the audio content, such as controlling the drum beat and volume, thereby simplifying the user's operation.
  • FIG. 7B shows a schematic diagram of an exemplary gesture of the second user (audience) according to the present disclosure, for example, audio content control can be performed by continuous movement.
  • the first user may also participate in the scene of interactive audio presentation, for example, the music content may also be controlled or adjusted based on the gesture of the first user.
  • the first user may also be treated as a specific second user, and the audio content may be adjusted based on the gestures of both users.
  • adjusting the audio content based on the user's gesture can be performed in accordance with various appropriate criteria.
  • the presentation of the audio content can be adjusted based on the statistical value of the gesture information of at least one second user. In this way, the needs of the second user can be more comprehensively considered to achieve the impact on the audio content.
  • the statistical value of the posture information of at least one second user includes a statistical value about the priority of the user posture information, and the presentation of the audio content is adjusted according to the highest priority posture information in the posture information of at least one second user.
  • the statistical value about the priority of the user posture information is determined as follows: the posture information of at least one audio user is weighted, wherein the weighted processing is performed based on at least one of the number of each posture information, the priority of each posture information, and the priority of the user corresponding to each posture information.
  • the presentation of the audio content is adjusted based on the statistical values of the posture information of the multiple second users. This allows the audio setting to take into account the group feelings of the multiple second users, and realizes audio presentation feedback jointly created by the group, thereby enhancing the audience's sense of presence.
  • priorities may be set for user gestures, and feedback may be provided based on the priorities of the user gestures. For example, the gestures of each user may be aggregated and sorted based on priorities, and then the audio content may be adjusted based on the gesture with the highest priority.
  • feedback may be provided based on the number of user actions. For example, the actions of each user may be aggregated, the number of identical or similar actions may be counted, and the audio content may be adjusted accordingly based on the action with the largest number.
  • the priority of the user may be further considered.
  • the priority may be set for the user, and the corresponding audio content adjustment may be performed according to the action of the user with the highest priority.
  • feedback may be further provided based on at least two of the priority of user actions, the priority of the user, the number of user actions, etc.
  • the actions of each user may be summarized and mathematically counted to obtain feedback results, so as to make corresponding audio content adjustments according to the feedback results.
  • a priority value for each user may be set, a priority value for each user's action may be set, and then the acquired user actions may be counted to obtain a statistical value for each user action, for example, by multiplying the user priority value or the action priority value by the number of the actions, thereby obtaining the statistical value of the user action. Then, the corresponding audio content adjustment may be performed according to the action with the highest statistical value.
  • user feedback can be easily obtained to achieve interaction, especially in the case of both online and offline scenarios, user feedback can also be obtained.
  • the audio content can be adjusted in a timely manner according to the user's feedback to meet the user's modification.
  • the audio content can also be adjusted in an appropriate manner, so that the audio content can be adjusted more appropriately to obtain a better presentation effect.
  • the processing circuit 302 can be in the form of a general-purpose processor or a dedicated processing circuit, such as an ASIC.
  • the processing circuit 202 can be constructed by a circuit (hardware) or a central processing device (such as a central processing unit (CPU)).
  • the processing circuit 302 can carry a program (software) for making the circuit (hardware) or the central processing device work.
  • the program can be stored in a memory (such as arranged in the memory) or an external storage medium connected from the outside, and downloaded via a network (such as the Internet).
  • the processing circuit 302 may include various units for implementing the above functions, such as a receiving unit 304, which is configured to receive relevant information of the audio content to be presented from a control side device for interactive audio presentation, wherein the audio content to be presented includes audio content set based on the user's posture information, and a control unit 306, which is configured to present the audio content, wherein presenting the audio content includes presenting the audio content in a tactile manner.
  • the control unit 306 may control the sending unit 308 to provide the audio content or its relevant information to the audio presentation device for audio content presentation.
  • the audio presentation device may be included in the receiving side device, in particular, included in the control unit, so that the audio presentation device can be directly controlled by the control unit to present the audio content.
  • the processing circuit 302 may further include an acquisition unit 310, which is configured to acquire user gesture information determined via a gesture capture device, and send the acquired user gesture information to the control side device via the sending unit 308.
  • the acquisition unit 310 may be separated from the gesture capture device, and acquire the user gesture information from the gesture capture device.
  • the acquisition unit 310 may include a gesture capture device.
  • the processing channel 302 may further include a conversion unit 312 configured to convert relevant information of the audio content into data suitable for the audio presentation device; and provide the converted data to the audio presentation device via the sending unit 308.
  • a conversion unit 312 configured to convert relevant information of the audio content into data suitable for the audio presentation device; and provide the converted data to the audio presentation device via the sending unit 308.
  • each unit is shown as a discrete unit in FIG3 , one or more of these units may be combined into one unit, or split into multiple units.
  • some units may not be included in the processing circuit or even the receiving side device, and thus may be shown with dotted lines.
  • the acquisition unit 310 and the conversion unit 312 may even be outside the processing circuit 302, and thus may also be shown with dotted lines.
  • the above-mentioned units are only logical modules divided according to the specific functions they implement, rather than The specific implementation method is limited, for example, it can be implemented in software, hardware, or a combination of software and hardware.
  • the above-mentioned various units can be implemented as independent physical entities, or can also be implemented by a single entity (for example, a processor (CPU or DSP, etc.), an integrated circuit, etc.).
  • the above-mentioned various units are shown with dotted lines in the drawings to indicate that these units may not actually exist, and the operations/functions implemented by them can be implemented by the processing circuit itself.
  • Fig. 3A is only a schematic structural configuration of the receiving side device for audio presentation, and the device 300 may also include other possible components, such as memory, network interface, controller, communication unit, etc., which are not shown for clarity.
  • the processing circuit can be associated with the memory.
  • the processing circuit can be directly or indirectly (for example, other components may be connected in the middle) connected to the memory to access the image processing related data.
  • the memory can store various data and/or information generated by the processing circuit 302.
  • the memory can also be located in the optimization device but outside the processing circuit, or even outside the optimization device.
  • the memory can be a volatile memory and/or a non-volatile memory.
  • the memory can include but is not limited to a random access memory (RAM), a dynamic random access memory (DRAM), a static random access memory (SRAM), a read-only memory (ROM), and a flash memory.
  • RAM random access memory
  • DRAM dynamic random access memory
  • SRAM static random access memory
  • ROM read-only memory
  • step S311 receiving step
  • relevant information of audio content to be presented is received from a control side device for interactive audio presentation, wherein the audio content to be presented includes audio content set based on the user's posture information
  • step S313 controlling step
  • the audio content is presented, wherein presenting the audio content includes presenting the audio content in a tactile manner.
  • step S312 conversion step
  • step S312 conversion step
  • these steps can be performed by any appropriate device or device element, such as the aforementioned receiving side device, the processing circuit in the receiving side device, the corresponding element in the processing circuit, etc. It should be noted that the audio presentation method according to the embodiment of the present disclosure may also include other steps, such as the various further processing described above. Moreover, these further processing can also be performed by appropriate devices or device elements, which will not be described in detail here.
  • the control side device 400 includes a processing circuit 402, which is configured to: obtain audio content presentation indication information, the audio content presentation indication information includes indication information based on the posture information of the user to whom the audio is to be presented, and send relevant information of the audio content to be presented to a receiving side device for audio interactive presentation, wherein the audio content to be presented includes audio content set based on the posture information of the user.
  • the processing circuit 402 may be further configured to: obtain user posture information, and set the audio content to be presented based on the obtained user posture information.
  • the processing circuit 402 can be further configured to: determine statistical values of posture information of at least one second user, the statistical values including statistical values about the priority of user posture information, and set the audio content to be presented according to the highest priority posture information in the posture information of at least one second user.
  • the processing circuit 402 can be implemented in various appropriate ways, such as the processing circuit 302 described above, and will not be described in detail here.
  • the processing circuit 402 may include various units for implementing the above functions, such as an acquisition unit 404, which is configured to acquire audio content presentation indication information, wherein the audio content presentation indication information includes indication information based on the posture information of the user to whom the audio is to be presented, and a sending unit 406, which is configured to send relevant information of the audio content to be presented to a receiving side device for audio interactive presentation, wherein the audio content to be presented includes audio content set based on the posture information of the user.
  • an acquisition unit 404 which is configured to acquire audio content presentation indication information, wherein the audio content presentation indication information includes indication information based on the posture information of the user to whom the audio is to be presented
  • a sending unit 406 which is configured to send relevant information of the audio content to be presented to a receiving side device for audio interactive presentation, wherein the audio content to be presented includes audio content set based on the posture information of the user.
  • the processing circuit 402 may include a setting unit 408 configured to set the audio content to be presented based on the acquired user's posture information.
  • the processing circuit may include a determination unit 410 configured to determine statistical values of posture information of at least one second user, wherein the statistical values include statistical values regarding the priority of the user posture information, whereby the setting unit 408 may set the audio content to be presented based on the highest priority posture information in the posture information of at least one second user.
  • the processing channel 402 may further include a conversion unit 412 configured to convert the audio content into information suitable for reception by a receiving device, or even into data suitable for an audio presentation device.
  • a conversion unit 412 configured to convert the audio content into information suitable for reception by a receiving device, or even into data suitable for an audio presentation device.
  • step S411 acquisition step
  • step S413 sending step
  • the relevant information of the audio content to be presented is sent to the receiving side device for audio interactive presentation, wherein the audio content to be presented includes audio content set based on the posture information of the user.
  • step S412 setting step
  • the audio content to be presented is set based on the acquired posture information of the user.
  • a system for interactive audio presentation may include a control side device and a receiving side device as described above, wherein the receiving side device may be associated with at least one user, for example, associated with multiple users including a first user and a second user, wherein each user wears a corresponding or associated receiving side device.
  • control side device receives the posture information of the audio presenting user, and sets the audio content to be presented based on the posture information of the audio presenting user.
  • the receiving side device receives the relevant information of the audio content to be presented, and makes the audio content presented, wherein in particular, presenting the audio content includes presenting the audio content to the user in a tactile manner.
  • a method for interactive audio presentation is also provided, which is based on the control side method and the receiving side method as described above.
  • the gesture e.g., finger gesture, etc.
  • the first user e.g., the host in a party, bar, etc., the performer in a concert, the performer in various activities, etc.
  • the second user e.g., the listener, the audience, etc.
  • the receiving device of the second user e.g., the listener, the audience, etc.
  • the gesture e.g., finger gesture, etc.
  • the music can be adjusted according to the gesture of the second user to be more suitable for the user, thereby further improving the user's experience.
  • Specific devices can be called interactive devices, and the posture of the device can also be used to generate/adjust audio content.
  • it can be a specific device, such as an anthropomorphic doll, a handheld device, various devices worn on the body, etc., and the specific postures of these devices can be collected to achieve audio content presentation and/or feedback.
  • the audience can release specific handheld devices, such as glow sticks, etc., so that the posture/movement of the handheld devices of the audience can be used as audience feedback to adjust the audio content accordingly.
  • FIG9 shows an implementation of a receiving device according to the present disclosure. It can be implemented as a receiving device that can be worn on a user. In the form of cots/gloves on the fingers.
  • 901 indicates a data receiving and sending unit of a receiving side device, which may, for example, receive information related to audio content and provide specific data to drive a tactile providing device 903 and a light special effect presenting device 904. Additionally, 901 may also implement data conversion of audio content data. Optionally, the tactile providing device 903 and the light special effect presenting device 904 may also be included in the receiving side device.
  • the receiving side device may further include a posture acquisition device 902, which may acquire finger motion data and provide the finger motion data to the control side device via 901.
  • a posture acquisition device 902 may acquire finger motion data and provide the finger motion data to the control side device via 901.
  • FIG. 9 shows that the receiving side device includes only a single posture acquisition device 902, and the posture acquisition device 902 is only worn on one finger, this is merely exemplary, and the device may be worn on other fingers, or on more fingers.
  • the gloves worn by the user may be the same or different.
  • two or more fingers may be configured with gesture capture devices, so that the performer's gestures can be detected more accurately, so as to more accurately set, create or synthesize audio content.
  • the gesture capture device may be worn on only one finger, and the tactile providing device may also be worn on another finger, which can simplify the listener's operation and facilitate the listener's use.
  • the receiving-side device may also include an integrated antenna.
  • the receiving side device may also include a battery, a data transmission and receiving device, such as an antenna, and an optional data processing unit, such as a phase shifter, a filter, etc., which will not be described in detail here.
  • a data transmission and receiving device such as an antenna
  • an optional data processing unit such as a phase shifter, a filter, etc., which will not be described in detail here.
  • FIG. 10 is a schematic diagram showing an exemplary implementation of interactive audio presentation between performers and listeners according to an embodiment of the present disclosure.
  • the performer performs relatively fine gesture operations, and then sets the music based on the acquired gesture information, such as converting the gesture into music, so as to present it to the performer in an appropriate manner. And such music can be presented to the audience. For example, in a live scene.
  • the audience can also perform gesture operations, in particular, in order to perform relatively simple gesture operations, and then influence the music based on the acquired gesture information, such as adjusting the rhythm, melody, etc. of the music, so as to present it to the audience in an appropriate manner, such as by sound, touch, visual feedback, etc.
  • the music adjusted in this way can also be presented to the performer. Interactive audio presentation is achieved in this way.
  • Such implementation can be embodied in various appropriate application scenarios.
  • a music bar especially a music bar suitable for or able to accommodate hearing-impaired people.
  • listeners can At the entrance, the user receives a suitable receiving device, such as the glove-type device described above, and then during activities in the music bar, the user sets the music to be listened to based on the gesture corresponding to the finger swing through appropriate body movements, especially finger swings wearing the glove-type device.
  • a suitable receiving device such as the glove-type device described above
  • the technical solution of the present disclosure can be applied to various appropriate tasks, including but not limited to hearing-impaired environments.
  • the task includes generating music and providing the music to the user in a video, audio, or the like.
  • the technical solution of the present disclosure can also be applied to provide other content in other ways. For example, provide audio content, or provide audio content in a video, such as dialogues in a movie or TV series, to a hearing-impaired user. In other embodiments, the technical solution of the present disclosure can also be used for users with normal hearing.
  • the technology disclosed herein can be used in many applications.
  • the technology disclosed herein can be used in live audio presentation applications. It can also be used in remote concerts, recitals, etc., and can also capture the user's gestures, set or adjust the audio content in the cloud, and then provide the audio content to the user through the network.
  • the disclosed solution can be implemented by a software algorithm, so that it can be easily integrated into various types of devices including presentation devices, such as devices including various presentation devices, such as finger sleeves.
  • the disclosed method can be executed as a computer program, instruction, etc. by a processor of a portable device to perform audio presentation enhancement processing.
  • FIG. 11 is a block diagram showing an example structure of a personal computer of an optimization device that can be adopted in an embodiment of the present disclosure.
  • the personal computer can correspond to the above exemplary optimization device according to the present disclosure.
  • a central processing unit (CPU) 1101 performs various processes according to a program stored in a read-only memory (ROM) 1102 or a program loaded from a storage section 1108 to a random access memory (RAM) 1103.
  • ROM read-only memory
  • RAM random access memory
  • the CPU 1101, the ROM 1102, and the RAM 1103 are connected to each other via a bus 1104.
  • An input/output interface 1105 is also connected to the bus 1104.
  • the following components are connected to the input/output interface 1105: an input section 1106 including a keyboard, a mouse, etc.; an output section 1107 including a display such as a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 1108 including a hard disk, etc.; and a communication section 1109 including a network interface card such as a LAN card, modem, etc.
  • the communication section 1109 performs communication processing via a network such as the Internet.
  • a drive 1110 is also connected to the input/output interface 1105 as needed.
  • a removable medium 1111 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc. is mounted on the drive 1110 as needed so that a computer program read therefrom is installed into the storage section 1108 as needed.
  • a program constituting the software is installed from a network such as the Internet or a storage medium such as the removable medium 1111 .
  • such storage medium is not limited to the removable medium 1111 shown in FIG. 11 in which the program is stored and distributed separately from the device to provide the program to the user.
  • the removable medium 1111 include magnetic disks (including floppy disks (registered trademark)), optical disks (including compact disk read-only memory (CD-ROM) and digital versatile disks (DVD)), magneto-optical disks (including minidiscs (MD) (registered trademark)), and semiconductor memories.
  • the storage medium may be ROM 1102, a hard disk included in the storage portion 1108, or the like, in which the program is stored and distributed to the user together with the device containing them.
  • the method and system of the present disclosure may be implemented in a variety of ways.
  • the method and system of the present disclosure may be implemented by software, hardware, firmware, or any combination thereof.
  • the order of the steps of the method described above is illustrative only, and unless otherwise specifically stated, the steps of the method of the present disclosure are not limited to the order specifically described above.
  • the present disclosure may also be embodied as a program recorded in a recording medium, including machine-readable instructions for implementing the method according to the present disclosure. Therefore, the present disclosure also encompasses a recording medium storing a program for implementing the method according to the present disclosure.
  • Such storage media may include, but are not limited to, a floppy disk, an optical disk, a magneto-optical disk, a memory card, a memory stick, and the like.
  • embodiments of the present disclosure may also include the following exemplary embodiment implementations (EEE).
  • a receiving-side device for interactive audio presentation comprising a processing circuit configured to: receive information related to audio content to be presented from a control-side device for interactive audio presentation, wherein The audio content to be presented includes audio content set based on the user's posture information, and the audio content is presented, wherein presenting the audio content includes presenting the audio content in a tactile manner.
  • EEE 2 The receiving side device according to EEE 1, wherein the user's posture information includes at least one of the user's posture and posture motion information, wherein the posture motion information includes at least one of the posture motion direction, trajectory, and motion acceleration.
  • EEE 3 The receiving side device according to EEE 1, wherein the processing circuit is further configured to: obtain the user's posture information determined by a posture capture device, and send the obtained user posture information to the control side device.
  • EEE 4 The receiving side device according to EEE 3, wherein the gesture capture device includes a motion capture device that can be worn on at least one finger of a user, and the gesture information is based on the gesture information of each finger wearing the motion capture device and/or their combination.
  • EEE 5 The receiving side device according to EEE 1, wherein the user's posture information includes posture information of a first user, and wherein the audio content to be presented includes at least one of specific audio content specified by the first user's posture information, audio content constructed based on an audio unit corresponding to the posture information of the first user, or a specific combination.
  • EEE 6 The receiving side device according to EEE 1, wherein the user's posture information includes posture information of a second user, and wherein the audio content to be presented includes audio content obtained by adjusting the audio content based on the posture information of the second user.
  • EEE 7 The receiving device according to EEE 6, wherein adjusting the audio content based on the posture information of the second user comprises at least one of the following:
  • EEE 8 The receiving device according to EEE 7, wherein adjusting the audio content based on the posture information of the second user comprises:
  • the presentation of the audio content is adjusted based on the statistical value of the gesture information of the plurality of second users.
  • EEE 9 The receiving-side device according to EEE 8, wherein the statistical value of the posture information of the plurality of second users includes a statistical value about the priority of the user posture information, and the presentation of the audio content is adjusted according to the highest priority posture information among the posture information of the plurality of second users.
  • EEE 10 The receiving device according to EEE 9, wherein the statistical value of the priority of the user posture information is determined as follows:
  • the posture information of multiple audio users is weighted, wherein the weighted processing is performed based on at least one of the quantity of each posture information, the priority of each posture information, and the priority of the user corresponding to each posture information.
  • EEE 11 The receiving side device according to EEE 1, wherein presenting the audio content further includes: converting relevant information of the audio content into data suitable for an audio presentation device; and providing the converted data to the audio presentation device.
  • EEE 12 A receiving side device according to any one of EEE 1-11, wherein the audio presentation device is a tactile providing device, so that the audio content is provided to the user in a tactile manner via the tactile providing device.
  • EEE 13 The receiving side device according to EEE 12, wherein the tactile sensation providing device comprises at least one tactile unit, wherein each tactile unit corresponds to a specific audio track in the audio content to be presented.
  • a control side device for interactive audio presentation comprising a processing circuit, configured to: obtain audio content presentation indication information, the audio content presentation indication information comprising indication information based on posture information of a user to whom the audio is to be presented, and send relevant information of the audio content to be presented to a receiving side device for audio interactive presentation, wherein the audio content to be presented comprises audio content set based on the posture information of the user.
  • EEE 15 The control side device according to EEE 14, wherein the processing circuit is further configured to: obtain the user's posture information, and set the audio content to be presented based on the obtained user's posture information.
  • EEE 16 The control side device according to EEE 15, wherein the user's posture information includes posture information of a first user, and wherein the audio content to be presented includes at least one of specific audio content specified by the first user's posture information, audio content constructed based on an audio unit corresponding to the posture information of the first user or a specific combination, and/or
  • the user's posture information includes posture information of a second user
  • the audio content to be presented includes audio content obtained by adjusting the audio content based on the posture information of the second user.
  • EEE 17 The control side device according to EEE 16, wherein the processing circuit is further configured to:
  • the statistical values comprising statistical values regarding priorities of the user gesture information
  • the audio content to be presented is set according to the highest priority gesture information among the gesture information of the plurality of second users.
  • a method for a receiving side of interactive audio presentation comprising: receiving information related to audio content to be presented from a control side device for interactive audio presentation, wherein the audio content to be presented includes The method further comprises: providing an audio content configured based on the user's gesture information, and causing the audio content to be presented, wherein presenting the audio content comprises presenting the audio content in a haptic manner.
  • EEE 19 The method according to EEE 18 also includes: acquiring posture information of the user determined by a posture capture device, and sending the acquired user posture information to the control side device.
  • EEE 20 The method according to EEE 18, wherein presenting the audio content further includes: converting relevant information of the audio content into data suitable for an audio presentation device; and providing the converted data to the audio presentation device.
  • a control side method for interactive audio presentation comprising: obtaining audio content presentation indication information, the audio content presentation indication information including indication information based on posture information of a user to whom the audio is to be presented, and sending relevant information of the audio content to be presented to a receiving side device for audio interactive presentation, wherein the audio content to be presented includes audio content set based on the posture information of the user.
  • EEE 22 The method according to EEE 21 also includes: obtaining user posture information, and setting the audio content to be presented based on the obtained user posture information.
  • the method according to EEE 21 also includes: determining statistical values of posture information of multiple second users, the statistical values including statistical values about the priority of user posture information, and setting the audio content to be presented according to the highest priority posture information among the posture information of the multiple second users.
  • An interactive audio presentation system comprising: a control-side device for interactive audio presentation, configured to receive posture information of an audio presentation user, and set audio content to be presented based on the posture information of the audio presentation user; and a receiving-side device for interactive audio presentation, configured to receive relevant information of the audio content to be presented, and present the audio content, wherein presenting the audio content includes presenting the audio content to the user in a tactile manner.
  • EEE 25 The system according to EEE 24, wherein the receiving side device is further configured to: obtain the user's posture information and send the posture information to the controlling side device.
  • EEE 26 A system according to EEE 24, wherein the control side device is further configured to: obtain posture information of multiple users, and set the audio content to be presented based on the statistical values of the posture information of the multiple users.
  • An interactive audio presentation method comprising: receiving posture information of an audio presentation user, setting audio content to be presented based on the posture information of the audio presentation user; and presenting the audio content, wherein presenting the audio content includes presenting the audio content to the user in a tactile manner.
  • EEE 28 A device comprising at least one processor; and at least one storage device storing instructions thereon that, when executed by the at least one processor, cause the at least one The processor executes a method according to any one of EEE 18-23 and 27.
  • EEE 29 A storage medium storing instructions that, when executed by a processor, enable the processor to execute a method according to any one of EEE 18-23 and 27.
  • EEE 30 A computer program product comprising instructions which, when executed by a processor, enable the processor to perform a method according to any one of EEEs 18-23 and 27.
  • EEE 31 A computer program comprising instructions which, when executed by a processor, enable the processor to perform a method according to any one of EEE 18-23 and 27.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

本公开涉及音频呈现方法和设备。提供了一种用于交互式音频呈现的接收侧设备,包括处理电路,其被配置为接收来自用于交互式音频呈现的控制侧设备的待呈现音频内容的相关信息,其中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容,以及使得呈现音频内容,其中呈现音频内容包括以触觉方式呈现音频内容。

Description

音频呈现方法和设备
相关申请的交叉引用
本申请是以申请号为202211599599.6、申请日为2022年12月12日的中国申请为基础,并主张其优先权,该中国申请的公开内容在此作为整体引入本申请中。
技术领域
本公开涉及音频信号处理,特别涉及音频信号呈现。
背景技术
音乐往往在人类生活中能够带给听众快乐和美好。音乐存在各种各样的形式并且可以在各种场合中被呈现给听众。然而,收听音乐对于听力受损的人士是比较困难的,而且随着生活水平进步,越来越多的听力受损人士也希望能够感受到音乐的魅力,享受音乐的乐趣。
因此,需要提供改进的音乐呈现方案。
发明内容
提供该发明内容部分以便以简要的形式介绍本公开的构思,这些构思将在后面的具体实施方式部分被详细描述。
本公开提供了对音频信号呈现进行优化,特别地优化针对特定用户的音频信号呈现。
本公开还提供了优化的交互式音频信号呈现。
在本公开的一个方面,提供了一种用于交互式音频呈现的接收侧设备,所述设备包括处理电路,被配置为:接收来自用于交互式音频呈现的控制侧设备的待呈现音频内容的相关信息,其中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容,以及使得呈现音频内容,其中呈现音频内容包括以触觉方式呈现音频内容。
在本公开的另一方面,提供了一种用于交互式音频呈现的控制侧设备,所述设备包括处理电路,被配置为:获取音频内容呈现指示信息,所述音频内容呈现指示信息包括基于音频要被呈现给的用户的姿态信息的指示信息,将待呈现音频内容的相关信息发送给用于音频交互呈现的接收侧设备,其中所述待呈现音频内容包括基于用户的 姿态信息设定的音频内容。
在本公开的另一方面,提供了一种用于交互式音频呈现的接收侧的方法,包括:接收来自用于交互式音频呈现的控制侧设备的待呈现音频内容的相关信息,其中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容,以及使得呈现音频内容,其中呈现音频内容包括以触觉方式呈现音频内容。
在本公开的另一方面,提供了一种用于交互式音频呈现的控制侧的方法,包括:获取音频内容呈现指示信息,所述音频内容呈现指示信息包括基于音频要被呈现给的用户的姿态信息的指示信息,将待呈现音频内容的相关信息发送给用于音频交互呈现的接收侧设备,其中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容。
在本公开的还另一方面,提供了一种设备,包括至少一个处理器和至少一个存储设备,所述至少一个存储设备其上存储有程序代码和/或指令,该程序代码和/或指令在由所述至少一个处理器执行时可使得所述至少一个处理器执行如本文所述的方法。
在本公开的仍另一方面,提供了一种存储有程序代码和/或指令的存储介质,该程序代码和/或指令在由处理器执行时可以使得执行如本文所述的方法。
在本公开的仍另一方面,提供了一种程序产品,所述程序产品包含程序代码和/或指令,该程序代码和/或指令在由处理器执行时可使得所述处理器执行如本文所述的方法。
在本公开的仍另一方面,提供了一种计算机程序,所述计算机程序包含程序代码和/或指令,该程序代码和/或指令在由处理器执行时可使得所述处理器执行如本文所述的方法。
从参照附图的示例性实施例的以下描述,本公开的其它特征将变得清晰。
附图说明
下面参照附图说明本公开的优选实施例。此处所说明的附图用来提供对本公开的进一步理解,各附图连同下面的具体描述一起包含在本说明书中并形成说明书的一部分,用于解释本公开。应当理解的是,下面描述中的附图仅仅涉及本公开的一些实施例,而非对本公开构成限制。
图1示出了根据本公开的实施例的音频呈现的概念图。
图2A示出了根据本公开的实施例的交互式音频呈现的概念图。
图2B示出了根据本公开的实施例的交互式音频呈现的流程图。
图3A示出了根据本公开的实施例的用于交互式音频呈现的接收侧设备的框图。
图3B示出了根据本公开的实施例的用于交互式音频呈现的接收侧方法的流程图。
图4A示出了根据本公开的实施例的用于交互式音频呈现的控制侧设备的框图。
图4B示出了根据本公开的实施例的用于交互式音频呈现的控制侧方法的流程图。
图5示出了根据本公开的实施例的待呈现音频内容设定的概念性流程图。
图6A到6C示出了示例性姿态检测的示意图。
图7A示出了根据本公开的第一用户(演奏者或表演者)的示例性姿态的示意图。
图7B示出了根据本公开的第二用户(听众)的示例性姿态的示意图。
图8示出了根据本公开的实施例的示例性音频转换的曲线图。
图9示出了根据本公开的实施例的接收侧设备的示例性实现。
图10示出了根据本公开的实施例的示例性实现场景。
图11示出了示出了能够实现本公开的实施例的计算机系统的示例性硬件配置的框图。
应当明白,为了便于描述,附图中所示出的各个部分的尺寸并不一定是按照实际的比例关系绘制的。在各附图中使用了相同或相似的附图标记来表示相同或者相似的部件。因此,一旦某一项在一个附图中被定义,则在随后的附图中可能不再对其进行进一步讨论。
具体实施方式
下面将结合本公开的附图,对本公开实施例中的技术方案进行清楚、完整地描述,但是应理解,所描述的实施例仅仅是本公开的一部分实施例,而不是全部的实施例。附图以及下文对实施例的描述实际上也仅仅是说明性的,而不作为对本公开及其应用或使用的任何限制。应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为仅限于这里阐述的实施例。
此外,在下文中结合附图对本公开的示例性实施例进行描述时,为了清楚和简明起见,在说明书中并未描述实施例的所有特征。应当注意,为了避免因不必要的细节而模糊了本公开,在附图中仅仅示出了与根据本公开的方案密切相关的处理步骤和/或设备结构,而省略了与本公开关系不大的其他细节。
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。 本公开的范围在此方面不受限制。除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值应被解释为仅仅是示例性的,不限制本公开的范围。
在本公开中,术语“第一”、“第二”等仅仅用于区分元件或者步骤,而不是要指示时间顺序、优先选择或者重要性。
音乐是人类共有的精神食粮。人们喜欢音乐,除了音乐好听,能给我们带来情绪上的不同感受以外,特定的音乐对人的身心健康有着积极作用。随着生活水平的日益提高,越来越多的人们有机会收听到音乐,也越来越喜欢收听音乐。
但是,应该认识到,当今世界中还存在大量听力受损的人士,尤其在中国大约由2700万左右的听力受损人士,其中的大部分人士是轻度到重度听力受损,甚至在年轻人中也有约1150万听力受损的人士。这些听力受损人士尤其希望能感受到音乐的美。然而,通常的各种社交活动中的音乐播放主要是面向听力未受损的人士,而没有特别考虑听力受损人士的收听情况。比如年轻人喜欢去各种音乐吧,然而普通的音乐吧中听力受损人士往往由于听力受损而无法如听力无损的人士那样收听音乐。
因此希望提供改进的音频呈现方案,尤其为听力受损人士提供改进的音频呈现。
此外,考虑到在一些音乐播放场景中,例如在各种音乐吧、KTV、演唱会等场景中,听众往往希望能够参与到音乐中,实现希望的音乐互动,因此系统提供改进的互动音频呈现方案,尤其是为听力受损人士提供改进的互动音频呈现。
一方面,本公开提出了一种改进的音频呈现方案,特别地,能够以触觉方式将音频内容呈现给听力受损人士。更特别地,待呈现的音频内容可以经由佩戴在听众身上的触感提供装置作为相应的振动提供给听众。
另一方面,本公开提出了一种改进的交互音频呈现方案,特别地,可以通过检测听众的特定输入来影响音频内容,例如正在播放的音频内容,从而使得音频内容能够更以用户更加希望的方式、节奏等等呈现给听众。更特别地,本公开尤其提出了根据姿态信息来实现对音频内容的影响,从而实现更加方便的交互。
以下结合示例性示例来具体描述根据本公开的实施例。应指出,在本公开的上下文中所提及的音频内容可以为各种适当形式,作为示例可以与音频有关,其可以涵盖任何适当类型的音乐信号,诸如音乐旋律、音轨、音元、音序、音效等等。例如,音频内容可对应于完整的音乐或者其一部分,甚至可为对应于用户特定输入,例如与特定用户姿态相对应的音乐片段。
图1示出了根据本公开的实施例的音频呈现的概念图。根据本公开的实施例的音频呈现尤其适合于听力受损人士,可以基于姿态信息来实现音频呈现。
首先,采集与音频呈现相关的数据/信息。特别地,在基于姿态信息实现音频呈现的场景中,采集参与音频呈现的成员的姿态信息。作为示例,参与音频呈现的人员可包括听众,尤其是听力受损的人士。参与音频呈现的人员还可以包括负责音频呈现的特定成员,例如主持人、DJ、演奏者,表演者等等。当然,所采集的与音频呈现相关的数据/信息还可以包括其它数据/信息,诸如成员的参数信息(包括诸如身份ID等),启动和/或停止音频呈现的指令、其它与音频呈现控制相关的数据/信息等。
其次,基于所采集的数据/信息进行音频处理。特别地,可以基于所采集的数据/信息,尤其是姿态信息,进行待呈现音频内容的设定,稍后将此进行详细描述。
然后,将所设定的待呈现音频内容呈现给听众。特别地,对于听力受损的人士可以通过触觉方式来实现音频内容呈现。当然,还可以通过其他方式来实现音频呈现。例如,可以通过视频、视觉特效、灯光特效等来呈现给听众,以便更加丰富音频呈现效果。
以下将参照图2A-2B描述根据本公开的实施例的交互式音频呈现,其中图2A示出了根据本公开的实施例的交互式音频呈现的概念图,图2B示出了根据本公开的实施例的交互式音频呈现的流程图。
根据本公开的一些实施例的交互式音频呈现可以适用于各种应用场景,例如在各种音乐吧、聚会地点参与音乐互动的音乐现场场景。在这样的场景中包含第一用户和第二用户。其中,第一用户可以是场景中的负责或主导音频呈现的人士,例如主持人,DJ等等,其可以启动、暂停、结束、设定、调整要呈现给用户的音频内容。第二用户可以是场景的音频呈现对象,例如音乐吧、聚会地点的顾客、参与者、听众等等。这里,第一和第二用户中的至少一者尤其可以是听力受损人士。但是应理解,第一和第二用户也可以不是参与现场的人士,例如可以是通过网络、云等来参与音乐的人士。
一方面,通过获取第一用户的姿态信息来设定音频内容,然后将所设定的音频内容呈现给第一用户和第二用户中的至少一者。在一些实施例中,可以基于第一用户的姿态信息生成或创建音频内容,例如与第一用户的姿态信息相对应的音频内容,例如由与第一用户的各姿态信息相对应的各音频单元来组合得到的音频内容。在另一些实施例中,第一用户的姿态信息可以仅仅指示音频呈现启动、暂停、停止等,从而在用户姿态信息表示启动时,可以开始呈现特定音频内容,例如播放特定音乐,诸如预先设定的音频/音乐。音频内容可被以各种适当方式呈现给用户,例如触觉、视频、视觉 特效、灯光等等,这里将不再详细描述。
另一方面,通过获取第二用户的姿态信息来对音频内容进行调整,例如对于正在播放的音频/音乐进行调整,然后将调整后的音频内容呈现给第一用户和第二用户中的至少一者。音频内容的调整可被以各种适当方式来实现,例如调整音频播放的音量、旋律等等,这样的调整可以相应地反映到触觉实现中。
根据本公开的实施例,音频数据处理可以被以适当的方式来实现,例如可以由软件、硬件、固件等实现。其可以位于用于音频呈现的系统的控制侧,由用于音频呈现的控制侧设备来实现,例如网络中的服务器、控制设备等。另一方面,音频呈现给的用户可对应于系统的接收侧,其可以配置有接收侧设备,接收音频内容以便以适当的方式呈现给用户。例如,接收侧设备可以与各种呈现设备,例如触觉、视觉特效、灯光等呈现设备相配合以呈现给用户。当然,呈现设备可也被包含在接收侧设备中。
用于第一用户和第二用户的接收侧设备可以不同,例如可以对于不同的用户根据用户的需求采用不同的接收侧设备,和/或音频内容呈现装置。用于第一用户和第二用户的接收侧设备也可以相同。例如,这样的接收侧设备和/或音频内容呈现装置可能够单独设置功能,从而可以针对不同的用户根据用户的需求来设置不同的功能配置,例如对于不同用户相应地开启或关闭某些功能。
以下将参照附图来描述根据本公开的实施例的实现。
图3A示出了根据本公开的实施例的用于交互式音频呈现的接收侧设备的框图。该接收侧设备300包括处理电路302,其被配置为接收待呈现音频内容的相关信息,其中所述待呈现音频内容可包括基于用户的姿态信息设定的音频内容,以及使得呈现音频内容,其中使得呈现音频内容可包括使得以触觉方式呈现音频内容。
根据本公开的一些实施例,音频内容可以被适当的方式提供到接收侧设备,从而音频内容相关信息也相应地为各种适当的形式。
在一些实施例中,音频内容可以为各种适当的格式,并且直接作为音频内容相关信息提供到接收侧。例如,音频内容为要播放的音乐,可以为各种适当的音乐格式,例如mp3、midi,其它适当格式等,并且被直接发送给接收侧。
在另一些实施例中,音频内容相关信息可以是指示音频内容的信息,例如为音频内容的索引。作为示例,音频内容和音频索引可以预先关联设定和存储,并且在应用过程中可以根据音频索引来调用对应的音频内容。
在还另一些实施例中,音频内容相关信息可以是将音频内容转换得到的信息/数 据,特别地,在预先知晓音频内容要以何种方式呈现给用户的情况,可以在控制侧将音频内容转换成适合于该呈现方式的信息/数据,然后作为相关信息传递到接收侧。
根据本公开的实施例,待呈现音频内容可被各种适当的方式设定,包括但不限于生成、创建、调整等。特别地,可以基于用户的姿态信息来设定。根据本公开的实施例,用户的姿态信息包括用户的姿态(包括例如特定部位的姿势、空间位置等)、姿态运动信息中的至少一者,其中,姿态运动信息包括姿态运动轨迹、运动加速度中的至少一者。作为示例,姿态运动信息可以包含特定姿态的移动方向、移动速度、移动加速度、移动频率等等。作为示例,在用户姿态对应于用户手指姿态的情况下,用户姿态可以包括是特定手势、空间位置等,姿态的动作可指的是手势的动作,例如特定手势如何摆动,摆动的速度、摆动的方向、摆动的频率等等。
根据本公开的一些实施例,该接收侧设备的应用场景中可包括各种类型的用户,尤其包括第一用户和第二用户,如上所述。这里,音频内容可包括基于第一用户和第二用户中的至少一者的姿态信息设定的音频内容。
图5示出了根据本公开的实施例的待呈现音频内容设定的概念性流程图。其中,在音频内容呈现场景中,当用户调整姿态时,获取或者检测用户的姿态,生成用户姿态相关的信息和/或数据,由此设定待呈现音频内容以便呈现给用户。应指出,待呈现音频内容设定通常可在系统的控制侧来实现,特别地由控制侧设备来实现。
以下将参照附图来进一步描述音频内容设定和呈现中的具体实现。
用户姿态的获取或检测
根据本公开的实施例,用户姿态的获取或者检测可通过各种适当的方式来执行。
在一些实施例中,可以通过视频采集、图像捕捉等方式来获取用户姿态。例如可以通过相机/摄像头获取用户的动作,然后从所获取的用户动作的图像或视频中进行用户姿态分析,以获取用户姿态相关的信息/数据。在一种示例性实现中,可以通过相机动作捕捉、相机颜色捕捉等来实现。可以在用户的特定部位设置特定颜色或者特定标签,然后通过相机颜色识别来获取相应部位的动作。例如,可以在用户的至少一个手指上贴上特定颜色的贴片,然后通过相机颜色识别来捕获用户手指姿态/动作相关的信息/数据,如图6A所示。
这样可以适用于宽范围的应用场景。例如在各种聚会场景中,可利用预先安装的聚会场所的摄像头,可以捕获聚会现场场景中所关注的用户的姿态;还例如在远程场景中,分别利用各用户自身专用的摄像头来捕获各用户的姿态,然后上传至网络;由 此可以在服务器端或云端进行姿态捕获。
在另一些实施例中,可以通过相机骨骼捕获来获取用户姿态。例如,可以通过相机捕获用户手部特定部位的动作,例如手指整体轮廓、骨骼等的动作来获取相应部位的姿态,如图6B所示。作为一个示例,可通过投影仪等设备通过特定算法来检测手指骨骼的运动状态,来获取手指姿态。
在还另一些实施例中,用户可以佩戴特定的姿态捕捉装置,诸如动作捕捉传感器、陀螺仪等,然后根据姿态捕捉装置的数据来获取用户姿态相关的信息/数据,如图6C所示。在一些实施例中,所要获取的用户姿态信息为用户手部姿态信息,并且姿态捕获装置可包括能够佩戴在用户的至少一个手指上的动作捕捉器件,并且姿态信息是基于佩戴有动作捕捉器件的各手指的姿态信息和/或它们的组合的。
应指出,在此情况下,用户姿态相关的信息/数据可以看做在接收侧获取并提供给控制侧以供进行音频内容设定。特别地,所述控制侧设备进一步配置为获取经由姿态捕捉装置确定的用户的姿态信息,并且将所获取的用户姿态信息发送给控制侧设备。应指出,接收侧设备还可以提供其它适当的信息,例如该用户的参数信息,诸如用户身份ID等等。
以下将描述根据本公开的实施例的姿态捕获和转换的示例性实现。该实现可以是在网络场景下示例性实现。其中用户在摄像头之前挥动手,由此可以通过计算机摄像头来捕获用户手指的运动,例如,可通过比较相邻图片之间的像素差异来确定用户手指的运动状态和轨迹,并且相应地生成手指运动数据,这样的数据可被以各种适当的方式表示和存储,例如可以包括每个手指的数据编号、以及对应的数据,包括但不限于摆动速度、摆动位置,时间点等等。由此可以确定用户的手指姿态等等。这样可以通过本领域已知的各种方式来实现,这里将不再详细描述。然后基于所确定的用户手指的运动数据来确定对应的音频内容,例如执行将其转换成MIDI和音乐化处理。
音频转换
根据本公开的实施例,可以基于所获取的用户姿态的相关信息/数据来设定相应的音频内容以供呈现给用户。
在一些实施例中,待呈现音频内容可以包括基于与所述第一用户的姿态信息相对应的音频单元或者特定组合构建的音频内容。特别地,可以基于姿态数据与音频单元之间的关联性或对应性来设定音频内容。音频单元可以是音频内容的组成单元,例如可对应于音元、音序、音频片段等等中的至少一者。由此,可以获取用户的至少一个 姿态,然后使用对应于所述至少一个姿态的音频单元来生成音频内容。在一些实施例中,可以进行音频单元的组合来生成音频内容。特别地,在用户连续作出动作的情况下,可以将用户的各个姿态所对应的音频单元进行组合以得到待呈现的音频内容。在另一些示例中,还可对组合得到的音频内容进行适当的处理,例如滤波、平滑化等等。
在一些实施例中,用户姿态与音频单元之间的关联性/对应性可被预先构建,例如可训练各种手势并且为每种手势设定对应的音频单元。在一些实施例中,第一用户可以提供相对精细的姿态信息,例如多个手指的姿态信息,针对音频内容进行相应的控制,例如控制音频的多个音轨,生成更加精确的音频内容,从而更加准确地呈现音频。作为示例,用户姿态可与相对应的音频单元相关联地存储在数据库中。用户姿态与音频单元可以被各种适当的方式来存储。例如,每一用户姿态和与其对应的音频单元可被以列表方式存储,以映射方式存储等等。作为示例,数据库中可包括用户姿态、相应的音频单元、用户姿态变化方式、相应的音频单元变化方式等等,但并不局限于此。只要能够从数据库所存储的数据,基于所获取的用户姿态来生成音频内容和/或改变音频内容即可。
在一些实施例中,音频内容设定可采用适当的方式来执行。作为示例,可以采用机器学习或深度学习算法来基于姿态数据来设定音频内容,从而使得根据姿态数据转化的音频MIDI信号得到更优的滤波和平滑,以增强其音乐性。机器学习或深度学习算法可以包含本领域中已知的各种算法,这里将不再详细描述。在一些实施例中,机器学习或深度学习算法也可预先基于训练数据来训练,训练可采用各种适当的方式来执行,这里将不再详细描述。训练好的AI模型输入为姿态数据,输出为MIDI信号以用于呈现音频。进一步地,训练的AI模型输入为多个用户的姿态数据和表演侧的初始音频内容,输出为经调整的MIDI信号以用于呈现音频,以实现观众对音乐的共同创作。
根据本公开的一些实施例,待呈现的音频内容可以包括由用户姿态信息所指定的特定音频内容。特别地,特定用户姿态可对应于特定的音频内容,由此在检测到特定用户姿态时可以直接将完整的音频内容发送给接收侧设备以供呈现。
在一些实施例中,用户的姿态信息还可以对应于音频内容呈现指示信息,其例如可指示音频呈现的特定操作,诸如启动、暂停、停止等,由此可以在检测到该姿态信息时,可以对音频内容呈现执行相应操作。这里的音频内容可以是预先设定的,或者与用户姿态相关联的。
应指出上述的音频内容创建和/或生成可尤其对应于基于本公开的第一用户的姿态信息来设定音频内容的情况,如图7A示出了根据本公开的第一用户(演奏者)的示例性姿态的示意图,其中不同的姿态可对应于不同的音乐呈现操作,例如连续移动可对应于表演,握拳可对应于击鼓,对于音乐开始录制、暂停、结束以及其他的操作也都可对应于其他的姿态。例如,可对应于在音频内容呈现场景中,由诸如演奏者、表演者、主持人等第一用户来主导音频内容的生成和/或创建。
音频呈现
根据本公开的实施例,所创建或生成的音频内容可被以各种适当方式呈现给用户。特别地,将音频内容的相关信息进行转换为适用于音频呈现装置的数据;并且将转换得到的数据提供给音频呈现装置。作为示例,转换得到的数据可以是音频呈现装置的驱动数据或输入数据,以便音频呈现装置能够以特定方式将音频内容呈现给用户。数据转换可通过各种适当的方式来实现。例如,根据本公开的实施例,可以通过各种适当的方式将音频数据改变为触觉数据,例如采用模拟信号方式、FFT(Fast Fourier Transform)过滤方式等等。当然还可采用本领域已知的其它方式,这里将不再详细描述。
图8示出了数据转换的示意图,其中对于不同类型的音乐数据将转换得到各自的波形数据并用以驱动音频呈现数据。由于所得到的波形数据往往能够体现不同类型的音乐数据的特性,因此音频呈现装置也能够将音乐的特性、旋律等准确地呈现给用户。
在本公开的一些实施例中,接收侧设备可以与音频呈现装置分离,在另一些实施例中,接收侧设备可以与音频呈现设备集成在一起,特别地,接收侧设备可以包括音频呈现装置。
在本公开的一些实施例中,音频内容可被以各种适当的时序方式提供给接收侧。在一些实施例中,一旦基于用户的姿态信息可得到可播放/呈现的音频单元/片段,就将之发送到接收侧。在另一些实施例中,可以每次将预定数量的音频单元/片段,甚至是整个音频内容,才发送给接收侧。
应指出,音频内容也可以是被以其它方式设定的待呈现的音频内容,例如接收到特定播放/呈现指令而开始播放/呈现的音频内容,按照预先设定的顺序/指令而开始播放/呈现的音频内容,例如演奏厅、现场等中预定的音频内容,这里将不再详细描述。
根据本公开的实施例,音频呈现装置为触感提供装置,使得经由所述触感提供装 置将音频内容以触觉方式提供给用户。这样,尤其针对听力受损人士,音频内容可以触觉方式呈现给用户。在本公开的一些实施例中,触感提供装置可以包括佩戴在用户手、手腕、手臂等中至少一者上的触觉反馈装置,例如为手套、腕带、臂带等的样式,可以为用户的至少一个手指、手背、手腕、手臂等提供触觉反馈。
在一些实施例中,在接收侧接收到的音频内容相关信息为已经由音频内容针对触觉呈现方式转换得到的信息/数据的情况下,可以将所接收到的信息/数据直接转发给触觉装置。在另一些示例中,在所接收到的音频内容相关信息为音频内容本身的情况下,接收侧设备可以将音频内容转换为适合于触觉呈现方式的信息/数据,然后将之提供转发给触觉装置。这样,接收侧装置可以包括转换单元,其被配置为针对音频呈现方式转换得到的信息/数据,以便进行音频呈现。
根据本公开的实施例,触感提供装置可被以各种适当的方式来实现。作为示例,触感提供装置可以包括振动器,其能够将与音频内容的特性,如旋律等,相对应的振动提供给用户,从而让听力受损人士感受到音乐旋律。例如,触感提供装置可以通过惯性执行机构(Inertial actuator),压电半导体换能器(Piezoelectric actuator),电活性高分子驱动器(Electro-activepolymer actuator,EAP)等等实现,这里将不再详细描述。
在一些实施例中,在所采集的用户姿态为用户手指姿态/动作的情况下,每个用户手指可以对应于特定的音轨,来设定(例如,生成或影响)音频内容的不同的音色。
根据本公开的实施例,所述触感提供装置包括至少一个触觉单元,其中每个触觉单元可对应于待呈现音频内容中的特定音轨。特别地,在一些实施例中,触感提供装置可以包括手套或指套样式的触觉反馈装置,可以为至少一个手指部件设置有振动马达以便提供振动反馈。特别地,在姿态采集阶段每个手指对应不同音轨的情况下,在触觉反馈阶段,手套或指套样式的触觉反馈设备的每个手指部件的震动马达可根据对应音轨的声音强度、节奏进行震动反馈。
在一些实施例中,进一步地,为了提升听力受损人士的用户体验并在应用成本上取得平衡,触感提供装置可以被设定为仅针对听力受损人士难以听到的音轨进行触觉反馈,例如特定频率音频(诸如高频音轨)等进行触觉反馈。作为示例,在手套或指套样式的触觉反馈装置的情况下,可以预先规定多个手指与多个音轨的对应关系,由此被采集每个手指的姿态都可以进行相应音轨的控制,但仅在特定音轨、例如高频音轨对应的手指上设置有触觉反馈单元,从而仅将该特定音轨的对应音频内容触觉反馈给用户。
在一些实施例中,还可以主要提供用户敏感的音频频率、节奏等进行反馈。作为示例,考虑到一般听众对鼓点等节奏韵律较为敏感,反馈设备可以被设定为仅针对鼓点所在的音轨进行触觉反馈。这样能够增强用户感受音乐的用户体验。例如用户在收听音乐时在特定的节奏旋律感受到反馈,进一步提高用户体验。
在一些实施例中,在触感提供装置包括手套或指套样式的触觉反馈装置的情况下,其还可被适当地设置以便于识别、控制以及用户操作的简易性。作为示例,可以预先规定特定的一个或多个手指专用于姿态控制,而另一个或多个手指专用于触觉反馈。
在存在演奏人员和听众的情况下,演奏人员的手指佩戴的呈现设备可与听众是对应的。例如,演奏人员的手指佩戴呈现设备可与听众是相同的,其中演奏人员的手指与听众的手指是相对应的,例如,同一手指对应于同一音轨。还例如,演奏人员的手指佩戴设备可以与听众不同,但是对应关系是预先设定的。
根据本公开的实施例,接收侧设备还可以使得音频内容以其它适当的方式呈现给用户。作为示例,可以采取声音、视频展示、灯光、视觉特效等方式来呈现给用户。在此情况下,接收侧设备可以将音频内容转换为适合于其它呈现方式的信息/数据,然后将之提供转发给相对应的呈现装置。当然,应指出,在多种呈现装置可以由相同格式的数据驱动的情况下,可将音频内容转换之后公共地用于各种呈现装置。
作为一些示例,音频内容可被以音频方式提供给用户。作为示例,可以通过扬声器等将音频内容播放给用户。特别地,在播放之前可以在对音频内容进行进一步的处理,例如将音频内容转换为适合听力受损人士的低频内容。这里,还可包含必要的音频播放软件、音频播放设备等等,这里将不再详细描述。这样的扬声器可以是便携式设备的扬声器,影院场景中的扬声器、KTV、酒吧、聚会场所等设置的扬声器等,或者是其它适当类型的扬声器等等。
作为另一些示例,音频内容可被以视频方式提供给用户。例如,通过视频呈现设备来提供给用户。作为示例,可以通过各种类型的屏幕,例如投影仪、计算机屏幕、便携式设备的屏幕等等,以各种适当的视频呈现给用户。这样的视频可以是与音频内容对应的视频轨迹、特效、图片、短视频等等,可以被以预先设定和存储。
作为另一示例,可通过灯光效果来呈现音频,特别地,呈现装置的灯光可以根据音频内容的节奏相应地闪烁。这样的呈现装置可以是固定地设置的,例如固定屏幕、闪烁器件等,也可以是便携式的,例如便携式设备的屏幕、闪烁器件,诸如腕带、摆件等。作为示例,灯光效果可以通过电子腕带上的LED来实现。
交互性音频呈现
本公开进一步提出了优化的交互式音频呈现。
根据本公开的实施例,可以基于用户姿态来实现音频互动。特别地,可以获取用户的姿态来对于音频内容进行调整。具体而言,在包括演奏者和听众的场景中,演奏者可以如上所述地向听众呈现音频内容,听众在获取音频内容之后,可以通过其动作来进行反馈,例如通过动作来表达用户的情绪,根据用户的动作来适应性地调整音频内容等等。由此,可以实现用户互动。
根据本公开的实施例,待呈现的音频内容包括基于所述用户的姿态信息对音频内容进行调整而获得的音频内容。在一些实施例中,调整包括以下中的至少一者:增大或减小音频内容的音量;调整音频内容的节奏;增强音频内容的效果;为音频内容增加附加效果。
特别地,作为示例,用户姿态可以对应于特定音频单元、音频片段等,并且特定用户姿态可以对应于针对该特定音频单元的修改、例如增强或减弱该特定音频单元的强度、变换该音频单元的节奏等等。在一些实施例中,在用户正对于特定音频片段作出特定动作时,可以相应地调整该音频片段的呈现效果。例如,特定动作可以指示要增大该音频片段的呈现效果,例如增大音量、增加触觉效果等;减小该音频片段的呈现效果,例如减小音量,减小触觉效果等等。这里的调整可以如前文所述地音频内容修改那样执行。作为示例,以用户手势相对于身体上下左右做动作为例,整个手在左边时指示低音,在右边时指示高音,向上时指示特定音符的高八度,向下时指示特定音符的低八度。
在一些示例中,用户可以通过动作来表达其喜欢该特定音频内容的情绪。例如,通过特定的挥手动作等。这样,可以在聚会场景中通过视频呈现该情绪。
应指出,交互式音频呈现的应用场景尤其适合于基于第二用户(例如听众)的姿态来实现对于所呈现的音频内容的影响。在一些实施例中,第二用户可以提供相对粗略的姿态信息,例如仅一个手指的姿态信息,针对音频内容进行相应的控制,例如控制鼓点、音量,从而简化用户的操作。图7B示出了根据本公开的第二用户(听众)的示例性姿态的示意图,例如可通过连续移动来进行音频内容控制。
应指出,第一用户也可以参与到交互式音频呈现的场景中,例如也可以基于第一用户的姿态来对音乐内容进行控制或调整。在此情况下,第一用户也可以被当做特定的第二用户,基于两者的用户姿态来调整音频内容。
在一些实施例中,基于用户姿态调整音频内容可以遵照各种适当的准则来进行。特别地,在存在至少一个第二用户、或者所获取的至少一个第二用户的姿态的情况下,可以基于至少一个第二用户的姿态信息的统计值来调整音频内容的呈现。这样,可以更加全面的考虑第二用户的需求来实现对于音频内容的影响。
根据本公开的实施例,至少一个第二用户的姿态信息的统计值包括关于用户姿态信息的优先级的统计值,并且根据至少一个第二用户的姿态信息中的最高优先级姿态信息来调整音频内容的呈现。在一些实施例中,关于用户姿态信息的优先级的统计值如下地被确定:对至少一个音频用户的姿态信息进行加权处理,其中加权处理是基于各姿态信息的数量、各姿态信息的优先级、各姿态信息对应的用户的优先级中的至少一者来进行的。
特别地,在存在多个第二用户的情况下尤其基于多个第二用户的姿态信息的统计值来调整音频内容的呈现,这样能考虑到多个第二用户的群体感受来进行音频设定,实现了群体共同创作的音频呈现反馈,从而能够提升听众的临场感。
在一些示例中,可以对于用户动作姿态设定优先级,然后按照用户姿态的优先级来提供反馈。例如,将各个用户的动作进行汇总,并且按照优先级进行排序,然后根据优先级最高的动作来进行相应的音频内容调整。
在一些示例中,可以根据用户动作的数量来提供反馈。例如,将将各个用户的动作进行汇总,统计相同或者相似动作的数量,并且按照数量最大的动作来进行相应的音频内容调整。
在一些示例中,还可以进一步考虑用户的优先级。特别地,可以为用户设定优先级,并且根据优先级最高的用户的动作来进行相应的音频内容调整。
在一些实施例中,还可以进一步基于用户动作的优先级、用户的优先级、用户动作的数量等中的至少两者来提供反馈。例如,可以将各个用户的动作进行汇总,并且进行数学统计来获得反馈结果,从而根据反馈结果来进行相应的音频内容调整。
在一些示例中,可以设定每个用户的优先级值,设定每个用户的动作的优先级值,然后,将所获取的各用户动作进行统计,以获得每种用户动作的统计值,例如通过将用户优先级值或动作优先级值乘以该动作的数量,由此而得到该用户动作的统计值。然后,根据统计值最高的动作来进行相应的音频内容调整。
通过本公开,可以实现方便的互动。例如在线上、线下的任何场景中可以方便地获取用户的反馈,实现互动,尤其是在线上、线下两者都存在的情况下,也可获得用 户的反馈。而且能够及时根据用户的反馈来进行音频内容的调整,满足用户的修改。还可以采用适当方式进行音频内容调整,这样能够更加适当地调整音频内容,获得更好的呈现效果。
在上述装置的结构示例中,处理电路302可以是通用处理器的形式,也可以是专用处理电路,例如ASIC。例如,处理电路202能够由电路(硬件)或中央处理设备(诸如,中央处理单元(CPU))构造。此外,处理电路302上可以承载用于使电路(硬件)或中央处理设备工作的程序(软件)。该程序能够存储在存储器(诸如,布置在存储器中)或从外面连接的外部存储介质中,以及经由网络(诸如,互联网)下载。
根据本公开的实施例,处理电路302可以包括用于实现上述功能的各个单元,例如接收单元304,被配置为接收来自用于交互式音频呈现的控制侧设备的待呈现音频内容的相关信息,其中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容,以及控制单元306,被配置为使得呈现音频内容,其中呈现音频内容包括以触觉方式呈现音频内容。控制单元306可控制发送单元308将音频内容或者其相关信息提供给音频呈现设备,以供音频内容呈现。在一些实施例中,音频呈现装置可以被包含在接收侧设备中,特别地,被包含在控制单元中,从而可以直接由控制单元控制音频呈现设备来呈现音频内容。
在一些实施例中,处理电路302还可以包括获取单元310,被配置为可以获取获取经由姿态捕捉装置确定的用户的姿态信息,并且经由发送单元308将所获取的用户姿态信息发送给控制侧设备。特别地,获取单元310可以与姿态捕捉装置分离,并且从姿态捕捉装置获取用户姿态信息。在另一种实现中,获取单元310可包含姿态捕捉装置。
根据本公开的实施例,处理单路302还可以包括转换单元312,被配置为将音频内容的相关信息进行转换为适于音频呈现装置的数据;以及经由发送单元308将转换得到的数据提供给音频呈现装置。
应注意,尽管图3中将各个单元示为分立的单元,但是这些单元中的一个或多个也可以合并为一个单元,或者拆分为多个单元。此外,一些单元也可并不被包含在处理电路甚至接收侧设备中,因此可用虚线示出。作为示例,获取单元310、转换单元312甚至可以在处理电路302之外,因此其也可被用虚线示出。
应注意,上述各个单元仅是根据其所实现的具体功能划分的逻辑模块,而不是用 于限制具体的实现方式,例如可以以软件、硬件或者软硬件结合的方式来实现。在实际实现时,上述各个单元可被实现为独立的物理实体,或者也可由单个实体(例如,处理器(CPU或DSP等)、集成电路等)来实现。此外,上述各个单元在附图中用虚线示出指示这些单元可以并不实际存在,而它们所实现的操作/功能可由处理电路本身来实现。
应理解,图3A仅仅是用于音频呈现的接收侧设备的概略性结构配置,设备300还可以包括其他可能的部件,诸如存储器、网络接口、控制器、通信单元等,为了清楚起见这些部件并未示出。特别地,处理电路可以与存储器相关联。例如,处理电路可以直接或间接(例如,中间可能连接有其它部件)连接到存储器,以进行图像处理相关数据的存取。存储器可以存储由处理电路302产生的各种数据和/或信息。存储器还可以位于优化设备内但在处理电路之外,或者甚至位于优化设备之外。存储器可以是易失性存储器和/或非易失性存储器。例如,存储器可以包括但不限于随机存储存储器(RAM)、动态随机存储存储器(DRAM)、静态随机存取存储器(SRAM)、只读存储器(ROM)、闪存存储器。
以下将参照图3B来描述根据本公开的实施例的用于失真图像增强的模型训练方法的流程图。在方法310中,在步骤S311(接收步骤),接收来自用于交互式音频呈现的控制侧设备的待呈现音频内容的相关信息,其中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容,以及在步骤S313(控制步骤),使得呈现音频内容,其中呈现音频内容包括以触觉方式呈现音频内容。在一些实施例中,可选地,方法310还可包括步骤S312(转换步骤),将音频内容的相关信息进行转换为适于音频呈现装置的数据;从而转换得到的数据可被提供给音频呈现装置。
应指出,这些步骤可以由任何适当的设备或设备元件来执行,例如前述的接收侧设备,接收侧设备中的处理电路、处理电路中的相应元件等等。应指出,根据本公开的实施例的音频呈现方法还可包含其他步骤,例如前文所述的各种进一步的处理。而且这些进一步的处理也可通过适当的设备或者设备元件来执行,这里将不再详细描述。
以下将参照图4A来描述根据本公开的实施例的用于音频呈现的控制侧设备的框图。该控制侧设备400包括处理电路402,被配置为:获取音频内容呈现指示信息,所述音频内容呈现指示信息包括基于音频要被呈现给的用户的姿态信息的指示信息,将待呈现音频内容的相关信息发送给用于音频交互呈现的接收侧设备,其中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容。
根据本公开的实施例,所述处理电路402可进一步配置为:获取用户的姿态信息,以及基于所获取的用户的姿态信息来设定待呈现音频内容。
根据本公开的实施例,所述处理电路402可进一步配置为:确定至少一个第二用户的姿态信息的统计值,所述统计值包括关于用户姿态信息的优先级的统计值,并且根据至少一个第二用户的姿态信息中的最高优先级姿态信息来设定待呈现音频内容。
根据本公开的实施例,处理电路402可以采用各种适当的方式来实现,如上文所述的处理电路302那样,这里将不再详细描述。特别地,根据本公开的实施例,处理电路402可以包括用于实现上述功能的各个单元,例如获取单元404,被配置为获取音频内容呈现指示信息,所述音频内容呈现指示信息包括基于音频要被呈现给的用户的姿态信息的指示信息,发送单元406,被配置为将待呈现音频内容的相关信息发送给用于音频交互呈现的接收侧设备,其中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容。
根据本公开的实施例,处理电路402可包括设定单元408,被配置为基于所获取的用户的姿态信息来设定待呈现音频内容。
根据本公开的实施例,处理电路可包括确定单元410,被配置为确定至少一个第二用户的姿态信息的统计值,所述统计值包括关于用户姿态信息的优先级的统计值,由此设定单元408可根据至少一个第二用户的姿态信息中的最高优先级姿态信息来设定待呈现音频内容。
根据本公开的实施例,处理单路402还可以包括转换单元412,被配置为将音频内容转换成适合于接收侧设备接收的信息,甚至可以转换为适于音频呈现装置的数据。
以下将参照图4B来描述根据本公开的实施例的用于音频呈现的控制方法的流程图。在方法410中,在步骤S411(获取步骤),获取音频内容呈现指示信息,所述音频内容呈现指示信息包括基于音频要被呈现给的用户的姿态信息的指示信息,以及在步骤S413(发送步骤),将待呈现音频内容的相关信息发送给用于音频交互呈现的接收侧设备,其中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容。在一些实施例中,可选的,在该方法中,在步骤S412(设定步骤),基于所获取的用户的姿态信息来设定待呈现音频内容。
应指出,这些步骤可以由任何适当的设备或设备元件来执行,例如前述的控制侧设备,控制侧设备中的处理电路、处理电路中的相应元件等等。应指出,根据本公开的实施例的音频呈现方法还可包含其他步骤,例如前文所述的各种进一步的处理。而 且这些进一步的处理也可通过适当的设备或者设备元件来执行,这里将不再详细描述。
根据本公开的实施例,还提供了一种用于交互式音频呈现的系统,该系统可包括如上所述的控制侧设备和接收侧设备,其中接收侧设备可以与至少一个用户相关联,例如与包括第一用户和第二用户的多个用户相关联,其中每个用户佩戴相应的或者相关联的接收侧设备。
在一些实施例中,控制侧设备接收音频呈现用户的姿态信息,并基于音频呈现用户的姿态信息设定待呈现音频内容。而接收侧设备接收待呈现音频内容的相关信息,并且使得呈现音频内容,其中特别地,呈现音频内容包括以触觉方式将音频内容呈现给用户。
根据本公开的实施例,还提供了一种用于交互式音频呈现的方法,该方法基于如上所述的控制侧方法和接收侧方法。
作为示例,在操作中,可以获取第一用户(例如聚会、酒吧等场景中的主持者、演唱会的演奏者、各种活动的表演者等等)的姿态(例如,手指姿态等)以生成、创建或者启动音乐,从而通过第二用户(例如听众、观众等)的接收设备,例如佩戴在手上的触觉反馈设备来触觉呈现给第二用户。另一方面,在音乐呈现给第二用户期间,还可以获取第二用户的姿态(例如,手指姿态等),这反映了用户的体验、反馈、需求等等,并且可根据第二用户的姿态来对音乐进行调整以更适应于用户,进一步提高用户的体验。
【实现示例】
以下将以手指或手部姿态实现音频内容的呈现和/或反馈为例来描述本公开的实施例的实现。
应指出,同样可应用于脚趾或脚步姿态,进一步可应用于身体其他部分的姿态,还可应用于特定器件的姿态。特定器件可被称为互动器件,该器件的姿态也可应用于生成/调整音频内容。作为一个示例,可以是特定器件,例如拟人玩偶、手持器件、佩戴在身体上的各种器件等,可以采集这些器件的特定姿态,来实现音频内容呈现和/或反馈。作为示例,在舞台、聚会等场景中,入场观众可以发布特定的手持器件,例如荧光棒等,这样可以根据入场观众的手持器件的姿态/动作来作为观众反馈,以相应地调整音频内容。
图9示出了根据本公开的接收侧设备的实现方式。其可以实现为可以佩戴在用户 手指上的指套/手套的形式。
作为示例,其可以包含各种适当的元件/装置。其中,901指示接收侧设备的数据接收、发送单元,例如可以接收音频内容相关信息,并且提供特定数据以驱动触觉提供装置903和灯光特效呈现装置904。附加地,901还可以实现音频内容数据进行数据转换。可选地,触觉提供装置903和灯光特效呈现装置904也可被包括在接收侧设备中。
可选地,接收侧设备中还可以包括姿态获取装置902,其可以获取手指运动数据,并经由901提供到控制侧设备。尽管图9中示出了接收侧设备中仅包括单个姿态获取装置902,并且姿态获取装置902仅佩戴在一个手指上,但是这仅仅示例性的,而且是可以被佩戴在其他手指上,或者佩戴在更多的手指上。
应指出,佩戴在用户上的手套可以相同或者不同。
作为示例,佩戴在演奏者的手上的接收侧设备中,两个或更多个手指都可被配置有姿态捕获装置,从而能够更加准确地检测演奏者的手势,以便更加准确地设定、创建或合成音频内容。作为对比,佩戴在听众手上的接收侧设备,姿态捕获装置可以仅佩戴在一个手指上,而触觉提供装置也可以佩戴在另一个手指上,这样可以简化听众的操作,便于听众使用
应指出,在本公开的实施例中,接收侧设备还可包括集成天线。
附加地,接收侧设备还可以包括电池、数据传递收发器件,例如天线,以及可选的数据处理单元,例如相位器、滤波器等等,这里将不再详细描述。
图10示出了根据本公开的实施例的演奏者和听众进行交互式音频呈现的示例性实现的示意图。
如图10中所示,演奏者进行相对精细的姿态操作,然后基于所获取的姿态信息来设定音乐,例如将姿态转换成音乐,从而以适当方式呈现给演奏者。并且这样的音乐可以呈现给听众。例如在现场场景中。另一方面,听众也可进行姿态操作,特别地,为了可以进行相对简单的姿态操作,然后基于所获取的姿态信息来影响音乐,例如将调整音乐的节奏、旋律等等,从而以适当方式呈现给听众,例如通过声音、触觉、视觉反馈等方式来呈现给听众。当然,这样调整后的音乐也可被呈现演奏者。这样实现了交互式音频呈现。
这样的实现可以体现各种适当的应用场景中。例如在一种音乐吧,尤其是适合于或者可以接待听力受损人士的音乐吧中。在音乐吧运营期间,听众在希望参与时可以 在入口处领取到适当的接收设备,如上所述的手套式设备,然后在音乐吧中的活动中,通过适当地肢体运动,尤其是佩戴有手套式装置的手指摆动,来基于手指摆动所对应的姿态来设定观众收听的音乐。具体如上所述,这里将不再详细描述。
应指出,本公开的技术方案可以应用于各种适当的任务,包括但不局限于听力受损环境。在一些实施例中,该任务包括生成音乐并且将音乐以视频方式、音频方式等提供给用户的任务。
在另一些实施例中,本公开的技术方案还可应用于提供其它内容,以其它方式。例如提供音频内容,或者将视频中的音频内容,如电影、电视剧等中的对白,提供给听力受损的用户。在另一些实施例中,本公开的技术方案同样可以用于听力正常用户。
本公开的技术可被用于许多应用。例如,本公开的技术可被用于现场音频呈现应用场合。还可以用于远程音乐会、演奏会等等,同样可以捕获用户的姿态,在云端设定或调整音频内容,然后将音频内容通过网络提供给用户。
在一些实施例中,本公开的方案可以通过软件算法来实现,从而可以方便地集成在包含呈现设备的各种类型的设备中,例如包含各种呈现装置的设备,诸如指套。特别地,本公开的方法可作为计算机程序、指令等由便携设备的处理器来执行,以便进行音频呈现进行增强处理。
另外,应当理解,上述系列处理和设备也可以通过软件和/或固件实现。在通过软件和/或固件实现的情况下,从存储介质或网络向具有专用硬件结构的计算机,例如图11所示的通用个人计算机1100安装构成该软件的程序,该计算机在安装有各种程序时,能够执行各种功能等等。图11是示出根据本公开的实施例的中可采用的优化设备的个人计算机的示例结构的框图。在一个例子中,该个人计算机可以对应于根据本公开的上述示例性优化设备。
在图11中,中央处理单元(CPU)1101根据只读存储器(ROM)1102中存储的程序或从存储部分1108加载到随机存取存储器(RAM)1103的程序执行各种处理。在RAM 1103中,也根据需要存储当CPU 1101执行各种处理等时所需的数据。
CPU 1101、ROM 1102和RAM 1103经由总线1104彼此连接。输入/输出接口1105也连接到总线1104。
下述部件连接到输入/输出接口1105:输入部分1106,包括键盘、鼠标等;输出部分1107,包括显示器,比如阴极射线管(CRT)、液晶显示器(LCD)等,和扬声器等;存储部分1108,包括硬盘等;和通信部分1109,包括网络接口卡比如LAN卡、 调制解调器等。通信部分1109经由网络比如因特网执行通信处理。
根据需要,驱动器1110也连接到输入/输出接口1105。可拆卸介质1111比如磁盘、光盘、磁光盘、半导体存储器等等根据需要被安装在驱动器1110上,使得从中读出的计算机程序根据需要被安装到存储部分1108中。
在通过软件实现上述系列处理的情况下,从网络比如因特网或存储介质比如可拆卸介质1111安装构成软件的程序。
本领域技术人员应当理解,这种存储介质不局限于图11所示的其中存储有程序、与设备相分离地分发以向用户提供程序的可拆卸介质1111。可拆卸介质1111的例子包含磁盘(包含软盘(注册商标))、光盘(包含光盘只读存储器(CD-ROM)和数字通用盘(DVD))、磁光盘(包含迷你盘(MD)(注册商标))和半导体存储器。或者,存储介质可以是ROM 1102、存储部分1108中包含的硬盘等等,其中存有程序,并且与包含它们的设备一起被分发给用户。
应指出,文中所述的方法和设备可被实现为软件、固件、硬件或它们的任何组合。有些组件可例如被实现为在数字信号处理器或者微处理器上运行的软件。其他组件可例如实现为硬件和/或专用集成电路。
另外,可采用多种方式来实行本公开的方法和系统。例如,可通过软件、硬件、固件或它们的任何组合来实行本公开的方法和系统。上文所述的该方法的步骤的顺序仅是说明性的,并且除非另外具体说明,否则本公开的方法的步骤不限于上文具体描述的顺序。此外,在一些实施例中,本公开还可具体化为记录介质中记录的程序,包括用于实施根据本公开的方法的机器可读指令。因此,本公开还涵盖了存储用于实施根据本公开的方法的程序的记录介质。这样的存储介质可以包括但不限于软盘、光盘、磁光盘、存储卡、存储棒等等。
本领域技术人员应当意识到,在上述操作之间的边界仅仅是说明性的。多个操作可以结合成单个操作,单个操作可以分布于附加的操作中,并且操作可以在时间上至少部分重叠地执行。而且,另选的实施例可以包括特定操作的多个实例,并且在其他各种实施例中可以改变操作顺序。但是,其它的修改、变化和替换同样是可能的。因此,本说明书和附图应当被看作是说明性的,而非限制性的。
另外,本公开的实施方式还可以包括以下示例性实施例实现(EEE)。
EEE 1、一种用于交互式音频呈现的接收侧设备,所述设备包括处理电路,被配置为:接收来自用于交互式音频呈现的控制侧设备的待呈现音频内容的相关信息,其 中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容,以及使得呈现音频内容,其中呈现音频内容包括以触觉方式呈现音频内容。
EEE 2、根据EEE 1所述的接收侧设备,其中,用户的姿态信息包括所述用户的姿态、姿态运动信息中的至少一者,其中,姿态运动信息包括姿态运动方向、轨迹、运动加速度中的至少一者。
EEE 3、根据EEE 1所述的接收侧设备,其中,所述处理电路进一步配置为:获取经由姿态捕捉装置确定的用户的姿态信息,并且将所获取的用户姿态信息发送给控制侧设备。
EEE 4、根据EEE 3所述的接收侧设备,其中,所述姿态捕获装置包括能够佩戴在用户的至少一个手指上的动作捕捉器件,并且姿态信息是基于佩戴有动作捕捉器件的各手指的姿态信息和/或它们的组合的。
EEE 5、根据EEE 1所述的接收侧设备,其中,用户的姿态信息包括第一用户的姿态信息,并且其中,待呈现的音频内容包括由第一用户姿态信息所指定的特定音频内容、基于与所述第一用户的姿态信息相对应的音频单元或者特定组合构建的音频内容中的至少一者。
EEE 6、根据EEE 1所述的接收侧设备,其中,用户的姿态信息包括第二用户的姿态信息,并且其中,待呈现的音频内容包括基于所述第二用户的姿态信息对音频内容进行调整而获得的音频内容。
EEE 7、根据EEE 6所述的接收侧设备,其中,基于所述第二用户的姿态信息对音频内容进行调整包括以下中的至少一者:
增大或减小音频内容的音量;
调整音频内容的节奏;
增强音频内容的效果;
为音频内容增加附加效果。
EEE 8、根据EEE 7所述的接收侧设备,其中,基于所述第二用户的姿态信息对音频内容进行调整包括:
基于多个第二用户的姿态信息的统计值来调整音频内容的呈现。
EEE 9、根据EEE 8所述的接收侧设备,其中,多个第二用户的姿态信息的统计值包括关于用户姿态信息的优先级的统计值,并且根据多个第二用户的姿态信息中的最高优先级姿态信息来调整音频内容的呈现。
EEE 10、根据EEE 9所述的接收侧设备,其中,关于用户姿态信息的优先级的统计值如下地被确定:
对多个音频用户的姿态信息进行加权处理,其中加权处理是基于各姿态信息的数量、各姿态信息的优先级、各姿态信息对应的用户的优先级中的至少一者来进行的。
EEE 11、根据EEE 1所述的接收侧设备,其中,使得呈现音频内容进一步包括:将音频内容的相关信息进行转换为适于适用于音频呈现装置的数据;以及将转换得到的数据提供给音频呈现装置。
EEE 12、根据EEE 1-11中任一项所述的接收侧设备,其中,所述音频呈现装置为触感提供装置,使得经由所述触感提供装置将音频内容以触觉方式提供给用户。
EEE 13、根据EEE 12所述的接收侧设备,其中,所述触感提供装置包括至少一个触觉单元,其中每个触觉单元对应于待呈现音频内容中的特定音轨。
EEE 14、一种用于交互式音频呈现的控制侧设备,所述设备包括处理电路,被配置为:获取音频内容呈现指示信息,所述音频内容呈现指示信息包括基于音频要被呈现给的用户的姿态信息的指示信息,以及将待呈现音频内容的相关信息发送给用于音频交互呈现的接收侧设备,其中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容。
EEE 15、根据EEE 14所述的控制侧设备,其中,所述处理电路进一步配置为:获取用户的姿态信息,以及基于所获取的用户的姿态信息来设定待呈现音频内容。
EEE 16、根据EEE 15所述的控制侧设备,其中,用户的姿态信息包括第一用户的姿态信息,并且其中,待呈现的音频内容包括由第一用户姿态信息所指定的特定音频内容、基于与所述第一用户的姿态信息相对应的音频单元或者特定组合构建的音频内容中的至少一者,和/或
其中,用户的姿态信息包括第二用户的姿态信息,并且其中,待呈现的音频内容包括基于所述第二用户的姿态信息对音频内容进行调整而获得的音频内容。
EEE 17、根据EEE 16所述的控制侧侧设备,其中,所述处理电路进一步配置为:
确定多个第二用户的姿态信息的统计值,所述统计值包括关于用户姿态信息的优先级的统计值,并且
根据多个第二用户的姿态信息中的最高优先级姿态信息来设定待呈现音频内容。
EEE 18、一种用于交互式音频呈现的接收侧的方法,包括:接收来自用于交互式音频呈现的控制侧设备的待呈现音频内容的相关信息,其中所述待呈现音频内容包括 基于用户的姿态信息设定的音频内容,以及使得呈现音频内容,其中呈现音频内容包括以触觉方式呈现音频内容。
EEE 19、根据EEE 18所述的方法,还包括:获取经由姿态捕捉装置确定的用户的姿态信息,并且将所获取的用户姿态信息发送给控制侧设备。
EEE 20、根据EEE 18所述的方法,其中,使得呈现音频内容进一步包括:将音频内容的相关信息进行转换为适于适用于音频呈现装置的数据;以及将转换得到的数据提供给音频呈现装置。
EEE 21、一种用于交互式音频呈现的控制侧的方法,包括:获取音频内容呈现指示信息,所述音频内容呈现指示信息包括基于音频要被呈现给的用户的姿态信息的指示信息,并且将待呈现音频内容的相关信息发送给用于音频交互呈现的接收侧设备,其中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容。
EEE 22、根据EEE 21所述的方法,还包括:获取用户的姿态信息,以及基于所获取的用户的姿态信息来设定待呈现音频内容。
EEE 23、根据EEE 21所述的方法,还包括:确定多个第二用户的姿态信息的统计值,所述统计值包括关于用户姿态信息的优先级的统计值,并且根据多个第二用户的姿态信息中的最高优先级姿态信息来设定待呈现音频内容。
EEE 24、一种交互式音频呈现系统,包括:用于交互式音频呈现的控制侧设备,被配置为接收音频呈现用户的姿态信息,并基于音频呈现用户的姿态信息设定待呈现音频内容;以及用于交互式音频呈现的接收侧设备,被配置为接收待呈现音频内容的相关信息,并且使得呈现音频内容,其中呈现音频内容包括以触觉方式将音频内容呈现给用户。
EEE 25、根据EEE 24所述的系统,其中,所述接收侧设备进一步配置为:获取用户的姿态信息,并且将所述姿态信息发送至所述控制侧设备。
EEE 26、根据EEE 24所述的系统,其中,所述控制侧设备进一步配置为:获取多个用户的姿态信息,并且基于多个用户的姿态信息的统计值设定待呈现音频内容。
EEE 27、一种交互式音频呈现方法,包括:接收音频呈现用户的姿态信息,基于音频呈现用户的姿态信息设定待呈现音频内容;以及使得呈现音频内容,其中呈现音频内容包括以触觉方式将音频内容呈现给用户。
EEE 28、一种设备,包括至少一个处理器;和至少一个存储设备,所述至少一个存储设备在其上存储指令,该指令在由所述至少一个处理器执行时,使所述至少一个 处理器执行根据EEE 18-23和27中任一项所述的方法。
EEE 29、一种存储指令的存储介质,该指令在由处理器执行时能使得执行处理器根据EEE 18-23和27中任一项所述的方法。
EEE 30、一种计算机程序产品,所述计算机程序产品包含指令,该指令在由处理器执行时能使得处理器执行根据EEE 18-23和27中任一项所述的方法。
EEE 31、一种计算机程序,所述计算机程序包含指令,该指令在由处理器执行时能使得处理器执行根据EEE 18-23和27中任一项所述的方法。
虽然已经详细说明了本公开及其优点,但是应当理解在不脱离由所附的权利要求所限定的本公开的精神和范围的情况下可以进行各种改变、替代和变换。而且,本公开实施例的术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
虽然已详细描述了本公开的一些具体实施例,但是本领域技术人员应当理解,上述实施例仅是说明性的而不限制本公开的范围。本领域技术人员应该理解,上述实施例可以被组合、修改或替换而不脱离本公开的范围和实质。本公开的范围是通过所附的权利要求限定的。

Claims (31)

  1. 一种用于交互式音频呈现的接收侧设备,所述设备包括处理电路,被配置为:
    接收来自用于交互式音频呈现的控制侧设备的待呈现音频内容的相关信息,其中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容,以及
    使得呈现音频内容,其中呈现音频内容包括以触觉方式呈现音频内容。
  2. 根据权利要求1所述的接收侧设备,其中,用户的姿态信息包括所述用户的姿态、姿态运动信息中的至少一者,
    其中,姿态运动信息包括姿态运动方向、轨迹、运动加速度中的至少一者。
  3. 根据权利要求1或2所述的接收侧设备,其中,所述处理电路进一步配置为:
    获取经由姿态捕捉装置确定的用户的姿态信息,并且
    将所获取的用户姿态信息发送给控制侧设备。
  4. 根据权利要求3所述的接收侧设备,其中,所述姿态捕获装置包括能够佩戴在用户的至少一个手指上的动作捕捉器件,并且姿态信息是基于佩戴有动作捕捉器件的各手指的姿态信息和/或它们的组合的。
  5. 根据权利要求1-4中任一项所述的接收侧设备,其中,用户的姿态信息包括第一用户的姿态信息,并且
    其中,待呈现的音频内容包括由第一用户姿态信息所指定的特定音频内容、基于与所述第一用户的姿态信息相对应的音频单元或者特定组合构建的音频内容中的至少一者。
  6. 根据权利要求1-5中任一项所述的接收侧设备,其中,用户的姿态信息包括第二用户的姿态信息,并且
    其中,待呈现的音频内容包括基于所述第二用户的姿态信息对音频内容进行调整而获得的音频内容。
  7. 根据权利要求6所述的接收侧设备,其中,基于所述第二用户的姿态信息对音频内容进行调整包括以下中的至少一者:
    增大或减小音频内容的音量;
    调整音频内容的节奏;
    增强音频内容的效果;
    为音频内容增加附加效果。
  8. 根据权利要求6或7所述的接收侧设备,其中,基于所述第二用户的姿态信息对音频内容进行调整包括:
    基于多个第二用户的姿态信息的统计值来调整音频内容的呈现。
  9. 根据权利要求8所述的接收侧设备,其中,多个第二用户的姿态信息的统计值包括关于用户姿态信息的优先级的统计值,并且根据多个第二用户的姿态信息中的最高优先级姿态信息来调整音频内容的呈现。
  10. 根据权利要求9所述的接收侧设备,其中,关于用户姿态信息的优先级的统计值如下地被确定:
    对多个音频用户的姿态信息进行加权处理,其中加权处理是基于各姿态信息的数量、各姿态信息的优先级、各姿态信息对应的用户的优先级中的至少一者来进行的。
  11. 根据权利要求1-10中任一项所述的接收侧设备,其中,使得呈现音频内容进一步包括:
    将音频内容的相关信息转换为适于适用于音频呈现装置的数据;以及
    将转换得到的数据提供给所述音频呈现装置。
  12. 根据权利要求11所述的接收侧设备,其中,所述音频呈现装置为触感提供装置,使得经由所述触感提供装置将音频内容以触觉方式提供给用户。
  13. 根据权利要求12所述的接收侧设备,其中,所述触感提供装置包括至少一个触觉单元,其中每个触觉单元对应于待呈现音频内容中的特定音轨。
  14. 一种用于交互式音频呈现的控制侧设备,所述设备包括处理电路,被配置为:
    获取音频内容呈现指示信息,所述音频内容呈现指示信息包括基于音频要被呈现给的用户的姿态信息的指示信息,以及
    将待呈现音频内容的相关信息发送给用于音频交互呈现的接收侧设备,其中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容。
  15. 根据权利要求14所述的控制侧设备,其中,所述处理电路进一步配置为:
    获取用户的姿态信息,以及
    基于所获取的用户的姿态信息来设定待呈现音频内容。
  16. 根据权利要求14或15所述的控制侧设备,
    其中,用户的姿态信息包括第一用户的姿态信息,并且其中,待呈现的音频内容包括由第一用户姿态信息所指定的特定音频内容、基于与所述第一用户的姿态信息相对应的音频单元或者特定组合构建的音频内容中的至少一者,和/或
    其中,用户的姿态信息包括第二用户的姿态信息,并且其中,待呈现的音频内容包括基于所述第二用户的姿态信息对音频内容进行调整而获得的音频内容。
  17. 根据权利要求16所述的控制侧侧设备,其中,所述处理电路进一步配置为:
    确定多个第二用户的姿态信息的统计值,所述统计值包括关于用户姿态信息的优先级的统计值,并且
    根据多个第二用户的姿态信息中的最高优先级姿态信息来设定待呈现音频内容。
  18. 一种用于交互式音频呈现的接收侧的方法,包括:
    接收来自用于交互式音频呈现的控制侧设备的待呈现音频内容的相关信息,其中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容,以及
    使得呈现音频内容,其中呈现音频内容包括以触觉方式呈现音频内容。
  19. 根据权利要求18所述的方法,还包括:
    获取经由姿态捕捉装置确定的用户的姿态信息,并且
    将所获取的用户姿态信息发送给控制侧设备。
  20. 根据权利要求18或19所述的方法,其中,使得呈现音频内容进一步包括:
    将音频内容的相关信息转换为适于适用于音频呈现装置的数据;以及
    将转换得到的数据提供给音频呈现装置。
  21. 一种用于交互式音频呈现的控制侧的方法,包括:
    获取音频内容呈现指示信息,所述音频内容呈现指示信息包括基于音频要被呈现给的用户的姿态信息的指示信息,并且
    将待呈现音频内容的相关信息发送给用于音频交互呈现的接收侧设备,其中所述待呈现音频内容包括基于用户的姿态信息设定的音频内容。
  22. 根据权利要求21所述的方法,还包括:
    获取用户的姿态信息,以及
    基于所获取的用户的姿态信息来设定待呈现音频内容。
  23. 根据权利要求21或22所述的方法,还包括:
    确定多个第二用户的姿态信息的统计值,所述统计值包括关于用户姿态信息的优先级的统计值,并且
    根据多个第二用户的姿态信息中的最高优先级姿态信息来设定待呈现音频内容。
  24. 一种交互式音频呈现系统,包括:
    用于交互式音频呈现的控制侧设备,被配置为接收音频呈现用户的姿态信息,并基于音频呈现用户的姿态信息设定待呈现音频内容;以及
    用于交互式音频呈现的接收侧设备,被配置为接收待呈现音频内容的相关信息,并且使得呈现音频内容,其中呈现音频内容包括以触觉方式将音频内容呈现给用户。
  25. 根据权利要求24所述的系统,其中,所述接收侧设备进一步配置为:获取用户的姿态信息,并且将所述姿态信息发送至所述控制侧设备。
  26. 根据权利要求24或25所述的系统,其中,所述控制侧设备进一步配置为:获取多个用户的姿态信息,并且基于多个用户的姿态信息的统计值设定待呈现音频内容。
  27. 一种交互式音频呈现方法,包括:接收音频呈现用户的姿态信息,基于音频呈现用户的姿态信息设定待呈现音频内容;以及使得呈现音频内容,其中呈现音频内容包括以触觉方式将音频内容呈现给用户。
  28. 一种设备,包括:
    至少一个处理器;和
    至少一个存储设备,所述至少一个存储设备在其上存储指令,该指令在由所述至少一个处理器执行时,使所述至少一个处理器执行根据权利要求18-23和27中任一项所述的方法。
  29. 一种存储指令的存储介质,该指令在由处理器执行时能使得执行处理器根据权利要求18-23和27中任一项所述的方法。
  30. 一种计算机程序产品,所述计算机程序产品包含指令,该指令在由处理器执行时能使得处理器执行根据权利要求18-23和27中任一项所述的方法。
  31. 一种计算机程序,所述计算机程序包含指令,该指令在由处理器执行时能使得处理器执行根据权利要求18-23和27中任一项所述的方法。
PCT/CN2023/138019 2022-12-12 2023-12-12 音频呈现方法和设备 WO2024125478A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202380083647.2A CN120322747A (zh) 2022-12-12 2023-12-12 音频呈现方法和设备

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211599599.6 2022-12-12
CN202211599599.6A CN118226946A (zh) 2022-12-12 2022-12-12 音频呈现方法和设备

Publications (1)

Publication Number Publication Date
WO2024125478A1 true WO2024125478A1 (zh) 2024-06-20

Family

ID=91484384

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/138019 WO2024125478A1 (zh) 2022-12-12 2023-12-12 音频呈现方法和设备

Country Status (2)

Country Link
CN (2) CN118226946A (zh)
WO (1) WO2024125478A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110228962A1 (en) * 2008-09-19 2011-09-22 National University Of Singapore Haptic Chair Sound Enhancing System With Audiovisual Display
US20150103154A1 (en) * 2013-10-10 2015-04-16 Sony Corporation Dual audio video output devices with one device configured for the sensory impaired
US20190373355A1 (en) * 2018-05-30 2019-12-05 Bose Corporation Audio eyeglasses with gesture control
CN112817557A (zh) * 2021-02-08 2021-05-18 海信视像科技股份有限公司 一种基于多人手势识别的音量调节方法及显示设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110228962A1 (en) * 2008-09-19 2011-09-22 National University Of Singapore Haptic Chair Sound Enhancing System With Audiovisual Display
US20150103154A1 (en) * 2013-10-10 2015-04-16 Sony Corporation Dual audio video output devices with one device configured for the sensory impaired
US20190373355A1 (en) * 2018-05-30 2019-12-05 Bose Corporation Audio eyeglasses with gesture control
CN112817557A (zh) * 2021-02-08 2021-05-18 海信视像科技股份有限公司 一种基于多人手势识别的音量调节方法及显示设备

Also Published As

Publication number Publication date
CN120322747A (zh) 2025-07-15
CN118226946A (zh) 2024-06-21

Similar Documents

Publication Publication Date Title
US11625994B2 (en) Vibrotactile control systems and methods
JP4555072B2 (ja) ローカライズされたオーディオ・ネットワークおよび関連するディジタル・アクセサリ
US10290291B2 (en) Information processing apparatus, method, and program for controlling output of a processing pattern in association with reproduced content
US20150161908A1 (en) Method and apparatus for providing sensory information related to music
US20200243055A1 (en) Method and System for Musical Communication
Frid Accessible digital musical instruments-a survey of inclusive instruments
CN104700860B (zh) 律动图像化方法及系统
Hunt et al. Multiple media interfaces for music therapy
CN109119057A (zh) 音乐创作方法、装置及存储介质和穿戴式设备
Clarke Rhythm/body/motion: Tricky's contradictory dance music
JP2023025013A (ja) 音楽療法のための歌唱補助装置
JP2014123085A (ja) カラオケにおいて歌唱に合わせて視聴者が行う身体動作等をより有効に演出し提供する装置、方法、およびプログラム
WO2022163137A1 (ja) 情報処理装置、情報処理方法、およびプログラム
WO2024125478A1 (zh) 音频呈现方法和设备
US12008892B2 (en) Vibrotactile control systems and methods
CN114639394B (zh) 一种虚拟演奏伙伴的实现方法和装置
WO2025121219A1 (en) Information processing apparatus, method, and program
US20230237981A1 (en) Method and apparatus for implementing virtual performance partner
Hólmgeirsson Enhancing the Performer-Spectator Communication at Electronic Concerts
Civit et al. A Framework for AI assisted Musical Devices
Martin Mobile computer music for percussionists
JP2023174364A (ja) カラオケ装置
WO2023084933A1 (ja) 情報処理装置、情報処理方法およびプログラム
WO2025122287A1 (en) Methods and systems for processing audio signals to identify sentiments for use in controlling game assets
KR20250030269A (ko) 촉각 신호 콘텐츠와 영상 콘텐츠를 동기화하여 재생하는 시스템

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23902674

Country of ref document: EP

Kind code of ref document: A1