SG11202003722SA - Speaker separation model training method, two-speaker separation method and computing device - Google Patents
Speaker separation model training method, two-speaker separation method and computing deviceInfo
- Publication number
- SG11202003722SA SG11202003722SA SG11202003722SA SG11202003722SA SG11202003722SA SG 11202003722S A SG11202003722S A SG 11202003722SA SG 11202003722S A SG11202003722S A SG 11202003722SA SG 11202003722S A SG11202003722S A SG 11202003722SA SG 11202003722S A SG11202003722S A SG 11202003722SA
- Authority
- SG
- Singapore
- Prior art keywords
- speaker separation
- computing device
- model training
- speaker
- training method
- Prior art date
Links
- 238000000926 separation method Methods 0.000 title 2
- 238000000034 method Methods 0.000 title 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/18—Artificial neural networks; Connectionist approaches
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
Landscapes
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Game Theory and Decision Science (AREA)
- Business, Economics & Management (AREA)
- Circuit For Audible Band Transducer (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810519521.6A CN108766440B (en) | 2018-05-28 | 2018-05-28 | Speaker separation model training method, two-speaker separation method and related equipment |
PCT/CN2018/100174 WO2019227672A1 (en) | 2018-05-28 | 2018-08-13 | Voice separation model training method, two-speaker separation method and associated apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
SG11202003722SA true SG11202003722SA (en) | 2020-12-30 |
Family
ID=64006219
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
SG11202003722SA SG11202003722SA (en) | 2018-05-28 | 2018-08-13 | Speaker separation model training method, two-speaker separation method and computing device |
Country Status (5)
Country | Link |
---|---|
US (1) | US11158324B2 (en) |
JP (1) | JP2020527248A (en) |
CN (1) | CN108766440B (en) |
SG (1) | SG11202003722SA (en) |
WO (1) | WO2019227672A1 (en) |
Families Citing this family (68)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109545186B (en) * | 2018-12-16 | 2022-05-27 | 魔门塔(苏州)科技有限公司 | Speech recognition training system and method |
CN109686382A (en) * | 2018-12-29 | 2019-04-26 | 平安科技(深圳)有限公司 | A kind of speaker clustering method and device |
CN110197665B (en) * | 2019-06-25 | 2021-07-09 | 广东工业大学 | A voice separation and tracking method for public security criminal investigation monitoring |
CN110444223B (en) * | 2019-06-26 | 2023-05-23 | 平安科技(深圳)有限公司 | Speaker separation method and device based on cyclic neural network and acoustic characteristics |
CN110289002B (en) * | 2019-06-28 | 2021-04-27 | 四川长虹电器股份有限公司 | End-to-end speaker clustering method and system |
CN110390946A (en) * | 2019-07-26 | 2019-10-29 | 龙马智芯(珠海横琴)科技有限公司 | A kind of audio signal processing method, device, electronic equipment and storage medium |
CN110718228B (en) * | 2019-10-22 | 2022-04-12 | 中信银行股份有限公司 | Voice separation method and device, electronic equipment and computer readable storage medium |
CN111312256B (en) * | 2019-10-31 | 2024-05-10 | 平安科技(深圳)有限公司 | Voice identification method and device and computer equipment |
CN110853618B (en) * | 2019-11-19 | 2022-08-19 | 腾讯科技(深圳)有限公司 | Language identification method, model training method, device and equipment |
CN110992940B (en) | 2019-11-25 | 2021-06-15 | 百度在线网络技术(北京)有限公司 | Voice interaction method, device, equipment and computer-readable storage medium |
CN110992967A (en) * | 2019-12-27 | 2020-04-10 | 苏州思必驰信息科技有限公司 | Voice signal processing method and device, hearing aid and storage medium |
CN111145761B (en) * | 2019-12-27 | 2022-05-24 | 携程计算机技术(上海)有限公司 | Model training method, voiceprint confirmation method, system, device and medium |
CN111191787B (en) * | 2019-12-30 | 2022-07-15 | 思必驰科技股份有限公司 | Training method and device of neural network for extracting speaker embedded features |
CN111370032B (en) * | 2020-02-20 | 2023-02-14 | 厦门快商通科技股份有限公司 | Voice separation method, system, mobile terminal and storage medium |
JP7359028B2 (en) * | 2020-02-21 | 2023-10-11 | 日本電信電話株式会社 | Learning devices, learning methods, and learning programs |
CN111370019B (en) * | 2020-03-02 | 2023-08-29 | 字节跳动有限公司 | Sound source separation method and device, and neural network model training method and device |
CN111009258A (en) * | 2020-03-11 | 2020-04-14 | 浙江百应科技有限公司 | Single sound channel speaker separation model, training method and separation method |
US11392639B2 (en) * | 2020-03-31 | 2022-07-19 | Uniphore Software Systems, Inc. | Method and apparatus for automatic speaker diarization |
CN111477240B (en) * | 2020-04-07 | 2023-04-07 | 浙江同花顺智能科技有限公司 | Audio processing method, device, equipment and storage medium |
CN111524521B (en) | 2020-04-22 | 2023-08-08 | 北京小米松果电子有限公司 | Voiceprint extraction model training method, voiceprint recognition method, voiceprint extraction model training device and voiceprint recognition device |
CN111524527B (en) * | 2020-04-30 | 2023-08-22 | 合肥讯飞数码科技有限公司 | Speaker separation method, speaker separation device, electronic device and storage medium |
CN111613249A (en) * | 2020-05-22 | 2020-09-01 | 云知声智能科技股份有限公司 | Voice analysis method and equipment |
CN111640438B (en) * | 2020-05-26 | 2023-09-05 | 同盾控股有限公司 | Audio data processing method and device, storage medium and electronic equipment |
CN111680631B (en) * | 2020-06-09 | 2023-12-22 | 广州视源电子科技股份有限公司 | Model training method and device |
CN111785291B (en) * | 2020-07-02 | 2024-07-02 | 北京捷通华声科技股份有限公司 | Voice separation method and voice separation device |
CN111933153B (en) * | 2020-07-07 | 2024-03-08 | 北京捷通华声科技股份有限公司 | Voice segmentation point determining method and device |
CN111985934B (en) * | 2020-07-30 | 2024-07-12 | 浙江百世技术有限公司 | Intelligent customer service dialogue model construction method and application |
CN111899755A (en) * | 2020-08-11 | 2020-11-06 | 华院数据技术(上海)有限公司 | Speaker voice separation method and related equipment |
CN112071329B (en) * | 2020-09-16 | 2022-09-16 | 腾讯科技(深圳)有限公司 | Multi-person voice separation method and device, electronic equipment and storage medium |
CN112071330B (en) * | 2020-09-16 | 2022-09-20 | 腾讯科技(深圳)有限公司 | Audio data processing method and device and computer readable storage medium |
CN112489682B (en) * | 2020-11-25 | 2023-05-23 | 平安科技(深圳)有限公司 | Audio processing method, device, electronic equipment and storage medium |
CN112700766B (en) * | 2020-12-23 | 2024-03-19 | 北京猿力未来科技有限公司 | Training method and device of voice recognition model, and voice recognition method and device |
CN114676618A (en) * | 2020-12-25 | 2022-06-28 | 北京搜狗科技发展有限公司 | Optimization method of speaker segmentation model, speaker segmentation method and device |
CN112820292B (en) * | 2020-12-29 | 2023-07-18 | 平安银行股份有限公司 | Method, device, electronic device and storage medium for generating meeting summary |
CN112289323B (en) * | 2020-12-29 | 2021-05-28 | 深圳追一科技有限公司 | Voice data processing method and device, computer equipment and storage medium |
KR20220098314A (en) * | 2020-12-31 | 2022-07-12 | 센스타임 인터내셔널 피티이. 리미티드. | Training method and apparatus for neural network and related object detection method and apparatus |
KR20220099003A (en) | 2021-01-05 | 2022-07-12 | 삼성전자주식회사 | Electronic device and Method for controlling the electronic device thereof |
US12125498B2 (en) | 2021-02-10 | 2024-10-22 | Samsung Electronics Co., Ltd. | Electronic device supporting improved voice activity detection |
KR20220115453A (en) * | 2021-02-10 | 2022-08-17 | 삼성전자주식회사 | Electronic device supporting improved voice activity detection |
KR20220136750A (en) | 2021-04-01 | 2022-10-11 | 삼성전자주식회사 | Electronic apparatus for processing user utterance and controlling method thereof |
CN113178205B (en) * | 2021-04-30 | 2024-07-05 | 平安科技(深圳)有限公司 | Voice separation method, device, computer equipment and storage medium |
KR20220169242A (en) * | 2021-06-18 | 2022-12-27 | 삼성전자주식회사 | Electronic devcie and method for personalized audio processing of the electronic device |
US20220406324A1 (en) * | 2021-06-18 | 2022-12-22 | Samsung Electronics Co., Ltd. | Electronic device and personalized audio processing method of the electronic device |
WO2023281717A1 (en) * | 2021-07-08 | 2023-01-12 | 日本電信電話株式会社 | Speaker diarization method, speaker diarization device, and speaker diarization program |
CN113362831A (en) * | 2021-07-12 | 2021-09-07 | 科大讯飞股份有限公司 | Speaker separation method and related equipment thereof |
CN113571085B (en) * | 2021-07-24 | 2023-09-22 | 平安科技(深圳)有限公司 | Voice separation method, system, device and storage medium |
CN113657289B (en) * | 2021-08-19 | 2023-08-08 | 北京百度网讯科技有限公司 | Training method and device of threshold estimation model and electronic equipment |
JP7643572B2 (en) | 2021-09-21 | 2025-03-11 | 日本電信電話株式会社 | Estimation device, estimation method, and estimation program |
KR102823706B1 (en) * | 2021-09-23 | 2025-06-24 | 한국전자통신연구원 | Apparatus and method for seperating voice section |
CN113870893B (en) * | 2021-09-27 | 2024-09-03 | 中国科学院声学研究所 | Multichannel double-speaker separation method and system |
CN115881092A (en) * | 2021-09-29 | 2023-03-31 | 中移动信息技术有限公司 | Method and device for voice subject recognition |
CN114220452A (en) * | 2021-11-30 | 2022-03-22 | 北京百度网讯科技有限公司 | A speaker separation method, device, electronic device and storage medium |
CN116416999A (en) * | 2021-12-30 | 2023-07-11 | 马上消费金融股份有限公司 | Training method of speaker segmentation model, speaker segmentation method and device |
CN114363531B (en) * | 2022-01-14 | 2023-08-01 | 中国平安人寿保险股份有限公司 | H5-based text description video generation method, device, equipment and medium |
CN114708850A (en) * | 2022-02-24 | 2022-07-05 | 厦门快商通科技股份有限公司 | Interactive voice segmentation and clustering method, device and equipment |
CN114664323B (en) * | 2022-03-28 | 2025-03-25 | 广东电网有限责任公司 | A partial discharge audio recognition method and system based on power frequency sine similarity |
CN114707668B (en) * | 2022-04-29 | 2025-07-11 | 思必驰科技股份有限公司 | Self-supervised speaker model training method, electronic device and storage medium |
CN115206326A (en) * | 2022-05-27 | 2022-10-18 | 厦门中创环保科技股份有限公司 | Voiceprint recognition method, system, storage medium and program product |
CN115223569B (en) * | 2022-06-02 | 2025-02-28 | 康佳集团股份有限公司 | Speaker verification method, terminal and storage medium based on deep neural network |
CN115171716B (en) * | 2022-06-14 | 2024-04-19 | 武汉大学 | A method, system and electronic device for continuous speech separation based on spatial feature clustering |
US12293771B2 (en) * | 2022-09-06 | 2025-05-06 | Dell Products, L.P. | Equalization of audio during a collaboration session in a heterogenous computing platform |
CN115659162B (en) * | 2022-09-15 | 2023-10-03 | 云南财经大学 | Method, system and equipment for extracting intra-pulse characteristics of radar radiation source signals |
CN116092512A (en) * | 2022-12-30 | 2023-05-09 | 重庆邮电大学 | Small sample voice separation method based on data generation |
CN116246636B (en) * | 2023-02-20 | 2025-07-11 | 阿里巴巴达摩院(杭州)科技有限公司 | Voiceprint feature extraction method, speaker recognition method, model training method and device |
WO2025022593A1 (en) * | 2023-07-25 | 2025-01-30 | 日本電信電話株式会社 | Learning device, inference device, inference method, and inference program |
CN117037255B (en) * | 2023-08-22 | 2024-06-21 | 北京中科深智科技有限公司 | 3D Expression Synthesis Method Based on Directed Graph |
CN118824276B (en) * | 2024-09-14 | 2025-02-25 | 北京云行在线软件开发有限责任公司 | A method and device for audio role recognition of online car-hailing based on voiceprint clustering |
CN119296560B (en) * | 2024-12-11 | 2025-03-14 | 杭州华亭科技有限公司 | A speech noise reduction system in a multi-noise environment |
Family Cites Families (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0272398A (en) | 1988-09-07 | 1990-03-12 | Hitachi Ltd | Audio signal preprocessing device |
KR100612840B1 (en) | 2004-02-18 | 2006-08-18 | 삼성전자주식회사 | Model Variation Based Speaker Clustering Method, Speaker Adaptation Method, and Speech Recognition Apparatus Using Them |
JP2008051907A (en) | 2006-08-22 | 2008-03-06 | Toshiba Corp | Utterance section identification apparatus and method |
WO2016095218A1 (en) * | 2014-12-19 | 2016-06-23 | Dolby Laboratories Licensing Corporation | Speaker identification using spatial information |
JP6430318B2 (en) | 2015-04-06 | 2018-11-28 | 日本電信電話株式会社 | Unauthorized voice input determination device, method and program |
CN106683661B (en) * | 2015-11-05 | 2021-02-05 | 阿里巴巴集团控股有限公司 | Role separation method and device based on voice |
JP2017120595A (en) | 2015-12-29 | 2017-07-06 | 花王株式会社 | Evaluation method of cosmetic application |
KR102450441B1 (en) * | 2016-07-14 | 2022-09-30 | 매직 립, 인코포레이티드 | Deep Neural Networks for Iris Identification |
US9824692B1 (en) * | 2016-09-12 | 2017-11-21 | Pindrop Security, Inc. | End-to-end speaker recognition using deep neural network |
JP6365859B1 (en) | 2016-10-11 | 2018-08-01 | エスゼット ディージェイアイ テクノロジー カンパニー リミテッドSz Dji Technology Co.,Ltd | IMAGING DEVICE, IMAGING SYSTEM, MOBILE BODY, METHOD, AND PROGRAM |
US10497382B2 (en) * | 2016-12-16 | 2019-12-03 | Google Llc | Associating faces with voices for speaker diarization within videos |
CN107180628A (en) * | 2017-05-19 | 2017-09-19 | 百度在线网络技术(北京)有限公司 | Set up the method, the method for extracting acoustic feature, device of acoustic feature extraction model |
CN107221320A (en) * | 2017-05-19 | 2017-09-29 | 百度在线网络技术(北京)有限公司 | Train method, device, equipment and the computer-readable storage medium of acoustic feature extraction model |
CN107342077A (en) | 2017-05-27 | 2017-11-10 | 国家计算机网络与信息安全管理中心 | A kind of speaker segmentation clustering method and system based on factorial analysis |
CN107680611B (en) | 2017-09-13 | 2020-06-16 | 电子科技大学 | Single-channel sound separation method based on convolutional neural network |
US10529349B2 (en) * | 2018-04-16 | 2020-01-07 | Mitsubishi Electric Research Laboratories, Inc. | Methods and systems for end-to-end speech separation with unfolded iterative phase reconstruction |
US11010179B2 (en) * | 2018-04-20 | 2021-05-18 | Facebook, Inc. | Aggregating semantic information for improved understanding of users |
-
2018
- 2018-05-28 CN CN201810519521.6A patent/CN108766440B/en active Active
- 2018-08-13 US US16/652,452 patent/US11158324B2/en active Active
- 2018-08-13 JP JP2019572830A patent/JP2020527248A/en active Pending
- 2018-08-13 SG SG11202003722SA patent/SG11202003722SA/en unknown
- 2018-08-13 WO PCT/CN2018/100174 patent/WO2019227672A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
US11158324B2 (en) | 2021-10-26 |
CN108766440B (en) | 2020-01-14 |
CN108766440A (en) | 2018-11-06 |
WO2019227672A1 (en) | 2019-12-05 |
US20200234717A1 (en) | 2020-07-23 |
JP2020527248A (en) | 2020-09-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
SG11202003722SA (en) | Speaker separation model training method, two-speaker separation method and computing device | |
EP3690763A4 (en) | Machine learning model training method and device, and electronic device | |
EP3683725A4 (en) | Abstract description generation method, abstract description model training method and computer device | |
EP3951654A4 (en) | Image classification model training method, and image processing method and device | |
EP3582118A4 (en) | Method and apparatus for training classification model | |
EP3872705A4 (en) | Detection model training method and apparatus and terminal device | |
EP3633610A4 (en) | Learning device, learning method, learning model, estimation device, and grip system | |
EP3537349A4 (en) | Machine learning model training method and device | |
SG11202000749RA (en) | Model training method and apparatus | |
EP3862893A4 (en) | Recommendation model training method, recommendation method, device, and computer-readable medium | |
EP3648044A4 (en) | Method, apparatus, and device for training risk control model and risk control | |
EP3690768A4 (en) | User behavior prediction method and apparatus, and behavior prediction model training method and apparatus | |
EP3503980A4 (en) | Exercise system and method | |
EP3179473A4 (en) | Training method and apparatus for language model, and device | |
EP3579169A4 (en) | Learned model provision method, and learned model provision device | |
EP4080407A4 (en) | Inference computing apparatus, model training apparatus, and inference computing system | |
EP3678072A4 (en) | Model integration method and device | |
SG11202104492QA (en) | Model training methods, apparatuses, and systems | |
EP3605405A4 (en) | Server device, trained model providing program, trained model providing method, and trained model providing system | |
EP3565371A4 (en) | Session processing method and device | |
EP3425527A4 (en) | Method of training machine learning system, and training system | |
EP3193328A4 (en) | Method and device for performing voice recognition using grammar model | |
EP3136677A4 (en) | Voice verification method, device and system | |
EP3349171A4 (en) | Credit-score model training method, and credit-score calculation method, device, and server | |
SG11202107218TA (en) | Separation device and separation method |