CN111402925A

CN111402925A - Voice adjusting method and device, electronic equipment, vehicle-mounted system and readable medium

Info

Publication number: CN111402925A
Application number: CN202010172637.4A
Authority: CN
Inventors: 李黎萍
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Apollo Intelligent Connectivity Beijing Technology Co Ltd
Priority date: 2020-03-12
Filing date: 2020-03-12
Publication date: 2020-07-10
Anticipated expiration: 2040-03-12
Also published as: CN111402925B

Abstract

Embodiments of the present disclosure disclose a voice adjustment method and apparatus. A specific embodiment of the present disclosure includes: acquiring environmental information of a vehicle; acquiring status information of at least one person on the vehicle; determining a voice adjustment strategy based on the environmental information and the state information; and according to the voice adjustment strategy, Adjust the parameters of the voice to be played on the vehicle. This embodiment uses internal and external vehicle conditions and road conditions to determine the urgency of the situation, and then combines the emotional state of the driver/passenger to determine whether it is necessary to appease the occupants in the vehicle, and performs tone change and speed change processing on the voice. Under the condition of ensuring the integrity of the same voice image, the voice is given correct emotional feedback, which brings safer driving voice interaction. It can be applied to assisted driving and unmanned driving scenarios.

Description

Method, apparatus, electronic device, in-vehicle system and readable medium for voice adjustment

技术领域technical field

本公开的实施例涉及计算机技术领域，具体涉及语音调节方法和装置。Embodiments of the present disclosure relate to the field of computer technology, and in particular, to a method and apparatus for adjusting speech.

背景技术Background technique

随着计算机技术及人工智能的飞速发展，车载导航、辅助驾驶在汽车驾驶领域越来越得到广泛的应用。前期的车载语音系统普遍采用机械单一的语音提示，即固定采用一种语音(语速、音调、语调)，这可能导致提示效果差，用户也不喜欢。例如在一般调查中，有的用户更喜欢甜美温柔女声(语速慢、音调高，语调变化)，有的用户则喜欢特定明星的语音播报。为了满足用户的个性化需求，可以采用语音合成技术模拟特定人物的语音(先采集有限量的语音，采用人工智能技术进行语音合成处理，获取具有该特定人物语音音色的目标语言)。With the rapid development of computer technology and artificial intelligence, in-vehicle navigation and assisted driving are more and more widely used in the field of car driving. Early vehicle voice systems generally use a single mechanical voice prompt, that is, a fixed voice (speaking speed, pitch, intonation) is used, which may lead to poor prompting effect and users do not like it. For example, in a general survey, some users prefer a sweet and gentle female voice (slow speed, high pitch, and intonation change), while some users prefer the voice broadcast of a specific star. In order to meet the individual needs of users, speech synthesis technology can be used to simulate the speech of a specific person (first collect a limited amount of speech, use artificial intelligence technology to perform speech synthesis processing, and obtain the target language with the specific person's voice timbre).

经研究发现，对于驾驶过程中的警示提醒，需要更确信紧急的语调(语速快、音调低、语调平)让用户更快反应。可以设想采用以下方案来解决：Research has found that for warning reminders during driving, it is necessary to be more confident that urgent intonation (speaking fast, low intonation, and intonation is flat) to allow users to respond more quickly. The following solutions can be envisaged:

(1)不同功能调用不同的语音包，例如在进行唤醒语音对话系统的功能时，调用语音包1，在进行播报导航信息的功能时，调用语音包2。但是在该方案中，不同功能并不能完全代表场景的紧急性，并且语音包音色差异较大，语音形象体验割裂。(1) Different functions call different voice packets. For example, when the function of waking up the voice dialogue system is performed, the voice packet 1 is called, and when the function of broadcasting navigation information is performed, the voice packet 2 is called. However, in this solution, different functions do not fully represent the urgency of the scene, and the voice packets are quite different in timbre, and the voice image experience is fragmented.

(2)真人录制特定语料，一些车载语音系统会录制较多的语料，力图在不同场景下语音的音调、语速有自然变化。但是在该方案中，需要录制语料时考虑到各种情况，语料录制工作量大。(2) Real people record specific corpus, some car voice systems will record more corpus, trying to have natural changes in the pitch and speed of speech in different scenarios. However, in this scheme, various situations need to be considered when recording corpus, and the workload of corpus recording is large.

需要说明的是，在此部分中描述的方法不一定是之前已经设想到或采用的方法。除非另有指明，否则不应假定此部分中描述的任何方法仅因其包括在此部分中就被认为是现有技术。类似地，除非另有指明，否则此部分中提及的问题不应认为在任何现有技术中已被公认。It should be noted that the methods described in this section are not necessarily methods that have been previously conceived or adopted. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, unless otherwise indicated, the issues raised in this section should not be considered to be recognized in any prior art.

发明内容SUMMARY OF THE INVENTION

本公开的实施例提出了语音调节方法和装置。Embodiments of the present disclosure propose a voice adjustment method and apparatus.

根据本公开的第一方面，本公开的实施例提供了一种语音调节方法，所述方法包括：获取车辆的环境信息；获取所述车辆上至少一个人员的状态信息；基于所述环境信息和所述状态信息，确定语音调节策略；根据所述语音调节策略，调节所述车辆上待播放语音的参数。According to a first aspect of the present disclosure, an embodiment of the present disclosure provides a voice adjustment method, the method includes: acquiring environmental information of a vehicle; acquiring status information of at least one person on the vehicle; based on the environmental information and According to the state information, a voice adjustment strategy is determined; according to the voice adjustment strategy, the parameters of the voice to be played on the vehicle are adjusted.

在一些实施例中，所述环境信息包括车内环境信息和车外环境信息中的至少一者，并且其中，所述车外环境信息包括路况信息和/或高级辅助驾驶系统ADAS信息。In some embodiments, the environmental information includes at least one of in-vehicle environmental information and out-of-vehicle environmental information, and wherein the out-of-vehicle environmental information includes road condition information and/or advanced driver assistance system ADAS information.

在一些实施例中，所述参数包括音调、语速和语调中的至少一者。In some embodiments, the parameter includes at least one of pitch, speech rate, and intonation.

在一些实施例中，所述基于所述环境信息和所述状态信息确定语音调节策略包括：根据所述环境信息判断待播放语音所对应的提示事件的紧急程度，以及根据所述状态信息判断所述车辆上至少一个人员的情绪状态。In some embodiments, the determining the voice adjustment strategy based on the environment information and the state information includes: judging, according to the environment information, the urgency of the prompting event corresponding to the voice to be played, and judging according to the state information the emotional state of at least one person on board the vehicle.

在一些实施例中，在所述环境信息包括车内环境信息的情况下，通过采集车辆组件状态信息获取所述车内环境信息。In some embodiments, when the environmental information includes in-vehicle environmental information, the in-vehicle environmental information is acquired by collecting vehicle component state information.

在一些实施例中，在所述环境信息包括路况信息的情况下，采用以下方式之一获取所述路况信息：通过云端获取实时高精路况信息；或通过传感摄像头和/或雷达感知车辆附近情况。In some embodiments, when the environmental information includes road condition information, the road condition information is acquired in one of the following ways: obtaining real-time high-precision road condition information through the cloud; or sensing the vicinity of the vehicle through a sensor camera and/or radar Happening.

在一些实施例中，所述车辆上至少一个人员包括所述车辆的驾驶员，并且其中，所述状态信息至少采用以下方式之一获取：通过摄像头采集所述车辆上至少一个人员的面部表情；通过语音接收器采集所述车辆上至少一个人员的语言；通过驾驶动作采集器采集所述驾驶员的驾驶动作；或通过时钟记录采集所述驾驶员本次驾驶的持续时间。In some embodiments, the at least one person on the vehicle includes a driver of the vehicle, and wherein the state information is obtained in at least one of the following ways: collecting the facial expression of the at least one person on the vehicle through a camera; The language of at least one person on the vehicle is collected through a voice receiver; the driving action of the driver is collected through a driving action collector; or the driving duration of the driver is collected through a clock recorder.

在一些实施例中，所述方法还包括：预先建立语音调节策略模型，所述语音调节策略模型包括所述紧急程度、情绪状态以及语音调节策略的对应关系，所述语音调节策略包括频率、速度以及语调模型曲线的组合。In some embodiments, the method further includes: pre-establishing a speech adjustment strategy model, the speech adjustment strategy model including the corresponding relationship between the urgency, the emotional state and the speech adjustment strategy, the speech adjustment strategy including frequency, speed and a combination of intonation model curves.

在一些实施例中，所述语调模型曲线包括具有警示效果的严肃语调模型曲线、具有安抚效果的平和语调模型曲线和具有振奋效果的活泼语调模型曲线。In some embodiments, the intonation model curves include a serious intonation model curve with a warning effect, a calm intonation model curve with a soothing effect, and a lively intonation model curve with an uplifting effect.

在一些实施例中，其中所述根据所述语音调节策略，调节所述车辆上待播放语音的参数包括：根据所述确定的语音调节策略中所包括的频率、速度以及语调模型曲线的组合，对所述待播放语音的频率及语速进行相应的调节，并对所述待播放语音进行相应的语调模型调整。In some embodiments, the adjusting the parameters of the speech to be played on the vehicle according to the speech adjustment strategy includes: according to the combination of the frequency, speed and intonation model curve included in the determined speech adjustment strategy, The frequency and speech rate of the to-be-played speech are adjusted accordingly, and the corresponding intonation model is adjusted for the to-be-played speech.

根据本公开的第二方面，本公开的实施例提供了一种语音调节装置，包括：第一获取单元，被配置成获取车辆的环境信息；第二获取单元，被配置成获取所述车辆上至少一个人员的状态信息；确定单元，被配置成基于所述环境信息和所述状态信息，确定语音调节策略；调节单元，被配置成根据所述语音调节策略，调节所述车辆上待播放语音的参数。According to a second aspect of the present disclosure, an embodiment of the present disclosure provides a voice adjustment device, comprising: a first acquisition unit configured to acquire environmental information of a vehicle; a second acquisition unit configured to acquire information on the vehicle Status information of at least one person; a determining unit configured to determine a voice adjustment strategy based on the environment information and the status information; an adjustment unit configured to adjust the voice to be played on the vehicle according to the voice adjustment strategy parameter.

在一些实施例中，所述环境信息包括车内环境信息和车外环境信息中的至少一者，并且其中，所述车外环境信息包括路况信息和/或ADAS信息。In some embodiments, the environmental information includes at least one of in-vehicle environmental information and out-of-vehicle environmental information, and wherein the out-of-vehicle environmental information includes road condition information and/or ADAS information.

在一些实施例中，所述确定单元被配置成根据所述环境信息判断提示语音所对应的提示事件的紧急程度，以及根据所述状态信息判断所述车辆上至少一个人员的情绪状态。In some embodiments, the determining unit is configured to determine the urgency of the prompting event corresponding to the prompting voice according to the environmental information, and to judge the emotional state of at least one person on the vehicle according to the state information.

在一些实施例中，在所述环境信息包括车内环境信息的情况下，所述第一获取单元被配置成通过采集车辆组件状态信息获取所述车内环境信息；并且其中，在所述环境信息包括路况信息的情况下，所述第一获取单元被配置成至少采用以下方式之一获取所述路况信息：通过云端获取实时高精路况信息；或通过传感摄像头和/或雷达感知车辆附近情况。In some embodiments, when the environmental information includes in-vehicle environmental information, the first acquisition unit is configured to acquire the in-vehicle environmental information by collecting vehicle component state information; and wherein, in the environment When the information includes road condition information, the first acquiring unit is configured to acquire the road condition information in at least one of the following ways: acquiring real-time high-precision road condition information through the cloud; or sensing the vicinity of the vehicle through a sensing camera and/or radar Happening.

在一些实施例中，在所述车辆上至少一个人员包括所述车辆的驾驶员的情况下，所述第二获取单元被配置成至少采用以下方式之一获取所述状态信息：通过摄像头采集所述车辆上至少一个人员的面部表情；通过语音接收器采集所述车辆上至少一个人员的语言；通过驾驶动作采集器采集所述驾驶员的驾驶动作；或通过时钟记录采集所述驾驶员本次驾驶的持续时间。In some embodiments, when the at least one person on the vehicle includes the driver of the vehicle, the second obtaining unit is configured to obtain the state information in at least one of the following ways: collecting the information by a camera collect the facial expression of at least one person on the vehicle; collect the language of at least one person on the vehicle through a voice receiver; collect the driving action of the driver through a driving action collector; or collect the driver's current time through a clock recorder duration of driving.

在一些实施例中，所述确定单元还被配置成根据预先建立的语音调节策略模型确定语音调节策略，所述语音调节策略模型包括所述紧急程度、情绪状态以及语音调节策略的对应关系，所述语音调节策略包括频率、速度以及语调模型曲线的组合。In some embodiments, the determining unit is further configured to determine a speech adjustment strategy according to a pre-established speech adjustment strategy model, where the speech adjustment strategy model includes the corresponding relationship between the urgency, the emotional state and the speech adjustment strategy, so The described speech modulation strategies include a combination of frequency, velocity, and intonation model curves.

根据本公开的第三方面，本公开的实施例提供了一种电子设备，该电子设备包括：一个或多个处理器；存储装置，其上存储有一个或多个程序；当所述一个或多个程序被所述一个或多个处理器执行，使得所述一个或多个处理器实现如第一方面中任一实现方式描述的方法。According to a third aspect of the present disclosure, an embodiment of the present disclosure provides an electronic device, the electronic device includes: one or more processors; a storage device on which one or more programs are stored; A plurality of programs are executed by the one or more processors such that the one or more processors implement the method as described in any one of the implementations of the first aspect.

根据本公开的第四方面，本公开的实施例提供了一种车载系统，包括如第三方面描述的电子设备。According to a fourth aspect of the present disclosure, an embodiment of the present disclosure provides an in-vehicle system including the electronic device as described in the third aspect.

根据本公开的第五方面，本公开的实施例提供了一种计算机可读介质，其上存储有计算机程序，其中，该程序被处理器执行时实现如第一方面中任一实现方式描述的方法。According to a fifth aspect of the present disclosure, an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, wherein the program, when executed by a processor, implements the implementation described in any of the first aspect method.

本公开的实施例提供的语音调节方法和装置，利用内外车况、路况判断情景紧急性，再结合驾驶员/车内乘客情绪状态确定是否需要安抚车内人员，对语音进行变调变速处理。在保证同一语音形象的整体性条件下，赋予语音正确的情绪反馈，带来更加安全的驾驶语音交互。本公开可应用于辅助驾驶和无人驾驶场景。The voice adjustment method and device provided by the embodiments of the present disclosure use internal and external vehicle conditions and road conditions to determine the urgency of the situation, and then combine the emotional state of the driver/passenger to determine whether it is necessary to appease the occupants in the vehicle, and perform pitch shifting and shifting processing on the voice. Under the condition of ensuring the integrity of the same voice image, the voice is given correct emotional feedback, which brings safer driving voice interaction. The present disclosure can be applied to assisted driving and unmanned driving scenarios.

附图说明Description of drawings

通过阅读参照以下附图所作的对非限制性实施例所作的详细描述，本公开的其它特征、目的和优点将会变得更明显。Other features, objects and advantages of the present disclosure will become more apparent upon reading the detailed description of non-limiting embodiments taken with reference to the following drawings.

图1是根据本公开的语音调节方法的一个实施例的流程图。FIG. 1 is a flowchart of one embodiment of a speech adjustment method according to the present disclosure.

图2是根据本公开的语音调节装置的一个实施例的结构示意图。FIG. 2 is a schematic structural diagram of an embodiment of a speech adjustment apparatus according to the present disclosure.

图3是适于用来实现本公开的实施例的电子设备的结构示意图。3 is a schematic structural diagram of an electronic device suitable for implementing embodiments of the present disclosure.

实施方式Implementation

下面结合附图和实施例对本公开作进一步的详细说明。可以理解的是，此处所描述的具体实施例仅仅用于解释相关发明，而非对该发明的限定。另外还需要说明的是，为了便于描述，附图中仅示出了与有关发明相关的部分。The present disclosure will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the related invention, but not to limit the invention. In addition, it should be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.

需要说明的是，在不冲突的情况下，本公开中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本公开。It should be noted that the embodiments of the present disclosure and the features of the embodiments may be combined with each other under the condition of no conflict. The present disclosure will be described in detail below with reference to the accompanying drawings and in conjunction with embodiments.

图1示出了根据本公开的语音调节方法的一个实施例的流程100的流程图。该语音调节方法包括以下步骤：FIG. 1 shows a flowchart of a process 100 of one embodiment of a speech adjustment method according to the present disclosure. The voice adjustment method includes the following steps:

在步骤101，获取车辆的环境信息。In step 101, environmental information of the vehicle is acquired.

在本步骤中，语音调节方法的执行主体可以通过有线连接方式或者无线连接方式获取环境信息。在一些实施例中，所述环境信息可以包括车内环境信息或车外环境信息。其中，车内环境信息例如可以包括车况信息，车况信息可以包括车辆运行状况相关的各种数据，例如包括车辆自身各组件相关的数据，如胎压、水温、油量、电量、车速等。作为示例，可以分别通过胎压传感器、温度传感器、油量传感器、电平传感器和车速传感器等采集获取相应组件状态数据。其中，车外环境信息可以包括路况信息，也可以包括高级辅助驾驶系统ADAS信息。路况信息例如可以是交通拥堵状况。作为示例，可以从云端获取实时高精路况信息。此信息可以用于在后续步骤中判断是否有交通紧急事故(恶劣天气、道路塌方、救护车让行等)、或者路线问题(走错路、超速等)。ADAS信息可以包括车辆周围情况，例如车辆距周围障碍物的距离。作为示例，可以通过传感摄像头、雷达等感知车辆周围情况，例如可以通过超声波雷达获取车辆距周围障碍物的距离。在一些实施例中，本公开应用于智能座舱系统中，通过智能座舱配置的传感摄像头、雷达等传感器可判断是否有本车接近事故(碰撞他车、碰撞他人、碰撞物体等)或者行驶问题(闯红灯、压线、车道偏离)，从而对用户进行预警或者提出人工接管要求。尽管以上对所述环境信息的具体内容及获取方式进行了说明，但这仅为示例，本公开并不限于此。本领域技术人员应当能够理解，可以根据具体需求扩充该信息的内容及获取方式。In this step, the executive body of the voice adjustment method may acquire the environmental information through a wired connection or a wireless connection. In some embodiments, the environmental information may include in-vehicle environmental information or out-of-vehicle environmental information. The in-vehicle environment information may include, for example, vehicle condition information, and the vehicle condition information may include various data related to vehicle operating conditions, such as data related to various components of the vehicle itself, such as tire pressure, water temperature, fuel quantity, electricity, vehicle speed, and the like. As an example, the corresponding component status data can be acquired through collection of tire pressure sensor, temperature sensor, fuel quantity sensor, level sensor, vehicle speed sensor, etc. respectively. The outside environment information may include road condition information or ADAS information of an advanced driving assistance system. The road condition information may be, for example, a traffic jam condition. As an example, real-time high-precision road condition information can be obtained from the cloud. This information can be used in subsequent steps to determine if there is a traffic emergency (bad weather, road collapse, ambulance giving way, etc.), or routing problems (wrong way, speeding, etc.). ADAS information may include conditions around the vehicle, such as the vehicle's distance from surrounding obstacles. As an example, the situation around the vehicle can be sensed through sensing cameras, radar, etc. For example, the distance between the vehicle and surrounding obstacles can be obtained through ultrasonic radar. In some embodiments, the present disclosure is applied to a smart cockpit system, and sensors such as a sensor camera and a radar configured in the smart cockpit can determine whether there is an approaching accident (collision with other cars, collision with others, collision with objects, etc.) or driving problems (running a red light, pressing the line, lane departure), so as to warn the user or make a manual takeover request. Although the specific content and acquisition method of the environmental information have been described above, this is only an example, and the present disclosure is not limited thereto. Those skilled in the art should be able to understand that the content and acquisition method of the information can be expanded according to specific requirements.

在步骤102，获取所述车辆上至少一个人员的状态信息。In step 102, the status information of at least one person on the vehicle is acquired.

在本步骤中，语音调节方法的执行主体可以通过有线连接方式或者无线连接方式获取所述状态信息。例如获取当前车辆的驾驶员或车内乘客等车辆上人员的状态信息。车辆上的一个或多个人员可以被视为语音调节功能的用户。为简便起见，下文使用“用户”表示包括驾驶员和车内乘客在内的所述车辆上的人员。应当理解，所述车辆上至少一个人员可以至少包括车辆的驾驶员。In this step, the executive body of the voice adjustment method may acquire the state information through a wired connection or a wireless connection. For example, the status information of people on the vehicle, such as the driver of the current vehicle or the passengers in the vehicle, is obtained. One or more persons in the vehicle can be considered as users of the voice conditioning function. For simplicity, "user" is used below to refer to persons on board the vehicle, including the driver and passengers in the vehicle. It should be understood that the at least one person on the vehicle may include at least the driver of the vehicle.

在一些实施例中，用户的状态信息可以是有助于分析用户情绪状态的各种有效信息，借助于这些信息，最好能较为有效地分析出用户是否处于一种特定的情绪状态，例如惊慌、疲劳或悲伤(情绪低落)的情绪状态。作为示例，可以通过用户的面部表情、用户说出的语言、用户的驾驶动作、此次驾驶的持续时间等来分析用户的情绪状态。作为示例，可以通过摄像头采集用户面部表情、通过语音接收器采集用户说出的语言、通过驾驶动作采集器采集驾驶员的驾驶动作、通过时钟记录采集驾驶员本次驾驶的持续时间。在一些实施例中，可以在本公开所应用的智能座舱系统中配备摄像头、语音接收器、驾驶动作采集器等。尽管以上对用户状态信息的具体内容及获取方式进行了说明，但这仅为示例，本公开并不限于此。本领域技术人员应当能够理解，可以根据具体需求扩充该信息的内容及获取方式。In some embodiments, the user's state information may be various effective information that is helpful for analyzing the user's emotional state. With the help of this information, it is better to more effectively analyze whether the user is in a specific emotional state, such as panic. , an emotional state of fatigue or sadness (depression). As an example, the user's emotional state may be analyzed by the user's facial expressions, the language spoken by the user, the user's driving action, the duration of the current drive, and the like. As an example, the user's facial expressions can be collected through a camera, the language spoken by the user can be collected through a voice receiver, the driver's driving movements can be collected through a driving action collector, and the driver's driving duration can be collected through a clock recorder. In some embodiments, a camera, a voice receiver, a driving action collector, etc. may be equipped in the intelligent cockpit system to which the present disclosure is applied. Although the specific content and acquisition method of the user state information have been described above, this is only an example, and the present disclosure is not limited thereto. Those skilled in the art should be able to understand that the content and acquisition method of the information can be expanded according to specific requirements.

尽管以上先描述步骤101，后描述步骤102，但这无意于限定该两个步骤的先后顺序，其可以同时进行，也可以后表述的先进行，本领域技术人员应当理解，本公开不限于此。Although step 101 is described above, and step 102 is described later, this is not intended to limit the sequence of the two steps, which may be performed simultaneously, or may be performed first as described later. Those skilled in the art should understand that the present disclosure is not limited to this. .

在步骤103，基于所述环境信息和所述状态信息，确定语音调节策略。In step 103, a speech adjustment strategy is determined based on the environment information and the state information.

在本步骤中，要针对前面步骤所获取的信息进行分析。根据环境信息分析情况紧急程度，根据用户状态信息分析用户的情绪状态。可以借助后台服务器对强大的分析功能提供支持。In this step, the information obtained in the previous steps is to be analyzed. The urgency of the situation is analyzed according to the environmental information, and the emotional state of the user is analyzed according to the user state information. Powerful analysis functions can be supported with the help of a background server.

在一些实施例中，预先将紧急程度分为多个级别。预先划分的级别数量根据需求而定。作为示例，可以简单地分成紧急和不紧急。但这仅为示例，本公开不限于此。In some embodiments, the urgency is pre-divided into multiple levels. The number of pre-divided levels is based on demand. As an example, it can be simply divided into urgent and not urgent. However, this is only an example, and the present disclosure is not limited thereto.

在一些实施例中，预先对具体情形对应于哪一级别的紧急程度进行设定。例如油量低于第一阈值或第二阈值时需要提示用户加油，而低于较大的第一阈值时的提示事件的紧急程度可以设为较低级别，低于较小的第二阈值时的提示事件的紧急程度可以设为较高级别。又例如车辆组件出现故障时需要进行故障提示，而重要部件出现故障会导致交通事故，则提示事件的紧急程度可以设为较高级别。又例如车速较高且未与前车保持安全行车距离的提示事件的紧急程度需要设为高级别。再例如对拥堵情况的路况提示事件其紧急程度可以设为低级别。在一些实施例中，预先建立数据库，对各种提示事件进行归类且设定相应的紧急程度级别。作为示例，在根据车辆的环境信息分析紧急程度时，可以基于获取的车辆环境信息，通过查找预先建立的数据库确定当前情况的紧急程度，即待播放语音所对应的提示事件的紧急程度。In some embodiments, the level of urgency corresponding to a specific situation is pre-set. For example, when the oil amount is lower than the first threshold or the second threshold, the user needs to be prompted to refuel, and when the fuel amount is lower than the first threshold, which is larger, the emergency degree of the prompt event can be set to a lower level, and when the fuel amount is lower than the second threshold, which is smaller The urgency of the reminder event can be set to a higher level. For another example, when a vehicle component fails, a fault prompt needs to be given, and a failure of an important component will lead to a traffic accident, and the emergency level of the prompt event can be set to a higher level. For another example, the urgency of the prompting event that the vehicle speed is high and a safe driving distance is not maintained with the preceding vehicle needs to be set to a high level. For another example, the emergency level of the road condition reminder event in the congestion situation may be set to a low level. In some embodiments, a database is established in advance, various prompt events are classified and corresponding urgency levels are set. As an example, when analyzing the urgency according to the vehicle environment information, the urgency of the current situation, that is, the urgency of the prompting event corresponding to the to-be-played voice, may be determined by searching a pre-established database based on the obtained vehicle environment information.

本步骤中，分析用户情绪状态。可以根据采集到的用户的面部表情、语言、驾驶动作、驾驶持续时间等来分析用户情绪状态，可以仅根据其中一者判断，例如仅根据驾驶持续时间可以判断驾驶员处于疲劳驾驶状态，也可以根据其中多个信息判断，甚至结合其他多种信息综合判断，例如可以根据用户当前面部表情、发出的特定语言，甚至结合表征拥堵状况的路况信息来综合判断用户当前处于焦虑状态。在一些实施例中，分析用户情绪状态时，在根据采集到的用户的面部表情、语言、驾驶动作和/或驾驶持续时间的基础上，还可以借助于用户特征数据库来进行综合分析，从而更为有效准确地确定用户情绪状态。可基于前面步骤获取的用户状态信息，并根据预定模式判断用户是否处于惊慌、疲劳、正常、开心、难过等状态。在一些实施例中，该信息处理判断可在座舱智能系统中处理，也可通过云端系统协助处理。In this step, the user's emotional state is analyzed. The user's emotional state can be analyzed according to the collected facial expressions, language, driving actions, and driving duration of the user, and only one of them can be judged. Judging based on multiple pieces of information, or even combining with other pieces of information, for example, it can be comprehensively judged that the user is currently in a state of anxiety based on the user's current facial expression, the specific language spoken, and even combined with road condition information that represents congestion. In some embodiments, when analyzing the user's emotional state, on the basis of the collected facial expressions, language, driving actions and/or driving duration of the user, comprehensive analysis may also be performed with the help of a user feature database, so as to improve In order to effectively and accurately determine the user's emotional state. Whether the user is in a state of panic, fatigue, normal, happy, sad, etc. can be determined based on the user state information obtained in the previous steps and according to a predetermined mode. In some embodiments, the information processing judgment may be processed in the cockpit intelligent system, or may be assisted in processing by the cloud system.

在本步骤中，根据紧急程度和情绪状态确定语音调节策略，所述语音调节策略包括频率调节、速度调节以及语调模型曲线的组合。其中，频率调节对应语音的音高，速度调节对应语音的语速，语调模型曲线对应语音的语调。语调是音调和节奏的结合，比如开心的语调，语句结尾音调是往上的，生气的语调，语句结尾音是往下的。一个语句通常有2个音调峰值、3个低点。因此采用不同的语调模型曲线对语音进行调节后可以表现为不同的语调。作为示例，语调模型曲线可以包括具有警示效果的严肃语调模型曲线、具有安抚效果的平和语调模型曲线和具有振奋效果的活泼语调模型曲线等等。In this step, a speech adjustment strategy is determined according to the degree of urgency and emotional state, and the speech adjustment strategy includes a combination of frequency adjustment, speed adjustment and intonation model curve. The frequency adjustment corresponds to the pitch of the speech, the speed adjustment corresponds to the speech rate of the speech, and the intonation model curve corresponds to the intonation of the speech. Intonation is a combination of pitch and rhythm, such as happy intonation, the tone at the end of a sentence is up, angry tone, the end of a sentence is down. A sentence usually has 2 pitch peaks and 3 low points. Therefore, using different intonation model curves to adjust the speech can show different intonation. As examples, the intonation model curves may include a serious intonation model curve with a warning effect, a peaceful intonation model curve with a soothing effect, a lively intonation model curve with an uplifting effect, and the like.

在一些实施例中，预先建立语音调节策略模型，该模型可以是线性、非线性或层级式。作为示例，该语音调节策略模型包括语音调节策略表，所述语音调节策略表包括紧急程度、情绪状态以及语音调节策略的对应关系。以下表1为语音调节策略表的示例。作为示例，所述紧急程度包括紧急和不紧急，情绪状态包括正常、惊慌、疲劳……等等。例如情况不紧急但用户处于惊慌状态时，需要采用低音、慢速并配合平和语调模型曲线对语音进行处理，又例如情况紧急但用户处于疲劳状态，则需要采用高音、快速并配合严肃语调模型曲线对语音进行处理。In some embodiments, a speech adjustment strategy model is pre-established, and the model may be linear, non-linear or hierarchical. As an example, the speech regulation strategy model includes a speech regulation strategy table, and the speech regulation strategy table includes the corresponding relationship between urgency, emotional state, and speech regulation strategy. Table 1 below is an example of a speech adjustment strategy table. As an example, the degree of urgency includes urgent and not urgent, and the emotional state includes normal, panic, fatigue, . . . and the like. For example, when the situation is not urgent but the user is in a state of panic, it is necessary to use low-pitched, slow-speed and a moderate intonation model curve to process the speech. For another example, when the situation is urgent but the user is in a state of fatigue, it is necessary to use a high-pitched, fast and serious intonation model curve. Process the voice.

表1Table 1

紧急程度emergency level 情绪状态emotional state 语音调节策略speech regulation strategy 紧急urgent 正常normal 低音、快速、语调严肃low, fast, serious tone 紧急urgent 疲劳fatigue 高音、快速、语调严肃high-pitched, fast, serious tone 紧急urgent 惊慌panic 低音、慢速、语调平和Bass, slow, smooth intonation 不紧急not urgent 正常normal 无处理no treatment 不紧急not urgent 疲劳fatigue 高音、快速、语调严肃high-pitched, fast, serious tone 不紧急not urgent 惊慌panic 低音、慢速、语调平和Bass, slow, smooth intonation ……... ……... ……...

在一些实施例中，还可以根据更多的情绪状态(例如焦虑、伤心、开心等等)采用具有合适的提示效果的语音调节策略。可以开发更多的语调模型曲线，如具有安抚效果的舒缓语调等等。以上仅为示例，本公开不限于此。In some embodiments, speech adjustment strategies with appropriate prompting effects may also be adopted according to more emotional states (eg, anxiety, sadness, happiness, etc.). More intonation model curves can be developed, such as soothing intonation with a soothing effect, etc. The above is just an example, and the present disclosure is not limited thereto.

在步骤104，根据所述语音调节策略，调节所述车辆上待播放语音的参数。In step 104, the parameters of the voice to be played on the vehicle are adjusted according to the voice adjustment strategy.

在本步骤中，根据确定的语音调节策略调节待播放语音的一个或多个参数，例如语音、语调和语速。在一些实施例中，根据所述确定的语音调节策略中所包括的频率、速度以及语调模型曲线的组合，调节所述待播放语音的频率及语速并进行相应的语调模型处理。In this step, one or more parameters of the speech to be played, such as speech, intonation and speech rate, are adjusted according to the determined speech adjustment strategy. In some embodiments, according to the combination of frequency, speed and intonation model curve included in the determined speech adjustment strategy, the frequency and speech speed of the to-be-played speech are adjusted and corresponding intonation model processing is performed.

在一些实施例中，所述待播放语音可以为从预制的语音包中调用的与提示事件相对应的待播放语音。例如油量低于特定阈值时生成提醒用户加油的提示事件。基于该提示事件从预制的语音包中调用内容匹配的语音数据作为待播放语音。在其他实施例中，所述待播放的语音也可以为根据用户偏好等从后台服务器获取的语音数据。在一些实施例中，所述待播放的语音也可以是已有播放程序中预定的语音，其可能与实时获取的环境信息或用户状态并无关联。以上关于待播放语音的内容和获得方式，仅为示例，本公开不限于此。In some embodiments, the to-be-played voice may be a to-be-played voice corresponding to a prompt event called from a pre-made voice package. For example, when the amount of oil is lower than a certain threshold, a prompt event to remind the user to refuel is generated. Based on the prompt event, the voice data with matching content is called from the prefabricated voice package as the voice to be played. In other embodiments, the voice to be played may also be voice data obtained from a background server according to user preferences. In some embodiments, the to-be-played voice may also be a predetermined voice in an existing playing program, which may not be related to the real-time acquired environmental information or user status. The content and obtaining method of the voice to be played above are only examples, and the present disclosure is not limited thereto.

用所确定的语音调节策略处理后的语音，被播放时，听起来被赋予了正确的情绪反馈，能够达到更佳的提示效果。When the voice processed by the determined voice adjustment strategy is played, it sounds like it is given correct emotional feedback, which can achieve a better prompt effect.

进一步参考图2，作为对图1所示方法的实现，本公开提供了语音调节装置的一个实施例，该装置实施例与图1所示的方法的实施例相对应，该装置具体可以应用于各种电子设备中。Referring further to FIG. 2 , as an implementation of the method shown in FIG. 1 , the present disclosure provides an embodiment of a speech adjustment apparatus, which corresponds to the embodiment of the method shown in FIG. 1 , and the apparatus can be specifically applied to in various electronic devices.

如图2所示，本实施例提供的语音调节装置200包括第一获取单元201，被配置成获取车辆的环境信息；第二获取单元202，被配置成获取所述车辆上至少一个人员的状态信息；确定单元203，被配置成基于所述环境信息和所述状态信息，确定语音调节策略；调节单元204，被配置成根据所述语音调节策略，调节所述车辆上待播放语音的参数。As shown in FIG. 2 , the voice adjustment apparatus 200 provided in this embodiment includes a first acquisition unit 201 configured to acquire environmental information of a vehicle; a second acquisition unit 202 configured to acquire the status of at least one person on the vehicle The determining unit 203 is configured to determine a voice adjustment strategy based on the environment information and the state information; the adjustment unit 204 is configured to adjust parameters of the voice to be played on the vehicle according to the voice adjustment strategy.

在本实施例中，语音调节装置200中：第一获取单元201、第二获取单元202、确定单元203和调节单元204的具体处理及其所带来的技术效果可分别参考图1对应实施例中的步骤101、步骤102、步骤103和步骤104的相关说明，在此不再赘述。In this embodiment, in the voice adjustment apparatus 200: the specific processing of the first acquisition unit 201, the second acquisition unit 202, the determination unit 203, and the adjustment unit 204 and the technical effects brought about by the first acquisition unit 201, the second acquisition unit 202, and the technical effects brought by them may refer to the corresponding embodiment in FIG. 1 respectively. The related descriptions of step 101, step 102, step 103 and step 104 in the above will not be repeated here.

在本实施例的一些可选的实现方式中，所述环境信息包括车内环境信息和车外环境信息中的至少一者，并且其中，所述车外环境信息包括路况信息和/或ADAS信息。In some optional implementations of this embodiment, the environment information includes at least one of in-vehicle environment information and outside-vehicle environment information, and wherein the outside-vehicle environment information includes road condition information and/or ADAS information .

在本实施例的一些可选的实现方式中，所述参数包括音调、语速和语调中的至少一者。In some optional implementations of this embodiment, the parameter includes at least one of pitch, speech rate, and intonation.

在本实施例的一些可选的实现方式中，所述确定单元203可以被配置成根据所述环境信息判断提示语音所对应的提示事件的紧急程度，以及根据所述状态信息判断所述车辆上至少一个人员的情绪状态。In some optional implementations of this embodiment, the determining unit 203 may be configured to determine, according to the environmental information, the degree of urgency of the prompting event corresponding to the prompting voice, and to determine the level of urgency of the prompting event corresponding to the prompting voice according to the state information The emotional state of at least one person.

在本实施例的一些可选的实现方式中，在所述环境信息包括车内环境信息的情况下，所述第一获取单元201可以被配置成通过采集车辆组件状态信息获取所述车内环境信息；并且其中，在所述环境信息包括路况信息的情况下，所述第一获取单元201可以被配置成至少采用以下方式之一获取所述路况信息：通过云端获取实时高精路况信息；或通过传感摄像头和/或雷达感知车辆附近情况。In some optional implementations of this embodiment, in the case that the environment information includes in-vehicle environment information, the first acquiring unit 201 may be configured to acquire the in-vehicle environment by collecting vehicle component state information and wherein, when the environmental information includes road condition information, the first obtaining unit 201 may be configured to obtain the road condition information in at least one of the following manners: obtaining real-time high-precision road condition information through the cloud; or Sensing the vicinity of the vehicle through sensing cameras and/or radar.

在本实施例的一些可选的实现方式中，所述车辆上至少一个人员包括所述车辆的驾驶员，并且其中，所述第二获取单元202可以被配置成至少采用以下方式之一获取所述状态信息：通过摄像头采集所述车辆上至少一个人员的面部表情；通过语音接收器采集所述车辆上至少一个人员的语言；通过驾驶动作采集器采集所述驾驶员的驾驶动作；或通过时钟记录采集所述驾驶员本次驾驶的持续时间。In some optional implementation manners of this embodiment, the at least one person on the vehicle includes a driver of the vehicle, and wherein the second acquiring unit 202 may be configured to acquire the information in at least one of the following manners The above state information: collect the facial expression of at least one person on the vehicle through the camera; collect the language of at least one person on the vehicle through the voice receiver; collect the driving action of the driver through the driving action collector; or use the clock The duration of the current driving of the driver is recorded and collected.

在本实施例的一些可选的实现方式中，所述确定单元203还可以被配置成根据预先建立的语音调节策略模型确定语音调节策略，所述语音调节策略模型包括所述紧急程度、情绪状态以及语音调节策略的对应关系，所述语音调节策略包括频率、速度以及语调模型曲线的组合。所述调节单元204可以被配置成根据所述确定的语音调节策略中所包括的频率、速度以及语调模型曲线的组合，对所述待播放语音的频率及语速进行相应的调节，并对所述待播放语音进行相应的语调模型调整。In some optional implementations of this embodiment, the determining unit 203 may be further configured to determine a speech adjustment strategy according to a pre-established speech adjustment strategy model, where the speech adjustment strategy model includes the urgency, emotional state And the corresponding relationship of the speech adjustment strategy, the speech adjustment strategy includes the combination of frequency, speed and intonation model curve. The adjusting unit 204 may be configured to adjust the frequency and the speed of the voice to be played according to the combination of the frequency, speed and intonation model curve included in the determined voice adjustment strategy, and adjust the voice to be played accordingly. According to the voice to be played, adjust the intonation model accordingly.

在本实施例的一些可选的实现方式中，所述语调模型曲线可以包括具有警示效果的严肃语调模型曲线、具有安抚效果的平和语调模型曲线和具有振奋效果的活泼语调模型曲线。In some optional implementations of this embodiment, the intonation model curve may include a serious intonation model curve with a warning effect, a peaceful intonation model curve with a soothing effect, and a lively intonation model curve with an uplifting effect.

本公开的实施例提供的语音调节方法和装置能够基于车载情景处理待播放语音，利用内外车况、路况判断情景紧急性，再结合驾驶员/车内乘客情绪状态确定是否需要安抚车内人员，对语音进行变调变速处理。在保证同一语音形象的整体性条件下，赋予语音正确的情绪反馈，带来更加安全的驾驶语音交互。可应用于辅助驾驶和无人驾驶场景。The voice adjustment method and device provided by the embodiments of the present disclosure can process the voice to be played based on the in-vehicle situation, judge the urgency of the situation by using the internal and external vehicle conditions and road conditions, and then determine whether it is necessary to appease the passengers in the vehicle in combination with the emotional state of the driver/passenger in the vehicle. The voice is transposed and processed with variable speed. Under the condition of ensuring the integrity of the same voice image, the voice is given correct emotional feedback, which brings safer driving voice interaction. It can be applied to assisted driving and unmanned driving scenarios.

下面参考图3，其示出了适于用来实现本公开的实施例的电子设备300的结构示意图。本公开的实施例中的电子设备可以包括但不限于诸如车载终端(例如车载导航终端)、移动电话、笔记本电脑、PAD(平板电脑)等等的移动终端。图3示出的电子设备仅仅是一个示例，不应对本公开的实施例的功能和使用范围带来任何限制。Referring next to FIG. 3 , a schematic structural diagram of an electronic device 300 suitable for implementing embodiments of the present disclosure is shown. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as in-vehicle terminals (eg, in-vehicle navigation terminals), mobile phones, notebook computers, PADs (tablet computers), and the like. The electronic device shown in FIG. 3 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.

如图3所示，电子设备300可以包括处理装置(例如中央处理器、图形处理器等)301，其可以根据存储在只读存储器(ROM)302中的程序或者从存储装置308加载到随机访问存储器(RAM)303中的程序而执行各种适当的动作和处理。在RAM 303中，还存储有电子设备300操作所需的各种程序和数据。处理装置301、ROM 302以及RAM 303通过总线304彼此相连。输入/输出(I/O)接口305也连接至总线304。As shown in FIG. 3, an electronic device 300 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 301 that may be loaded into random access according to a program stored in a read only memory (ROM) 302 or from a storage device 308 Various appropriate actions and processes are executed by the programs in the memory (RAM) 303 . In the RAM 303, various programs and data necessary for the operation of the electronic device 300 are also stored. The processing device 301 , the ROM 302 , and the RAM 303 are connected to each other through a bus 304 . An input/output (I/O) interface 305 is also connected to bus 304 .

通常，以下装置可以连接至I/O接口305：包括例如触摸屏、触摸板、摄像头、加速度计、陀螺仪等的输入装置306；包括例如液晶显示器(LCD，Liquid Crystal Display)、扬声器、振动器等的输出装置307；包括例如闪存(Flash Card)等的存储装置308；以及通信装置309。通信装置309可以允许电子设备300与其他设备进行无线或有线通信以交换数据。虽然图3示出了具有各种装置的电子设备300，但是应理解的是，并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。图3中示出的每个方框可以代表一个装置，也可以根据需要代表多个装置。Typically, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touchpad, camera, accelerometer, gyroscope, etc.; including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, etc. The output device 307 ; a storage device 308 including, for example, a flash memory (Flash Card); and a communication device 309 . Communication means 309 may allow electronic device 300 to communicate wirelessly or by wire with other devices to exchange data. While FIG. 3 shows electronic device 300 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in FIG. 3 can represent one device, and can also represent multiple devices as needed.

特别地，根据本公开的实施例，上文参考流程图描述的过程可以被实现为计算机软件程序。例如，本公开的实施例包括一种计算机程序产品，其包括承载在计算机可读介质上的计算机程序，该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中，该计算机程序可以通过通信装置309从网络上被下载和安装，或者从存储装置308被安装，或者从ROM 302被安装。在该计算机程序被处理装置301执行时，执行本公开的实施例的方法中限定的上述功能。In particular, according to embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network via the communication device 309 , or from the storage device 308 , or from the ROM 302 . When the computer program is executed by the processing apparatus 301, the above-described functions defined in the methods of the embodiments of the present disclosure are executed.

需要说明的是，本公开的实施例所述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件，或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于：具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开的实施例中，计算机可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开的实施例中，计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号，其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式，包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质，该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输，包括但不限于：电线、光缆、RF(Radio Frequency，射频)等等，或者上述的任意合适的组合。It should be noted that the computer-readable medium described in the embodiments of the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing. In embodiments of the present disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. Rather, in embodiments of the present disclosure, a computer-readable signal medium may include a data signal in baseband or propagated as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device . The program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: electric wire, optical cable, RF (Radio Frequency, radio frequency), etc., or any suitable combination of the above.

上述计算机可读介质可以是上述电子设备中所包含的；也可以是单独存在，而未装配入该电子设备中。上述计算机可读介质承载有一个或者多个程序，当上述一个或者多个程序被该电子设备执行时，使得该电子设备：获取车辆的环境信息；获取所述车辆上至少一个人员的状态信息；基于所述环境信息和所述状态信息，确定语音调节策略；根据所述语音调节策略，调节所述车辆上待播放语音的参数。The above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device. The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: acquires the environmental information of the vehicle; acquires the status information of at least one person on the vehicle; Based on the environment information and the state information, a voice adjustment strategy is determined; according to the voice adjustment strategy, parameters of the voice to be played on the vehicle are adjusted.

上述语音调节装置可以成为车载系统或辅助驾驶系统中的一部分，例如是高级辅助驾驶系统ADAS中的一部分，实现成为该系统的一种功能。The above-mentioned voice adjustment device can be a part of an in-vehicle system or an assisted driving system, for example, a part of an advanced assisted driving system ADAS, and can be realized as a function of the system.

可以以一种或多种程序设计语言或其组合来编写用于执行本公开的实施例的操作的计算机程序代码，所述程序设计语言包括面向对象的程序设计语言——诸如Java、Smalltalk、C++，还包括常规的过程式程序设计语言——诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中，远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)——连接到用户计算机，或者，可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for carrying out operations of embodiments of the present disclosure may be written in one or more programming languages, including object-oriented programming languages such as Java, Smalltalk, C++, or a combination thereof , but also conventional procedural programming languages - such as "C" or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider to via Internet connection).

附图中的流程图和框图，图示了按照本公开的各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上，流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分，该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意，在有些作为替换的实现中，方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如，两个接连地表示的方框实际上可以基本并行地执行，它们有时也可以按相反的顺序执行，这依所涉及的功能而定。也要注意的是，框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合，可以用执行规定的功能或操作的专用的基于硬件的系统来实现，或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.

描述于本公开的实施例中所涉及到的单元可以通过软件的方式实现，也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中，例如，可以描述为：一种处理器，包括第一获取单元、第二获取单元、确定单元、调节单元。其中，这些单元的名称在某种情况下并不构成对该单元本身的限定，例如，第一获取单元还可以被描述为“获取车辆的环境信息的单元”。The units involved in the embodiments of the present disclosure may be implemented in software or hardware. The described unit may also be provided in the processor, for example, it may be described as: a processor including a first obtaining unit, a second obtaining unit, a determining unit, and an adjusting unit. Wherein, the names of these units do not constitute a limitation on the unit itself in some cases, for example, the first acquisition unit may also be described as a "unit for acquiring environmental information of a vehicle".

以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解，本公开的实施例中所涉及的发明范围，并不限于上述技术特征的特定组合而成的技术方案，同时也应涵盖在不脱离上述发明构思的情况下，由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开的实施例中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is merely a preferred embodiment of the present disclosure and an illustration of the technical principles employed. Those skilled in the art should understand that the scope of the invention involved in the embodiments of the present disclosure is not limited to the technical solution formed by the specific combination of the above-mentioned technical features, and should also cover, without departing from the above-mentioned inventive concept, the above-mentioned Other technical solutions formed by any combination of technical features or their equivalent features. For example, a technical solution is formed by replacing the above-mentioned features with the technical features disclosed in the embodiments of the present disclosure (but not limited to) with similar functions.

Claims

1. A method of speech conditioning, the method comprising:

acquiring environmental information of a vehicle;

acquiring state information of at least one person on the vehicle;

determining a voice adjustment strategy based on the environmental information and the state information;

and adjusting parameters of the voice to be played on the vehicle according to the voice adjusting strategy.

2. The voice adjustment method according to claim 1, wherein the environmental information includes at least one of in-vehicle environmental information and out-vehicle environmental information, and wherein the out-vehicle environmental information includes road condition information and/or Advanced Driver Assistance System (ADAS) information.

3. The speech adjustment method according to claim 1 or 2, wherein the parameter comprises at least one of pitch, pace and intonation.

4. The speech adjustment method according to claim 1 or 2, wherein the determining a speech adjustment policy based on the context information and the status information comprises:

and judging the emergency degree of a prompt event corresponding to the voice to be played according to the environment information, and judging the emotional state of at least one person on the vehicle according to the state information.

5. The voice adjustment method according to claim 2, wherein the in-vehicle environment information is acquired by collecting vehicle component state information in a case where the environment information includes in-vehicle environment information.

6. The voice adjusting method according to claim 2, wherein, when the environment information includes traffic information, the traffic information is obtained by one of the following methods:

acquiring real-time high-precision road condition information through a cloud; or

And sensing the situation near the vehicle through a sensing camera and/or a radar.

7. The voice adjustment method of claim 1, wherein the at least one person on the vehicle comprises a driver of the vehicle, and wherein the status information is obtained in at least one of:

collecting facial expressions of at least one person on the vehicle through a camera;

collecting, by a voice receiver, a language of at least one person on the vehicle;

collecting the driving action of the driver through a driving action collector; or

And recording and acquiring the duration of the current driving of the driver through a clock.

8. The speech adjustment method of claim 4, wherein the method further comprises: and pre-establishing a voice regulation strategy model, wherein the voice regulation strategy model comprises the corresponding relation of the emergency degree, the emotional state and the voice regulation strategy, and the voice regulation strategy comprises the combination of frequency, speed and a tone model curve.

9. The speech adaptation method of claim 8, wherein the intonation model curves include a serious intonation model curve with warning effect, a peace intonation model curve with soothing effect, and an active intonation model curve with excitement effect.

10. The voice adjustment method according to claim 8 or 9, wherein the adjusting the parameter of the voice to be played on the vehicle according to the voice adjustment strategy comprises:

and correspondingly adjusting the frequency and the speech speed of the voice to be played and correspondingly adjusting the tone model of the voice to be played according to the combination of the frequency, the speed and the tone model curve included in the determined voice adjusting strategy.

11. A voice adjustment apparatus comprising:

a first acquisition unit configured to acquire environmental information of a vehicle;

a second acquisition unit configured to acquire status information of at least one person on the vehicle;

a determining unit configured to determine a voice adjustment policy based on the environment information and the state information;

and the adjusting unit is configured to adjust the parameters of the voice to be played on the vehicle according to the voice adjusting strategy.

12. The voice adjustment device of claim 11, wherein the environmental information includes at least one of in-vehicle environmental information and out-vehicle environmental information, and wherein the out-of-vehicle environmental information includes road condition information and/or ADAS information.

13. The speech adaptation device according to claim 11 or 12, wherein the parameter comprises at least one of pitch, pace and intonation.

14. The speech adjustment device of claim 11 or 12, wherein the determination unit is configured to:

and judging the emergency degree of a prompt event corresponding to the prompt voice according to the environment information, and judging the emotional state of at least one person on the vehicle according to the state information.

15. The voice adjustment device according to claim 12, wherein, in a case where the environmental information includes in-vehicle environmental information, the first acquisition unit is configured to acquire the in-vehicle environmental information by collecting vehicle component state information; and wherein, in case the environment information includes traffic information, the first obtaining unit is configured to obtain the traffic information at least in one of:

16. The voice adjustment apparatus according to claim 11, wherein the at least one person on the vehicle comprises a driver of the vehicle, and wherein the second acquisition unit is configured to acquire the status information at least in one of:

17. The speech adaptation device according to claim 14, wherein the determining unit is configured to determine a speech adaptation strategy according to a pre-established speech adaptation strategy model comprising correspondence of the urgency level, emotional state and speech adaptation strategy, the speech adaptation strategy comprising a combination of frequency, speed and pitch model curves.

18. An electronic device, comprising:

one or more processors;

a storage device having one or more programs stored thereon;

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-10.

19. An in-vehicle system comprising the electronic device of claim 18.

20. A computer-readable medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method according to any one of claims 1-10.