[go: up one dir, main page]

CN112434872A - Hotel yield prediction method, system, equipment and storage medium - Google Patents

Hotel yield prediction method, system, equipment and storage medium Download PDF

Info

Publication number
CN112434872A
CN112434872A CN202011394090.9A CN202011394090A CN112434872A CN 112434872 A CN112434872 A CN 112434872A CN 202011394090 A CN202011394090 A CN 202011394090A CN 112434872 A CN112434872 A CN 112434872A
Authority
CN
China
Prior art keywords
hotel
feature
data
output
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011394090.9A
Other languages
Chinese (zh)
Inventor
林晨
褚煜佳
李鹤
孙刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ctrip Computer Technology Shanghai Co Ltd
Original Assignee
Ctrip Computer Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ctrip Computer Technology Shanghai Co Ltd filed Critical Ctrip Computer Technology Shanghai Co Ltd
Priority to CN202011394090.9A priority Critical patent/CN112434872A/en
Publication of CN112434872A publication Critical patent/CN112434872A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/12Hotels or restaurants

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Finance (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Human Resources & Organizations (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Game Theory and Decision Science (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Primary Health Care (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明提供了一种酒店产量预测方法、系统、设备及存储介质,所述方法包括步骤:获取多个酒店的产量数据,所述产量数据包括历史产量数据、酒店属性特征数据以及当前预订进度数据;基于分位数回归模型构建分位数损失函数;构建初始深度学习网络模型;基于所述分位数损失函数和所述多个酒店的产量数据,训练所述初始深度学习网络模型,分别得到各个酒店对应的目标网络模型;基于所述目标网络模型对多个酒店的未来一预设时段内的产量分别进行预测,输出各个酒店在所述预设时段内的产量值;本申请实现对多酒店的产量同时预测,并且提高了酒店产量预测的准确性。

Figure 202011394090

The present invention provides a method, system, equipment and storage medium for predicting hotel output. The method includes the steps of: acquiring output data of multiple hotels, where the output data includes historical output data, hotel attribute feature data and current reservation progress data. ; build a quantile loss function based on the quantile regression model; build an initial deep learning network model; based on the quantile loss function and the output data of the multiple hotels, train the initial deep learning network model, respectively obtaining A target network model corresponding to each hotel; based on the target network model, the output of multiple hotels in a preset time period in the future is respectively predicted, and the output value of each hotel in the preset time period is output; The output of the hotel is forecast at the same time, and the accuracy of the hotel output forecast is improved.

Figure 202011394090

Description

酒店产量预测方法、系统、设备及存储介质Hotel yield forecasting method, system, equipment and storage medium

技术领域technical field

本发明涉及人工智能技术领域,具体地说,涉及一种酒店产量预测方法、系统、设备及存储介质。The invention relates to the technical field of artificial intelligence, in particular to a method, system, equipment and storage medium for predicting hotel yield.

背景技术Background technique

现有技术中,对酒店的产量(即酒店客房等资源的销量)进行预测一般采用传统的时间序列模型或者神经网络模型LSTM(Long Short-Term Memory,长短期记忆人工神经网络)。但是传统时序模型很难综合考虑多个外生变量对最终产量的影响,并且其表达能力有限。而在酒店产量预测过程中,除时序特征外,还涉及到诸如预定进度特征、酒店所在竞争圈特征、酒店属性特征、日期特征等外生变量。所以此时传统时序模型的预测准确性就会明显较差。而神经网络模型LSTM在对多酒店进行预测的场景下其表达能力有限,无法实现同时对多酒店进行预测的场景使用。In the prior art, a traditional time series model or a neural network model LSTM (Long Short-Term Memory, artificial neural network for long short-term memory) is generally used to predict the output of a hotel (ie, the sales volume of resources such as hotel rooms). However, the traditional time series model is difficult to comprehensively consider the influence of multiple exogenous variables on the final yield, and its expressive ability is limited. In the process of hotel output forecasting, in addition to time series features, it also involves exogenous variables such as reservation schedule features, hotel competitive circle features, hotel attribute features, and date features. Therefore, the prediction accuracy of the traditional time series model will be significantly worse at this time. However, the neural network model LSTM has limited expressive ability in the scenario of multi-hotel prediction, and cannot be used in the scenario of multi-hotel prediction at the same time.

发明内容SUMMARY OF THE INVENTION

针对现有技术中的问题,本发明的目的在于提供一种酒店产量预测方法、系统、设备及存储介质,既能实现对多酒店的产量同时预测,又能提高酒店产量预测的准确性。In view of the problems in the prior art, the purpose of the present invention is to provide a method, system, equipment and storage medium for predicting the output of hotels, which can not only realize the simultaneous prediction of the output of multiple hotels, but also improve the accuracy of the hotel output prediction.

为实现上述目的,本发明提供了一种酒店产量预测方法,所述方法包括以下步骤:To achieve the above object, the present invention provides a method for predicting the output of a hotel, the method comprising the following steps:

获取多个酒店的产量数据;Obtain production data for multiple hotels;

基于分位数回归模型构建分位数损失函数;Construct quantile loss function based on quantile regression model;

构建初始深度学习网络模型;Build the initial deep learning network model;

基于所述分位数损失函数和所述多个酒店的产量数据,训练所述初始深度学习网络模型,分别得到各个酒店对应的目标网络模型;Based on the quantile loss function and the output data of the multiple hotels, the initial deep learning network model is trained, and the target network models corresponding to each hotel are obtained respectively;

基于所述目标网络模型对多个酒店的未来一预设时段内的产量分别进行预测,输出各个酒店在所述预设时段内的产量值。Based on the target network model, the output of a plurality of hotels in a preset time period in the future is respectively predicted, and the output value of each hotel in the preset time period is output.

可选地,所述基于所述分位数损失函数和所述多个酒店的产量数据,训练所述初始深度学习网络模型,分别得到各个酒店对应的目标网络模型的步骤包括:Optionally, the steps of training the initial deep learning network model based on the quantile loss function and the output data of the multiple hotels, respectively obtaining target network models corresponding to each hotel include:

对每一酒店的所述产量数据进行向量化操作,得到归属于一酒店的多个特征向量;Carrying out a vectorization operation on the output data of each hotel to obtain a plurality of feature vectors belonging to a hotel;

对所述特征向量分别进行特征低阶交叉和特征高阶交叉,分别得到第一预测值和第二预测值;Perform low-order feature crossover and feature high-order crossover on the feature vector, respectively, to obtain a first predicted value and a second predicted value;

对所述第一预测值和第二预测值进行加权求和,得到初始预测结果,并对所述初始预测结果利用预设激活函数进行归一化操作,得到预测概率值;Weighted summation is performed on the first predicted value and the second predicted value to obtain an initial predicted result, and a normalization operation is performed on the initial predicted result using a preset activation function to obtain a predicted probability value;

将所述预测概率值和预设的真实结果值作为所述分位数损失函数的输入,得到目标网络模型。The predicted probability value and the preset real result value are used as the input of the quantile loss function to obtain the target network model.

可选地,所述初始深度学习网络模型中包含有DNN模型,所述对所述特征向量分别进行特征低阶交叉和特征高阶交叉,分别得到第一预测值和第二预测值的步骤包括:Optionally, the initial deep learning network model includes a DNN model, and the step of performing low-order feature crossover and feature high-order crossover on the feature vector, respectively, to obtain the first predicted value and the second predicted value includes: :

从所有的所述特征向量中任选两个特征向量进行组合,形成多个特征向量集合,每一所述特征向量集合中包含有两个特征向量;Choose two eigenvectors from all the eigenvectors to combine to form a plurality of eigenvector sets, and each of the eigenvector sets contains two eigenvectors;

对所有的所述特征向量集合分别进行内积运算,得到第一预测值;Perform an inner product operation on all the feature vector sets respectively to obtain a first predicted value;

将所有的所述特征向量作为所述DNN模型的输入,得到第二预测值。Using all the feature vectors as the input of the DNN model, a second predicted value is obtained.

可选地,所述目标网络模型中包含分位数,不同酒店对应的目标网络模型中的分位数不同。Optionally, the target network model includes quantiles, and the quantiles in the target network models corresponding to different hotels are different.

可选地,所述基于所述分位数损失函数和所述多个酒店的产量数据,训练所述初始深度学习网络模型,分别得到各个酒店对应的目标网络模型的步骤包括:Optionally, the steps of training the initial deep learning network model based on the quantile loss function and the output data of the multiple hotels, respectively obtaining target network models corresponding to each hotel include:

对于每一酒店,将多个预设分位数分别作为所述分位数损失函数的输入,训练生成归属于同一酒店的多个不同的第二网络模型,每一所述第二网络模型对应一所述预设分位数;For each hotel, a plurality of preset quantiles are used as the input of the quantile loss function respectively, and a plurality of different second network models belonging to the same hotel are generated by training, and each of the second network models corresponds to 1. the preset quantile;

基于过去N天的所述产量数据作为测试集,对所述多个不同的第二网络模型分别进行测试,将得到的预测值与真实产量值差值最小的所述第二网络模型,作为所述酒店对应的目标网络模型;所述过去N天的所述产量数据包含所述真实产量值。Based on the output data of the past N days as the test set, the multiple different second network models are tested respectively, and the second network model with the smallest difference between the obtained predicted value and the actual output value is used as the test set. The target network model corresponding to the hotel; the production data of the past N days includes the real production value.

可选地,所述基于所述分位数损失函数和所述多个酒店的产量数据,训练所述初始深度学习网络模型,分别得到各个酒店对应的目标网络模型的步骤包括:Optionally, the steps of training the initial deep learning network model based on the quantile loss function and the output data of the multiple hotels, respectively obtaining target network models corresponding to each hotel include:

依据每一酒店的所述产量数据中包含的特征,生成每一特征对应的特征ID;According to the features contained in the output data of each hotel, a feature ID corresponding to each feature is generated;

依据所述特征和所述特征ID,生成特征字典文件;Generate a feature dictionary file according to the feature and the feature ID;

将所述特征字典文件存储至所述目标网络模型中。Store the feature dictionary file in the target network model.

可选地,每一条所述产量数据包括特征和与所述特征对应的特征值;所述初始深度学习网络模型具有多个参数;每一所述特征匹配一个所述参数;且每一所述特征匹配一所述特征向量;Optionally, each piece of the output data includes a feature and a feature value corresponding to the feature; the initial deep learning network model has a plurality of parameters; each of the features matches one of the parameters; and each of the feature matching—the feature vector;

所述对所有的所述特征向量集合分别进行内积运算,得到第一预测值的步骤包括:依据低阶交叉函数对每一特征向量集合分别进行内积运算;The step of performing inner product operations on all the feature vector sets respectively to obtain the first predicted value includes: performing inner product operations on each feature vector set respectively according to a low-order cross function;

所述低阶交叉函数为:The low-order cross function is:

Figure BDA0002813965160000031
Figure BDA0002813965160000031

其中,

Figure BDA0002813965160000032
in,
Figure BDA0002813965160000032

y1表示所述第一预测值,w0表示所述初始深度学习网络模型的第一个参数,wi表示所述初始深度学习网络模型的第i个参数,m表示每一条所述产量数据的特征总数,xi表示每一条所述产量数据中第i个特征对应的特征值,<vi,vj>表示特征向量vi和vj的内积,vi表示每一条所述产量数据中第i个特征对应的特征向量,vj表示每一条所述产量数据中第j个特征对应的特征向量,xj表示每一条所述产量数据中第j个特征对应的特征值,vi,f表示特征向量vi中的第f个元素,vj,f表示特征向量vj中的第f个元素,k表示特征向量vi中的元素总数。y 1 represents the first predicted value, w 0 represents the first parameter of the initial deep learning network model, w i represents the ith parameter of the initial deep learning network model, m represents each piece of the output data The total number of features, x i represents the eigenvalue corresponding to the ith feature in each piece of the yield data, <v i , v j > represents the inner product of the feature vectors v i and v j , v i represents the yield of each item The eigenvector corresponding to the ith feature in the data, v j represents the eigenvector corresponding to the jth feature in each piece of the output data, x j represents the eigenvalue corresponding to the jth feature in each piece of the output data, v i, f represent the f-th element in the feature vector v i , v j, f represent the f-th element in the feature vector v j , and k represents the total number of elements in the feature vector v i .

可选地,所述分位数损失函数为:Optionally, the quantile loss function is:

Figure BDA0002813965160000033
Figure BDA0002813965160000033

其中,n表示所述产量数据的总数量,q表示预设分位数,y′p表示第p条所述产量数据的预测值,yp表示第p条所述产量数据的真实产量值,L表示总损失值。Wherein, n represents the total quantity of the yield data, q represents the preset quantile, y′ p represents the predicted value of the yield data of the pth item, yp represents the actual yield value of the yield data of the pth item, L represents the total loss value.

可选地,所述方法还包括步骤:Optionally, the method further includes the steps:

基于当前时段的酒店产量数据对所述多个酒店的产量数据进行更新;updating the production data of the plurality of hotels based on the hotel production data of the current period;

基于更新后的产量数据,再次训练所述初始深度学习网络模型,得到更新后的目标网络模型以及更新后的所述分位数。Based on the updated yield data, the initial deep learning network model is retrained to obtain the updated target network model and the updated quantile.

可选地,所述产量数据包括历史产量数据、酒店属性特征数据以及当前预订进度数据;所述酒店属性特征数据包括酒店星级、房型等级和酒店所在城市数据,所述产量数据还包括历史预订进度数据、当前取消进度、酒店所在竞争圈特征和日期特征。Optionally, the output data includes historical output data, hotel attribute feature data, and current reservation progress data; the hotel attribute feature data includes hotel star rating, room type level, and data on the city where the hotel is located, and the output data also includes historical reservations. Progress data, current cancellation progress, competitive circle characteristics of the hotel, and date characteristics.

本发明还提供了一种酒店产量预测系统,用于实现上述酒店产量预测方法,所述系统包括:The present invention also provides a hotel yield forecasting system for realizing the above hotel yield forecasting method, the system comprising:

产量数据获取模块,用于获取多个酒店的产量数据,所述产量数据包括历史产量数据、酒店属性特征数据以及当前预订进度数据;The output data acquisition module is used to obtain the output data of multiple hotels, and the output data includes historical output data, hotel attribute feature data and current reservation progress data;

损失函数构建模块,基于分位数回归模型构建分位数损失函数;A loss function building module, which builds a quantile loss function based on a quantile regression model;

模型构建模块,用于构建初始深度学习网络模型;A model building module for building an initial deep learning network model;

模型训练模块,基于所述分位数损失函数和所述多个酒店的产量数据,训练所述初始深度学习网络模型,分别得到各个酒店对应的目标网络模型;The model training module, based on the quantile loss function and the output data of the multiple hotels, trains the initial deep learning network model, and obtains the target network model corresponding to each hotel respectively;

产量预测模块,基于所述目标网络模型对多个酒店的未来一预设时段内的产量分别进行预测,输出各个酒店在所述预设时段内的产量值。The output prediction module, based on the target network model, respectively predicts the output of a plurality of hotels within a preset time period in the future, and outputs the output value of each hotel within the preset time period.

本发明还提供了一种酒店产量预测设备,包括:The present invention also provides a hotel yield forecasting device, comprising:

处理器;processor;

存储器,其中存储有所述处理器的可执行指令;a memory in which executable instructions for the processor are stored;

其中,所述处理器配置为经由执行所述可执行指令来执行上述任意一项酒店产量预测方法的步骤。Wherein, the processor is configured to execute the steps of any one of the above-mentioned hotel yield forecasting methods by executing the executable instructions.

本发明还提供了一种计算机可读存储介质,用于存储程序,所述程序被处理器执行时实现上述任意一项酒店产量预测方法的步骤。The present invention also provides a computer-readable storage medium for storing a program, when the program is executed by a processor, the steps of any one of the above-mentioned hotel yield prediction methods are implemented.

本发明与现有技术相比,具有以下优点及突出性效果:Compared with the prior art, the present invention has the following advantages and outstanding effects:

本发明提供的酒店产量预测方法、系统、设备及存储介质能够综合考虑多个外生变量对最终产量的影响,提高了酒店产量预测准确性;并且具有对多酒店的并行预测能力,能够根据不同的酒店采用不同的分位数生成模型,在实现多酒店同时预测的前提下,保证了并行预测时对每个酒店的产量预测准确性。The hotel output forecasting method, system, equipment and storage medium provided by the invention can comprehensively consider the influence of multiple exogenous variables on the final output, improve the hotel output forecasting accuracy; The hotels used different quantile generation models, and under the premise of achieving simultaneous forecasting of multiple hotels, the accuracy of the output forecast for each hotel during parallel forecasting was guaranteed.

附图说明Description of drawings

通过阅读参照以下附图对非限制性实施例所作的详细描述,本发明的其它特征、目的和优点将会变得更明显。Other features, objects and advantages of the present invention will become more apparent upon reading the detailed description of non-limiting embodiments with reference to the following drawings.

图1为本发明一实施例公开的一种酒店产量预测方法的示意图;1 is a schematic diagram of a method for predicting hotel output disclosed in an embodiment of the present invention;

图2为本发明一实施例中步骤S40的流程示意图;FIG. 2 is a schematic flowchart of step S40 in an embodiment of the present invention;

图3为本发明一实施例中步骤S402的流程示意图;FIG. 3 is a schematic flowchart of step S402 in an embodiment of the present invention;

图4为本发明一实施例中步骤S404的流程示意图;FIG. 4 is a schematic flowchart of step S404 in an embodiment of the present invention;

图5为本发明另一实施例中步骤S40的流程示意图;FIG. 5 is a schematic flowchart of step S40 in another embodiment of the present invention;

图6为本发明另一实施例公开的一种酒店产量预测方法的示意图;6 is a schematic diagram of a method for predicting hotel output disclosed by another embodiment of the present invention;

图7为本发明实施例公开的一种酒店产量预测系统的结构示意图;7 is a schematic structural diagram of a hotel yield prediction system disclosed in an embodiment of the present invention;

图8为本发明实施例公开的一种酒店产量预测设备的结构示意图;8 is a schematic structural diagram of a hotel yield forecasting device disclosed in an embodiment of the present invention;

图9为本发明实施例公开的一种计算机可读存储介质的结构示意图。FIG. 9 is a schematic structural diagram of a computer-readable storage medium disclosed in an embodiment of the present invention.

具体实施方式Detailed ways

现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的实施方式。相反,提供这些实施方式使得本发明将全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。在图中相同的附图标记表示相同或类似的结构,因而将省略对它们的重复描述。Example embodiments will now be described more fully with reference to the accompanying drawings. However, example embodiments can be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The same reference numerals in the drawings denote the same or similar structures, and thus their repeated descriptions will be omitted.

如图1所示,本发明实施例公开了一种酒店产量预测方法,该方法包括以下步骤:As shown in FIG. 1 , an embodiment of the present invention discloses a method for predicting hotel output, and the method includes the following steps:

S10,获取多个酒店的产量数据,上述产量数据可以包括历史产量数据、酒店属性特征数据以及当前预订进度数据。上述酒店属性特征数据可以包括酒店星级、房型等级和酒店所在城市数据,上述产量数据还包括历史预订进度数据、当前取消进度、酒店所在竞争圈特征和日期特征。本申请对上述产量数据和酒店属性特征数据不作限制,比如上述产量数据还可以包括酒店点评特征。S10: Obtain output data of multiple hotels, where the output data may include historical output data, hotel attribute feature data, and current reservation progress data. The above-mentioned hotel attribute characteristic data may include hotel star rating, room type level, and data of the city where the hotel is located, and the above-mentioned output data may also include historical reservation progress data, current cancellation progress, characteristics of the competition circle and date characteristics of the hotel. This application does not limit the above-mentioned production data and hotel attribute feature data, for example, the above-mentioned production data may also include hotel review features.

S20,基于分位数回归模型构建分位数损失函数。其中,每一个分位数损失函数包含有对应的预设分位数,每一个酒店对应一个预设分位数,并且不同酒店的分位数损失函数中的预设分位数可以不同,也可以相同。本领域技术人员可以根据需要进行设置。S20, constructing a quantile loss function based on the quantile regression model. Among them, each quantile loss function includes a corresponding preset quantile, each hotel corresponds to a preset quantile, and the preset quantiles in the quantile loss functions of different hotels may be different, or can be the same. Those skilled in the art can make settings as required.

本实施例中,上述分位数损失函数为:In this embodiment, the above-mentioned quantile loss function is:

Figure BDA0002813965160000061
Figure BDA0002813965160000061

其中,n表示上述产量数据的总数量,也即产量数据的总条数。q表示预设分位数,比如可以为0.45、0.5或者0.6。y′p表示第p条产量数据的预测值,yp表示第p条产量数据的真实产量值。产量数据中包含有该条数据对应的真实产量值。L表示总损失。Among them, n represents the total number of the above-mentioned production data, that is, the total number of production data. q represents a preset quantile, such as 0.45, 0.5 or 0.6. y' p represents the predicted value of the p-th production data, and y p represents the actual production value of the p-th production data. The yield data contains the actual yield value corresponding to this piece of data. L represents the total loss.

p:y′p≥yp(y′p-yp)表示对于第p条产量数据,若预测值大于等于真实产量值,也即y′p≥yp,则执行

Figure BDA0002813965160000062
作为第p条产量数据的损失值计算总损失值。如果预测值小于真实产量值,也即y′p<yp,则执行
Figure BDA0002813965160000063
作为第p条产量数据的损失值计算总损失值。∑ p:y′p≥yp (y′ p -y p ) means that for the p-th production data, if the predicted value is greater than or equal to the actual production value, that is, y′ p ≥ y p , execute
Figure BDA0002813965160000062
The total loss value is calculated as the loss value for the p-th yield data. If the predicted value is less than the true yield value, i.e. y′ p <y p , then execute
Figure BDA0002813965160000063
The total loss value is calculated as the loss value for the p-th yield data.

S30,构建初始深度学习网络模型。本实施例中,该初始深度学习网络模型中包含有DNN(Deep Neural Networks,深度神经网络)模型。其中,初始深度学习网络模型可以为DeepFM深度学习模型。本申请对此不作限制。上述DeepFM模型包含FM(factor Machine,因子分解机)和DNN模型两部分。S30, constructing an initial deep learning network model. In this embodiment, the initial deep learning network model includes a DNN (Deep Neural Networks, deep neural network) model. The initial deep learning network model may be a DeepFM deep learning model. This application does not limit this. The above DeepFM model includes two parts: FM (factor Machine, factorization machine) and DNN model.

S40,基于上述分位数损失函数和上述多个酒店的产量数据,训练上述初始深度学习网络模型,分别得到各个酒店对应的目标网络模型。具体来说,如图2所示,步骤S40包括:S40, based on the quantile loss function and the output data of the multiple hotels, train the initial deep learning network model, and obtain target network models corresponding to each hotel respectively. Specifically, as shown in Figure 2, step S40 includes:

S401,对每一酒店的上述产量数据进行向量化操作,得到归属于一酒店的多个特征向量。本实施例中,每一条上述产量数据包括特征和与上述特征对应的特征值。比如,若产量数据中包含酒店星级数据,则每一条酒店星级数据包含星级特征和该特征对应的特征值。S401, performing a vectorization operation on the above-mentioned output data of each hotel to obtain a plurality of feature vectors belonging to a hotel. In this embodiment, each piece of the above-mentioned production data includes a feature and a feature value corresponding to the above-mentioned feature. For example, if the output data includes hotel star data, each piece of hotel star data includes a star feature and a feature value corresponding to the feature.

上述初始深度学习网络模型具有多个参数。每一特征匹配一个上述参数。且每一特征匹配一上述特征向量。对产量数据进行向量化操作也即是对产量数据中的特征进行上述向量化操作,得到特征对应的特征向量。其中,可以利用embedding技术实现上述向量化操作。The above initial deep learning network model has multiple parameters. Each feature matches one of the above parameters. And each feature matches one of the above feature vectors. Performing the vectorization operation on the output data is to perform the above-mentioned vectorization operation on the features in the output data to obtain feature vectors corresponding to the features. Among them, the above-mentioned vectorization operation can be realized by using the embedding technology.

S402,对所有特征向量分别进行特征低阶交叉和特征高阶交叉,分别得到第一预测值和第二预测值。具体来说,如图3所示,步骤S402包括:S402: Perform low-order feature crossover and feature high-order crossover on all feature vectors, respectively, to obtain a first predicted value and a second predicted value, respectively. Specifically, as shown in Figure 3, step S402 includes:

S4021,从所有的上述特征向量中任选两个特征向量进行组合,形成多个特征向量集合,每一上述特征向量集合中包含有两个特征向量。S4021 , combining any two feature vectors from all the aforementioned feature vectors to form multiple feature vector sets, each of the aforementioned feature vector sets includes two feature vectors.

S4022,对每一特征向量集合中的特征向量分别进行内积运算,得到第一预测值。具体来说,该步骤中依据低阶交叉函数对每一特征向量集合中的特征向量分别进行内积运算。S4022: Perform an inner product operation on the feature vectors in each feature vector set to obtain a first predicted value. Specifically, in this step, an inner product operation is respectively performed on the feature vectors in each feature vector set according to the low-order cross function.

上述低阶交叉函数为:The above low-order crossover function is:

Figure BDA0002813965160000071
Figure BDA0002813965160000071

其中,

Figure BDA0002813965160000072
in,
Figure BDA0002813965160000072

y1表示上述第一预测值。w0表示上述初始深度学习网络模型的第一个参数。wi表示上述初始深度学习网络模型的第i个参数。m表示每一条上述产量数据的特征总数。xi表示每一条上述产量数据中第i个特征对应的特征值。<vi,vj>表示特征向量vi和vj的内积。vi和vj表示一个特征向量集合中的两个特征向量。y 1 represents the above-mentioned first predicted value. w 0 represents the first parameter of the above initial deep learning network model. w i represents the i-th parameter of the above initial deep learning network model. m represents the total number of features for each piece of the above yield data. xi represents the eigenvalue corresponding to the ith feature in each of the above-mentioned production data. <v i ,v j > represents the inner product of the feature vectors v i and v j . v i and v j represent two eigenvectors in a set of eigenvectors.

vi表示每一条上述产量数据中第i个特征对应的特征向量。vj表示每一条上述产量数据中第j个特征对应的特征向量。xj表示每一条上述产量数据中第j个特征对应的特征值,vi,f表示特征向量vi中的第f个元素,vj,f表示特征向量vj中的第f个元素,k表示特征向量vi中的元素总数。v i represents the feature vector corresponding to the i-th feature in each of the above-mentioned production data. v j represents the feature vector corresponding to the j-th feature in each of the above-mentioned production data. x j represents the eigenvalue corresponding to the j-th feature in each of the above-mentioned production data, vi , f represent the f-th element in the feature vector v i , v j, f represent the f-th element in the feature vector v j , k represents the total number of elements in the feature vector vi.

以及S4023,将所有的上述特征向量作为上述DNN模型的输入,得到第二预测值。对于DNN模型的计算过程,利用现有技术即可,不再赘述。And S4023, all the above-mentioned feature vectors are used as the input of the above-mentioned DNN model to obtain a second predicted value. For the calculation process of the DNN model, the existing technology can be used, and details are not repeated here.

S403,对上述第一预测值和第二预测值进行加权求和,得到初始预测结果,并对上述初始预测结果利用预设激活函数进行归一化操作,得到预测概率值。其中,上述加权求和时权重可以根据需要设置,本申请不作限制。上述预设激活函数可以采用softmax激活函数。S403: Perform a weighted summation on the first predicted value and the second predicted value to obtain an initial prediction result, and perform a normalization operation on the initial prediction result using a preset activation function to obtain a predicted probability value. Wherein, the weight in the above-mentioned weighted summation can be set as required, which is not limited in this application. The above-mentioned preset activation function may use a softmax activation function.

以及S404,将上述预测概率值和预设的真实结果值作为上述分位数损失函数的输入,得到目标网络模型。具体来说,也即可以利用上述分位数损失函数计算损失,使损失达到最小;此时可以得到初始深度学习网络模型中各个参数对应的值,将特征参数设置为该值,即得到目标网络模型。And S404 , using the predicted probability value and the preset real result value as the input of the quantile loss function to obtain a target network model. Specifically, the above quantile loss function can be used to calculate the loss to minimize the loss; at this time, the corresponding value of each parameter in the initial deep learning network model can be obtained, and the feature parameter is set to this value, that is, the target network can be obtained. Model.

本实施例中,上述目标网络模型中包含分位数,该分位数是上述预设分位数中的一个。不同酒店对应的目标网络模型中的分位数不同。这样可以根据不同酒店的属性特征等数据,设置合理的分位数,提高其预测准确率,从而实现提高对各个酒店的预测准确率。In this embodiment, the target network model includes a quantile, and the quantile is one of the preset quantiles. The quantiles in the target network model corresponding to different hotels are different. In this way, reasonable quantiles can be set according to the attribute characteristics and other data of different hotels, and the prediction accuracy can be improved, so as to improve the prediction accuracy of each hotel.

以及S50,基于上述目标网络模型对多个酒店的未来一预设时段内的产量分别进行预测,输出各个酒店在上述预设时段内的产量值。比如,上述预设时段可以为6天,也即对各个酒店未来6天的产量值分别进行预测。这样实现了对多酒店的并行预测能力,可以节省需要对多个酒店预测产量的时间,提升预测系统对多酒店场景下产量预测的效率。and S50 , predicting the output of a plurality of hotels in a preset time period in the future based on the target network model, and outputting the output value of each hotel in the preset time period. For example, the above-mentioned preset time period may be 6 days, that is, the output value of each hotel in the next 6 days is predicted respectively. In this way, the parallel forecasting capability for multiple hotels can be realized, which can save the time required to forecast the output of multiple hotels, and improve the efficiency of the forecasting system for the output forecasting in the multi-hotel scenario.

在本申请的另一实施例中,在上述实施例的基础上,如图4所示,步骤S404包括:In another embodiment of the present application, on the basis of the above-mentioned embodiment, as shown in FIG. 4 , step S404 includes:

S4041,对于每一酒店,将多个预设分位数分别作为上述分位数损失函数的输入,训练生成归属于同一酒店的多个不同的第二网络模型。每一上述第二网络模型对应一上述预设分位数。S4041 , for each hotel, use a plurality of preset quantiles as the input of the above-mentioned quantile loss function, and train to generate a plurality of different second network models belonging to the same hotel. Each of the second network models corresponds to the predetermined quantile.

S4042,基于过去N天的上述产量数据作为测试集,对上述多个不同的第二网络模型分别进行测试,将得到的预测值与真实产量值差值最小的上述第二网络模型,作为上述酒店对应的目标网络模型。上述过去N天的上述产量数据包含上述真实产量值。S4042, based on the above-mentioned production data of the past N days as the test set, test the above-mentioned multiple different second network models respectively, and use the above-mentioned second network model with the smallest difference between the obtained predicted value and the actual production value as the above-mentioned hotel The corresponding target network model. The above-mentioned production data for the above-mentioned past N days includes the above-mentioned actual production value.

这样对于每一个酒店,可以利用多个预设分位数得到多个备用的第二网络模型,防止仅选择一个预设分位数生成模型而导致模型收敛偏片面,可以避免异常数据值对模型产生明显的影响,提高模型的容错率,使得生成的模型更加准确。并且,利用最接近当前的产量数据进行测试,可以保证得到的模型参数的准确性,这些都能使得最终的预测结果更加准确。In this way, for each hotel, multiple preset second network models can be obtained by using multiple preset quantiles, so as to prevent the model from only selecting one preset quantile to generate a model and cause the model to converge too unilaterally, it can avoid abnormal data values to the model. It has a significant impact, improves the fault tolerance rate of the model, and makes the generated model more accurate. Moreover, using the most recent production data for testing can ensure the accuracy of the obtained model parameters, which can make the final prediction result more accurate.

在本申请的另一实施例中,在上述实施例的基础上,如图5所示,在步骤S40中,步骤S402还包括:依据每一酒店的上述产量数据中包含的特征,生成每一特征对应的特征ID。In another embodiment of the present application, on the basis of the above-mentioned embodiment, as shown in FIG. 5 , in step S40, step S402 further includes: according to the characteristics contained in the above-mentioned production data of each hotel, generating each hotel The feature ID corresponding to the feature.

步骤S403还包括:依据上述特征和上述特征ID,生成特征字典文件。Step S403 further includes: generating a feature dictionary file according to the aforementioned feature and the aforementioned feature ID.

步骤S404还包括:将上述特征字典文件存储至上述目标网络模型中。Step S404 further includes: storing the above-mentioned feature dictionary file in the above-mentioned target network model.

这样可以避免现有技术必须在开发中先根据所有特征,预设生成一字典文件,在后续若产生新增特征等情况无法及时更新字典文件,导致开发成本增加的问题,本申请无须单独开发特征映射关系,降低了工程开发成本。This can avoid the problem that the prior art must first generate a dictionary file according to all the features in the development, and the dictionary file cannot be updated in time if new features are generated in the future, resulting in an increase in the development cost. The present application does not need to develop the features separately. The mapping relationship reduces engineering development costs.

在本申请的另一实施例中,在上述实施例的基础上,如图6所示,上述酒店产量预测方法还可以包括步骤:In another embodiment of the present application, on the basis of the above-mentioned embodiment, as shown in FIG. 6 , the above-mentioned method for predicting the output of the hotel may further include the steps:

S60,基于当前时段的酒店产量数据对上述多个酒店的产量数据进行更新。S60: Update the output data of the above-mentioned multiple hotels based on the hotel output data of the current period.

S70,基于更新后的产量数据,再次训练上述初始深度学习网络模型,得到更新后的目标网络模型以及更新后的上述分位数。S70, based on the updated output data, train the above-mentioned initial deep learning network model again to obtain the updated target network model and the above-mentioned updated quantile.

这样可以根据最近的预测效果,重新找到最合适的分位数并更新,并且丰富产量数据库,更好地刻画了数据的分布,有利于保证历史产量数据的时效性,有利于提高下次进行产量预测时的准确性。In this way, the most suitable quantile can be found and updated according to the latest prediction effect, and the production database can be enriched, which can better describe the distribution of data, which is conducive to ensuring the timeliness of historical production data and improving the next production. accuracy in forecasting.

需要说明的是,本申请的上述所有实施例可以自由组合,组合后得到的技术方案也在本申请的保护范围之中。It should be noted that all the above-mentioned embodiments of the present application can be freely combined, and the technical solutions obtained after the combination are also within the protection scope of the present application.

如图7所示,本发明实施例还公开了一种酒店产量预测系统7,该系统包括:As shown in FIG. 7 , an embodiment of the present invention also discloses a hotel yield prediction system 7, which includes:

产量数据获取模块71,用于获取多个酒店的产量数据,上述产量数据包括历史产量数据、酒店属性特征数据以及当前预订进度数据;The output data acquisition module 71 is used to obtain the output data of a plurality of hotels, and the above-mentioned output data includes historical output data, hotel attribute characteristic data and current reservation progress data;

损失函数构建模块72,基于分位数回归模型构建分位数损失函数;a loss function construction module 72, which constructs a quantile loss function based on the quantile regression model;

模型构建模块73,用于构建初始深度学习网络模型;Model building module 73, used to build an initial deep learning network model;

模型训练模块74,基于上述分位数损失函数和上述多个酒店的产量数据,训练上述初始深度学习网络模型,分别得到各个酒店对应的目标网络模型;The model training module 74, based on the above-mentioned quantile loss function and the output data of the above-mentioned multiple hotels, trains the above-mentioned initial deep learning network model, and obtains the target network model corresponding to each hotel respectively;

产量预测模块75,基于上述目标网络模型对多个酒店的未来一预设时段内的产量分别进行预测,输出各个酒店在上述预设时段内的产量值。The output forecasting module 75 respectively predicts the output of a plurality of hotels within a preset time period in the future based on the target network model, and outputs the output value of each hotel within the preset time period.

可以理解的是,本发明的酒店产量预测系统还包括其他支持酒店产量预测系统运行的现有功能模块。图7显示的酒店产量预测系统仅仅是一个示例,不应对本发明实施例的功能和使用范围带来任何限制。It can be understood that the hotel production forecasting system of the present invention also includes other existing functional modules that support the operation of the hotel production forecasting system. The hotel yield forecasting system shown in FIG. 7 is only an example, and should not impose any limitations on the functions and scope of use of the embodiments of the present invention.

本实施例中的酒店产量预测系统用于实现上述的酒店产量预测的方法,因此对于酒店产量预测系统的具体实施步骤可以参照上述对酒店产量预测的方法的描述,此处不再赘述。The hotel output forecasting system in this embodiment is used to implement the above-mentioned method for hotel output forecasting. Therefore, for the specific implementation steps of the hotel output forecasting system, reference may be made to the above description of the method for hotel output forecasting, which will not be repeated here.

本发明实施例还公开了一种酒店产量预测设备,包括处理器和存储器,其中存储器存储有所述处理器的可执行指令;处理器配置为经由执行可执行指令来执行上述酒店产量预测方法中的步骤。图8是本发明公开的酒店产量预测设备的结构示意图。下面参照图8来描述根据本发明的这种实施方式的电子设备600。图8显示的电子设备600仅仅是一个示例,不应对本发明实施例的功能和使用范围带来任何限制。The embodiment of the present invention also discloses a hotel yield prediction device, including a processor and a memory, wherein the memory stores executable instructions of the processor; the processor is configured to execute the above method for hotel yield prediction by executing the executable instructions. A step of. FIG. 8 is a schematic structural diagram of the hotel yield forecasting device disclosed in the present invention. The electronic device 600 according to this embodiment of the present invention is described below with reference to FIG. 8 . The electronic device 600 shown in FIG. 8 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present invention.

如图8所示,电子设备600以通用计算设备的形式表现。电子设备600的组件可以包括但不限于:至少一个处理单元610、至少一个存储单元620、连接不同平台组件(包括存储单元620和处理单元610)的总线630、显示单元640等。As shown in FIG. 8, electronic device 600 takes the form of a general-purpose computing device. Components of the electronic device 600 may include, but are not limited to, at least one processing unit 610, at least one storage unit 620, a bus 630 connecting different platform components (including the storage unit 620 and the processing unit 610), a display unit 640, and the like.

其中,存储单元存储有程序代码,程序代码可以被处理单元610执行,使得处理单元610执行本说明书上述酒店产量预测方法部分中描述的根据本发明各种示例性实施方式的步骤。例如,处理单元610可以执行如图1中所示的步骤。Wherein, the storage unit stores program codes, which can be executed by the processing unit 610, so that the processing unit 610 executes the steps according to various exemplary embodiments of the present invention described in the above section of the hotel yield prediction method of this specification. For example, the processing unit 610 may perform the steps shown in FIG. 1 .

存储单元620可以包括易失性存储单元形式的可读介质,例如随机存取存储单元(RAM)6201和/或高速缓存存储单元6202,还可以进一步包括只读存储单元(ROM)6203。The storage unit 620 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 6201 and/or a cache storage unit 6202 , and may further include a read only storage unit (ROM) 6203 .

存储单元620还可以包括具有一组(至少一个)程序模块6205的程序/实用工具6204,这样的程序模块6205包括但不限于:操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。The storage unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, An implementation of a network environment may be included in each or some combination of these examples.

总线630可以为表示几类总线结构中的一种或多种,包括存储单元总线或者存储单元控制器、外围总线、图形加速端口、处理单元或者使用多种总线结构中的任意总线结构的局域总线。The bus 630 may be representative of one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any of a variety of bus structures bus.

电子设备600也可以与一个或多个外部设备700(例如键盘、指向设备、蓝牙设备等)通信,还可与一个或者多个使得用户能与该电子设备600交互的设备通信,和/或与使得该电子设备600能与一个或多个其它计算设备进行通信的任何设备(例如路由器、调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口650进行。并且,电子设备600还可以通过网络适配器660与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。网络适配器660可以通过总线630与电子设备600的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备600使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储平台等。The electronic device 600 may also communicate with one or more external devices 700 (eg, keyboards, pointing devices, Bluetooth devices, etc.), with one or more devices that enable a user to interact with the electronic device 600, and/or with Any device (eg, router, modem, etc.) that enables the electronic device 600 to communicate with one or more other computing devices. Such communication may occur through input/output (I/O) interface 650 . Also, the electronic device 600 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 660 . Network adapter 660 may communicate with other modules of electronic device 600 through bus 630 . It should be appreciated that, although not shown, other hardware and/or software modules may be used in conjunction with electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives And data backup storage platform, etc.

本发明还公开了一种计算机可读存储介质,用于存储程序,所述程序被执行时实现上述酒店产量预测方法中的步骤。在一些可能的实施方式中,本发明的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当程序产品在终端设备上运行时,程序代码用于使终端设备执行本说明书上述酒店产量预测方法中描述的根据本发明各种示例性实施方式的步骤。The invention also discloses a computer-readable storage medium for storing a program, when the program is executed, the steps in the above-mentioned hotel yield prediction method are implemented. In some possible implementations, various aspects of the present invention can also be implemented in the form of a program product, which includes program code, when the program product runs on a terminal device, the program code is used to cause the terminal device to execute the above-mentioned description in this specification. The steps according to various exemplary embodiments of the present invention are described in the method for forecasting hotel yield.

如上所示,该实施例的计算机可读存储介质的程序在执行时,能够综合考虑多个外生变量对最终产量的影响,提高了酒店产量预测准确性;并且具有对多酒店的并行预测能力,能够根据不同的酒店采用不同的分位数生成模型,在实现多酒店同时预测的前提下,保证了并行预测时对每个酒店的产量预测准确性。As shown above, when the program of the computer-readable storage medium of this embodiment is executed, the influence of multiple exogenous variables on the final output can be comprehensively considered, thereby improving the prediction accuracy of hotel output; and it has the parallel prediction ability for multiple hotels. , it can use different quantile generation models according to different hotels, and under the premise of realizing multi-hotel simultaneous prediction, it ensures the accuracy of output prediction for each hotel during parallel prediction.

图9是本发明的计算机可读存储介质的结构示意图。参考图9所示,描述了根据本发明的实施方式的用于实现上述方法的程序产品800,其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在终端设备,例如个人电脑上运行。然而,本发明的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。FIG. 9 is a schematic structural diagram of a computer-readable storage medium of the present invention. Referring to FIG. 9, a program product 800 for implementing the above method according to an embodiment of the present invention is described, which can adopt a portable compact disk read only memory (CD-ROM) and include program codes, and can be used in a terminal device, For example running on a personal computer. However, the program product of the present invention is not limited thereto, and in this document, a readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.

程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

计算机可读存储介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。可读存储介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。可读存储介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。A computer-readable storage medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A readable storage medium can also be any readable medium other than a readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any suitable medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

可以以一种或多种程序设计语言的任意组合来编写用于执行本发明操作的程序代码,程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural programming Language - such as the "C" language or similar programming language. The program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (eg, using an Internet service provider business via an Internet connection).

本发明实施例提供的酒店产量预测方法、系统、设备及存储介质能够综合考虑多个外生变量对最终产量的影响,提高了酒店产量预测准确性;不再过度依赖历史数据进行预测,能够更好地应对冷启动场景;The hotel output forecasting method, system, device and storage medium provided by the embodiments of the present invention can comprehensively consider the influence of multiple exogenous variables on the final output, thereby improving the accuracy of hotel output forecasting; Handle cold start scenarios well;

另一方面,本申请具有对多酒店的并行预测能力,能够根据不同的酒店采用不同的分位数生成模型,在实现多酒店同时预测的前提下,保证了并行预测时对每个酒店的产量预测准确性。On the other hand, the present application has the capability of parallel forecasting for multiple hotels, and can use different quantile generation models according to different hotels. Under the premise of realizing simultaneous forecasting of multiple hotels, the output of each hotel during parallel forecasting is guaranteed. prediction accuracy.

以上内容是结合具体的优选实施方式对本发明所作的进一步详细说明,不能认定本发明的具体实施只局限于这些说明。对于本发明所属技术领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干简单推演或替换,都应当视为属于本发明的保护范围。The above content is a further detailed description of the present invention in combination with specific preferred embodiments, and it cannot be considered that the specific implementation of the present invention is limited to these descriptions. For those of ordinary skill in the technical field of the present invention, without departing from the concept of the present invention, some simple deductions or substitutions can be made, which should be regarded as belonging to the protection scope of the present invention.

Claims (13)

1.一种酒店产量预测方法,其特征在于,包括以下步骤:1. a hotel yield forecasting method, is characterized in that, comprises the following steps: 获取多个酒店的产量数据;Obtain production data for multiple hotels; 基于分位数回归模型构建分位数损失函数;Construct quantile loss function based on quantile regression model; 构建初始深度学习网络模型;Build the initial deep learning network model; 基于所述分位数损失函数和所述多个酒店的产量数据,训练所述初始深度学习网络模型,分别得到各个酒店对应的目标网络模型;Based on the quantile loss function and the output data of the multiple hotels, the initial deep learning network model is trained, and the target network models corresponding to each hotel are obtained respectively; 基于所述目标网络模型对多个酒店的未来一预设时段内的产量分别进行预测,输出各个酒店在所述预设时段内的产量值。Based on the target network model, the output of a plurality of hotels in a preset time period in the future is respectively predicted, and the output value of each hotel in the preset time period is output. 2.如权利要求1所述的酒店产量预测方法,其特征在于,所述基于所述分位数损失函数和所述多个酒店的产量数据,训练所述初始深度学习网络模型,分别得到各个酒店对应的目标网络模型的步骤包括:2. The method for predicting hotel output according to claim 1, wherein the initial deep learning network model is trained based on the quantile loss function and the output data of the multiple hotels, and each The steps of the target network model corresponding to the hotel include: 对每一酒店的所述产量数据进行向量化操作,得到归属于一酒店的多个特征向量;Carrying out a vectorization operation on the output data of each hotel to obtain a plurality of feature vectors belonging to a hotel; 对所述特征向量分别进行特征低阶交叉和特征高阶交叉,分别得到第一预测值和第二预测值;Perform low-order feature crossover and feature high-order crossover on the feature vector, respectively, to obtain a first predicted value and a second predicted value; 对所述第一预测值和第二预测值进行加权求和,得到初始预测结果,并对所述初始预测结果利用预设激活函数进行归一化操作,得到预测概率值;Weighted summation is performed on the first predicted value and the second predicted value to obtain an initial predicted result, and a normalization operation is performed on the initial predicted result using a preset activation function to obtain a predicted probability value; 将所述预测概率值和预设的真实结果值作为所述分位数损失函数的输入,得到目标网络模型。The predicted probability value and the preset real result value are used as the input of the quantile loss function to obtain the target network model. 3.如权利要求2所述的酒店产量预测方法,其特征在于,所述初始深度学习网络模型中包含有DNN模型,所述对所述特征向量分别进行特征低阶交叉和特征高阶交叉,分别得到第一预测值和第二预测值的步骤包括:3. hotel output forecasting method as claimed in claim 2, is characterized in that, in described initial deep learning network model, comprises DNN model, described feature vector is carried out respectively low-order cross of feature and high-order cross of feature, The steps of obtaining the first predicted value and the second predicted value respectively include: 从所有的所述特征向量中任选两个特征向量进行组合,形成多个特征向量集合,每一所述特征向量集合中包含有两个特征向量;Choose two eigenvectors from all the eigenvectors to combine to form a plurality of eigenvector sets, and each of the eigenvector sets contains two eigenvectors; 对所有的所述特征向量集合分别进行内积运算,得到第一预测值;Perform an inner product operation on all the feature vector sets respectively to obtain a first predicted value; 将所有的所述特征向量作为所述DNN模型的输入,得到第二预测值。Using all the feature vectors as the input of the DNN model, a second predicted value is obtained. 4.如权利要求1所述的酒店产量预测方法,其特征在于,所述目标网络模型中包含分位数,不同酒店对应的目标网络模型中的分位数不同。4 . The method for predicting hotel output according to claim 1 , wherein the target network model includes quantiles, and the target network models corresponding to different hotels have different quantiles. 5 . 5.如权利要求1所述的酒店产量预测方法,其特征在于,所述基于所述分位数损失函数和所述多个酒店的产量数据,训练所述初始深度学习网络模型,分别得到各个酒店对应的目标网络模型的步骤包括:5. The method for predicting hotel output according to claim 1, wherein the initial deep learning network model is trained based on the quantile loss function and the output data of the multiple hotels, and each The steps of the target network model corresponding to the hotel include: 对于每一酒店,将多个预设分位数分别作为所述分位数损失函数的输入,训练生成归属于同一酒店的多个不同的第二网络模型,每一所述第二网络模型对应一所述预设分位数;For each hotel, a plurality of preset quantiles are used as the input of the quantile loss function respectively, and a plurality of different second network models belonging to the same hotel are generated by training, and each of the second network models corresponds to 1. the preset quantile; 基于过去N天的所述产量数据作为测试集,对所述多个不同的第二网络模型分别进行测试,将得到的预测值与真实产量值差值最小的所述第二网络模型,作为所述酒店对应的目标网络模型;所述过去N天的所述产量数据包含所述真实产量值。Based on the output data of the past N days as the test set, the multiple different second network models are tested respectively, and the second network model with the smallest difference between the obtained predicted value and the actual output value is used as the test set. The target network model corresponding to the hotel; the production data of the past N days includes the real production value. 6.如权利要求1所述的酒店产量预测方法,其特征在于,所述基于所述分位数损失函数和所述多个酒店的产量数据,训练所述初始深度学习网络模型,分别得到各个酒店对应的目标网络模型的步骤包括:6. The hotel yield prediction method according to claim 1, wherein the initial deep learning network model is trained based on the quantile loss function and the yield data of the multiple hotels, and each The steps of the target network model corresponding to the hotel include: 依据每一酒店的所述产量数据中包含的特征,生成每一特征对应的特征ID;According to the features contained in the output data of each hotel, a feature ID corresponding to each feature is generated; 依据所述特征和所述特征ID,生成特征字典文件;Generate a feature dictionary file according to the feature and the feature ID; 将所述特征字典文件存储至所述目标网络模型中。Store the feature dictionary file in the target network model. 7.如权利要求3所述的酒店产量预测方法,其特征在于,每一条所述产量数据包括特征和与所述特征对应的特征值;所述初始深度学习网络模型具有多个参数;每一所述特征匹配一个所述参数;且每一所述特征匹配一所述特征向量;7. The method for predicting hotel yield as claimed in claim 3, wherein each piece of said yield data comprises a feature and a feature value corresponding to said feature; said initial deep learning network model has a plurality of parameters; each the feature matches one of the parameters; and each of the features matches one of the feature vectors; 所述对所有的所述特征向量集合分别进行内积运算,得到第一预测值的步骤包括:依据低阶交叉函数对每一特征向量集合分别进行内积运算;The step of performing inner product operations on all the feature vector sets respectively to obtain the first predicted value includes: performing inner product operations on each feature vector set respectively according to a low-order cross function; 所述低阶交叉函数为:The low-order cross function is:
Figure FDA0002813965150000021
Figure FDA0002813965150000021
其中,
Figure FDA0002813965150000022
in,
Figure FDA0002813965150000022
y1表示所述第一预测值,w0表示所述初始深度学习网络模型的第一个参数,wi表示所述初始深度学习网络模型的第i个参数,m表示每一条所述产量数据的特征总数,xi表示每一条所述产量数据中第i个特征对应的特征值,<vi,vj>表示特征向量vi和vj的内积,vi表示每一条所述产量数据中第i个特征对应的特征向量,vj表示每一条所述产量数据中第j个特征对应的特征向量,xj表示每一条所述产量数据中第j个特征对应的特征值,vi,f表示特征向量vi中的第f个元素,vj,f表示特征向量vj中的第f个元素,k表示特征向量vi中的元素总数。y 1 represents the first predicted value, w 0 represents the first parameter of the initial deep learning network model, w i represents the ith parameter of the initial deep learning network model, m represents each piece of the output data The total number of features, x i represents the eigenvalue corresponding to the ith feature in each piece of the yield data, <v i , v j > represents the inner product of the feature vectors v i and v j , v i represents the yield of each item The eigenvector corresponding to the ith feature in the data, v j represents the eigenvector corresponding to the jth feature in each piece of the output data, x j represents the eigenvalue corresponding to the jth feature in each piece of the output data, v i, f represent the f-th element in the feature vector v i , v j, f represent the f-th element in the feature vector v j , and k represents the total number of elements in the feature vector v i .
8.如权利要求1所述的酒店产量预测方法,其特征在于,所述分位数损失函数为:8. hotel yield forecasting method as claimed in claim 1, is characterized in that, described quantile loss function is:
Figure FDA0002813965150000031
Figure FDA0002813965150000031
其中,n表示所述产量数据的总数量,q表示预设分位数,yp 表示第p条所述产量数据的预测值,yp表示第p条所述产量数据的真实产量值,L表示总损失值。Wherein, n represents the total quantity of the yield data, q represents the preset quantile, y p represents the predicted value of the yield data of the p-th item, y p represents the actual yield value of the yield data of the p-th item, L represents the total loss value.
9.如权利要求4所述的酒店产量预测方法,其特征在于,所述方法还包括步骤:9. hotel yield forecasting method as claimed in claim 4, is characterized in that, described method also comprises the step: 基于当前时段的酒店产量数据对所述多个酒店的产量数据进行更新;updating the production data of the plurality of hotels based on the hotel production data of the current period; 基于更新后的产量数据,再次训练所述初始深度学习网络模型,得到更新后的目标网络模型以及更新后的所述分位数。Based on the updated yield data, the initial deep learning network model is retrained to obtain the updated target network model and the updated quantile. 10.如权利要求1所述的酒店产量预测方法,其特征在于,所述产量数据包括历史产量数据、酒店属性特征数据以及当前预订进度数据;所述酒店属性特征数据包括酒店星级、房型等级和酒店所在城市数据,所述产量数据还包括历史预订进度数据、当前取消进度、酒店所在竞争圈特征和日期特征。10. The method for predicting hotel output according to claim 1, wherein the output data includes historical output data, hotel attribute feature data and current reservation progress data; the hotel attribute feature data includes hotel star rating, room type rating and data of the city where the hotel is located, the output data also includes historical booking progress data, current cancellation progress, characteristics of the competitive circle where the hotel is located, and date characteristics. 11.一种酒店产量预测系统,用于实现如权利要求1所述的酒店产量预测方法,其特征在于,所述系统包括:11. A hotel output forecasting system for realizing the hotel output forecasting method as claimed in claim 1, wherein the system comprises: 产量数据获取模块,用于获取多个酒店的产量数据,所述产量数据包括历史产量数据、酒店属性特征数据以及当前预订进度数据;The output data acquisition module is used to obtain the output data of multiple hotels, and the output data includes historical output data, hotel attribute feature data and current reservation progress data; 损失函数构建模块,基于分位数回归模型构建分位数损失函数;A loss function building module, which builds a quantile loss function based on a quantile regression model; 模型构建模块,用于构建初始深度学习网络模型;A model building module for building an initial deep learning network model; 模型训练模块,基于所述分位数损失函数和所述多个酒店的产量数据,训练所述初始深度学习网络模型,分别得到各个酒店对应的目标网络模型;The model training module, based on the quantile loss function and the output data of the multiple hotels, trains the initial deep learning network model, and obtains the target network model corresponding to each hotel respectively; 产量预测模块,基于所述目标网络模型对多个酒店的未来一预设时段内的产量分别进行预测,输出各个酒店在所述预设时段内的产量值。The output prediction module, based on the target network model, respectively predicts the output of a plurality of hotels within a preset time period in the future, and outputs the output value of each hotel within the preset time period. 12.一种酒店产量预测设备,其特征在于,包括:12. A hotel yield forecasting device, characterized in that it comprises: 处理器;processor; 存储器,其中存储有所述处理器的可执行指令;a memory in which executable instructions for the processor are stored; 其中,所述处理器配置为经由执行所述可执行指令来执行权利要求1至10中任意一项所述酒店产量预测方法的步骤。Wherein, the processor is configured to perform the steps of the method for predicting hotel yield according to any one of claims 1 to 10 by executing the executable instructions. 13.一种计算机可读存储介质,用于存储程序,其特征在于,所述程序被处理器执行时实现权利要求1至10中任意一项所述酒店产量预测方法的步骤。13. A computer-readable storage medium for storing a program, wherein when the program is executed by a processor, the program implements the steps of the method for predicting hotel yield according to any one of claims 1 to 10.
CN202011394090.9A 2020-12-02 2020-12-02 Hotel yield prediction method, system, equipment and storage medium Pending CN112434872A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011394090.9A CN112434872A (en) 2020-12-02 2020-12-02 Hotel yield prediction method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011394090.9A CN112434872A (en) 2020-12-02 2020-12-02 Hotel yield prediction method, system, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112434872A true CN112434872A (en) 2021-03-02

Family

ID=74690810

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011394090.9A Pending CN112434872A (en) 2020-12-02 2020-12-02 Hotel yield prediction method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112434872A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361920A (en) * 2021-06-04 2021-09-07 上海华客信息科技有限公司 Hotel service optimization index recommendation method, system, equipment and storage medium
CN113496005A (en) * 2021-05-26 2021-10-12 北京房多多信息技术有限公司 Information management method and device, electronic equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113496005A (en) * 2021-05-26 2021-10-12 北京房多多信息技术有限公司 Information management method and device, electronic equipment and storage medium
CN113496005B (en) * 2021-05-26 2022-04-08 北京房多多信息技术有限公司 Information management method and device, electronic equipment and storage medium
CN113361920A (en) * 2021-06-04 2021-09-07 上海华客信息科技有限公司 Hotel service optimization index recommendation method, system, equipment and storage medium

Similar Documents

Publication Publication Date Title
US12223275B2 (en) Method of training model, device, and storage medium
US20220269835A1 (en) Resource prediction system for executing machine learning models
Jiang et al. Day‐ahead renewable scenario forecasts based on generative adversarial networks
CN117039895A (en) Wind power prediction method and system for energy storage auxiliary black start
CN110795939A (en) Text processing method and device
US20230145452A1 (en) Method and apparatus for training a model
CN112434872A (en) Hotel yield prediction method, system, equipment and storage medium
CN114358257A (en) Neural network pruning method and device, readable medium and electronic equipment
WO2024056051A1 (en) Non-intrusive flexible load aggregation characteristic identification and optimization method, apparatus, and device
CN115310590A (en) Graph structure learning method and device
CN113780662A (en) Flow prediction method, device, equipment and medium
CN111695967A (en) Method, device, equipment and storage medium for determining quotation
Green II et al. Intelligent state space pruning for Monte Carlo simulation with applications in composite power system reliability
CN117807888B (en) Method, system and equipment for calculating tower icing load by considering corrosion influence
CN118735360A (en) Model training method, behavior quality evaluation method and device, equipment and product
CN112580885A (en) Method, device and equipment for predicting accessory qualification rate and storage medium
CN118763647A (en) A photovoltaic short-term power prediction method and system
TWI851438B (en) Optimizing algorithms for hardware devices
WO2024155788A1 (en) Machine learning models for electrical power simulations
CN117312791A (en) Current transformer error state trend prediction method, device, equipment and medium
EP4198831A1 (en) Automated feature engineering for predictive modeling using deep reinforcement learning
CN116957133A (en) Wind power generation power and photovoltaic power generation power prediction method and device
CN117473384A (en) A power grid line safety constraint identification method, device, equipment and storage medium
CN115759373A (en) Gas daily load prediction method, device and equipment
US20250124082A1 (en) Method, device, and computer program product for processing workflow chart

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210302

RJ01 Rejection of invention patent application after publication