CN104156594B

CN104156594B - Dynamic flight station-crossing time estimation method based on Bayes network

Info

Publication number: CN104156594B
Application number: CN201410391944.6A
Authority: CN
Inventors: 丁建立; 赵键涛; 曹卫东; 胡海生; 黄威
Original assignee: Civil Aviation University of China
Current assignee: Civil Aviation University of China
Priority date: 2014-08-11
Filing date: 2014-08-11
Publication date: 2017-01-25
Anticipated expiration: 2034-08-11
Also published as: CN104156594A

Abstract

A method for dynamic estimation of flight transit time based on Bayesian network. It uses the method of data mining in the estimation of flight transit time. First, several factors that have a significant impact on flight transit time are extracted, and the Bayesian network is used to obtain the transit time estimation model, and then different conditions are obtained. Download the transit time estimate. When estimating the departure time of a flight, it is only necessary to know the probability distribution of the value of the transit time by using the transit time estimation model under the condition of the flight arrival information. On this basis, by calculating the expected value, we can get The possible values of the departure time. In addition, aiming at the characteristics of increasing flight data, the method used in the present invention can continuously learn the newly added data, and ensure the consistency of the learning result and the re-learning result of all data, so that the transit time estimation model can dynamically Adjust and regularly update transit time estimates to adapt to changing external conditions.

Description

A Dynamic Estimation Method of Flight Transit Time Based on Bayesian Network

技术领域technical field

本发明属于民航航空技术领域，特别是涉及一种基于贝叶斯网的航班过站时间动态估计方法。The invention belongs to the technical field of civil aviation, in particular to a method for dynamically estimating flight transit time based on a Bayesian network.

背景技术Background technique

航班延误，作为航空运输服务纠纷的焦点，近几年随着中国民航运输量的不断增长，这个问题受到越来越多的关注。航班延误不仅给机场、航空公司带来直接的经济损失，给旅客的正常出行造成很大的不便，还会严重扰乱机场的正常秩序。导致航班延误的原因多种多样，常见的原因有天气原因、航空管制、机械故障、飞机调配、飞行计划等。在航班延误发生前进行有效的预测，是民航业发展过程中期望达到的一个目标。对于航空公司来说，每一架飞机在一天之内执行多个航班，在上游航班发生延误的情况下对下游航班延误状况进行有效的预测，对于提高航公公司的服务质量，提升航空公司的竞争力具有重要的现实意义。Flight delays, as the focus of air transport service disputes, have received more and more attention in recent years with the continuous growth of China's civil aviation traffic. Flight delays not only bring direct economic losses to airports and airlines, but also cause great inconvenience to passengers' normal travel, and seriously disrupt the normal order of airports. There are various reasons for flight delays, common reasons include weather, air traffic control, mechanical failure, aircraft deployment, flight planning, etc. Effective prediction of flight delays before they occur is a goal expected to be achieved in the development of the civil aviation industry. For airlines, each aircraft executes multiple flights within one day, and effectively predicts the delay of downstream flights in the case of delays in upstream flights, which is helpful for improving the service quality of airline companies and improving the airline's profitability. Competitiveness has important practical significance.

Flyontime.us是美国一个面向公众免费开放的航班延误时间分析系统。该系统向全社会免费开放，任何人都可以通过它查询分析美国各次航班的延误率、机场等候时间、航班时刻及天气信息。该系统使用的数据主要来源于美国交通部，而安检排队等待的时间则通过普通旅行人员提交给系统获得。该系统主要是采用统计方法研究在某一段时间内每个机场所有的航班延误分布。Flyontime.us is a free flight delay time analysis system open to the public in the United States. The system is free and open to the whole society, through which anyone can query and analyze the delay rate, airport waiting time, flight schedule and weather information of various flights in the United States. The data used by the system mainly comes from the U.S. Department of Transportation, and the time spent waiting in line at the security check is obtained through the submission of ordinary travelers to the system. The system mainly uses statistical methods to study the distribution of all flight delays at each airport in a certain period of time.

美国FlightCaster公司开发了航班延误信息服务系统，该系统采用一种高级算法搜集国内每个航班过去几年的数据，然后将其与实时情况匹配来确定航班的延误情况，可以提前预测未来几个小时内的航班情况。同时也为iPhone和Blackberry提供了终端应用，提供航班延误预告。American company FlightCaster has developed a flight delay information service system, which uses an advanced algorithm to collect the data of each domestic flight in the past few years, and then matches it with the real-time situation to determine the delay of the flight, which can predict the next few hours in advance Flight status within. It also provides terminal applications for iPhone and Blackberry, providing flight delay forecasts.

欧洲非常重视航班信息服务。欧洲民航管理机构(EUROCONTROL)实施了研发欧洲航空信息系统过站时间规则库(European AIS Database,EAD)的科技计划，EAD集成并整合了成员国航空信息系统过站时间规则库的信息，是目前世界上规模最大的集中式航空信息服务系统，EAD的网上航空信息服务范围已经覆盖欧洲的大部分国家。Europe attaches great importance to flight information services. The European civil aviation management agency (EUROCONTROL) has implemented a scientific and technological plan for the research and development of the European AIS Database (EAD), which integrates and integrates the information of the national aviation information system transit time rule library. The world's largest centralized aviation information service system, EAD's online aviation information service has covered most countries in Europe.

新加坡樟宜机场是亚洲最繁忙的大型枢纽机场之一，该机场的信息发布系统能够提供实时、准确的航班动态信息，其航班进出港动态信息(登机广播、上客、舱门关闭、起飞、着陆等消息)已经实现了实时更新和发布。Singapore Changi Airport is one of the busiest large-scale hub airports in Asia. The information release system of the airport can provide real-time and accurate flight dynamic information. , landing and other news) has been updated and released in real time.

近几年，部分国内机场如首都机场、新白云机场等已通过建立网站、电话呼叫中心等手段，向旅客提供航班动态查询服务。也有的在微博等社交网络网站上建立专属页面，在出现延误航班情况时，通过文字、图片等形式向旅客进行解释沟通。In recent years, some domestic airports, such as Capital Airport and New Baiyun Airport, have provided passengers with dynamic flight query services through the establishment of websites and telephone call centers. Some also set up exclusive pages on Weibo and other social networking sites to explain and communicate to passengers through text and pictures when there is a flight delay.

飞友科技(民航资源网)已开发出“非常准”、“VariFlight”等相关产品向旅客免费提供航班动态定制服务，也向一些小型机场提供航班信息服务。Feiyou Technology (Civil Aviation Resource Network) has developed related products such as "Very Accurate" and "VariFlight" to provide passengers with free flight dynamic customization services, and also provide flight information services to some small airports.

但是，上述系统存在下列问题：总的来说，信息不完整，各影响因素变动的随机性较大等，因此目前缺少一种能够用数据挖掘的办法对航班历史数据进行学习，在众多因素的影响下估计过站时间，从而在遇到相同状况时对过站时间进行有效估计的方法。However, the above-mentioned system has the following problems: in general, the information is incomplete, and the randomness of each influencing factor is relatively large. Therefore, there is currently a lack of a method that can use data mining to learn flight history data. Estimate the transit time under the influence, so as to effectively estimate the transit time when encountering the same situation.

发明内容Contents of the invention

为了解决上述问题，本发明的目的在于提供一种基于贝叶斯网的航班过站时间动态估计方法。In order to solve the above problems, the object of the present invention is to provide a method for dynamically estimating flight transit time based on a Bayesian network.

为了达到上述目的，本发明提供的基于贝叶斯网的航班过站时间动态估计方法包括按顺序进行的下列步骤：In order to achieve the above object, the flight time dynamic estimation method based on Bayesian network provided by the invention comprises the following steps carried out in order:

步骤一：对历史航班数据进行预处理，从中提取出包括前航班到达延误时间、前航班到达时间段、计划过站时间、起飞机场、飞机类型和实际过站时间在内的数据作为影响因素；Step 1: Preprocess the historical flight data, and extract the data including the arrival delay time of the previous flight, the arrival time period of the previous flight, the planned transit time, the departure airport, the aircraft type and the actual transit time as influencing factors ;

步骤二：假设上述各因素对实际过站时间都具有影响且影响是相互独立的，由此确定出贝叶斯网拓扑结构；然后运用上一步骤得到的数据，采用最大似然估计法得到贝叶斯网参数，由此得到过站时间估计模型；Step 2: Assuming that the above factors have an influence on the actual transit time and the influence is independent of each other, then determine the topology of the Bayesian network; then use the data obtained in the previous step to obtain the Bayesian network topology by using the maximum likelihood estimation method. YES network parameters, thus obtaining the transit time estimation model;

步骤三：对上述得到的过站时间估计模型进行推理，得到不同状况下的过站时间估计值；Step 3: Reasoning the estimated transit time model obtained above to obtain estimated transit time values under different conditions;

步骤四：对进港航班的离港时间进行预测；Step 4: Predict the departure time of the incoming flight;

步骤五：经过一段时间后，采用步骤一的方法对新的历史航班数据进行预处理，得到新的训练样本；然后将之前得到的模型作为先验知识，结合新的训练样本，采用贝叶斯估计法修正贝叶斯网参数；修正完参数以后，对新的过站时间估计模型进行推理，并且更新过站时间规则库；Step 5: After a period of time, use the method of step 1 to preprocess the new historical flight data to obtain new training samples; then use the previously obtained model as prior knowledge, combine with the new training samples, and use Bayesian The estimation method corrects the parameters of the Bayesian network; after the parameters are corrected, the new transit time estimation model is reasoned, and the transit time rule base is updated;

步骤六：定期重复步骤五，动态更新过站时间规则库。Step 6: Repeat step 5 regularly to dynamically update the transit time rule base.

在步骤一中，所述的航班历史数据包括航班号、飞机号、航班计划起飞时间、计划降落时间、实际起飞时间、实际降落时间、起飞机场、目的机场和航班座位数；前航班到达延误时间为前航班实际到达时间减去前航班计划到达时间，计划过站时间为航班计划表中下一航班的计划起飞时间减去上一航班的计划到达时间。In step 1, the flight history data includes flight number, aircraft number, flight planned departure time, planned landing time, actual departure time, actual landing time, departure airport, destination airport and flight seat number; previous flight arrival delay The time is the actual arrival time of the previous flight minus the planned arrival time of the previous flight, and the planned transit time is the planned departure time of the next flight in the flight schedule minus the planned arrival time of the previous flight.

所述的采用最大似然估计法得到贝叶斯网参数，由此得到过站时间估计模型的方法为：The described method of adopting the maximum likelihood estimation method to obtain the Bayesian network parameters, thus obtaining the estimation model of the transit time is:

假设为所有参数组成的向量，n为节点个数，q_i为π(X_i)的取值组合个数，r_i为节点X_i的取值个数，θ_ijk＝P(X_i＝k/π(X_i)＝j)为当X_i父节点取值为第j个取值，X_i取值为第k个取值的概率，D_i,i＝1,2,...,m为样本数据，则向量的对数似然函数为：suppose is the vector composed of all parameters, n is the number of nodes, q _i is the number of value combinations of π(X _i ), r _i is the number of values of node X _i , θ _ijk =P(X _i =k/ π(X _i )=j) is the probability that when the value of the parent node of X _i is the jth value, the value of X _i is the kth value, D _i ,i=1,2,...,m is the sample data, then the vector The log-likelihood function for is:

$l l ((\overset{&RightArrow; &Right Arrow;}{θ θ} / / D D.)) = = log log {Π Π}_{l l = = 11}^{m m} P P (({D D.}_{l l} / / \overset{&RightArrow; &Right Arrow;}{θ θ})) = = {Σ Σ}_{l l = = 11}^{m m} log log P P (({D D.}_{l l} / / \overset{&RightArrow; &Right Arrow;}{θ θ})) - - - - - - ((11))$

当θ_ijk取如下值时，对数似然函数取得最大值：When θ _ijk takes the following values, the logarithmic likelihood function achieves the maximum value:

其中，m_ijk是航班数据中满足X_i＝k，π(X_i)＝j的样本数量，r_i为节点X_i的取值个数；若历史航班数据源中则设参数为均匀分布；由式(2)确定出贝叶斯网参数，由此得到过站时间估计模型。Among them, m _ijk is the number of samples satisfying Xi = k, π(X _i ) = _j in the flight data _, and r _i is the value number of node Xi; if the historical flight data source The parameters are assumed to be uniformly distributed; the parameters of the Bayesian network are determined by formula (2), and the transit time estimation model is thus obtained.

对得到的过站时间估计模型进行推理，得到不同状况下的过站时间估计值的方法为：Reasoning on the obtained transit time estimation model, the method to obtain the estimated value of transit time under different conditions is as follows:

3.1对过站时间估计模型进行推理，得到不同情况下过站时间的概率分布；3.1 Reasoning the transit time estimation model to obtain the probability distribution of transit time in different situations;

3.2对过站时间求期望值：3.2 Find the expected value of the transit time:

$E E. ((t t)) = = \underset{i i}{Σ Σ} P P ((i i)) * * {T T}_{i i} - - - - - - ((33))$

其中E(t)为在其他条件确定时过站时间期望值，T_i为第i个过站时间区间的中值，P(i)为过站时间在第i个区间中的概率，以求得的过站时间期望值E(t)作为此种状况下过站时间的估计值；Where E(t) is the expected value of transit time when other conditions are determined, T _i is the median value of the i-th transit time interval, P(i) is the probability of the transit time in the i-th interval, to obtain The expected value of the transit time E(t) is used as the estimated value of the transit time in this situation;

3.3将不同条件下的过站时间估计值插入到过站时间规则库中，为航班离港时间预测提供条件。3.3 Insert the transit time estimates under different conditions into the transit time rule base to provide conditions for flight departure time prediction.

在步骤四中，对进港航班的离港时间进行预测的方法为；如果上一航班不发生进港延误或延误时间小于10分钟，则下一航班的估计起飞时间为计划时间；如果延误时间大于10分钟，则从过站时间规则库中得出相应状况下的估计过站时间；如果实际进港时间加上估计过站时间小于下一航班的计划过站时间，则下一航班的估计离港时间为计划离港时间；如果上一航班的实际进港时间加上估计的过站时间大于下一航班的计划离港时间，则下一航班的估计离港时间为实际进港时间加上估计过站时间；如果延误时间大于10分钟，并且过站时间规则库中没有相应的过站时间信息，则估计的过站时间为计划过站时间，下一航班的估计离港时间为实际进港时间加上计划过站时间。In step 4, the method of predicting the departure time of the incoming flight is as follows; if there is no arrival delay on the previous flight or the delay time is less than 10 minutes, then the estimated departure time of the next flight is the planned time; if the delay time If it is greater than 10 minutes, the estimated transit time under the corresponding conditions is obtained from the transit time rule library; if the actual arrival time plus the estimated transit time is less than the planned transit time of the next flight, the estimated transit time of the next flight The departure time is the planned departure time; if the actual arrival time of the previous flight plus the estimated transit time is greater than the planned departure time of the next flight, the estimated departure time of the next flight is the actual arrival time plus If the delay time is greater than 10 minutes and there is no corresponding transit time information in the transit time rule base, the estimated transit time is the planned transit time, and the estimated departure time of the next flight is the actual Arrival time plus planned transit time.

在步骤五中，采用贝叶斯估计法修正贝叶斯网参数的方法为：In step five, the method of correcting the parameters of the Bayesian network using the Bayesian estimation method is as follows:

假设为由所组成的子向量，α_ijk为先验知识中满足X_i＝k和π(X_i)＝j的样本数量,为狄利克雷分布由贝叶斯公式得：suppose for the reason α _ijk is the number of samples satisfying Xi = k and π(X _i ) = _j in the prior knowledge, is a Dirichlet distribution From Bayesian formula:

$P P ((\overset{&RightArrow; &Right Arrow;}{θ θ} / / D D.)) = = \frac{P P ((\overset{&RightArrow; &Right Arrow;}{θ θ})) P P ((D D. / / \overset{&RightArrow; &Right Arrow;}{θ θ}))}{&Integral; &Integral; P P ((D D. \overset{&RightArrow; &Right Arrow;}{θ θ})) d d \overset{&RightArrow; &Right Arrow;}{θ θ}} = = \frac{P P ((\overset{&RightArrow; &Right Arrow;}{θ θ})) P P ((D D. / / \overset{&RightArrow; &Right Arrow;}{θ θ}))}{P P ((D D.))} - - - - - - ((44))$

其中为向量的先验概率分布，为向量的后验概率分布；由于服从狄利克雷分布 $D = [m_{ij 1} + α_{ij 1}, m_{ij 2} + α_{ij 2}, . . ., {m_{ijr}}_{i} + {α_{ijr}}_{i}],$ 所以：in as a vector The prior probability distribution of , as a vector The posterior probability distribution of ; since Obey the Dirichlet distribution $D. = [m_{ij 1} + α_{ij 1}, m_{ij 2} + α_{ij 2}, . . ., {m_{ijr}}_{i} + {α_{ijr}}_{i}],$ so:

$E E. (({θ θ}_{ijk ijk} / / D D.)) = = \frac{{m m}_{ijk ijk} + + {α α}_{ijk ijk}}{{Σ Σ}_{k k = = 11}^{{r r}_{i i}} (({m m}_{ijk ijk} + + {α α}_{ijk ijk}))} - - - - - - ((55))$

采用式(5)修正贝叶斯网参数，其中m_ijk为新样本中满足X_i＝k和π(X_i)＝j的样本数量。Formula (5) is used to correct the parameters of the Bayesian network, where m _ijk is the number of samples satisfying Xi =k and π(X _i )= _j in the new samples.

本发明提供的基于贝叶斯网的航班过站时间动态估计方法将数据挖掘的方法用于航班过站时间估计中，首先提取出了对航班过站时间有显著影响的几个因素，运用贝叶斯网得出过站时间估计模型，进而得到不同条件下过站时间估计值。在进行航班离港时间估计时，只需要知道在航班进港信息的条件下，运用过站时间估计模型即可得出过站时间取值的概率分布，在此基础上，通过求期望值，得出过站时间的可能取值。另外，针对航班数据不断增加的特点，本发明运用的方法能够不断对新增加的数据进行学习，并且确保学习的结果与对所有数据进行重新学习的结果一致性，使过站时间估计模型能够动态调整，并且定期更新过站时间估计值，以适应不断变化的外界情况。The method for dynamically estimating flight transit time based on Bayesian networks provided by the present invention uses the method of data mining in the estimation of flight transit time, and first extracts several factors that have a significant impact on flight transit time, and uses Yesnet obtained the transit time estimation model, and then obtained the estimated transit time under different conditions. When estimating the departure time of a flight, it is only necessary to know the probability distribution of the value of the transit time by using the transit time estimation model under the condition of the flight arrival information. On this basis, by calculating the expected value, we can get The possible values of the departure time. In addition, aiming at the characteristics of increasing flight data, the method used in the present invention can continuously learn the newly added data, and ensure that the learning result is consistent with the re-learning result of all data, so that the transit time estimation model can dynamically Adjust and regularly update transit time estimates to adapt to changing external conditions.

附图说明Description of drawings

图1为本发明提供的基于贝叶斯网的航班过站时间动态估计方法流程图。Fig. 1 is a flow chart of the method for dynamically estimating flight transit time based on the Bayesian network provided by the present invention.

图2本发明提供的基于贝叶斯网的航班过站时间动态估计方法中航班离港时间估计流程图。Fig. 2 is a flow chart of flight departure time estimation in the method for dynamically estimating flight transit time based on Bayesian network provided by the present invention.

具体实施方式detailed description

下面结合附图和具体实施例对本发明提供的基于贝叶斯网的航班过站时间动态估计方法进行详细说明。The method for dynamically estimating flight transit time based on the Bayesian network provided by the present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

如图1所示，本发明提供的基于贝叶斯网的航班过站时间动态估计方法包括按顺序进行的下列步骤：As shown in Figure 1, the flight time dynamic estimation method based on Bayesian network provided by the invention comprises the following steps carried out in order:

步骤一：对历史航班数据进行预处理，从中提取出包括前航班到达延误时间、前航班到达时间段、计划过站时间、起飞机场、飞机类型和实际过站时间在内的数据作为影响因素；其中航班历史数据包括航班号、飞机号、航班计划起飞时间、计划降落时间、实际起飞时间、实际降落时间、起飞机场、目的机场和航班座位数；前航班到达延误时间为前航班实际到达时间减去前航班计划到达时间，计划过站时间为航班计划表中下一航班的计划起飞时间减去上一航班的计划到达时间；Step 1: Preprocess the historical flight data, and extract the data including the arrival delay time of the previous flight, the arrival time period of the previous flight, the planned transit time, the departure airport, the aircraft type and the actual transit time as influencing factors ;The flight history data includes flight number, aircraft number, flight scheduled departure time, planned landing time, actual departure time, actual landing time, departure airport, destination airport and the number of seats on the flight; the arrival delay time of the previous flight is the actual arrival time of the previous flight Time minus the scheduled arrival time of the previous flight, and the planned transit time is the scheduled departure time of the next flight in the flight schedule minus the scheduled arrival time of the previous flight;

步骤二：假设上述各因素对实际过站时间都具有影响且影响是相互独立的，由此确定出贝叶斯网拓扑结构；在确定贝叶斯网拓扑结构过程中，代表飞机类型的节点根据航班座位数不同有四种取值状态，90座以下为机型A，90—160座为机型B，160—230座为机型C，230座以上为机型D；代表前航班到达时间段的节点取值有24个状态，每个状态代表的时间跨度为一个小时，例如H13代表前航班到达时间是在13:00—13：59之间；前航班到达延误时间、计划过站时间、实际过站时间的每个状态以10分钟为一个区间段；代表机场的节点每个状态代表一个机场。确定完贝叶斯网拓扑结构后，运用上一步骤得到的数据，采用最大似然估计法得到贝叶斯网参数，由此得到过站时间估计模型：Step 2: Assuming that the above-mentioned factors have an influence on the actual transit time and the influence is independent of each other, the Bayesian network topology is determined; in the process of determining the Bayesian network topology, the nodes representing the aircraft type are based on There are four value states for the different number of seats on the flight, below 90 seats is model A, 90-160 seats is model B, 160-230 seats is model C, and more than 230 seats is model D; it represents the arrival time of the previous flight There are 24 states for the node value of a segment, and each state represents a time span of one hour. For example, H13 represents that the arrival time of the previous flight is between 13:00 and 13:59; the arrival delay time of the previous flight and the planned transit time 1. Each state of the actual transit time takes 10 minutes as an interval; each state of a node representing an airport represents an airport. After determining the topology of the Bayesian network, using the data obtained in the previous step, the parameters of the Bayesian network are obtained by using the maximum likelihood estimation method, and thus the transit time estimation model is obtained:

假设为所有参数组成的向量，n为节点个数，q_i为π(X_i)的取值组合个数，r_i为节点X_i的取值个数，θ_ijk＝P(X_i＝k/π(X_i)＝j)为当X_i父节点取值为第j个取值，X_i取值为第k个取值的概率，D_i,i＝1,2,...,m为样本数据，则参数向量的对数似然函数为：suppose is the vector composed of all parameters, n is the number of nodes, q _i is the number of value combinations of π(X _i ), r _i is the number of values of node X _i , θ _ijk =P(X _i =k/ π(X _i )=j) is the probability that when the value of the parent node of X _i is the jth value, the value of X _i is the kth value, D _i ,i=1,2,...,m is the sample data, then the parameter vector The log-likelihood function for is:

其中，m_ijk是航班数据中满足X_i＝k，π(X_i)＝j的样本数量，r_i为节点X_i的取值个数；若历史航班数据源中则设参数为均匀分布。由式(2)确定出贝叶斯网参数，由此得到过站时间估计模型；Among them, m _ijk is the number of samples satisfying Xi = k, π(X _i ) = _j in the flight data _, and r _i is the value number of node Xi; if the historical flight data source Then the parameters are assumed to be uniformly distributed. The parameters of the Bayesian network are determined by the formula (2), and the transit time estimation model is thus obtained;

步骤三：对上述得到的过站时间估计模型进行推理，得到不同状况下的过站时间估计值，具体步骤如下：Step 3: Infer the transit time estimation model obtained above to obtain the transit time estimates under different conditions. The specific steps are as follows:

步骤四：对进港航班的离港时间进行预测；预测时，如果上一航班不发生进港延误或延误时间小于10分钟，则下一航班的估计起飞时间为计划时间；如果延误时间大于10分钟，则从过站时间规则库中得出相应状况下的估计过站时间；如果实际进港时间加上估计过站时间小于下一航班的计划过站时间，则下一航班的估计离港时间为计划离港时间；如果上一航班的实际进港时间加上估计的过站时间大于下一航班的计划离港时间，则下一航班的估计离港时间为实际进港时间加上估计过站时间；如果延误时间大于10分钟，并且过站时间规则库中没有相应的过站时间信息，则估计的过站时间为计划过站时间，下一航班的估计离港时间为实际进港时间加上计划过站时间。Step 4: Predict the departure time of the incoming flight; when predicting, if there is no arrival delay on the previous flight or the delay time is less than 10 minutes, then the estimated departure time of the next flight is the planned time; if the delay time is greater than 10 minutes Minutes, the estimated transit time under the corresponding situation is obtained from the transit time rule library; if the actual arrival time plus the estimated transit time is less than the planned transit time of the next flight, the estimated departure time of the next flight The time is the scheduled departure time; if the actual arrival time of the previous flight plus the estimated transit time is greater than the planned departure time of the next flight, the estimated departure time of the next flight is the actual arrival time plus the estimated Transit time; if the delay time is greater than 10 minutes, and there is no corresponding transit time information in the transit time rule base, the estimated transit time is the planned transit time, and the estimated departure time of the next flight is the actual arrival time Time plus planned transit time.

步骤五：经过一段时间后，采用步骤一的方法对新的历史航班数据进行预处理，得到新的训练样本；然后将之前得到的模型作为先验知识，结合新的训练样本，采用贝叶斯估计法修正贝叶斯网参数；Step 5: After a period of time, use the method of step 1 to preprocess the new historical flight data to obtain new training samples; then use the previously obtained model as prior knowledge, combine with the new training samples, and use Bayesian The estimation method corrects the parameters of the Bayesian network;

假设为由所组成的子向量，α_ijk为先验知识中满足X_i＝k和π(X_i)＝j的样本数量，为狄利克雷分布由贝叶斯公式得：suppose for the reason The composed sub-vector, α _ijk is the number of samples satisfying X _i =k and π(X _i )=j in the prior knowledge, is a Dirichlet distribution From Bayesian formula:

采用式(5)修正贝叶斯网参数，其中m_ijk为新样本中满足X_i＝k和π(X_i)＝j的样本数量。修正完参数以后，对新的过站时间估计模型进行推理，并且更新过站时间规则库。Formula (5) is used to correct the parameters of the Bayesian network, where m _ijk is the number of samples satisfying Xi =k and π(X _i )= _j in the new samples. After the parameters are corrected, the new transit time estimation model is reasoned and the transit time rule base is updated.

Claims

1. a kind of flight turnaround on airport method for dynamic estimation based on Bayesian network it is characterised in that: described method include by The following step that order is carried out:

Step one: pretreatment is carried out to history flight data, therefrom extract including before flight reach delay time at stop, front flight and arrive Reach time period, plan turnaround on airport, original base, type of airplane and actual turnaround on airport in interior data as influence factor；

Step 2: assume that above-mentioned each factor all has impact and affects to be separate on actual turnaround on airport, thereby determine that Go out Bayesian network topological structure；Then the data obtaining with previous step, obtains Bayesian network using maximum likelihood estimate Parameter, thus obtains turnaround on airport and estimates model；

Step 3: model makes inferences to be estimated to turnaround on airport obtained above, obtains front flight and reach delay time at stop, front flight The time of advent section, plan turnaround on airport, original base, this five variables of type of airplane take turnaround on airport during different value to estimate respectively Evaluation；

Step 4: the Departure airport of the flight that approaches is predicted；

Step 5: through after a period of time, the method using step one carries out pretreatment to new history flight data, obtains new Training sample；Then using the model obtaining before as priori, in conjunction with new training sample, using Bayes' assessment Revise Bayesian network parameter；After having revised parameter, model makes inferences to be estimated to new turnaround on airport, and update when missing the stop Between rule base；

Step 6: repeat at periodic or other desired step 5, dynamically update time rule storehouse of missing the stop.

2. the flight turnaround on airport method for dynamic estimation based on Bayesian network according to claim 1 it is characterised in that: In step one, described flight historical data includes flight number, aircraft number, the flight planning departure time, plan landing time, reality The border departure time, Actual Time Of Fall, original base, destination airport and airline seat number；The front flight arrival delay time at stop is front The flight actual time of arrival deducts front flight planning time of advent, and plan turnaround on airport is the meter of Next Flight in flight planning table Drawing the departure time deducts plan time of advent of a flight.

3. the flight turnaround on airport method for dynamic estimation based on Bayesian network according to claim 1 it is characterised in that: In step 2, described obtains Bayesian network parameter using maximum likelihood estimate, thus obtains turnaround on airport and estimates model Method is:

AssumeFor the vector of all parameters composition, n is node number, q_iFor π(x_i) valued combinations number, r_iFor node x_iValue number, θ_ijk=p (x_i=k/ π (x_i)=j) it is to work as x_iFather node value For j-th value, x_iValue is the probability of k-th value, d_i, i=1,2 ..., m is sample data, then vectorLogarithm seemingly So function is:

l (\overset{&rightarrow;}{θ} / d) = l o g π_{l = 1}^{m} p (d_{l} / \overset{&rightarrow;}{θ}) = σ_{l = 1}^{m} \log p (d_{l} / \overset{&rightarrow;}{θ}) - - - (1)

Work as θ_ijkWhen taking following value, log-likelihood function acquirement maximum:

Wherein, m_ijkIt is in flight data, to meet x_i=k, π (x_iThe sample size of)=j；If in history flight data sourceThen setting parameter is to be uniformly distributed；Bayesian network parameter is determined by formula (2), thus obtains turnaround on airport and estimate mould Type.

4. the flight turnaround on airport method for dynamic estimation based on Bayesian network according to claim 1 it is characterised in that: In step 3, model makes inferences to be estimated to the turnaround on airport obtaining, when obtaining front flight arrival delay time at stop, the arrival of front flight Between section, plan turnaround on airport, original base, this five variables of type of airplane take turnaround on airport estimated value during different value respectively Method is:

3.1 pairs of turnaround on airport estimate that model makes inferences, and obtain front flight and reach delay time at stop, the front flight section time of advent, meter Draw turnaround on airport, original base, this five changes of type of airplane measure the probability distribution of turnaround on airport during a certain definite value；

3.2 pairs of turnaround on airport seek expected value:

e (t) = \underset{i}{σ} p (i) * t_{i} - - - (3)

Wherein e (t) is to reach delay time at stop, the front flight section time of advent, plan turnaround on airport, original base, fly in front flight This five changes of machine type measure turnaround on airport expected value during a certain definite value, t_iFor the intermediate value that i-th turnaround on airport is interval, p (i) is Probability in i-th interval for the turnaround on airport, in the hope of turnaround on airport expected value e (t) as front flight reach the delay time at stop, The estimation of turnaround on airport when the front flight section time of advent, plan turnaround on airport, original base, this five variables of type of airplane determine Value；

3.3 the turnaround on airport estimated value under different condition is inserted in turnaround on airport rule base, is flight for putting forth time prediction Offer condition.

5. the flight turnaround on airport method for dynamic estimation based on Bayesian network according to claim 1 it is characterised in that: In step 4, to the method that Departure airport of the flight that approaches is predicted it is；If a upper flight does not approach and being delayed or prolongs It is less than between mistaking 10 minutes, then the estimation departure time of Next Flight is planned time；If the delay time at stop is more than 10 minutes, The estimation turnaround on airport under respective conditions is drawn from turnaround on airport rule base；If actual enter ETA estimated time of arrival when missing the stop plus estimation Between less than Next Flight plan turnaround on airport, then the estimation Departure airport of Next Flight be plan the Departure airport；If upper one Flight actual enter ETA estimated time of arrival add that the turnaround on airport estimated is more than the plan Departure airport of Next Flight, then the estimating of Next Flight Meter the Departure airport be actual enter ETA estimated time of arrival add estimate turnaround on airport；If the delay time at stop is more than 10 minutes, and turnaround on airport There is no corresponding turnaround on airport information, then the turnaround on airport estimated is plan turnaround on airport, the estimation of Next Flight in rule base Departure airport be actual enter ETA estimated time of arrival add plan turnaround on airport.

6. the flight turnaround on airport method for dynamic estimation based on Bayesian network according to claim 1 it is characterised in that: In step 5, the method using Bayes' assessment correction Bayesian network parameter is:

AssumeIt is by θ_ij1,θ_ij2,...,The subvector being formed, α_ijkMeet x in priori_i=k and π (x_i)=j Sample size,For the distribution of Di Li CrayObtained by Bayesian formula:

p (\overset{&rightarrow;}{θ} / d) = \frac{p (\overset{&rightarrow;}{θ}) p (d / \overset{&rightarrow;}{θ})}{&integral; p (d \overset{&rightarrow;}{θ}) d \overset{&rightarrow;}{θ}} = \frac{p (\overset{&rightarrow;}{θ}) p (d / \overset{&rightarrow;}{θ})}{p (d)} - - - (4)

WhereinFor vectorPrior probability distribution,For vectorPosterior probability distribution；Due to Obey the distribution of Di Li CraySo:

e (θ_{i j k} / d) = \frac{m_{i j k} + α_{i j k}}{σ_{k = 1}^{r_{i}} (m_{i j k} + α_{i j k})} - - - (5)

Bayesian network parameter, wherein m are revised using formula (5)_ijkMeet x in new samples_i=k and π (x_iThe sample size of)=j.