CN112182427A - Data processing method, device, electronic device and storage medium - Google Patents
Data processing method, device, electronic device and storage medium Download PDFInfo
- Publication number
- CN112182427A CN112182427A CN202010859043.0A CN202010859043A CN112182427A CN 112182427 A CN112182427 A CN 112182427A CN 202010859043 A CN202010859043 A CN 202010859043A CN 112182427 A CN112182427 A CN 112182427A
- Authority
- CN
- China
- Prior art keywords
- poi
- candidate
- wifi
- feature
- state
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/02—Services making use of location information
- H04W4/021—Services related to particular areas, e.g. point of interest [POI] services, venue services or geofences
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Remote Sensing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention discloses a data processing method and device. The method comprises the following steps: for a candidate POI, identifying candidate WiFi with positioning positions around the coordinates of the candidate POI; extracting WiFi characteristics used for representing the candidate WiFi popularity from the candidate WiFi; and identifying the state of the candidate POI corresponding to the candidate WiFi on the basis of the WiFi characteristics. The invention can quickly and timely make offline judgment on the POI which is actually offline, thereby quickly and timely finding out the POI which is in an offline state.
Description
Technical Field
Embodiments of the present invention relate to the field of data processing technologies, and in particular, to a data processing method and apparatus, an electronic device, and a computer-readable storage medium.
Background
With the wider application of internet maps, the timely update of POI data becomes more and more important. POI (Point of Interest) is the appearance of personalized service demand of users after the geographic information system is developed to a certain stage. The POI information mainly includes information such as name, category, coordinate, classification, and the like. Comprehensive POI information is a condition for enriching a navigation map, timely POI can remind a user of branches of road conditions and detailed information of surrounding buildings, and can also facilitate searching of each place required by the user in navigation, so that the most convenient and unobstructed road is selected for path planning. The updated POI data is generally classified into three types of newly added data, error correction data, and invalid data. In all the update data types, it is often more difficult for failure data to directly collect evidence for judging whether the failure data is failed, so that the update difficulty is relatively high. If the failure data is not updated in time, great inconvenience is brought to map users. Therefore, it is a task in map data updating how to dig out POI failure data in a large batch and with high quality without direct proof.
In the related art, when the invalid POI data is mined, the offline POI is judged mainly by means of collected POI information, but the offline state of the POI is difficult to judge in time if the POI information is not collected and reported in time.
Therefore, the scheme for mining the offline POI (i.e., failed POI) in the related art generally has the problem that it is difficult to make offline judgment on the actual offline POI in time and quickly.
Disclosure of Invention
The embodiment of the invention provides a data processing method, which aims to solve the problem that offline judgment on an actual offline POI is difficult to timely and quickly make when the offline POI is mined in the related art.
In order to solve the above problem, in a first aspect, an embodiment of the present invention provides a data processing method, including:
for a candidate POI, identifying candidate WiFi with positioning positions around the coordinates of the candidate POI;
extracting WiFi characteristics used for representing the candidate WiFi popularity from the candidate WiFi;
and identifying the state of the candidate POI corresponding to the candidate WiFi on the basis of the WiFi characteristics.
In a second aspect, an embodiment of the present invention provides a data processing apparatus, including:
a first identification module for identifying, for a candidate POI, candidate WiFi whose location positions are located around coordinates of the candidate POI;
a first extraction module, configured to extract, for the candidate WiFi, a WiFi feature that is used to characterize the candidate WiFi hotness;
and the second identification module is used for identifying the state of a candidate POI corresponding to the candidate WiFi on the basis of the WiFi characteristics.
In a third aspect, an embodiment of the present invention further discloses an electronic device, which includes a memory, a processor, and a computer program that is stored in the memory and can be run on the processor, and when the processor executes the computer program, the data processing method according to the embodiment of the present invention is implemented.
In a fourth aspect, the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the data processing method disclosed in the present invention.
In the embodiment of the invention, by identifying the candidate WiFi around the candidate POI, the information of the candidate WiFi can be timely and quickly acquired, so that the WiFi characteristics representing the WiFi popularity of the candidate WiFi around the candidate POI can be timely and quickly acquired, and the state of the corresponding candidate POI is reversely deduced by utilizing the WiFi characteristics, thereby realizing the timely judgment and acquisition of the state of the candidate POI; moreover, the state of the candidate POI corresponding to the candidate WiFi can be timely and accurately reflected by the change condition of the WiFi characteristic representing the WiFi popularity, so that the state of the corresponding candidate POI is judged by utilizing the WiFi characteristic, offline judgment can be rapidly and timely carried out on the POI which is actually offline, and the POI which is in the offline state can be timely and rapidly found.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.
FIG. 1 is a flow chart of the steps of a data processing method of one embodiment of the present invention;
FIG. 2 is a flow chart of the steps of a data processing method of another embodiment of the present invention;
FIG. 3 is a graphical illustration of a time decay model of one embodiment of the present invention;
FIG. 4 is a block diagram of a data processing apparatus according to an embodiment of the present invention;
FIG. 5 schematically shows a block diagram of a computing processing device for performing a method according to the present disclosure; and
fig. 6 schematically shows a storage unit for holding or carrying program code implementing a method according to the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In practical application, the POI change is very common, such as the POI offline of a shop caused by shop going out of business, and the like, a plurality of business lines are strongly dependent on the data construction of the POI, so that the timely grasping of the offline state of the POI is very important, and WiFi equipment is an indispensable part in our daily life.
To this end, an embodiment of the present invention provides a data processing method, as shown in fig. 1, the method may include step 101, step 102, and step 104:
in an embodiment, the candidate POI may be a data source that needs to determine whether the POI status is offline status according to the method of the embodiment of the present invention;
in another embodiment, the candidate POI may also be a POI suspected to be in a offline state obtained by preliminarily screening a total number of POIs through a preset scoring policy.
And the number of candidate POIs may be one or more, preferably more;
in this step, POI information (including but not limited to name, category, coordinates, etc.) of each candidate POI may be obtained; in addition, WiFi information can also be acquired, and in the step, candidate WiFi is mined for the candidate POI mainly by using POI information and WiFi information of the candidate POI, so that a corresponding relation (or a hooking relation) between the candidate POI and the candidate WiFi is formed; in the correspondence, one POI may be hooked with one or more WiFi (for example, each room of a hotel has WiFi equipment, and then the POI of the hotel may be hooked with multiple WiFi);
in one example, the advantage of big data with positioning information can be fully utilized, candidate WiFi capable of being hooked with the candidate POI is mined for multiple data sources, and the high-quality hooking relation between the WiFi and the POI is generated. Then the hitching relation is generated, and the corresponding POI can be found based on the WiFi information carried in the user request, so that the WiFi landmark can be used for combining the POI landmark and strongly enabling upper-layer services such as pan-indoor positioning and the like.
Since the candidate POI has coordinate information and the WiFi has a positioning position, a candidate WiFi closer to the coordinate of the candidate POI may be mined from the WiFi by means of two types of coordinates, so as to form a hanging relationship between the candidate POI and the candidate WiFi, where the closer may be embodied as a candidate WiFi whose positioning position of the WiFi is around the coordinate of the candidate POI, for example, a WiFi in a range with the coordinate of the candidate POI as a center and a preset distance as a radius is taken as a candidate WiFi hanging on the candidate POI, which is equivalent to that the step mines a WiFi list around the coordinate of the candidate POI. And the number of the candidate WiFi hooked by one candidate POI may be one or more.
Optionally, in executing step 101, for a candidate POI, a candidate WiFi whose positioning location is around the coordinates of the candidate POI and whose WiFi name is similar to the POI name of the candidate POI may be identified.
When mining candidate WiFi hooked with a candidate POI, not only the distance between the POI and WiFi may be referred to, but also the similarity between the name of the POI and the name of WiFi (e.g., SSID (Service Set Identifier)) may be combined. For example, candidate WiFi whose locating position is around the coordinates of the candidate POI and whose SSID is similar to the name of the candidate POI (the similarity criterion may be that text similarity is greater than threshold 1, or semantic similarity is greater than threshold 2) may be mined from the business data to attach to the candidate POI.
In the embodiment of the present invention, for a candidate POI, a candidate WiFi whose location is around the coordinates of the candidate POI and whose name is similar to the POI name of the candidate POI may be mined to attach to the candidate POI, so that the candidate WiFi hooked by the candidate POI is not only the WiFi located around the coordinates of the candidate POI, and also WiFi whose name is similar to that of the candidate POI, then there is a high probability that the corresponding candidate WiFi identified for any one candidate POI is WiFi provided by the physical building (e.g., store) of that candidate POI, then after the store withdraws, the WiFi provided by the store will also disappear, so the WiFi characteristics mined to the candidate WiFi can more accurately reflect the real status of the corresponding candidate POI, such as online status or offline status, therefore, the candidate POI in the offline state can be timely and accurately judged by utilizing the WiFi characteristics of the candidate WiFi.
Optionally, in an application scenario, when generating a hitching relationship between a candidate POI and a candidate WiFi, a WiFi feature extraction model trained in advance may be utilized to extract WiFi features of each WiFi to be detected from an underlying data source, for example, extract WiFi features of each WiFi to be detected related to from at least one service data of taxi taking data, inertial navigation data, collected data, and positioning data, where the WiFi features may include, but are not limited to, at least one of the following: SSID, master connection times, location position, mac (Media Access Control Address, local area network Address) segmentation, whether indoor is adopted or not, whether mobile WiFi is adopted or not, encryption mode, time characteristics and the like;
when the WiFi feature extraction model is generated, WiFi images of various service types (each WiFi image comprises various WiFi features) can be mined from a bottom layer data source to serve as a training set to train the preset neural network model, and therefore the WiFi feature extraction model is obtained after the model converges. Wherein, the type of WiFi representation may include, but is not limited to, at least one of: the system comprises a taxi-taking class, a goods-receiving class, a ticket-checking class, a sign-in class and the like.
And the candidate POI has respective POI characteristics, such as name, category, coordinates and the like, the POI characteristics of the candidate POI and the WiFi characteristics of the WiFi to be detected can be input into a pre-trained prediction model, and the prediction model can predict candidate WiFi capable of being hooked with the candidate POI from the WiFi to be detected by using an algorithm strategy (including but not limited to at least one of frequent item sets, marginal distance, clustering and sorting) so as to output a hooking relationship of the candidate POI and the candidate WiFi, wherein in the hooking relationship, the candidate WiFi is WiFi with an approximate rate belonging to the address of the candidate POI.
In the embodiment of the invention, the big data can be fully utilized, and the hitching relation between the high-quality WiFi and the POI can be excavated by combining various data sources, so that the state of the POI can be judged by utilizing the WiFi characteristics of the WiFi positioned around the POI and hitched with the POI, and the POI in the offline state can be found out timely and quickly.
the WiFi characteristics used for characterizing the candidate WiFi popularity may include, but are not limited to, characteristics of at least one of the following dimensions: number of times WiFi is actively connected by the client over a period of time (e.g., daily), number of times WiFi is scanned by the client over a period of time (e.g., daily), location heat over a period of time (e.g., daily);
the positioning heat may be the number of clients (i.e., the number of users) located within a preset geographic range of the positioning location of the WiFi (e.g., within 100 meters of the positioning location square); but also the number of clients located within a preset geographic range of coordinates of the candidate POI hooked with the WiFi (e.g., the number of users located within a hundred meters of the candidate POI hooked with the candidate WiFi).
And 104, identifying the state of a candidate POI corresponding to the candidate WiFi on the basis of the WiFi characteristics.
Since the WiFi feature in this step is a feature that represents the heat of the candidate WiFi, and the heat of the WiFi provided by a store is greatly reduced compared to that before the store is removed after the store POI is removed, the state of the candidate POI hooked with the candidate WiFi can be determined to be an online state or an offline state according to the variation trend of the WiFi feature of the candidate WiFi.
In the embodiment of the invention, by identifying the candidate WiFi around the candidate POI, the information of the candidate WiFi can be timely and quickly acquired, so that the WiFi characteristics representing the WiFi popularity of the candidate WiFi around the candidate POI can be timely and quickly acquired, and the state of the corresponding candidate POI is reversely deduced by utilizing the WiFi characteristics, thereby realizing the timely judgment and acquisition of the state of the candidate POI; moreover, the state of the candidate POI corresponding to the candidate WiFi can be timely and accurately reflected by the change condition of the WiFi characteristic representing the WiFi popularity, so that the state of the corresponding candidate POI is judged by utilizing the WiFi characteristic, offline judgment can be rapidly and timely carried out on the POI which is actually offline, and the POI which is in the offline state can be timely and rapidly found.
In addition, compared with the problems of large acquisition difficulty and high acquisition cost in POI information acquisition in the related art, the method provided by the embodiment of the invention can be completed on a data level, can avoid performance and data consumption in actual acquisition, and has the advantages of simplicity and easiness.
Optionally, in the step 104, a pre-trained Logistic Regression (LR) model 1 may be adopted to identify the POI status of the candidate POI hooked with the candidate WiFi by using the WiFi characteristics of the candidate WiFi;
the logistic regression model 1 is used for scoring the input WiFi features, identifying the POI states corresponding to the WiFi features of which the scoring results are greater than or equal to a first preset threshold as offline states, and in addition, identifying the POI states corresponding to the WiFi features of which the scoring results are less than the first preset threshold as online states.
The inventor finds that the predicted result of the LR model is the probability between 0 and 1 in the process of implementing the invention; furthermore, the LR model can be adapted to continuity and categorical arguments; and the LR model is easy to use and interpret; thus, the method of an embodiment of the present invention uses an LR model for modeling, generating LR model 1.
Wherein, the objective function of the LR model 1 is a sigmoid function shown in formula 1:
the sigmoid function is an s-shaped curve, and g (x) takes a value between [0 and 1], g (x) takes a value of 0.5 when x is 0, and g (x) takes a value which is close to 0 or 1 quickly when x is far from 0. Therefore, the inventors utilized the sigmoid function as an objective function for modeling, thereby being able to interpret the result of classification in a probabilistic manner.
In this embodiment, x in formula 1 is a WiFi feature of a candidate WiFi of the input of LR model 1, as exemplified above, the WiFi feature characterizing the heat of the candidate WiFi may be a feature of one or more dimensions, and thus x is a multi-dimensional feature vector of WiFi features of multiple dimensions of a candidate WiFi. For example, a WiFi feature of a candidate WiFi includes: the number of times WiFi is actively connected by the client during a period of time (e.g., daily), the number of times WiFi is scanned by the client during a period of time (e.g., daily), and the three-dimensional feature of the hot is located during a period of time (e.g., daily), then x corresponding to the candidate WiFi is the feature vector of the three-dimensional WiFi feature. In other words, x represents the sum of WiFi features of a candidate WiFi, and then the WiFi features of the candidate WiFi hooked with each candidate POI are input into the LR model 1 in the form of a multidimensional feature vector, and the LR model 1 may score the multidimensional feature vector by using the above formula 1, where the scoring result is g (x), if g (x) is greater than or equal to 0.5, it indicates that the POI status of the candidate POI corresponding to the candidate WiFi is down status, otherwise it is up status.
Alternatively, when modeling is performed by using the LR model to generate the LR model 1, the LR model 1 may be trained and modeled by using the above formula 1 as an objective function and WiFi features of WiFi hooked to POIs that have been off-line as a training set, so as to generate the converged LR model 1, and the LR model 1 may be constructed by using changes in the heat of WiFi (the WiFi features representing the heat of WiFi), so that the LR model 1 can be used to accurately determine the state of the candidate POI reflected by the input WiFi features.
In the embodiment of the present invention, a trained logistic regression model may be used to score the WiFi characteristics of the candidate WiFi, so as to reasonably determine the state of the candidate POI corresponding to the candidate WiFi according to the size relationship between the scoring result and the first preset threshold, where the scoring result is greater than or equal to the first preset threshold, and then the state of the candidate POI is determined as an offline state, and an LR model may be constructed according to the change of the WiFi heat (the WiFi characteristic representing the WiFi heat), so as to accurately determine the state of the candidate POI reflected by the input WiFi characteristic by using the LR model.
Optionally, the method according to the embodiment of the present invention may further include:
103, under the condition that a first POI associated with a business order exists in the candidate POIs, extracting order features of the business order corresponding to the first POI;
for example, if a certain fast food restaurant has a takeout order, the POI of the fast food restaurant will be associated with the takeout order, and for a first POI associated with a service order in the candidate POIs, not only the WiFi characteristics of the candidate WiFi associated with the first POI can be extracted through the above step 102, but also the order characteristics of the service order associated with the first POI can be extracted through this step.
Wherein the initial order characteristics include, but are not limited to, at least one of: the amount of orders over a period of time (e.g., daily), the date of the order of the last business order, etc.
When the initial order features are extracted, the initial order features can be obtained by inquiring a log library.
Alternatively, the initial order characteristics may be used as the order characteristics of step 103, or the initial scoring result may be used as the order characteristics of step 103 by scoring the initial order characteristics.
Optionally, in one embodiment, when the initial order features are scored, the initial order features (here, the order date of the last business order of the first POI associated with a business order is taken as an example) may be scored using a pre-trained time decay model.
The attenuation function adopted by the time attenuation model is formula 2:
equation 2;
wherein T0 is an initial score of a first POI (e.g., a store corresponding to the POI), T0 is an order date (i.e., an order time) of a latest business order of the first POI, T is a current time, and T-T0 represents a time interval between the order time of the latest business order of the first POI and the current time; t is the score for this first POI (i.e., the score for the day of the last business order); α is the half-life of the time decay model;
the curve expressed by the above equation 2 is shown in fig. 3, the y-axis represents T, and the x-axis represents (T-T0), and it can be seen from fig. 3 that the longer the time interval between the last business order of the POI and the current time, the lower the score of the POI on the business order.
The half-life period α in the above formula 2 is a value of x corresponding to the value of y in fig. 3 being 0.5, which is 52 here. I.e. alpha is 52.
When the time attenuation model is trained, the method for acquiring the training set can inquire the daily order quantity of each POI (point of interest) which has a business order and is offline in a log library; then, a time attenuation model is established for the order date of the latest business order of the POI, namely, the time interval between the latest business order of each POI and the current time is determined based on the order date, the time interval of each POI is used as a sample, the initial model is trained by taking the formula 2 as an objective function, and after convergence, the value of the half-life period alpha is obtained, so that the trained time attenuation model is obtained.
The training and using processes of the time decay model are both exemplified by the order date of the last business order with the initial order feature as the first POI, but in other embodiments, when the initial order feature includes, for example, the order quantity within a period of time, the principle of training and using the model is similar to the principle of training and using the time decay model described in detail above, and therefore, the details are not repeated here.
In the embodiment of the invention, for a first POI with a service order, the order date of the last service order of the first POI is scored through a time attenuation model, so that the state of the first POI can be judged in an auxiliary manner through the height of the order scoring result, wherein the lower the order scoring result is, the higher the probability that the first POI is in a next line state is, the higher the probability that the order scoring result is, the state of the first POI is judged comprehensively by taking the order scoring result as the order feature of the first POI and combining the WiFi feature of the first POI, and the accuracy of judging the state of the POI with the service order can be further improved.
Then, in the present embodiment, when the above step 104 is executed, it is realized through S11 and/or S12:
s11, for the first POI, identifying the state of the first POI based on the WiFi characteristics and the order characteristics corresponding to the first POI;
and S12, when a second POI which is not associated with a business order exists in the candidate POIs, identifying the state of the second POI based on the WiFi characteristics corresponding to the second POI.
In the present invention, there is no limitation on the execution sequence between step 103 and step 101, and there is no limitation on the execution sequence between step 103 and step 102, and step 104 is executed after step 102 and step 103.
In one example, as shown in fig. 2, for example, if the candidate POIs are full POIs, the candidate WiFi may be hooked to each candidate POI, and in addition, for the first POI with a business order, a business order may also be hooked; then, extracting WiFi features such as a main connection feature (the number of times WiFi is actively connected by the client within a period of time (e.g., every day)), a scan feature (the number of times WiFi is scanned by the client within a period of time (e.g., every day)), and the like, for the candidate WiFi; extracting order characteristics of business orders (including POS (point of sale) receipts, store-to-store purchase orders and take-out orders) hooked by the first POI; since the candidate WiFi and the service order are hung on the first POI with the service order, the multidimensional feature corresponding to the first POI comprises a WiFi feature and an order feature; the candidate WiFi is hung on the second POI without the service order, so that the multidimensional characteristics corresponding to the second POI comprise WiFi characteristics; then, the multi-dimensional features corresponding to each candidate POI (here, the first POI or the second POI) are input into the scoring model for scoring, so that whether the state of the first POI is on-line or off-line can be determined according to the scoring result of the multi-dimensional features of the first POI, and whether the state of the second POI is on-line or off-line can be determined according to the scoring result of the multi-dimensional features of the second POI.
In the embodiment of the present invention, when there is a first POI associated with a business order in the candidate POIs, an order feature of the business order corresponding to the first POI may also be extracted; for the first POI, the state of the first POI can be identified based on the WiFi feature and the order feature corresponding to the first POI, so that the state of the POI can be reflected according to WiFi popularity, the state of the first POI can be further assisted and judged by combining with order data of the POI, and the judgment accuracy of the state of the POI is improved; if a second POI not associated with a service order exists in the candidate POIs, identifying the state of the second POI based on the WiFi feature corresponding to the second POI, and reflecting the state of the POI according to WiFi popularity.
Optionally, in an embodiment, in executing the S11, a pre-trained LR model 2 may be adopted to identify a POI status corresponding to an input feature of the first POI, where the input feature includes a WiFi feature and an order feature corresponding to the first POI;
optionally, in an embodiment, in executing the S12, a pre-trained LR model 2 may be adopted to identify a POI state corresponding to an input feature of the second POI, where the input feature includes a WiFi feature corresponding to the second POI;
the LR model 2 is configured to score input features, identify, as an offline state, a POI state corresponding to an input feature whose score result is greater than or equal to a second preset threshold, and identify, as an online state, a POI state corresponding to an input feature whose score result is less than the second preset threshold.
The training process of the LR model 2 is similar to the training process of the LR model 1 (the using process is also similar, and refer to the training and using process of the LR model 1 specifically), wherein the training process is only different in that the training samples in the training set adopted by the LR model 2 include not only the WiFi feature of WiFi attached to a POI that has been offline, but also the order feature of the service order associated with the POI (i.e. the order feature described in step 103, such as the order scoring result), so x in the above formula 1 is also a multidimensional feature vector, but includes not only a vector of multidimensional WiFi feature, but also a one-dimensional order feature vector (i.e. a vector of the order scoring result), so that the trained LR model 2 can depend on not only the WiFi feature of WiFi attached to a POI, but also the order feature of the attached service order, and scoring, and judging whether the state of the POI is a down line state or not according to a scoring result.
The second preset threshold may be the same as the first preset threshold, and is 0.5.
The difference in the model usage process is that when the LR model 2 is used to score a first POI, the input feature x of the LR model 2 is the sum of the vector of the WiFi features of the candidate WiFi hooked by the first POI and the vector of the order features (such as the order scoring result) of the business order associated with the first POI; the LR model 2 scores the input features x to obtain a scoring result, if the scoring result is greater than or equal to 0.5, the state of the first POI is an offline state, and if not, the state of the first POI is an online state;
when the LR model 2 is used to mark the second POI, since the second POI is not associated with a service order and does not have an order feature, the position of the input feature x of the LR model 2 with respect to the order feature vector may be set to 0, and the position of the input feature x with respect to the WiFi feature vector is supplemented with the vector of the WiFi feature of the candidate WiFi hooked by the second POI; the LR model 2 scores the input features x to obtain a scoring result, if the scoring result is greater than or equal to 0.5, the state of the first POI is an offline state, and if not, the state of the first POI is an online state;
of course, when S12 is executed, the above-mentioned LR model 1 trained in advance may also be used to identify a POI status corresponding to an input feature of the second POI, where the input feature includes a WiFi feature corresponding to the second POI.
In the embodiment of the invention, the pre-trained logistic regression model can be used for scoring the WiFi characteristics of the candidate WiFi hooked by the POI and the order characteristics of the hooked business order, so that the state of the POI can be reasonably judged according to the magnitude relation between the scoring result and the second preset threshold, wherein the state of the POI is judged to be an off-line state if the scoring result is greater than or equal to the second preset threshold, and an LR model can be constructed by using the change of the WiFi heat (the WiFi characteristics representing the WiFi heat) and assisting the order characteristics of the business order hooked by the POI, so that the state of the POI reflected by the input WiFi characteristics and the order characteristics can be accurately judged by using the LR model.
The present embodiment discloses a data processing apparatus, as shown in fig. 4, the apparatus includes:
a first identifying module 41, configured to identify, for a candidate point of interest POI, candidate WiFi whose location positions are located around coordinates of the candidate POI;
a first extraction module 42, configured to extract, for the candidate WiFi, a WiFi feature that is used to characterize the candidate WiFi hotness;
a second identifying module 43, configured to identify, based on the WiFi characteristics, a status of a candidate POI corresponding to the candidate WiFi.
Optionally, the first identifying module 41 is further configured to identify, for a candidate POI, a candidate WiFi whose positioning location is around the coordinates of the candidate POI and whose WiFi name is similar to the POI name of the candidate POI.
Optionally, the apparatus further comprises:
the second extraction module is used for extracting order features of the business orders corresponding to the first POI under the condition that the first POI associated with the business orders exists in the candidate POIs;
the second identification module 43 includes:
a first identification sub-module, configured to identify, for the first POI, a status of the first POI based on the WiFi characteristic and the order characteristic corresponding to the first POI;
and the second identification submodule is used for identifying the state of a second POI based on the WiFi characteristics corresponding to the second POI under the condition that the second POI which is not associated with a business order exists in the candidate POI.
Optionally, the second identifying module 43 is further configured to identify, by using a pre-trained logistic regression model, a POI state corresponding to the input feature of the candidate POI;
wherein, when the candidate POI comprises the first POI, the input features comprise the WiFi features and the order features corresponding to the first POI;
when the candidate POI comprises the second POI, the input feature comprises the WiFi feature corresponding to the second POI;
the logistic regression model is used for scoring the input features, and identifying POI states corresponding to the input features of which the scoring results are greater than or equal to a preset threshold value as offline states.
The data processing apparatus disclosed in the embodiments of the present invention is configured to implement each step of the data processing method described in each of the above embodiments of the present invention, and for specific implementation of each module of the apparatus, reference is made to the corresponding step, which is not described herein again.
In the embodiment of the invention, by identifying the candidate WiFi around the candidate POI, the information of the candidate WiFi can be timely and quickly acquired, so that the WiFi characteristics representing the WiFi popularity of the candidate WiFi around the candidate POI can be timely and quickly acquired, and the state of the corresponding candidate POI is reversely deduced by utilizing the WiFi characteristics, thereby realizing the timely judgment and acquisition of the state of the candidate POI; moreover, the state of the candidate POI corresponding to the candidate WiFi can be timely and accurately reflected by the change condition of the WiFi characteristic representing the WiFi popularity, so that the state of the corresponding candidate POI is judged by utilizing the WiFi characteristic, offline judgment can be rapidly and timely carried out on the POI which is actually offline, and the POI which is in the offline state can be timely and rapidly found.
Correspondingly, the invention also discloses an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the computer program to realize the data processing method according to any one of the above embodiments of the invention. The electronic device can be a PC, a mobile terminal, a personal digital assistant, a tablet computer and the like.
The invention also discloses a computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the steps of the data processing method according to any of the above-mentioned embodiments of the invention.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The data processing method and apparatus provided by the present invention are introduced in detail, and a specific example is applied in the text to explain the principle and the implementation of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Various component embodiments of the disclosure may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components in a computing processing device according to embodiments of the present disclosure. The present disclosure may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present disclosure may be stored on a computer-readable medium or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
For example, FIG. 5 illustrates a computing processing device that may implement methods in accordance with the present disclosure. The computing processing device conventionally includes a processor 1010 and a computer program product or computer-readable medium in the form of a memory 1020. The memory 1020 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. The memory 1020 has a storage space 1030 for program code 1031 for performing any of the method steps of the above-described method. For example, the storage space 1030 for program code may include respective program code 1031 for implementing various steps in the above method, respectively. The program code can be read from or written to one or more computer program products. These computer program products comprise a program code carrier such as a hard disk, a Compact Disc (CD), a memory card or a floppy disk. Such a computer program product is typically a portable or fixed storage unit as described with reference to fig. 6. The memory unit may have memory segments, memory spaces, etc. arranged similarly to memory 1020 in the computing processing device of fig. 5. The program code may be compressed, for example, in a suitable form. Typically, the memory unit comprises computer readable code 1031', i.e. code that can be read by a processor, such as 1010, for example, which when executed by a computing processing device causes the computing processing device to perform the steps of the method described above.
Reference herein to "one embodiment," "an embodiment," or "one or more embodiments" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Moreover, it is noted that instances of the word "in one embodiment" are not necessarily all referring to the same embodiment.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the disclosure may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The disclosure may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solutions of the present disclosure, not to limit them; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present disclosure.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010859043.0A CN112182427A (en) | 2020-08-24 | 2020-08-24 | Data processing method, device, electronic device and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010859043.0A CN112182427A (en) | 2020-08-24 | 2020-08-24 | Data processing method, device, electronic device and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN112182427A true CN112182427A (en) | 2021-01-05 |
Family
ID=73925483
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010859043.0A Pending CN112182427A (en) | 2020-08-24 | 2020-08-24 | Data processing method, device, electronic device and storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN112182427A (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113056102A (en) * | 2021-03-18 | 2021-06-29 | 广州市爱浦电子科技有限公司 | Method for manufacturing direct-insert micro-power module power supply |
| CN116385743A (en) * | 2023-04-28 | 2023-07-04 | 抖音视界有限公司 | Positioning method, device, computer readable medium and electronic device |
Citations (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20050251331A1 (en) * | 2004-04-20 | 2005-11-10 | Keith Kreft | Information mapping approaches |
| CN106302664A (en) * | 2016-08-03 | 2017-01-04 | 百度在线网络技术(北京)有限公司 | A kind of method and apparatus of point of interest inefficacy verification |
| CN107357797A (en) * | 2016-05-10 | 2017-11-17 | 滴滴(中国)科技有限公司 | A kind of information-pushing method and device |
| CN107704589A (en) * | 2017-09-30 | 2018-02-16 | 百度在线网络技术(北京)有限公司 | Interest point failure method for digging, device, server and medium based on waybill |
| CN107729459A (en) * | 2017-09-30 | 2018-02-23 | 百度在线网络技术(北京)有限公司 | Map interest point failure method for digging, device, equipment and computer-readable recording medium |
| CN109582880A (en) * | 2018-12-04 | 2019-04-05 | 百度在线网络技术(北京)有限公司 | Interest point information processing method, device, terminal and storage medium |
| CN110276023A (en) * | 2019-06-20 | 2019-09-24 | 北京百度网讯科技有限公司 | POI change event discovery method, device, computing device and medium |
| CN110597943A (en) * | 2019-09-16 | 2019-12-20 | 腾讯科技(深圳)有限公司 | Interest point processing method and device based on artificial intelligence and electronic equipment |
| CN110674232A (en) * | 2018-06-14 | 2020-01-10 | 百度在线网络技术(北京)有限公司 | Map interest point processing method, map interest point processing device, server and storage medium |
| CN111460056A (en) * | 2019-01-22 | 2020-07-28 | 阿里巴巴集团控股有限公司 | Outdated POI mining method and device |
-
2020
- 2020-08-24 CN CN202010859043.0A patent/CN112182427A/en active Pending
Patent Citations (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20050251331A1 (en) * | 2004-04-20 | 2005-11-10 | Keith Kreft | Information mapping approaches |
| CN107357797A (en) * | 2016-05-10 | 2017-11-17 | 滴滴(中国)科技有限公司 | A kind of information-pushing method and device |
| CN106302664A (en) * | 2016-08-03 | 2017-01-04 | 百度在线网络技术(北京)有限公司 | A kind of method and apparatus of point of interest inefficacy verification |
| CN107704589A (en) * | 2017-09-30 | 2018-02-16 | 百度在线网络技术(北京)有限公司 | Interest point failure method for digging, device, server and medium based on waybill |
| CN107729459A (en) * | 2017-09-30 | 2018-02-23 | 百度在线网络技术(北京)有限公司 | Map interest point failure method for digging, device, equipment and computer-readable recording medium |
| CN110674232A (en) * | 2018-06-14 | 2020-01-10 | 百度在线网络技术(北京)有限公司 | Map interest point processing method, map interest point processing device, server and storage medium |
| CN109582880A (en) * | 2018-12-04 | 2019-04-05 | 百度在线网络技术(北京)有限公司 | Interest point information processing method, device, terminal and storage medium |
| CN111460056A (en) * | 2019-01-22 | 2020-07-28 | 阿里巴巴集团控股有限公司 | Outdated POI mining method and device |
| CN110276023A (en) * | 2019-06-20 | 2019-09-24 | 北京百度网讯科技有限公司 | POI change event discovery method, device, computing device and medium |
| CN110597943A (en) * | 2019-09-16 | 2019-12-20 | 腾讯科技(深圳)有限公司 | Interest point processing method and device based on artificial intelligence and electronic equipment |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113056102A (en) * | 2021-03-18 | 2021-06-29 | 广州市爱浦电子科技有限公司 | Method for manufacturing direct-insert micro-power module power supply |
| CN113056102B (en) * | 2021-03-18 | 2021-11-09 | 广州市爱浦电子科技有限公司 | Method for manufacturing direct-insert micro-power module power supply |
| CN116385743A (en) * | 2023-04-28 | 2023-07-04 | 抖音视界有限公司 | Positioning method, device, computer readable medium and electronic device |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111949834B (en) | Site selection method and site selection platform system | |
| CN112861972B (en) | Site selection method, device, computer equipment and media for exhibition area | |
| CN103631835B (en) | Interest point information map presenting system and method thereof | |
| JP2020061146A (en) | System and method for detecting POI changes utilizing convolutional neural networks | |
| WO2020052338A1 (en) | Address identifier and longitude and latitude thereof mining | |
| CN110727740B (en) | Correlation analysis method and device, computer equipment and readable medium | |
| RU2598165C1 (en) | Non-deterministic disambiguation and comparison of data of location of commercial enterprise | |
| CN103514199A (en) | Method and device for POI data processing and method and device for POI searching | |
| CN107506499B (en) | Method, device and server for establishing logical relationship between points of interest and buildings | |
| CN110309433B (en) | Data processing method and device and server | |
| JP2001318938A (en) | Method and device for mining space data and recording medium | |
| CN112395486B (en) | Broadband service recommendation method, system, server and storage medium | |
| CN107688955A (en) | A kind of city commercial circle group variety division methods based on adaptive DBSCAN Density Clusterings | |
| US9811539B2 (en) | Hierarchical spatial clustering of photographs | |
| CN112836020A (en) | Method, device and equipment for querying house source information and computer storage medium | |
| Kilic et al. | Effects of reverse geocoding on OpenStreetMap tag quality assessment | |
| CN112182427A (en) | Data processing method, device, electronic device and storage medium | |
| CN110263250B (en) | Recommendation model generation method and device | |
| CN114547386A (en) | Wi-Fi signal-based positioning method, device, and electronic device | |
| CN114036414B (en) | Methods, devices, electronic equipment, media, and software products for processing points of interest | |
| CN111125272B (en) | Regional characteristic acquisition method, regional characteristic acquisition device, computer equipment and medium | |
| Krasteva et al. | Geospatial enrichment of urban data for advanced city planning: a pilot study | |
| US9449110B2 (en) | Geotiles for finding relevant results from a geographically distributed set | |
| Cichociński et al. | Spatio-temporal analysis of the real estate market using geographic information systems | |
| CN114820960A (en) | Method, device, equipment and medium for constructing map |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| WD01 | Invention patent application deemed withdrawn after publication | ||
| WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20210105 |

