Assuring different responses to queries sent in a data network
FIELD OF THE INVENTION
The field of the present invention relates to sending queries in a data network, such as in peer-to-peer (P2P) networks, and more specifically to a method of ensuring that a user receives a predetermined number of responses to a query sent by the user. Furthermore, an entity for providing such information is disclosed.
BACKGROUND OF THE INVENTION
It has been more and more common to set up data networks comprising more computers or other entities in decentralized networks. The task of searching for content in such decentralized networks is complicated due to the fact that no central server guides the searching process and accumulates results from different computers or entities in the data network, sort the results for instance by relevance, and present the results to the user. Thus, there is no primary server that will handle all queries and requests from a user and provide a completed result of any query or request to the user.
To achieve the desired results in such networks, and for a user to have the possibility of selecting between as many results as possible, the query may be sent through the entire network so that each match on each computer or entity is found. However, this is not practically feasible when the size of the network increases.
Another possibility is to search the network only until a threshold number of responses has been received by the entity sending the query. Thus, when an entity has responses matching the query, the results are returned and the query propagated to further entities with an adjusted threshold value. Hereby, a limited number of results is obtained, however, all the results may be identical, especially where a popular piece of content matches the query.
One solution has been proposed by C. Lindemann and O. P. Waldhorst, University of Dortmund, Proc. 2nd IEEE Conf. on Peer-to-Peer Computing, Sept. 2002. They disclose a method of searching for content specifically adapted to mobile applications. A query Q is sent to devices in the network, for example to a device B, and device B forwards the query to a device C. Due to the query request, results, that is query responses, obtained by device B, for example ql, q2, are sent to all neighboring devices, that is to both device C and
device A. The results or query responses achieved at device C, for example q2, q3, are also sent to device B. At device B, the correspondence between the results obtained at device B, and the results received from device C is checked and only the responses different from the results found at device B are selected and then selectively forwarded back to device A. In this method, an intermediate device filters the responses sent from other devices on the way back to the device sending the query.
OBJECT AND SUMMARY OF THE INVENTION
It is an object of the present invention to minimize the information sent in the network.
It is a further object of the present invention to ensure that a predetermined number of different responses are received by the requesting entity.
According to a first aspect of the invention, the above-mentioned and other objects are fulfilled by a method of of enabling a user to obtain at least a predetermined number of different responses to a query sent in a data network comprising a number of further entities, the method comprises the steps of receiving, at an entity, a query from a requesting entity, the query comprising a query expression and a threshold value representing the predetermined number of different responses to be obtained; and performing the query at the entity to obtain query responses; determining an updated query comprising identifying said query responses and/or and determining an updated threshold value; and transmitting said updated query to one or more further entities if the updated threshold value is above zerowhereby the user will receive at least the predetermined number of different responses when the query has been performed in the data network.
The method of enabling a user to obtain, or the method of obtaining, at least a predetermined number of different responses to a query sent in a data network, may be stored on a computer readable medium, so that the computer readable medium comprises or have stored therein instructions for causing a central processing unit to execute said method. In a preferred embodiment of the invention, the query is a search query. However, it is envisaged that the method may be used for any request or query sent in a data network. The query comprises in addition to the query, being for example a search string, also the threshold value being the predetermined number of different responses that a user would like to obtain. When the query has been performed on entities in the data network some query responses are obtained, and an updated query is generated, the updated query comprising the query information, for instance a search string, and the threshold value, and
additionally, the query comprises content identifiers for the already found results.
The method uses a threshold value so that a user may chose a predetermined number of different responses, which the user would like to obtain. Alternatively, the threshold value may be generated by the entity. Thus, the query is distributed in the network only until this predetermined number of different responses is obtained. It is an advantage of using this method that the time for searching is limited and further that the user is not presented with a long list of identical responses. This provides for an effective use of the resources in the system and furthermore, the information sent in the network is minimized. The threshold value may be generated by the entity according to the specific query. If the query is based on a content identifier,for example a unique content identifier for the specific content, there will be only one possible response, and the threshold value should be set to one (1). If the query is based on a title, it is not expected that there will be many pieces of content having the same title, so a relative small threshold value, such as for example five (5) should work properly. If the query is based on a keyword search, there may be many pieces of content matching the query, and the threshold value could therefore be set to a higher value, such as ten (10) or twenty (20) or fifty (50).
Setting the threshold value according to the query in an intelligent way as for example set out above, may improve the scalability of the method, since the query will not be propagated unnecessarily. The query responses may be sent to the requesting entity immediately upon performing of the query, that is the query response may be sent back to the requesting entity immediately after the query has been performed at the entity. Hereby, the query response is sent back before the updated query is generated and thus minimizing the delay in the network. The response may be sent directly back to the requesting entity, that is the entity sending out the original query, or alternatively, the query responses may be sent back through the same entities through which the query was originally propagated to reach this entity, thus the query response may be sent back along the route at which the query was propagated. The query response submitted back to the requesting entity may comprise query responses similar to query responses identified in the updated query, so that the requesting entity has multiple sources from which a specific response may be downloaded from. However, such query responses being identical to query responses already mentioned or listed in the updated query will not be considered to be new query responses and the threshold value will not be adapted, even though this query response is identified in the query response sent back to the requesting entity. Thus, the threshold value is decreased only by the
number of new query responses.
The query is updated before the query is forwarded to further entities in the network. The number of new different responses found at the entity is determined for example by way of comparison. The threshold value is updated to reflect the number of new different query responses that still needs to be found. The updated query thus comprises both the updated threshold value and an updated list of responses found so far matching the query.
The list of query responses identified in the updated query do preferably comprise all the query responses, or at least identifiers, such as content identifiers, for the query responses, found so far in the specific hub. Thus, the updated list comprises identification of the query responses obtained at the present entity being the entity sending the updated query and of the query responses being identified in the query sent to the entity sending the updated query, if any.
It is preferred that a unique content identifier identifies the query responses so that for example searchable content in the data network have been assigned unique content identifier values. These content identifiers could for example be Content Reference
Identifiers (CRIDs), such as CRIDs in a TVanytime network.
The query response submitted back to the requesting entity then comprises both the identifiers for the content, and an identifier for the entity, to say an entity identifier, at which the content was found. In this way, the query response comprises pointers for the content and depending on which content the user wants to receive, this content may be received from any of the entities comprising this content as listed in the query response.
Typically, a user will receive more different responses than provided for by the threshold value. This happens because the query is sent to different hubs in the same network, and since the different hubs have no information of the responses obtained in other hubs, the total number of responses will exceed the threshold value. In each hub, a number of different responses corresponding to, or exceeding the threshold value will be achieved.
However, some of these responses will most likely be identical and a reasonable number of responses is achieve. Because of the demand for scalability and performance, current network systems are typically designed or constructed so that this overhead is minimal. Examples of such a network system comprises the so-called hybrid systems where the entities are locally centralized and globally decentralized, and may for example be a JXTA hybrid network. In a preferred embodiment the network is a network of hard disk recorders, such as a network of DVD hard disk recorders. The network may comprise any type of computers recorders, playback devices, etc.
The method may further comprise the option of terminating the query after a pre-set maximum time. Hereby, a response is received within a reasonable time, even if the predetermined number of different responses are not found in the data network.
According to a further aspect of the present invention, a computer readable media having stored therein instructions for causing a central processing unit to execute the method according to one of the claims 1 to 9 is provided.
According to a further aspect of the present invention, an entity for enabling a user to obtain at least a predetermined number of different responses to a query sent in a data network comprising a number of further entities, the entity comprising a receiver adapted for receiving a query from a requesting entity, the query comprising a query expression and a threshold value representing the predetermined number of different responses to be obtained; and query means for performing the query at the entity to obtain new query responses; and determining means for determining an updated query comprising identifying said query responses and determining an updated threshold value; and a transmitter for transmitting said updated query to one or more further entities if the updated threshold value is above zero whereby the user will receive at least the predetermined number of different responses when the query has been performed in the data network.
Thus, the entity needs to be able to receive a query sent in the network and the query means for performing the query may be any processor adapted to search at the entity, such as for searching a storage of the entity. The query means may preferably be a method that searches a database of the entity for example for keywords. The previously identified query responses are identified by identifiers which preferably are unique identifiers, and by comparing the new query responses found at the entity with the previously identified query responses, for example by use of the identifier values, the number of new query responses are determined. The threshold value generator thus generates a new threshold value by subtracting the number of new query responses from the received threshold value. Finally, the updated query is transmitted to new entities using any transmitter. Preferably, the same transmitter also provides for the transmitting of the query response to the requesting entity.
BRIEF DESCRIPTION OF THE DRAWINGS
It is envisaged that all features and functions described in connection with the first and second aspect of the invention applies equally to the other aspects of the invention and vice versa. The invention will be described in more detail hereinafter with reference to examples of embodiment but to which the invention is not limited.
Fig. 1 shows a number of entities in a data network.
Fig. 2 shows a blockdiagram of the structure of an entity according to the invention.
Fig. 3 shows a flowchart describing a method according to the invention.
DESCRIPTION OF EMBODIMENTS
In Fig. 1 a basic data network 100 comprising peers or entities A to E is shown. Peer A is the requesting entity and the network comprises two hubs, a first hub comprising the entities: Peer B, Peer D, and Peer E, and a second hub comprising the entity Peer C.
In Fig. 2, an entity 1 for use in a data network 100 is shown. The entity 1 enables a user to obtain at least a predetermined number of different responses to a query Q sent in a data network 100 comprising a number of entities A to E. The entity 1 comprises a receiver 2 for receiving a query Q from a requesting entity 9, the query Q comprises a query expression QE and a threshold value T representing the predetermined number of different responses to be obtained and optionally an identification of previously obtained responses PR.
The entity 1 further comprises query means 3 and a storage 4 storing audio files in a database in MP3 format in the present case. Other contents and formats are likely possible, for instance audio files according to MPEG, AAC, OGG as well as video files in MPEG format or DivX or other similar format. By means of the query means 3 the query Q is then performed on basis of the query expression QE, whereby the query means 3 interact with the storage 4, thereby obtaining query responses QR.
Further, entity 1 comprises determining means 15 for determining an updated query UQ comprising identifying said query responses QR and determining an updated threshold value UT. In the present case the determining means 15 comprises a threshold value generator 5 for generating the new or updated threshold value UT and comprises identification means 16 for identifying said query responses QR. The identification means 16 identifies a number of new query responses IQR based on query responses QR obtained from the query means 3 and on previously obtained responses (PR) submitted by the query Q. The threshold value generator 5 determines an updated threshold value UT in that the received threshold value T is decreased by the number or quantity of new different query responses IQR. The query expression QE and the updated threshold value UT and the new different query responses IQR are fed to composing means 6 of the determining means 15 for
composing the updated query UQ.
Furthermore, entity 1 comprises a transmitter 7 for transmitting said updated query UQ to one or more further entities 8 if the updated threshold value UT is above zero. The transmitter 7 in this case is also adapted for transmitting the query responses QR to the requesting entity 9. A separate transmitter or separate submission means may be provided for transmitting the query responses QR to the requesting entity 9.
In other words, an updated query UQ comprising identification of the new different query responses QR found at the entity 1 and identification of query responses obtained at prior entity/entities 9 is provided and the updated query UQ also comprises the updated threshold value UT. The updated query UQ is then transmitted by transmitter 7 to a further peer 8 in the data network. The transmitter 7 thus propagates the updated query UQ comprising the query expression QE, identified query responses IQR and updated threshold value UT.
Now, referring to Fig. 3, a flow chart 300 describing a method according to the present invention is explained.
The process starts according to a step 301 if a query Q is received by entity 1. The query Q comprises a query search information or query expression QE to be searched and a threshold value T representing the predetermined number of different responses to be obtained and an identification of previously obtained responses PR, which identification of previously obtained responses PR are CRIDs in a TV-anytime network used as unique content identifiers in this case. The query is performed at the entity 1, step 302, in order to obtain query responses QR. In a step 304, it is determined if there has been found new query responses IQR. If no new query responses IQR are found, then the process continues at a step 309. If new query responses IQR are found according to step 304, the new different query responses IQR are obtained in step 305 and the threshold value T is then updated according to a step 306 obtaining an updated threshold value UT. Then, an updated query UQ is determined or composed according to a step 307, the updated query UQ comprising an updated list of content identifiers of the the new different query responses IQR and the updated threshold value UT and the query expression QE. If the updated threshold value UT is above zero, the updated query UQ is sent to one or more further entities 8 according to a step 308. If the updated threshold value UT is zero or below, the process is ended at a step 310 and the query is not propagated any further.
The flow chart 300 is further illustrated in Table 1 :
Table 1
An example of the method according to the present invention is now provided.
Peer A being the requesting entity, is looking for five (5) different so called "mp3" audio files; so a search query Q is issued to Peer B, the query Q comprising a denotation of the mp3 files which are to be searched for, to say a query expression QE, and the threshold value T of the amount of five (5) in this case. The resulting response should then be a list of at least five (5) identifiers identifying five (5) pieces of content matching the query expression QE. An exemplary syntax could be:
search (*.mp3, 5)
Peer B has two (2) responses, that is the responses with the unique identifiers "songl.mp3" and "song2.mp3". Peer B sends these query responses QR back to Peer A. An exemplary syntax for the response to Peer A could be:
response (peer B, [songl.mp3, song2.mp3])
Peer B then generates an updated query UQ comprising the responses QR, or pointers to or identifiers for, the two (2) responses, and an updated threshold value UT as being the
threshold value five (5) decreased by the quantity of different query responses, which quantity of different query responses is two (2) in this case. The updated query response may thus be:
search (*.mp3, 3, [songl.mp3, song2.mp3])
wherein the single number three (3) denotes the updated threshold value UT. In other words, the updated search query UQ now requests for three (3) different mp3 files which are not among the songs or files already found, to say are not songl.mp3 or song2.mp3 in this case. This updated query UQ is then transmitted to a Peer D and a Peer E. Peer D has three (3) query responses, being "songl.mp3", "song3.mp3", and "song4.mp3". Peer D sends all these three (3) query responses, or at least pointers to or identifiers for the responses, including the already known response: "songl.mp3", back to peer A, knowing that two (2) new different query responses, that is "song3.mp3" and "song4.mp3", were found. The search transmitted propagated from Peed D to further peers (not shown) may thus be:
search (*.mp3, 1, [songl.mp3, song2.mp3, song3.mp3, song4.mp3])
When the last new different query response, as indicated with the single number one (1) in the expression, has been found, then no further queries are forwarded in the system, since the threshold value is then decreased to zero. Since two new different responses may be found at a subsequent peer, also six (6) different responses may be provided back to peer A. The threshold value will then decrease to minus one (-1) being interpreted by the system as a zero. In that the search query has also been sent to a Peer C, being another hub having no information about the query responses obtained in the Peer B hub, there may be up to five different responses from Peer C. Some of these response may be similar to responses found at the Peer B hub, so that ten (10) different responses will rarely be received by Peer A. in total