Abstract
In general,when users try to search information, they can have difficulties to express the information as exact queries. Therefore, users consume many times to find useful webpages. Previous techniques could not solve the problem effectively. In this paper, we propose an algorithm, RCW (Ranking technique for finding Correlated Webpages) for improving previous ranking techniques. Our method makes it possible to retrieve not only basic webpages but also correlated webpages. Therefore, RCW algorithm in this paper can help users easily look for meaningful information without using exact queries. To find correlated webpages, the algorithm applies a novel technique for computing correlations among webpages. In performance evaluation, we test precision, recall, and NDCG of our RCW compared with the other popular system. In this result, RCW guarantees that itfinds the number of correlated webpages greater than the other method, and shows high ratios in terms of precision, recall, and NDCG.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hulth A, Karlren J, Jonsson A, Bostrom H, Asker L (2010) Automatic keyword extraction using domain knowledge. Lect Notes Comput Sci 472–482
Ishii H, Tempo R (2010) Distributed randomized algorithms for the page rank computation. IEEE Control Syst Soc 55(9):1987–2002
Ermelinda O, Massimo R (2011) Towards a spatial instance learning method for deep web pages. In: Industrial conference on data mining (ICDM), pp 270–285
Fu L, Mmeng Y, Xia Y, Yu H (2010) Web content extraction based on webpage layout analysis. In: Information technology and computer science (ITCS), pp 40–43
Baillie M, Carman M, Crestani F (2011) A multi-collection latent topic model for federated search. Inf Retrieval 14(4):390–412
Ricardo Y, Carlos C, Flavio J, Vassilis P, Fabrizio S (2007) Challenges on distributed web retrieval. In: International conference on data engineering, pp 15–20
Flora T (2011) Web-based geographic search engine for location-aware search in Singapore. Expert Syst Appl (ESWA) 38(1):1011–1016
Song G, Yajie M, Liu Y, Chunping L (2009) Topic-based computing model for web page popularity and website influence. In: Australasian conference on artificial intelligence, pp 210–219
Costantinos D, Christos M, Yannis P, Evangelos T, Athanasios T (2010) A web page usage prediction scheme using sequence indexing and clustering techniques. Data Knowl Eng (DKE) 69(4):371–382
Sandeepkumar S, Sahely B, Sundararajan S, Rajeev R, Prithviraj S (2011) Web information extraction using markov logic networks. In: Knowledge discovery and data mining (KDD), pp 1406–1414
Metzler D (2008) Generalized inverse document frequency. In: Conference on information and knowledge management, pp 399–408
CLucene Project web page http://clucene.sourceforge.net/
Acknowledgments
This research was supported by the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF No. 2012-0003740 and 2012-0000478).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media Dordrecht
About this paper
Cite this paper
Pyun, G., Yun, U. (2013). Ranking Techniques for Finding Correlated Webpages. In: Kim, K., Chung, KY. (eds) IT Convergence and Security 2012. Lecture Notes in Electrical Engineering, vol 215. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-5860-5_130
Download citation
DOI: https://doi.org/10.1007/978-94-007-5860-5_130
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-5859-9
Online ISBN: 978-94-007-5860-5
eBook Packages: EngineeringEngineering (R0)