[go: up one dir, main page]

Skip to main content

Dissimilarities for Web Usage Mining

  • Conference paper
Data Science and Classification

Abstract

The obtention of a set of homogeneous classes of pages according to the browsing patterns identified in web server log files can be very useful for the analysis of organization of the site and of its adequacy to user needs. Such a set of homogeneous classes is often obtained from a dissimilarity measure between the visited pages defined via the visits extracted from the logs. There are however many possibilities for defined such a measure. This paper presents an analysis of different dissimilarity measures based on the comparison between the semantic structure of the site identified by experts and the clustering constructed with standard algorithms applied to the dissimilarity matrices generated by the chosen measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • CELEUX, G., DIDAY, E., GOVAERT, G., LECHEVALLIER, Y. and RALAM-BONDRAINY, H. (1989): Classification Automatique des Données. Bordas, Paris.

    Google Scholar 

  • CHEN, C. (1998): Generalized similarity analysis and pathfinder network scaling. Interacting with Computers, 10:107–128.

    Article  MATH  Google Scholar 

  • FOSS, A., WANG, W. and ZAÏANE, O.R. (2001): A non-parametric approach to web log analysis. In Proc. of Workshop on Web Mining in First International SIAM Conference on Data Mining (SDM2001), pages 41–50, Chicago, IL, April 2001.

    Google Scholar 

  • GOWER, J. and LEGENDRE, P. (1986): Metric and euclidean properties of dissimilarity coefficients. Journal of Classification, 3:5–48.

    Article  MATH  MathSciNet  Google Scholar 

  • HUBERT, L. and ARABIE, P. (1985): Comparing partitions. Journal of Classification, 2:193–218.

    Article  Google Scholar 

  • KAUFMAN, L. and ROUSSEEUW, P.J. (1987): Clustering by means of medoids. In Y. Dodge, editor, Statistical Data Analysis Based on the L1-Norm and Related Methods, pages 405–416. North-Holland, 1987.

    Google Scholar 

  • ROSSI, F., EL GOLLI, A. and LECHEVALLIER, Y. (2005): Usage guided clustering of web pages with the median self organizing map. In Proceedings of XIIIth European Symposium on Artificial Neural Networks (ESANN 2005), pages 351–356, Bruges (Belgium), April 2005.

    Google Scholar 

  • TANASA, D. and TROUSSE, B. (2004): Advanced data preprocessing for intersites web usage mining. IEEE Intelligent Systems, 19(2):59–65, March–April 2004. ISSN 1094-7167.

    Article  Google Scholar 

  • TANASA, D. and TROUSSE, B. (2004): Data preprocessing for wum. IEEE Potentials, 23(3):22–25, August–September 2004.

    Article  Google Scholar 

  • VAN RIJSBERGEN, C.J. (1979): Information Retrieval (second ed.). London: Butterworths.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin · Heidelberg

About this paper

Cite this paper

Rossi, F., De Carvalho, F., Lechevallier, Y., Da Silva, A. (2006). Dissimilarities for Web Usage Mining. In: Batagelj, V., Bock, HH., Ferligoj, A., Žiberna, A. (eds) Data Science and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg . https://doi.org/10.1007/3-540-34416-0_5

Download citation

Publish with us

Policies and ethics