Abstract
The TOR Project allows the publication of content anonymously, which cause the proliferation of illegal material whose authorship is almost impossible to identify. In this paper, we present and make publicly available TOIC (TOr Image Categories), an image dataset which comprises five different illegal classes based on crawled TOR addresses. To classify those images we used Edge-SIFT features jointly with dense SIFT descriptors obtained from an “edge image” calculated with the Compass Operator. We demonstrate how a Bag of Visual Words model trained with the early fusion of dense SIFT and Edge-SIFT features can create an efficient model to detect and categorise illegal content in TOR network. Then, we estimated the radius for a complete dataset before the Edge-SIFT calculation, and we demonstrate that the classification performance is higher when the most salient edge information is extracted from the edges. We tested our proposal in both TOIC and in the public dataset Butterflies to prove the consistency of the method, obtaining an accuracy increase of 2.32 and 7.00 points respectively. We obtained with the Ideal Radius Selection an accuracy of 92.49% on TOIC dataset which makes this approach an attractive tool to detect and categorise illegal content in TOR network.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Script by Franck Michel - https://www.flickr.com/photos/franckmichel/6855169886.
- 2.
- 3.
References
Moore, D., Rid, T.: Cryptopolitik and the darknet. Survival 58(1), 7–38 (2016)
Deep light shining a light on the dark web. SC Magazine 2015, 13 January 2017
Al Nabki, M., Fidalgo, E., Alegre, E., de Paz, I.: Classifying illegal activities on TOR network based on web textual contents. In: European Chapter of the Association for Computational Linguistics (2017)
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004). doi:10.1023/B:VISI.0000029664.99615.94
van de Weijer, J., Schmid, C.: Coloring local feature extraction. In: ECCV, vol. 3952, pp. 334–348. Springer (2006). doi:10.1007/11744047_26
van de Sande, K., Gevers, T., Snoek, C.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1582–1596 (2010). doi:10.1109/TPAMI.2009.154
Bosch, A., Zisserman, A., Muoz, X.: Scene classification using a hybrid generative/discriminative approach. IEEE Trans. Pattern Anal. Mach. Intell. 30(04), 712–727 (2008). doi:10.1109/TPAMI.2007.70716
Borji, A., Itti, L.: State-of-the-art in modeling visual attention. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 185–207 (2013). doi:10.1.1.252.3616
Xie, L., Tian, Q., Zhang, B.: Spatial pooling of heterogeneous features for image classification. IEEE Trans. Image Process. 23(5), 1994–2008 (2014). doi:10.1109/TIP.2014.2310117
Fidalgo, E., Alegre, E., González-Castro, V., Fernández-Robles, L.: Compass radius estimation for improved image classification using Edge-SIFT. Neurocomputing 197, 119–135 (2016). doi:10.1016/j.neucom.2016.02.045. ISSN:0925-2312
Ruzon, M.A., Tomasi, C.: Color edge detection with the compass operator. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, p. 166 (1999). doi:10.1109/CVPR.1999.784624
Csurka, G., Bray, C., Dance, C., Fan, L.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1–22 (2004). doi:10.1.1.72.604
Lazebnik, S., Schmid, C., Ponce, J.: Semi-local affine parts for object recognition. In: Proceedings of the British Machine Vision Conference, vol. 2, pp. 959–968, September 2004. doi:10.5244/C.18.98
Vapnik, V.N.: The nature of statistical learning theory. Springer, New York (1995). doi:10.1.1.332.356
Suykens, J.A., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999). doi:10.1023/A:1018628609742
Vedaldi, A., Fulkerson, B.: Vlfeat: an open and portable library of computer vision, algorithms. In: Proceedings of the International Conference on Multimedia (MM 2010). ACM, New York, pp. 1469–1472 (2010). doi:10.1145/1873951.1874249
VLFEAT. http://www.vlfeat.org/
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967). doi:10.1.1.308.8619
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theor. 28(2), 129–137 (2006). doi:10.1109/TIT.1982.1056489
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011). doi:10.1145/1961189.1961199
Maji, S., Berg, A.C., Malik, J.: Classification using intersection kernel support vector machines is efficient. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, 23–28 June 2008. doi:10.1109/CVPR.2008.4587630
Acknowledgement
This research was funded by the framework agreement between the University of León and INCIBE (Spanish National Cybersecurity Institute) under addendum 22. We want to thanks to Francisco J. Rodríguez and Antonio Sepúlveda, from INCIBE, for their help and valuable comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Fidalgo, E., Alegre, E., González-Castro, V., Fernández-Robles, L. (2018). Illegal Activity Categorisation in DarkNet Based on Image Classification Using CREIC Method. In: Pérez García, H., Alfonso-Cendón, J., Sánchez González, L., Quintián, H., Corchado, E. (eds) International Joint Conference SOCO’17-CISIS’17-ICEUTE’17 León, Spain, September 6–8, 2017, Proceeding. SOCO ICEUTE CISIS 2017 2017 2017. Advances in Intelligent Systems and Computing, vol 649. Springer, Cham. https://doi.org/10.1007/978-3-319-67180-2_58
Download citation
DOI: https://doi.org/10.1007/978-3-319-67180-2_58
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67179-6
Online ISBN: 978-3-319-67180-2
eBook Packages: EngineeringEngineering (R0)