Abstract
At present, there is still room for evolution in style transfer of open source programs. This research uses open source code for style transfer on GitHub. In addition, it supports the development of online AI Attraction Page, Windows versions, Andorid platform, and Intel NCS. It also strengthens calculation and supports bases of multiple platforms. It is able to implement static style transfer on film, and speed up style transfer inferencing performance on web page. In addition, the literature review explores aesthetic perception elements and applies them to calculate parameter setting. The results of this study discover when the content image weight is 7.5 and the style image weight is 120, the inferenced image can retain characteristics of the original image, and come out with new blending style. Besides, to freeze the content and style image weight ratio, and increase the style image weight value to more than 10,000, the thin film color effect may appear. When there are 32 filters, the extracted color and style can show the most appropriate proportion and state. When the style size is adjusted to 410 × 256 and the content image is close in size, the original style features become more prominent. Finally, keep the style image free space at appropriately 25%, higher texture effect may occur after training.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
In recent years, style transfer has flourished. This type of computer vision technology has been regarded as a popular field in both academia and industry. The literature review reveals that previous studies have mostly started at the technical level, while recent trends began to explore aesthetic principle on top of enhancing technique. Furthermore, setting calculus parameters becomes a new research subject. In order to produce good visual effect, adjustment has been made based on past experience [14]. The study reviews the aesthetic factors to find the most appropriate key style elements as a reference for parameter adjustment. In general, the contribution of this research lies in: (1) based on the aesthetic principle, exploring adjustment of different styles and content parameters generated from effect of style transfer images. (2) Implementing dynamic style transfer to film and reach key factors of noise reduction.
2 Literature Review
2.1 Convolutional Neural Network
Convolutional Neural Network (CNN), which gradually evolves from Neural Network (NN), is a mathematical function that uses computing model to simulate the neural system of the human or animal brain [17]. In 1989, Yann LeCun successfully applied CNN to recognize handwritten digits, which was an important basis for machine vision [19]. In recent years, due to the improvement of hardware computing performance and the advancement of deep learning algorithms, neural networks again became a hot topic. Their overall structure can be divided into three parts: the convolution layer, the pooling layer, and the fully connected layer [18]. The convolution layer is used to extract the image data characteristics. The image feature undergoes weight arithmetic according to pixels in the area covered by the filter. During the process, the effects of images vary as the size of the filter, the distance of the slide and the number of outputs are different. The filter, also known as the convolution kernel, is mainly used to reduce image noise. The pooling layer takes over the features after the convolution layer, then classifies and extracts features accordingly. The purpose is to reduce and compress calculation parameters, and to decrease overfitting. The loss layer is mainly used to determine the difference between the calculated result and the original preset answer. The loss function is used to punish the predicted deviation. The fully connected layer integrates the previously learned features and connects them.
2.2 Style Transfer
Style transfer is to transfer the selected image features to another image or film through convolution calculation. Gatys, Ecker, and Bethge entered style transfer to new application areas [8]. In addition, some researches have put forward concepts of feed-forward networks and defined the output loss function [16, 23]. Some other studies have proposed faster calculation methods [10], which greatly reduce the training time and cost. The framework of image analogies mentioned divides a set of images into two sides, one side is training data and the other side is a filter, which can produce similar image filter effect [12]. This method can transfer the texture of artistic style images or non-artistic common images to another image [6]. The similar point is just able to transfer a specific style. To this end, Dumoulin, Shlens, and Kudlur further brought up a neural network that can produce a variety of style changes, and created a neural network that can adapt to various artistic styles. This framework allows users to generate new styles through various combinations [5].
2.3 Video Style Transfers
Compared with image style transfer, video style transfer contains one more key factor – time, so smoothness is a very important key point. In the transfer process, the connection of each frame of the video played needs to be considered [14]. Originally, the algorithm of Gatys et al. is easy to cause noise, omission or flicker, and the effect is unstable [9]. For this reason, a time loss function is introduced to improve the deviation [22]. A deep learning framework of training is also proposed to proceed with style transfer for films of any length. Moreover, it is also suitable for virtual reality to produce 360 equirectangular images and films [3].
2.4 Aesthetic
Perception of beauty includes beauty experiment, beauty memory, visual experience, and culture influence. The aesthetic concept involves some psychology [13]. For image calculated by style transfer, visual evaluation is another important topic of current research [14]. According to visual complexity of image, it is discovered that three major factors affect human perception: compositions, colors and contents [11]. Color analysis proposed the MECOCO1 method for works by Van Gogh. Measuring complementary colors, and analyzing paintings colors find out that through the eclipse of time, colors will also change according to style [15]. As to specific arrangement of texture and handwriting, the painting style itself may generate a unique rhythm. Based on the aesthetic form displayed, five points are sorted out respectively: color, proportion, texture, structure, and composition [4]. To summarize the above, this study concludes with three aspects of aesthetic perception: (1) Proportional facets: structure, weight, and balance. (2) Color facets: lightness, chroma, hue. (3) Textured facets: gloss, texture, and aging marks. Under proportional, color and textured facets, this study will then apply different parameters to images after style transfer as well as differences of aesthetics so as to sustain the feature and strengthen style transfer weight implementation in the future.
3 Research Method
3.1 Purpose of Methodology
Values of various parameters are set according to aesthetic key factors obtained from literature review. The three key factors are explored in terms of ratio, color, and texture facets to calculate the aesthetic disparity in images under calculation of different parameters.
3.2 Experiment Facility
The hardware equipment used in the study is Acer PREOATOR HELIOS 300, Genuine Intel (R) CPU0000@2.40 GHz. The operating system is Windows10 64-bit. The OpenVINO Model Optimizer environment is set up as well.
3.3 Test Method and Framework
This research is divided into two parts: image style transfer and video style transfer. The image style transfer uses the framework of neural style transfer network brought up by J. Johnson, Alahi, and Fei-Fei [16], and obtains the VGG19 dataset from the open source programs of COCO and Matconvnet. The overall model training will calculate the total loss function, and then return to optomize.py to set the training model program, the feature points of each picture and the style image in the dataset. It will also execute transform.py to train the final model weight through each image convolution in the dataset. The entire program is executed recursively until the total function loss approaches zero, as shown in Fig. 1.
The image style transfer is based on the framework of Ruder, Dosovitskiy, and Brox [21]. It is modified in python based on lengstrom’s open source code to become the research model. Noise pixels are added at random to a single image so as to learn the before and after difference compensation of each image, and eliminate the irregularities in video after the transfer, as shown in Fig. 2.
3.4 Test Procedure
The first step is to store material that requires transfer style image into file folder of style transfer and file folder of content image, and adjust parameter setting for transfer. Such parameters are (1) content image weight. (2) style image weight. (3) num-base-channels. The second step is to execute program calculation so that the program can find the corresponding folders and images. Then the pre-trained neural network model can be imported. Each calculation takes about one and half days, and 170 style transfer images are calculated in one operation.
4 Results
This study uses the above network framework as the basis for research style transfer, adjusts parameter values before calculation via manual entry. The overall results are as follows:
4.1 Image Style Transfer-Control Weight Proportion
Based on the original style image of image content to proceed with different ratio α/β calculations. First fix the content image, then change the style image weight. Set the content image weight to 7.5, and set the style weight values to 75, 120, and 200 respectively. The output result is shown in Fig. 3. When the content image weight is controlled, and the style image weight is 75, the calculation result is closer to the content image. When the style image weight is 120, the calculated image has the best effect. It not only can create a new style, but also retain the characteristics of the original image. Yet continue to increase the style weight, the output results are closer to the style image.
4.2 Image Style Transfer - Control Weight Ratio
First, fix the weight ratio of content image and style image to 0.15. Then gradually enlarge Style-weight value to 1,000, 10,000, and 100,000. The output results are shown in Fig. 4. It is discovered as style weight becomes larger than 10,000, a thin film color effect appears. When the weight value continues to increase, the effect becomes more prominent.
4.3 Image Style Transfer - Control Style Image Size
Fix the content image, and change the style image. Calculate respectively 1024 × 638, 512 × 319, and 410 × 256. The output result is shown in Fig. 5. When the style mode image is 1024 × 638, its content image weight becomes more prominent. At 512 × 319, neither content nor style image feature stands out. Adjust the size to 410 × 256, and the content mode size approaches the same, the original style feature becomes even more prominent.
4.4 Image Style Transfer - Control Color Produced by Filter
Fix the content image and style image weights, and adjust the num-base-channels (nbc) values respectively from \( 2^{0} \) to \( 2^{7} \). The output result is shown in Fig. 6. Control color produced by filter. When the nbc value is 1, 2 and 4, the color is more monotonous in display. When the nbc value is 8 and 16, the displayed color is richer. But when the nbc value is 32 or more, the color no longer changes according to the increased value (Fig. 6).
4.5 Image Style Transfer - Control Degrees of Free Space in Style Image
Fix the content image, and different proportions of free space, 25%, 40%, and 50%, are left on the original style images. The output results are shown in Fig. 7. When the free space rate is bigger than 50%, outline of the output image is less clear. At 40%, the style features gradually become prominent, and the image outline gradually stands out. When the free space on style image is appropriately at 25%, the image style is trained. The image outline becomes the most obvious and can be trained with higher texture effect (Fig. 7).
Images produced by various degrees of free space in image styles [2].
4.6 Style Transfer Applied on Web Page Result
The style transfer of this study not only initiates the original code on GitHub, but develops supporting online webpage AI Attraction Page Fig. 8, Windows version, Android platform and Intel NCS. On top of the original static style transfer, the dynamic video style transfer can also be implemented. The real-time dynamic image shows immediate outcome after style transfer (Fig. 8).
Style transfer applied on webpage image display [1].
5 Research Conclusion and Discussion
This study discovers when the content image weight changes to 7.5 and the style image weight changes to 120, the calculated image has the best transfer effect, which can not only create a new style but also retain the original image characteristics. As to texture effect, fix the content image and style image weight ratio and increase the style image weight value to more than 10,000, it can show a thin film color effect with texture of saturation and transparency. When the size of the style image and the content image are close to each other, the transfer style effect becomes more obvious. For color, when there are 32 filters, the extracted color and style are displayed in the most appropriate proportion and state. As to quality, keep around 25% free space in style image can train with better texture quality. It provides reference for set value in future style transfer research.
References
Acer AI Attraction Page. https://acerwebai.github.io/VangoghCrazyWorld-Web/. Accessed 23 Jan 2019
Taipei Zoo-Flamingo. http://bitvoice.blogspot.com/2012/03/blog-post_2189.html. Accessed 23 Jan 2019
Chen, D., Liao, J., Yuan, L., Yu, N., Hua, G.: Coherent online video style transfer. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1105–1114 (2017)
Chen, J.C.-H.: Opens the gates of aesthetics from the visual form. Pulse Educ. (2), b1–b21 (2015). Opening aesthetics from the visual form. Educ. Pulse (2), pp. 1–21 (2015)
Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. arXiv preprint arXiv:1610.07629 (2016)
Elad, M., Milanfar, P.: Style transfer via texture synthesis. IEEE Trans. Image Process. 26(5), 2338–2351 (2017)
Wharf Walk. http://www.emmabiggsmosaic.net/03_work/01_public_art.html. Accessed 23 Jan 2019
Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015)
Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 1440–1448 (2015)
Guo, X., Qian, Y., Li, L., Asano, A.: Assessment model for perceived visual complexity of painting images. Knowl. Based Systems 159, 110–119 (2018)
Hertzmann, A., Jacobs, C.E., Oliver, N., Curless, B., Salesin, D.H.: Image analogies. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, pp. 327–340. ACM (2001)
Jacobsen, T.: Bridging the arts and sciences: a framework for the psychology of aesthetics (2006)
Jing, Y., Yang, Y., Feng, Z., Ye, J., Yu, Y., Song, M.: Neural style transfer: a review. IEEE Trans. Vis. Comput. Graph. p. 1 (2019)
Johnson, C.R., et al.: Image processing for artist identification. IEEE Sig. Process. Mag. 25(4), 37–48 (2008)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Networks 8(1), 98–113 (1997)
LeCun, Y.: Generalization and network design strategies. In: Connectionism in Perspective. Citeseer (1989)
LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)
Ready Mades. http://www.maggyhowarth.co.uk/readymades.html. Accessed 16 Jan 2019
Ruder, M., Dosovitskiy, A., Brox, T.: Artistic style transfer for videos. In: Rosenhahn, B., Andres, B. (eds.) GCPR 2016. LNCS, vol. 9796, pp. 26–36. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45886-1_3
Artistic style transfer for videos and spherical images. Int. J. Comput. Vis. 126(11), 1199–1219 (2018)
Ulyanov, D., Lebedev, V., Vedaldi, A., Lempitsky, V.S.: Texture networks: feed-forward synthesis of textures and stylized images. In: ICML, vol. 1, p. 4 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Feng, CH. et al. (2020). Research on Aesthetic Perception of Artificial Intelligence Style Transfer. In: Stephanidis, C., Antona, M. (eds) HCI International 2020 - Posters. HCII 2020. Communications in Computer and Information Science, vol 1224. Springer, Cham. https://doi.org/10.1007/978-3-030-50726-8_83
Download citation
DOI: https://doi.org/10.1007/978-3-030-50726-8_83
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50725-1
Online ISBN: 978-3-030-50726-8
eBook Packages: Computer ScienceComputer Science (R0)