How Do Convolutional Neural Networks Learn Design

This document summarizes a research paper that used convolutional neural networks and layer-wise relevance propagation to analyze visual design elements in book cover images that help classify the book's genre. Specifically: 1. CNNs were used to classify book cover images into genres, achieving varying accuracy levels across genres. 2. Layer-wise relevance propagation was then applied to the CNNs to generate heatmaps highlighting the pixels most relevant to each genre classification. 3. Additional object and text detection methods were used to quantitatively analyze the visual elements identified by the heatmaps as most important for genre classification.

Uploaded by

Thaís Donega

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

65 views6 pages

How Do Convolutional Neural Networks Learn Design

Uploaded by

Thaís Donega

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

2018 24th International Conference on Pattern Recognition (ICPR)

Beijing, China, August 20-24, 2018

How do Convolutional Neural Networks

Learn Design?
Shailza Jolly∗ , Brian Kenji Iwana† , Ryohei Kuroki† , Seiichi Uchida†
∗ University
of Kaiserslautern, Kaiserlautern, Germany
Email: sjolly@rhrk.uni-kl.de
† Department of Advanced Information Technology, Kyushu University, Fukuoka, Japan

Email: {brian, kuroki, uchida}@human.ait.kyushu-u.ac.jp

Abstract—In this paper, we aim to understand the design [7]. LRP decomposes output function on its input variables and
principles in book cover images which are carefully crafted by highlights input pixels contributing towards the network deci-
experts. Book covers are designed in a unique way, specific to sion. It produces a layer-wise relevance heatmap by recursively
genres which convey important information to their readers. By
using Convolutional Neural Networks (CNN) to predict book gen- multiplying the relevance of higher layers by the normalized
res from cover images, visual cues which distinguish genres can feature maps of the target layer. The heatmaps can help us to
be highlighted and analyzed. In order to understand these visual discover the input image elements which have an effect on the
clues contributing towards the decision of a genre, we present the classification result.
application of Layer-wise Relevance Propagation (LRP) on the
book cover image classification results. We use LRP to explain
The main contributions of this paper are threefold. Firstly,
the pixel-wise contributions of book cover design and highlight we classified the book cover images using one-vs-others
the design elements contributing towards particular genres. In classification with CNNs. Secondly, the models built by the
addition, with the use of state-of-the-art object and text detection CNNs are analyzed using LRP. With LRP, we demonstrate
methods, insights about genre-specific book cover designs are design elements specifically relevant to classification of the
discovered.
book cover images. We show that certain objects have a strong
I. I NTRODUCTION relevance to particular genres. Finally, we use state-of-the-art
object detection and text detection methods, namely Single
Visual design renders specific impressions to transmit in- Shot Multibox Detector (SSD) [8] and Efficient and Accurate
formation which enriches the product’s value. However, these Scene Text Detector [9], to quantitatively enforce the results
visual designs despite of being important are not analyzed found by LRP. This reveals the specific elements in which
objectively or statistically. Analyzing these visual designs CNNs classify book cover images for genre classification.
enables us to understand the contained information carried by The organization is as follows. Section II provides related
them. works in design understanding and genre classification as
An interesting target of visual design analysis is book cover well as feature visualization of CNNs. Section III reviews
image design where the design of a book cover can infer the the data and tools used for understanding book cover design.
genre. Each book cover is carefully designed by typographers Section IV presents analysis of CNN’s understanding of book
and their designs represent the book contents in an intuitive cover design. In Section V, we demonstrate the use of LRP
way for better sales. This association of books to specific combined with SSD and EAST for quantitative analysis.
genres is based on the differences in their underlying book Finally, Section VI draws a conclusion.
cover designs [1]. The slight change in book cover design can
reflect changes in book genre which makes design learning a
challenging task for book covers. II. R ELATED W ORK
In order to understand the design elements used for machine
A. Genre Classification
aided book cover classification, we employ Convolutional
Neural Networks (CNN) [2]. In recent years, CNNs have Artistic style understanding and subjective genre classifica-
achieved state-of-the-art results in isolated character recog- tion is a budding field in machine learning. For example, recent
nition [3], [4] and large-scale image recognition [5], [6]. attempts have been done to identify artistic styles and quality
Notably, Iwana et al. [1] demonstrated that CNNs can be of paintings and photographs [10], [11] with neural network
used for genre classification based on book cover image, models. In addition, there have been trials to classify music
although with a high level of difficulty. However, that study by genre [12], [13], book covers by genre [1], movie posters
was subjective and not enough explanation is given as to why by genre [14], paintings by genre [15], and text by genre [16],
the CNN performed as it did. [17]. Also, in a general sense, document classification can be
To interpret the reasoning behind a CNN’s prediction we considered genre classification and deep CNNs are the state-
used a method called Layer-wise Relevance Propagation (LRP) of-the-art in the document classification domain [18]–[20].

978-1-5386-3788-3/18/$31.00 ©2018 IEEE 1085

Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on November 22,2023 at 22:06:37 UTC from IEEE Xplore. Restrictions apply.
B. Visualization inside of CNNs
There is a desire to visualize features and determine pixel-
wise attention and relevance within the hidden layers of CNNs.
However, this is a not a straightforward task [21]. Erhan
et al. [21] proposed using gradient decent to maximize a
node’s activation to visualize the employed features. Similar
work has been done for large-scale image classification [22].
Zeiler and Fergus [23] used deconvolutional neural networks to
visualize features learned by CNNs. In addition, they created
heatmaps by monitoring class changes systematic cover up of
portions of the images. Class Activation Maps (CAM) [24],
GradCAM [25], and GradCAM++ [26] reveal the parts of Fig. 1. CNN accuracy by genre.
images which are most important to a class using global
average pooling (GAP).
Recently, LRP has been used in the fields of text [27] where reference classes, such as ”Engineering & Transportation,”
classification scores were projected back to input features for ”Health, Fitness and Dieting,” ”History,” ”Medical Books,”
extracting relevant words for a specific prediction. The method and ”Reference.” Conversely, ”Children Books,” ”Romance,”
has also shown successes in model understanding in fields of and ”Test Preparation” had high accuracies. However, more
sentiment analysis [28], action recognition [29], and age and than just classification accuracy, the purpose of this paper is
gender classification [30]. As far as the authors are aware, this to understand why the CNN’s performed as such and reveal
is the first time LRP has been used for the understanding of the relevant parts of the images.
genre or design classification.
C. Layer-wise Relevance Propagation
III. DATA AND TOOLS FOR UNDERSTANDING BOOK COVER
The LRP algorithm and the LRP toolbox [31] aims to
DESIGN
explain the reasoning behind the decision made by a network
A. Amazon Book Cover Dataset model which allows its users to validate classifier results.
We used the Book Cover Image to Genre dataset1 Task 1.A. LRP is mainly derived from Deep Taylor Decomposition [32],
The dataset consists of 57,000 book cover images divided a method of decomposing network’s output predictions onto
into 30 classes of equal sizes. In the experiments, we used its input variable. The results after such a decomposition is
the predefined training set and test set modified for one-vs- visualized in the form of a heatmap highlighting each pixel’s
others classification. In this way, genre-wise training sets were importance for the prediction.
prepared with an equal distribution of positive and negative LRP explains output function, i.e. classifier’s decision,
data samples. which helps us to derive all of the crucial pixels for a
particular prediction. In Fig. 2, the technique is shown in
B. Convolution Neural Networks which the output value given by the network is decomposed
CNNs are able to tackle image recognition by implementing backwards layer by layer until it reaches the input. This
convolutions of learned filter-like shared weights which main- backward decomposition of network’s prediction uses local
tain the structural qualities of images while acting as feature redistribution rules for assigning relevance values Ri to each
extractors [2]. For the experiment, we implement CNNs to neuron contributing towards the output, namely
tackle book genre classification. To use the book cover images X X X
with a CNN, they were preprocessed by scaling them to Ri = Rj = · · · = Rk = f (x), (1)
112×112 pixels by 3 color channel images and by normalizing i j k

the values to be between -1 and 1. The CNN used for the where f (x) is the prediction function, Ri is the relevance of
experiments has six convolutional layers with Rectified Linear node i in the target layer, Rj is the relevance of node j of
Units (ReLU) activations and a softmax output layer. The the previous layer, and Rk is the relevance of node k of the
convolutional layers consisted of three layers of 10 nodes with highest layer. The total amount of relevance is conserved in
5 × 5 convolutional filters, one layer of 25 nodes with a 4 × 4 this equation.
filter, one layer of 50 nodes with a 3 × 3 filter, and one layer For the experiment, we used the α − β decomposition
of 100 nodes with a 1×1 filter. A 2×2 maxpooling layer with formula defined by
stride 2 was used between each convolution layer. Finally, the
X (ai wij )+ (ai wij )−

CNNs were trained using gradient decent with a batch size of Ri = αP + β Rj , (2)
−
+
P
25 at a learning rate of 0.001 for 50,000 iterations. j i (ai wij ) i (ai wij )
The accuracy results for each genre is summarized in
Fig. 1. In particular, the CNNs had difficulties with the where α and β are hyperparameters to weight the positive
(ai wij ) (ai wij )
values of P (a i wij )
and the negative values of P (a i wij )
,
i i
1 https://github.com/uchidalab/book-dataset respectively. Furthermore, wij is the weight between nodes i

1086

Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on November 22,2023 at 22:06:37 UTC from IEEE Xplore. Restrictions apply.
A. Sports & Outdoors
Under this genre, many book covers with pictures of players
playing indoor and outdoor games were seen. Figure 3 (a)
shows LRP results on these covers, which presents significance
of player’s picture on the cover. The first image in Fig. 3 (a)
supports this fact with LRP being centered on players who are
either playing a sport or showing some player like gesture,
Fig. 2. Feed forward neural network with the (left) forward pass and the
(right) backward relevance calculation. The function f (x) is the prediction with car in background adding no contribution. The second
outcome given input x. The variables ai and aj are the inputs for node i and image in Fig. 3 (a) emphasizes the animal’s importance for
j, respectively. Ri is the relevance of node i and Rj is the relevance of node this genre’s prediction.
j.

B. Engineering & Transportation

and j and ai is the input to node i. This decomposition allows
for the separation of the positive connections and the negative For this genre, almost all the covers with vehicle pictures
connections. Values inside positive bracket indicates propa- on their covers were classified correctly by the network. With
gation of activating input messages while negative weight LRP in Fig. 3 (b), part of image containing cars or motorbikes
connections indicate deactivating input values. seem to add more relevance than others. The last image in the
Fig. 3 (b) presents the cases when contribution of person image
D. Single-Shot Multibox Detector was dominated by vehicle in the image.
To develop a better understanding of the objects within book
cover images, we employed SSD [8], a state-of-the-art deep C. Romance
neural network based object detection method. SSD is a feed
forward CNN which produces a multi-scale collection of fixed Its obvious from the genre name that pictures of couples on
size bounding boxes and scores for object detection within the the cover are going to have more relevance and LRP results
boxes. A final non-maximal suppression step determines the showed this fact to be true. However, among pictures presented
final detections. The result of SSD is bounding box regions in Fig. 3 (c), LRP depicted girls to add more relevance
with object classification labels. Using SSD, it is possible to than men or other things. The reason could reside in their
accurately detect multiple objects of different classes within physical appearance, hairs, and choice of dresses. The same
images. was demonstrated in last picture of Fig. 3 (c) in which girl’s
hair are seen to add more relevance with zero relevance coming
E. Efficient and Accurate Scene Text Detector from animal part on book cover.

For humans, text is an important component of book covers;

it is where the title, authors, and additional information is D. Children’s Books
conveyed. However, a CNN may place a different importance Almost all the children book covers contain pictures of car-
on text than humans. Thus, to analyze the relevance of text in toon characters. LRP on covers from this genre showed these
book covers, we use EAST [9] as a text detector. EAST uses cartoon characters to have higher relevance. An interesting
a multi-channel Fully Convolutional Network (FCN) and non- result is shown in first picture of Fig. 3 (d), where person is
maximal suppression on predicted geometric shapes to detect depicted as an adversarial identity and importance of cartoons
multi-orient text-line and word boxes. in cover is highlighted. Some covers showed more relevance
for one object in the set of objects. Like, in Fig. 3 (d) some
IV. H OW CNN S U NDERSTAND B OOK C OVER D ESIGN :
cartoons in last picture have higher relevance. It can be because
Q UALITATIVE A NALYSIS
of the object placement and their orientations. With the help
In this section, we have presented LRP results from main of this information, one can make smart choices for different
genres. The analysis helped us to deduce book cover design characters, cartoons and color patterns.
elements contributing towards a prediction by CNN. We used
α − β decomposition formula with values of α = 2 and β = E. Cookbooks, Food & Wine Books
−1 which is suggested for networks using ReLU activation
functions because it emphasizes the positive elements and de- Book covers in this genre most commonly contained pic-
emphasizes the negative ones [7]. This is important due to tures of different kinds of food. The results in Fig. 3 (e)
the ReLU activation function setting negative values to zero. showed these food pictures as containment of higher relevance
In the heatmaps generated by LRP under this decomposition, for this genre. However, carefully analyzing LRP results we
pixels adding positive contribution are represented in red color discovered shapes of dishes like bowls or spoons adding
and the ones adding negative contribution are represented by significant relevance for the genre’s prediction. So, this marks
blue color. significance of dish shape designs on covers from this genre.

1087

Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on November 22,2023 at 22:06:37 UTC from IEEE Xplore. Restrictions apply.
(a) “Sports & Outdoors” (b) “Engineering & Transportation” (c) “Romance”

(d) “Children’s Books” (e) “Cookbooks, Food & Wine” (f) “Test Preparation”

Fig. 3. Correctly recognized book covers. Object classes by SSD and text by EAST are highlighted.

F. Test Preparation V. H OW CNN S U NDERSTAND B OOK C OVER D ESIGN :

The genre contained covers with both text and pictorial Q UANTITATIVE A NALYSIS
information as shown in Fig. 3 (f). With most contribution
A. Experiment Setup
coming from big text content on covers. Images of Fig. 3 (f)
presents big texts to add more relevance than images of people. In order to quantitatively analyze LRP, we propose using
In first image of Fig. 3 (f), despite of big girl face, relevance SSD to detect objects and EAST to detect text within the book
is concentrated on text area of book cover. cover images. We then use LRP to compare the relevance
Such analysis helped us to find design elements specific of objects bound by the detection methods. The SSD was
to the presented genres. To get more familiar with design, trained on the 2012 PASCAL Visual Object Classes (VOC)
we also presented some cases where the network was not Challenge dataset [33]. The VOC dataset contains 20 classes,
able to correctly classify the genre. Figure 4 shows some of including ”person,” six animal classes, eight vehicle classes,
these misclassifications, mainly from the presented genres. The and seven indoor object classes. While SSD trained with VOC
correct genre names are written below the image. From the is intended for natural scene images, it can be used with book
analysis presented above, one can easily decode the reason cover images because book covers often contain many of the
behind their misclassification because the designs on these shared classes, such as ”person” and ”car.” Similarly, EAST
book covers are not aligned with their genres which makes it was trained on the 2015 ICDAR Robust Reading Competition
obvious for network to mis-classify. Here, cover from ”Sports dataset [34] meant for scene text detection. Despite being
& Outdoors” contains birds, ”Romance” cover contains text, trained for scene text, shown in Fig. 3, EAST performs
”Cookbooks, Food & Wine Books” contain no food picture remarkably well on book covers for detecting text.
and ”Test Preparation” cover is also without any significant To extract object and text bounding box information, the
feature. LRP justifies all these covers misclassification by book covers were prepared by scaling the images to 512×512
highlighting these mentioned objects contributing towards the pixels by 3 color channels. It is important to note that the
”other” class in one-vs-others. images used for SSD and EAST were larger than the images

1088

Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on November 22,2023 at 22:06:37 UTC from IEEE Xplore. Restrictions apply.
“Sports & Outdoors” “Romance”

“Cookbooks,Food & Wine” “Test Preparation”

Fig. 4. “Misclassified” book covers with correct genre names written below
each book cover.

used by the CNN used for genre classification. This is due to Fig. 5. Average object-wise relevance for text detected by EAST and
each object class detected by SSD for each book genre. Only object-genre
the detection methods being much more effective at the higher combinations with five or more data points are shown.
resolution. To accommodate this, the bounding boxes were
scaled post detection and projected onto the LRP heatmaps.
The relevance of an object Robj is calculated using the sum
of the relevance within the bounding box, or
X
Robj = R(n,m) , (3)
(n,m)∈B

where R(n,m) is the relevance at pixel coordinates (n, m)

within bounding box B.

B. LRP with Object Detection

A macro view of the genres can be seen by viewing the
average relevance of object classes. Figure 5 illustrates the
average object-wise relevance of each object class as detected
by SSD and EAST for each book genre using the test set
book cover images. It should be noted that detected objects
Fig. 6. Box plot of relevance of ”person” Rperson for each genre. The boxes
such as ”bottle” and ”tvmonitor” were overfit to certain book represent the first through third quartile and the mean is in red. The whiskers
cover images because many books have plain covers which mark the minimum and maximum datum.
resemble bottle labels or televisions. However, this does not
mean that the information is useless. For example, from Fig. 5,
”bottle” is more relevant for reference and nonfiction genres C. LRP with Text Detection
where plain covers are common. Figure 5 also reveals that the average relevance of text is
In addition, by examining the distribution of the Robj of low. The reasoning behind this phenomenon can be explained
specific object classes, such as ”person,” it is possible to by Fig. 7. The figure shows that the majority of the detected
create associations between genres and detected objects. For text boxes have a very small relevance Rtext , but there are
example, the relevance of ”person” Rperson for each genre some text boxes have a higher relevance. For most genres, the
is shown in Fig. 6. The figure demonstrates that detected title text contains a significant amount of relevance determined
”person”s within certain genres are more relevant than other by LRP, but the small descriptive text carries very little
genres. For instance, the genres of ”Romance” and ”Mystery, relevance. Figure 3 (f) in particular demonstrates this with
Thriller & Suspense” put a high average relevance in ”person.” the large title text having a high relevance and much of the
This indicates that ”person” is important for the CNNs of those smaller descriptive text having near zero relevance.
categories. In addition, mentioned in Section IV-F and shown
in Fig. 3 (f), people are common in ”Test Preparation” but VI. C ONCLUSION
are not necessarily relevant. This is supported by Fig. 6 which In this paper, we presented importance of design in book
indicates that on average, ”person” has very little relevance. covers belonging to a specific genre. The application of LRP
Distributions for the other object classes are provided in the on the book cover dataset showed genre specific book cover
supplemental material. features. The method described most relevant parts of input

1089

Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on November 22,2023 at 22:06:37 UTC from IEEE Xplore. Restrictions apply.
[13] C. McKay and I. Fujinaga, “Automatic genre classification using large
high-level musical feature sets.” in Int. Soc. of Music Inform. Retrieval,
vol. 2004, 2004, pp. 525–530.
[14] W.-T. Chu and H.-J. Guo, “Movie genre classification based on poster
images with deep neural networks,” in Proc. Workshop Multimodal
Understanding of Social, Affective and Subjective Attributes, 2017, pp.
39–45.
[15] J. Zujovic, L. Gandy, S. Friedman, B. Pardo, and T. N. Pappas,
“Classifying paintings by artistic genre: An analysis of features &
classifiers,” in IEEE Int. Workshop Multimedia Sig. Process. IEEE,
2009, pp. 1–5.
[16] A. Finn and N. Kushmerick, “Learning to classify documents according
to genre,” J. Amer. Soc. for Inform. Sci. and Technology, vol. 57, no. 11,
pp. 1506–1518, 2006.
[17] P. Petrenz and B. Webber, “Stable classification of text genres,” Com-
putational Linguistics, vol. 37, no. 2, pp. 385–393, 2011.
[18] L. Kang, J. Kumar, P. Ye, Y. Li, and D. Doermann, “Convolutional
neural networks for document image classification,” in Int. Conf. Pattern
Recognition. IEEE, 2014, pp. 3168–3172.
Fig. 7. Box plot of relevance of ”text” Rtext for each genre. The boxes [19] A. W. Harley, A. Ufkes, and K. G. Derpanis, “Evaluation of deep
represent the first through third quartile and the mean is in red. The whiskers convolutional nets for document image classification and retrieval,” in
mark the minimum and maximum datum. Int. Conf. Document Anal. and Recognition, 2015, pp. 991–995.
[20] M. Z. Afzal, S. Capobianco, M. I. Malik, S. Marinai, T. M. Breuel,
A. Dengel, and M. Liwicki, “Deepdocclassifier: Document classification
with deep convolutional neural network,” in Int. Conf. Document Anal.
book cover contributing towards a genre prediction by CNN. and Recognition, 2015, pp. 1111–1115.
We also presented quantitative analysis of LRP using an object [21] D. Erhan, Y. Bengio, A. Courville, and P. Vincent, “Visualizing higher-
detection method, SSD, and a text detection method, EAST. layer features of a deep network,” Tech. Rep. University of Montreal,
vol. 1341, p. 3, 2009.
The analysis further demonstrates that genre classification [22] C. Olah, A. Mordvintsev, and L. Schubert, “Feature visualization,”
heavily relies on specific objects for each genres. Distill, 2017, https://distill.pub/2017/feature-visualization.
[23] M. D. Zeiler and R. Fergus, “Visualizing and understanding convolu-
VII. ACKNOWLEDGEMENT tional networks,” in European Conf. Comput. Vision. Springer, 2014,
This research was partially supported by MEXT-Japan pp. 818–833.
[24] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, “Learning
(Grant No.J17H06100). deep features for discriminative localization,” in Proc. IEEE Conf.
Comput. Vision and Pattern Recognition, 2016, pp. 2921–2929.
R EFERENCES [25] R. R. Selvaraju, A. Das, R. Vedantam, M. Cogswell, D. Parikh, and
[1] B. K. Iwana, S. T. R. Rizvi, S. Ahmed, A. Dengel, and S. Uchida, D. Batra, “Grad-cam: Why did you say that? visual explanations
“Judging a book by its cover,” arXiv preprint arXiv:1610.09204, 2016. from deep networks via gradient-based localization,” arXiv preprint
[2] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning arXiv:1610.02391, 2016.
applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278– [26] A. Chattopadhyay, A. Sarkar, P. Howlader, and V. N. Balasubramanian,
2324, 1998. “Grad-cam++: Generalized gradient-based visual explanations for deep
[3] D. Ciregan, U. Meier, and J. Schmidhuber, “Multi-column deep neural convolutional networks,” arXiv preprint arXiv:1710.11063, 2017.
networks for image classification,” in Proc. IEEE Conf. Comput. Vision [27] L. Arras, F. Horn, G. Montavon, K.-R. Müller, and W. Samek, “”what
and Pattern Recognition. IEEE, 2012, pp. 3642–3649. is relevant in a text document?”: An interpretable machine learning
[4] S. Uchida, S. Ide, B. K. Iwana, and A. Zhu, “A further step to perfect approach,” Plos ONE, 2017.
accuracy by training cnn with larger data,” in Int. Conf. Frontiers in [28] L. Arras, G. Montavon, K.-R. Müller, and W. Samek, “Explaining
Handwriting Recognition, 2016. recurrent neural network predictions in sentiment analysis,” in Proc.
[5] K. Simonyan and A. Zisserman, “Very deep convolutional networks for Workshop Computational Approaches to Subjectivity, Sentiment and
large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014. Social Media Anal. Association for Computational Linguistics, 2017,
[6] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, pp. 159–168.
V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” [29] V. Srinivasan, S. Lapuschkin, C. Hellge, K.-R. Müller, and W. Samek,
in Proc. IEEE Conf. Comput. Vision and Pattern Recognition, 2015, pp. “Interpretable human action recognition in compressed domain,” in IEEE
1–9. Int. Conf. Acoustics, Speech and Sig. Process., 2017.
[7] S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and [30] F. Arbabzadeh, G. Montavon, K.-R. Müller, and W. Samek, “Identifying
W. Samek, “On pixel-wise explanations for non-linear classifier deci- individual facial expressions by deconstructing a neural network,” in
sions by layer-wise relevance propagation,” PloS ONE, vol. 10, no. 7, German Conf. Pattern Recognition, ser. Lecture Notes Comput. Science,
p. e0130140, 2015. B. Rosenhahn and B. Andres, Eds. Springer International Publishing,
[8] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and 2016, vol. 9796, pp. 344–354.
A. C. Berg, “Ssd: Single shot multibox detector,” in European Conf. [31] S. Lapuschkin, A. Binder, G. Montavon, K.-R. Müller, and W. Samek,
Comput. Vision. Springer, 2016, pp. 21–37. “The lrp toolbox for artificial neural networks,” J. Mach. Learning
[9] X. Zhou, C. Yao, H. Wen, Y. Wang, S. Zhou, W. He, and J. Liang, “East: Research, vol. 17, no. 1, pp. 3938–3942, 2016.
An efficient and accurate scene text detector,” in IEEE Conf. Comput. [32] G. Montavon, S. Lapuschkin, A. Binder, W. Samek, and K.-R. Müller,
Vision and Patern Recognition. IEEE, 2017, pp. 2642–2651. “Explaining nonlinear classification decisions with deep taylor decom-
[10] R. Datta, D. Joshi, J. Li, and J. Z. Wang, “Image retrieval: Ideas, position,” Pattern Recognition, vol. 65, pp. 211–222, 2017.
influences, and trends of the new age,” ACM Computing Surveys, vol. 40, [33] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisser-
no. 2, p. 5, 2008. man, “The pascal visual object classes (voc) challenge,” Int. J. Comput.
[11] S. Karayev, M. Trentacoste, H. Han, A. Agarwala, T. Darrell, A. Hertz- Vision, vol. 88, no. 2, pp. 303–338, Jun. 2010.
mann, and H. Winnemoeller, “Recognizing image style,” arXiv preprint [34] D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. Ghosh, A. Bagdanov,
arXiv:1311.3715, 2013. M. Iwamura, J. Matas, L. Neumann, V. R. Chandrasekhar, S. Lu et al.,
[12] G. Tzanetakis and P. Cook, “Musical genre classification of audio “Icdar 2015 competition on robust reading,” in Int. Conf. Document
signals,” IEEE Trans. Speech and Audio Process., vol. 10, no. 5, pp. Anal. and Recognition, 2015, pp. 1156–1160.
293–302, 2002.

1090

Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on November 22,2023 at 22:06:37 UTC from IEEE Xplore. Restrictions apply.

Classification of Textures Using Convolutional
No ratings yet
Classification of Textures Using Convolutional
30 pages
A Neural Algorithm of Artistic Style 1508.06576v2
No ratings yet
A Neural Algorithm of Artistic Style 1508.06576v2
16 pages
Towards Hallucinating Machines: Designing With Computational Vision
No ratings yet
Towards Hallucinating Machines: Designing With Computational Vision
25 pages
Capabilities Limitations and Challenges of Style T
No ratings yet
Capabilities Limitations and Challenges of Style T
20 pages
Deep Learning For Video Game Genre Classification
No ratings yet
Deep Learning For Video Game Genre Classification
21 pages
Optimizing Book Covers with Color Psychology
No ratings yet
Optimizing Book Covers with Color Psychology
2 pages
Ragot Martin Cojean 2020 - Proof
No ratings yet
Ragot Martin Cojean 2020 - Proof
13 pages
Lu PDF
No ratings yet
Lu PDF
13 pages
Design: Design Inspiration From Generative Networks
No ratings yet
Design: Design Inspiration From Generative Networks
12 pages
SSRN Id3354412
No ratings yet
SSRN Id3354412
8 pages
Personal Topic 2
No ratings yet
Personal Topic 2
16 pages
Thesis
No ratings yet
Thesis
102 pages
Art and Visual Perception Thesis
No ratings yet
Art and Visual Perception Thesis
128 pages
Text-Based Classification
No ratings yet
Text-Based Classification
7 pages
Neil Leach - Architecture in The Age of Artificial Intelligence - An Introduction To AI For Architects-Bloomsbury Visual Arts (2022)
100% (1)
Neil Leach - Architecture in The Age of Artificial Intelligence - An Introduction To AI For Architects-Bloomsbury Visual Arts (2022)
281 pages
Li Et Al. - 2019 - LayoutGAN Generating Graphic Layouts With Wirefra
No ratings yet
Li Et Al. - 2019 - LayoutGAN Generating Graphic Layouts With Wirefra
16 pages
Convolutional Neural Networks For Document Image Classification
No ratings yet
Convolutional Neural Networks For Document Image Classification
5 pages
StackGAN and AttnGAN
No ratings yet
StackGAN and AttnGAN
78 pages
Architecture in The Age of Artificial Intelligence An Introduction To AI For Architects (Neil Leach)
100% (3)
Architecture in The Age of Artificial Intelligence An Introduction To AI For Architects (Neil Leach)
322 pages
(Deep Learning Paper) The Unreasonable Effectiveness of Deep Features As A Perceptual Metric
No ratings yet
(Deep Learning Paper) The Unreasonable Effectiveness of Deep Features As A Perceptual Metric
14 pages
Investigating Aesthetic Image Analysis With Convolution Neural Networks - Bachelor Thesis - David Van Der Linde
No ratings yet
Investigating Aesthetic Image Analysis With Convolution Neural Networks - Bachelor Thesis - David Van Der Linde
75 pages
Entropy 20 00982 With Cover
No ratings yet
Entropy 20 00982 With Cover
10 pages
Image Style Retrieval 2020 2025 A Literature Review
No ratings yet
Image Style Retrieval 2020 2025 A Literature Review
9 pages
8 X October 2020
No ratings yet
8 X October 2020
6 pages
Creating Book Covers Using Pstricks: Yuri Robbers and Annemarie Skjold
No ratings yet
Creating Book Covers Using Pstricks: Yuri Robbers and Annemarie Skjold
31 pages
Evaluation and Design Method For Product Form Aesthetics Based On Deep Learning
No ratings yet
Evaluation and Design Method For Product Form Aesthetics Based On Deep Learning
12 pages
Image Captioning Research Paper
No ratings yet
Image Captioning Research Paper
59 pages
Anime vs Cartoons: Machine Learning Study
No ratings yet
Anime vs Cartoons: Machine Learning Study
5 pages
CS283 Lecture4 2024
No ratings yet
CS283 Lecture4 2024
107 pages
2018TFG Marti Grau VF
No ratings yet
2018TFG Marti Grau VF
50 pages
Caadria2018 314
No ratings yet
Caadria2018 314
10 pages
Pveerina Final
No ratings yet
Pveerina Final
5 pages
Gatys 2016
No ratings yet
Gatys 2016
10 pages
Fake News Detection Methods 2021
No ratings yet
Fake News Detection Methods 2021
4 pages
Presentation #7 A Style-Based GANs
No ratings yet
Presentation #7 A Style-Based GANs
23 pages
Assignment 01 Paranthaman
No ratings yet
Assignment 01 Paranthaman
19 pages
BAM - The Behance Artistic Media Dataset For Recognition Beyond Photography
No ratings yet
BAM - The Behance Artistic Media Dataset For Recognition Beyond Photography
10 pages
Pattern Extraction and Recognition in Paintings
No ratings yet
Pattern Extraction and Recognition in Paintings
20 pages
Report
No ratings yet
Report
49 pages
Gatys Image Style Transfer CVPR 2016 Paper
No ratings yet
Gatys Image Style Transfer CVPR 2016 Paper
10 pages
Deep Learning For Image Spam Detection
No ratings yet
Deep Learning For Image Spam Detection
44 pages
Arts 08 00026 With Cover
No ratings yet
Arts 08 00026 With Cover
10 pages
Harsha Thesis
No ratings yet
Harsha Thesis
62 pages
Script
No ratings yet
Script
10 pages
My Document
No ratings yet
My Document
14 pages
Understanding and Visualizing Generative
No ratings yet
Understanding and Visualizing Generative
6 pages
SSO 2024 - AI Heritage Architecture
No ratings yet
SSO 2024 - AI Heritage Architecture
51 pages
Behind The Pixels Digital Edition Spread
100% (1)
Behind The Pixels Digital Edition Spread
147 pages
Image Classification and Text Extraction Using Convolutional Neural Network
No ratings yet
Image Classification and Text Extraction Using Convolutional Neural Network
7 pages
Towards Machine-Learning Assisted Asset Generation For Games: A Study On Pixel Art Sprite Sheets
No ratings yet
Towards Machine-Learning Assisted Asset Generation For Games: A Study On Pixel Art Sprite Sheets
10 pages
VietNguyen MasterThesis
No ratings yet
VietNguyen MasterThesis
66 pages
4 Aa 8
No ratings yet
4 Aa 8
16 pages
Huang Et Al. - 2021 - On GANs, NLP and Architecture Combining Human and
No ratings yet
Huang Et Al. - 2021 - On GANs, NLP and Architecture Combining Human and
19 pages
3D Art Design - Volume 2 2013
100% (2)
3D Art Design - Volume 2 2013
260 pages
3D Art & Design - Maya - 3DMax - Photoshop - Cinema 4D - ZBrush (Volume 2, 2013)
No ratings yet
3D Art & Design - Maya - 3DMax - Photoshop - Cinema 4D - ZBrush (Volume 2, 2013)
260 pages
Bao Cao BTL Python
No ratings yet
Bao Cao BTL Python
28 pages
Recognition and Processing of Phishing Emails Using NLP A Survey
No ratings yet
Recognition and Processing of Phishing Emails Using NLP A Survey
4 pages
Image Classification Using MNIST Dataset
No ratings yet
Image Classification Using MNIST Dataset
28 pages
COL 774 - Machine Learning - Assignment 5
No ratings yet
COL 774 - Machine Learning - Assignment 5
6 pages
Neural Networks for Heart Prediction
No ratings yet
Neural Networks for Heart Prediction
7 pages
A - Document Hair Loss - MS Word Final
No ratings yet
A - Document Hair Loss - MS Word Final
62 pages
ANFIS FOR Stock Price Prediction 2
No ratings yet
ANFIS FOR Stock Price Prediction 2
5 pages
AI-Based Real-Time Daylight Analysis
No ratings yet
AI-Based Real-Time Daylight Analysis
19 pages
DEPHIDES Deep Learning Based Phishing Detection System
No ratings yet
DEPHIDES Deep Learning Based Phishing Detection System
19 pages
Cybersecurity ML for Encrypted Traffic
No ratings yet
Cybersecurity ML for Encrypted Traffic
15 pages
Deep Learning & ML Basics Guide
No ratings yet
Deep Learning & ML Basics Guide
151 pages
Research Ideas For Artificial Intelligence in Auditing: The Formalization of Audit and Workforce Supplementation
No ratings yet
Research Ideas For Artificial Intelligence in Auditing: The Formalization of Audit and Workforce Supplementation
20 pages
AI's Role in HCI/UX Transformation
No ratings yet
AI's Role in HCI/UX Transformation
36 pages
A Smart Information Gateway To Support Businesses
No ratings yet
A Smart Information Gateway To Support Businesses
18 pages
4 BDCC-09-00064
No ratings yet
4 BDCC-09-00064
13 pages
Compare YOLOv7 To YOLOv10
No ratings yet
Compare YOLOv7 To YOLOv10
8 pages
A Hybrid Approach To Automatic Corpus Generation For Chinese Spelling Check
No ratings yet
A Hybrid Approach To Automatic Corpus Generation For Chinese Spelling Check
11 pages
TF Certificate Candidate Handbook
No ratings yet
TF Certificate Candidate Handbook
9 pages
Carla Kyber-E2E
No ratings yet
Carla Kyber-E2E
4 pages
Demonstrating Large-Scale Package Manipulation Via Learned Metrics of Pick Success
No ratings yet
Demonstrating Large-Scale Package Manipulation Via Learned Metrics of Pick Success
11 pages
Datasheet
No ratings yet
Datasheet
11 pages
Multi-Class Sentiment Analysis From Afaan Oromo Text Based 3
No ratings yet
Multi-Class Sentiment Analysis From Afaan Oromo Text Based 3
9 pages
Machine Learning
No ratings yet
Machine Learning
12 pages
Wang 2021 J. Phys. Conf. Ser. 1856 012036
No ratings yet
Wang 2021 J. Phys. Conf. Ser. 1856 012036
7 pages
Le Wagon - Data Science Course Syllabus
No ratings yet
Le Wagon - Data Science Course Syllabus
37 pages
Speech M1
No ratings yet
Speech M1
4 pages
AI & Machine Learning Course Guide
No ratings yet
AI & Machine Learning Course Guide
58 pages
Fncom 17 1243779
No ratings yet
Fncom 17 1243779
14 pages
College Predictor - Thesis
No ratings yet
College Predictor - Thesis
37 pages
MachineLearningProject-Capillary Pressure Estimation
No ratings yet
MachineLearningProject-Capillary Pressure Estimation
46 pages
Brain Stroke Prediction
No ratings yet
Brain Stroke Prediction
5 pages