Keywords

1 Introduction

With the explosive growth of social media on the Web (Social Systems, SS), e.g., reviews, forum discussions, blogs, micro-blogs, Twitter, comments, and postings in social network sites, individuals and organizations are increasingly using the content in these media for decision making [24]. In this sense, some authors have been devoted to collect such data in order to study people’s feelings, their behaviors, collaboration, relationships etc. Wang et al. [41], for instance, investigated regrets from the messages of users on Facebook; Lim and Datta [23] studied communities that share common interests on Twitter; Mogadala and Varma [31] did a study about the humor mood transition of Twitter users. Other studies have focused on the analysis of postings from users in order to understand their views as for health [1, 11, 16, 17, 32], politics [7] etc.

In the area of Human Computer Interaction (HCI), the opinions of users are important in the evaluation of a system. Asking their opinion regarding a product, checking whether they enjoy it, checking whether the product is aesthetically appealing, how they accomplish what they want, and checking whether they face problems when using it are possible forms of evaluating a system [37]. The main techniques for collecting user opinion about the system are: field research, interviews and questionnaires [2, 9, 37]. However, such techniques do not consider the spontaneity of the users at the moment when they are using the system [28, 34, 37]. We believe that the spontaneous way of the users in describing a problem with the system to their friends, while using it, may be different from a description they do to a specialist. Preece, Rogers and Sharp [37] raised the following thought: “What users say is not always what they do”. People sometimes give answers which are not true, or they may just have forgotten what happened. Thus, can evaluators believe all the answers they receive? Are respondents saying “the truth”, or are they simply providing the answers they presume the evaluator wants to hear? Moreover, in SS the interaction is mainly constituted of written texts. Why not to take advantage of this feature in SS communication for obtaining relevant data on the use of the system?

In our previous work, we did studies with postings of users in SS [26, 27]. In [28], we did a systematic review of studies in the field of HCI and Natural Language Processing (NLP). In [26], we investigated whether SS users post messages about the system in use, for which we had a positive conclusion. Users praise the system, criticize it, make comparisons, clarify doubts and provide suggestions about the system. In [27] we conducted two experiments with postings of users on SS in order to investigate how users express their feelings regarding the use of the system, and how to assess the Users eXperience (UX) by using their postings during the system interaction. Results showed some characteristics of postings related to the use which may be useful for Usability and UX (UUX) evaluation in SS. Therefore, our goal in this work is to investigate UUX of a system from the users’ postings when using it, specifically, we hope to get assessment results UUX from the posts of two SS of distinct contexts.

We collected users’ postings from two SS: a popular SS (entertainment) - Twitter - and an academic SS of a university – forums. This investigation sought to answer the following research questions: (i) Is it possible to classify users’ postings in dimensions of UUX? (ii) Is it possible to find out, from users’ postings, which the main problems of the system are? (iii) Is it possible to realize, from users’ postings, the context of use of a system?

In this study, 29 participants (students and specialists in HCI) classified 1,210 postings and discussed their impressions about this form of evaluation.

This paper is organized as follows: in the next section, we present some related works. In Sect. 3, we describe concepts about SS, postings related to the use and UUX. In Sects. 4 and 5, we describe the investigative studies, followed by final considerations and future work.

2 Related Work

Some studies that have focused on user narratives in order to study UUX or UX are: [13, 14, 22, 35, 40]. In [13], the authors, focusing on studying UX from positive experiences of users, collected 500 texts written by users of interactive products (cell phones, computers etc.) and presented studies about positive experiences with interactive products. In [22], the authors collected 116 reports of users’ experiences about their personal products (smartphones and MP3 players) in order to evaluate the UX of these products. Users had to report their personal feelings, values and interests related to the moment at which they used those. In [35], the authors collected 90 written reports of beginners in mobile applications of augmented reality. The focus was also evaluating the UX of these products, and the analysis consisted in determining the subject of each text and classifying them, by focusing attention on the most satisfactory and most unsatisfactory experiences. Following this line, in [40], the authors studied 691 narratives generated by users with positive and negative experiences in technologies in order to study the UX from them.

In the four studies mentioned above, the information was manually extracted from texts generated by users. The users were specifically asked to write texts or answer a questionnaire, unlike the spontaneous gathering of what they post on the system.

In [14], the authors extracted reviews of products from a reviews website and did a study in order to find relevant information regarding UUX in texts classified by specialists. However, they did not investigate SS, but other products used by users. In this case, the texts were written by products reviewers. It is believed that the posture of users in a product review website is different from that when they are using a system and face a problem, then deciding to report this problem just to unburden or even to suggest a solution.

In this work, we focused on considering the opinions of users about the system in use from their postings on the system being evaluated. We intend thereby to capture the user spontaneously at the moment they are using the system.

3 Background

3.1 Social Systems

Social Systems focus on enabling its users to communicate and interact with each other in different ways and for several purposes [1, 8, 19]. Complementing this concept, [6] specify SS as web-based services that allow individuals to (1) create a public or semi-public profile within a limited system, (2) articulate a list of other users with whom they share a connection, and (3) view and surf through their list of connections and through those made by others within the system.

For [36], SS are communication systems often used in the composition of collaborative systems such as: social networks, in which various types of communication systems are adapted to allow multiple forms of interaction between users; learning environments, in which multiple communication systems are available to be used and configured in each course; or in virtual environments, which often contain a chatting service and audio conferencing. This work considers the latter definition, by considering as the main interaction the text messages posted by users: their postings.

The SS used in this research were: TwitterFootnote 1 and SIGAA.Footnote 2 Twitter is one of the largest microblogging services on the Internet with over 600 million active users.Footnote 3 Microblogs are short text messages that users produce to share all kinds of information with the world. On Twitter, these microblogs are called “Tweets” and may contain news, announcements, personal affairs, jokes, opinions etc.

The SIGAA (Integrated System of Academic Activities Management) is the academic control system of the Federal Universities in Brazil, and, through such system, students can have access to various features such as: Enrollment receipt, academic history, enrollment procedure etc. The system allows the exchange of messages through a discussion forum. Its users are students and university staff.

3.2 Postings Related to the Use

The main form of interaction in social systems are their posted messages, whether they are public or private. In their postings, users deal with various issues. This work focuses on the public postings in natural language in which the users refer to the SS they are currently using (Postings Related to the Use, PRUs). For example, if the user is using Facebook, we are interested in Facebook PRUs; if the SS under evaluation is Twitter, the PRUs should be regarding Twitter.

Next to this field of text analysis, there are studies about reviews of products or services on the web. In recent years, the use of websites to evaluate products and services has become increasingly common. Websites such as Booking.com, Decolar.com, Tripadvisor etc. provide a space for clients to disclose their reviews on products and services.

A review is a small text written by a user of the product or service who used it for some time, detailing its positive and negative points and possibly providing an evaluation of it and a recommendation to other potential buyers [14].

It is worthy highlighting the reflection about these two concepts: reviews of products or services and user comments in SS. From studies and empirically, we come to the description of their differences in the following aspects:

  1. 1.

    Form: reviews are structured texts, presenting certain regularity in the format of the information, such as, for example, a completed form. There are fields to score, text input for evaluation and even a field of the aspect to be evaluated, whereas postings in SS are unstructured texts, presenting no regularity in its format. Postings of users may display images, various types of texts and characters and even links referencing pages;

  2. 2.

    Motivation: a series of online articles [10, 20, 30, 38] have been written in order to explain why people write reviews. Among the main reasons given, the authors generally agree that people write reviews because they care about their fellow consumers and want to help others in making a decision [10]. In [26], we conducted a research on characteristics of PRUs of SS and noted that users praise, criticize, make comparisons, clarify doubts and provide suggestions about the system, which leads us to believe that such comments contain users’ reports about their experiences of use in the system; and

  3. 3.

    Context: at the time of review, the reviewer is not using the system being evaluated. The fact that the users make comments on the own SS they are using may be a way to request help to solve a problem at the time of use.

The postings used in this research were obtained from previous experiments. We extracted 295,797 Twitter postings using an extraction tool in six samples taken from October to December 2012 [29]. For the SIGAA system, we extracted 24,743 postings after system installation (2nd half of 2010) until January 2014 [25]. Of these, PRUs were selected for evaluation of both systems

3.3 Usability, User Experience and Their Goals

Usability, according to [37], is generally considered to be the factor which ensures that products are easy to use, efficient and pleasant - from the user’s perspective. According to [15], usability is a measure in which a system, product or service can be used by specified users in order to achieve specified goals with efficacy, efficiency and satisfaction in a specified context of use.

UX, in turn, consists of perceptions and responses of people, resulting from the use and /or from the anticipated use of a product, service or system [15]. According to [15], it includes all emotions, beliefs, preferences, perceptions, physical and psychological responses, behaviors and user achievements that occur before, during and after use.

In [14], the authors investigated the main goals of UUX, based on studies [3, 5, 12, 21, 39] and came to the following goals (Table 1). We used these dimensions in this paper.

Table 1. Dimensions of UUX used for the studies in this paper

4 First Investigation

4.1 Participants

This investigation was conducted in March 2014 with 17 students of a HCI discipline of the course of Computer Science, 14 men and 3 women aged 22−25 years old.

4.2 Procedure

As this investigation was applied with students of a discipline of HCI, the theoretical basics of UUX and its goals had been previously taught in previous classes, as well as UUX assessment methods.

The research consisted in: each student would classify 50 PRUs, arranged in a worksheet, according to the following categories: (1) type of PRUs; and (2) UUX goals. The types of PRUs are criticism, question, compliment, suggestion, help and comparison [26]. The UUX goals were those proposed by [14], arranged in Table 1. We also presented the examples arranged in Table 2. The deadline for classification was 2 weeks, and at the end, the student should deliver the worksheet with the 50 classified PRUs and a completed questionnaire. In addition to personal information such as name, age and the SS analyzed, the questionnaire contained only three questions: (1) what is the feeling os users you have noticed more frequently in postings? (2) what are the main complaints (problems faced in the system)? and (3) what are the main compliments (system benefits) perceived in the messages? Ten students worked with PRUs of Twitter and seven students worked with the PRUs of SIGAA.

Table 2. Examples of PRUs classified

4.3 Results

After two weeks, the students delivered us the PRUs classified by them, which were corrected by two HCI specialists, authors of this paper. We noted that 80 % of the postings were correctly classified. However, their main mistakes were:

  1. 1.

    Regarding the type of PRU: confusion between the types doubt and help (for SIGAA). Although some PRUs were actually doubts, the students classified them as help, for instance: “Does anyone know how I can get a history of disciplines here?” represents a doubt, whereas “Go to “see previous courses” on your homepage.. then click the small blue arrow there is in each discipline, then on the left you will see “students” and then “see marks”.” is an example of help;

  2. 2.

    Concerning the classification by UUX goals: confusion between the goals effectiveness, efficiency and utility.

The classification of 50 postings enabled each student to have a perception of the system, providing an evaluation result through their questionnaires. In their responses, they identified the main feelings of users towards the system, their complaints and compliments. Figures 1, 2 and 3 illustrate the students’ perception regarding the set of PRUs analyzed on SIGAA.

Fig. 1.
figure 1figure 1

Main feelings perceived on SIGA

Fig. 2.
figure 2figure 2

Main causes/functionalities perceived on SIGAA

Fig. 3.
figure 3figure 3

Main compliments perceived on SIGAA

From the experiment with the students, we concluded that it is possible to obtain results of an evaluation from a set of PRUs. In this study, it was possible to create a relationship of categories of PRUs classified by the students. For example, 48 % of the criticisms were related to the effectiveness goal, and 86 % of them were related to the frustration goal. However, we did not request the students to indicate the functionality referred by the user when reading a PRU. Still, in order to answer the questionnaire, the student described the main features mentioned by users (Fig. 2). We note that the functionality is a required information to be collected. From this, we would be able to discover the main features presenting UUX problems in the SS. We could also carry out a correlation between the functionalities and the other features of classification suggested in this paper, for instance: “x% of the doubts were related to the functionality y” or “w% of the criticism was related to the functionality y and to the effectiveness goal”.

5 Second Investigation

5.1 Participants

This research was carried out in November 2014 and had 12 specialists in HCI, 4 from academia and 8 from industry. We consider as academy specialists those who teach in universities and as industry specialists those who work with HCI in companies, following the profile shown in Table 2.

5.2 Procedure

The 12 specialists in HCI were invited two weeks in advance, and the investigation occurred during a morning following the schedule below (Tables 3 and 4).

Table 3. Profile of specialists in HCI
Table 4. Agenda of investigation with specialists in HCI

The specialists received the postings to be classified, the goals for UUX classification (Table 1) and examples of PRUs classified by UUX (Table 2). Each specialist received 30 PRUs to classify. Half of the specialists received PRUs from SIGAA, and the other half had PRUs from Twitter. In the validation step, the SS were exchanged; the specialist who had classified SIGAA validated Twitter, and vice versa. This was done so that each specialist could have a view of a different SS.

After the classification step, we provided a moment for brainstorm, at which the specialists discussed about: the user feelings, their intended uses, types and the importance of PRUs (conclusions, actions or how they would represent). Two authors of the research took notes on the whiteboard. After the brainstorm, participants should write on a blank sheet their main difficulties regarding the classification of PRUs.

5.3 Results

The results are presented in difficulty of classification and perceptions for evaluation using the PRUs of users in SS. The main difficulties of the specialists were: 1) great number of goals, leading to uncertainty at the time of classification; and 2) some goals are intersections of others, such as: usability (effectiveness, efficiency, satisfaction, utility, learning, security, memorization) and emotion (satisfaction, frustration, affection, pleasure, enchantment).

During the analysis of PRUs classified by the specialists, we realized that some of them did not have the usability classification, only the UX, for example, the following PRU: “Syndrome of screen exchange on Twitter. Whenever I refresh the screen, I place the cursor to the left and it goes to ‘discover’ tab Grrrr”. The feeling is so perceptive that they classified it regarding the UX goals frustration and emotion, but did not classify it as for the security goal.

In another example, “I hate this feature of Twitter of favoriting a tweet by just laying the cursor on it, we cannot even stalk fearless anymore”, specialists classified the UX goals: frustration and emotion, but missed the goal usability, identifying a security issue.

Although the specialists had a short time for the analysis (30 min for classification + 30 min for validation, with two different SS), they were able to report some information about the evaluated SS and its users, such as: “we can see that SIGAA users are beginners”; “SIGAA users are not yet used to the platform”; “Some features have not yet been implemented on SIGAA”; “Twitter users are satisfied, motivated and engaged! I had never imagined that…”. There were also other interesting discussions about the user: “What is their purpose when posting?”, “What do they want?”, “What is their priority? Expressing feeling? Asking for help?”.

Regarding the feeling expressed by users in PRUs, a specialist noted that there are feelings in all cases; they only differ as for polarity and intensity, for example: very happy. On the other hand, other specialists noticed that the feeling was not always expressed, although system problems were reported. They commented that it would be interesting to also classify what caused that feeling.

We noted that, as the specialists become familiar with the method, they begin to classify the postings more easily, and also that analyzing only one posting is not enough, but from a set of PRUs it is possible to give some information about the context of use. The specialists discussed about what they would do with the result. We highlight the following answers: analysis of product acceptance, application development, interface improvement or correction of system bugs.

6 Final Considerations and Future Work

We conclude this research addressing the following issues raised in the study: (i) Is it possible to classify PRUs in dimensions of UUX? Yes, it is. The students and the specialists classified the PRUs in UUX goals. However, some measures should be taken in this regard in order to facilitate the classification process, such as: the classification of goals should be simplified, by removing intersections between them not to confuse the specialist. The goals should also be separated by quality of use, such as: usability goals and UX goals, preventing them from forgetting any of the criteria.

The second question raised: (ii) It is possible to discover, from a set of PRUs, which the major problems of the system used are? Yes, it is. In some PRUs, the problem is clearly presented. In these, a classification of the problem (or the cause of posting) could be made in order to establish a correlation with the other classifications made. The classification would be the cause or the functionality mentioned by the user, that is, referred in their PRU.

The third question was: (iii) Is it possible to perceive, from a set of PRUs, the context of use of the system? Yes, it is possible to perceive it from a simple set of classified PRUs, as well as in other evaluation methods such as heuristic evaluation, for example, in which certain items of the system have to be evaluated in order to reach a conclusion about it. For the evaluation from PRUs, it is necessary to have a number of items (PRUs) evaluated so we can reach any relevant conclusion.

We can therefore conclude that it is possible to achieve results of evaluation of UUX from the PRUs analysis. This investigation resulted in 1,210 PRUs analyzed, from which 850 were classified by students (350 from Twitter + 500 from SIGAA) and 360 were classified by HCI specialists (180 for each SS: Twitter and SIGAA).

Some studies have already been carried out from the PRUs of users in SS. In [25], we proposed an Evaluation Model of the User Textual Language. We intend to continue this work by studying characteristics of PRUs and how they can be useful in order to obtain user perceptions regarding the system.