Keywords

1 Introduction

Improving learning support for students is a vital task for teachers to reduce at-risk students and to enhance students’ performances. To provide better supports for students, teachers need to understand how the students learn and their characteristics. For example, it is useful to find contents students feel difficult in their lectures because teachers can provide more detailed explanations of the contents more carefully thanks to the finding. However, to investigate such useful information manually is time-consuming for teachers by themselves. To understand students’ learning behavior automatically, recent works in learning analytics analyze their learning behaviors based on educational data using machine learning techniques and data mining techniques, and provide various results and findings for supporting students and teachers [1, 5].

Various types of educational data are collected from digital learning systems such as Massive Open Online Courses (MOOCs) and M2B systems in Kyushu University [14]. The collected data is responses written in e-portfolio systems, eye movement of students reading a digital textbook, and clickstream data and event logs such as students’ access logs and page browsing logs. The number of students that can be measured at the same time and the degree of detail of the measured learning behavior differs depending on the data type. For example, clickstream data can be collected from more than 100 students simultaneously, and it is difficult to capture reading behaviors on e-book pages. Therefore, the functions of systems for supporting teachers also differ depending on the collected educational data.

In this paper, we introduce our recent works [9,10,11] using clickstream data and eye movement data to discuss learning analytics-based approaches for helping teachers support students when using different types of educational data. The first topic is a visualization of the relationship between quiz scores and reading behaviors using clickstream data to understand the characteristics of a large number of students, and the second topic is page difficulty estimation using eye movement data to help teachers find contents students feel difficult.

2 Related Work

In learning analytics, it is essential to understand student learning behaviors and feedback results of the analysis to teachers and students. To measure student learning behaviors, recent online digital learning management systems is available because they can store interaction by students reading digital textbooks. Besides, teaching materials, quiz scores, and responses written by students are also collected in such systems.

One of the major educational data is clickstream data representing the interaction between e-learning systems and students. There are several studies using clickstream data for predicting student achievement and grades [2], and monitoring student learning behaviors [3, 16]. These studies using interaction such as page transition can help teachers detect and support at-risk students. Besides, interactions such as adding highlights and memos are used for a concept map tool [18] and a knowledge map tool [15]. Teachers can use such tools to confirm if the students comprehend. Our method [10] also visualizes student characteristics based on clickstream data. As shown in the previous studies, a large amount of clickstream data can be collected from numerous students. However, it is difficult to measure the more detailed learning activity of students such as reading paths in pages of digital textbooks.

On the other hand, an eye tracker is used for the measurement of eye movement of students reading teaching materials. Reading paths measured from students learning contents by eye trackers is attractive to analyze student performances and difficulty of teaching materials. Nakamura et al. predicted subjective impressions of difficulty of English word tests by using eye movements [13]. The analysis of eye movement data can be useful for understanding learning behaviors within pages. Findings are shown related to effective attention guidance techniques in [4] and relationships between students’ reading paths and performance in [7, 17]. Therefore, we can use eye movements for analyzing student learning behaviors deeply. The eye trackers can measure eye movement, however, it is hard to measure eye movements from many students because of the difficulty regarding preparing many eye trackers and expertise to use eye trackers.

3 Visualization of Quiz Scores and Reading Behaviors Based on Clickstream Data

We introduce a visualization method to overview student’s quiz scores and reading behaviors [10]. Teachers can use quiz scores as a criterion to understand student’s achievement and select strategies of learning supports for students. However, such learning supports based only on the quiz scores can not consider students’ learning behaviors during lecture time. Even if some students obtain the same score, appropriate learning supports for them are not always the same support. In this study, we provide reading behaviors during lecture time as another criterion, and teachers can have an opportunity to consider how to individually support. Our visualization method defines an action score to represent reading behaviors, and show a distribution between quiz scores and action scores. We apply our visualization method to Kyushu University clickstream data provided by the LAK19 data challenge [5].

3.1 Method

We compute action scores of each student from clickstream data which contain pages read by students, their timestamps, and operations done by students. An action score is defined based on pages read by many students in a class and the number of operations except operations of page access.

In this study, we observed many students read the same page at the same time. Figure 1 shows a heatmap of students reading a textbook. In the heatmap, red color means that many students read a page. We assume that teachers’ instruction can affect students’ page transitions. Under the assumption, we focus on the majority page transitions, and we compute differences between a student’s page transition and the major page transition. When a student follows the major page transition, the action score increases. We can evaluate rare reading patterns such as the yellow reading pattern in Fig. 1.

Besides, the used clickstream data provides several operations such as “ADD MARKER” and “ADD BOOKMARK”. Teachers did not force students to do such operations. Therefore, analysis of the operations can be helpful to understand students’ learning behaviors. To consider such operations as learning behaviors, we count the number of operations done by each student, and the count is quantized to four-level scores between 0 and 1. When the number of operations increases, the action score also increases. After computing action scores, we plot quiz scores and action scores of students to visualize them.

Fig. 1.
figure 1

Reading behaviors in a course. The green and yellow lines represent instances of the reading behavior of the two students. (Color figure online)

3.2 Results

We show a result of the Kyushu University data by our visualization method in Fig. 2. This figure illustrates a quiz score and an action score for each student. To select strategies for learning supports, teachers can use this figure. For example, we can focus on some points in the bottom-right of Fig. 2. The students have higher action scores and lower quiz scores. In this study, such students indicate that their activity did not contribute to higher quiz scores because the action scores are computed from their students’ behaviors such as page transition and interactions between students and e-learning systems.

Our visualization method can provide some suggestions for teachers when supporting students in the bottom-right and bottom-left of Fig. 2. We can consider that these two types of students have different learning behaviors during lectures. In the bottom-right, we can estimate that the students follow teachers’ instruction or proactively perform operations. Therefore, teachers can provide supplementary teaching materials to help the student learn more detailed content. For students in the bottom-left, teachers can provide a summary of lectures to encourage reviewing previous lectures.

Fig. 2.
figure 2

Distribution of quiz scores and action scores. \(\mu \) and \(\sigma \) are a mean and a standard deviation of quiz scores.

4 Region-Wise Page Difficulty Using Eye Movement Data

We investigate the relationship between subjective impressions of a page’s difficulty and eye movements of students while studying by themselves [9]. Eye movement data can have information on more detailed reading behaviors than clickstream data. The purpose of this study is to estimate where on pages students found difficulty using their eye movement data. In this study, we use a neural network to model eye movements in pages with difficult content. This work can contribute to helping teachers revise teaching materials.

4.1 Data Collection

To analyze eye movements of students reading pages of teaching materials, our eye movement data were collected from 15 Kyushu University students reading a digital textbook. In this study, a Tobii eye tracker we used (Tobii pro spectrum 150 Hz) was attached to a monitor in a dark room, and its sampling rate was set to 150 Hz. The participants started to learn teaching materials after a calibration of the eye tracker. The contents of teaching materials were a statistical test and correlation, and it was included figures, tables, text, formulations, and images. In addition, content alignment was free. After finishing each page, the students were asked to answer their subjective impressions of a page’s difficulty on each page. Before our experiment, we confirmed that all participants had little knowledge regarding information science and statistical mathematics in the contents. Afterward, a black page was displayed for one second before reading the next page. The participants added highlights on each page where they found difficult content after reading all the pages. In this procedure, we collected the students’ eye movements, subjective impressions of page difficulty, and highlights.

To decide which pages with difficult content based on their subjective impression, we confirm distributions of their subjective impressions for each student. As a result, we observed the distributions of the means and variances were different between the students. In this study, the subjective impressions of each student are evaluated relatively, and we choose some pages with higher subjective impressions in each student as “difficult pages”.

4.2 Neural Network-Based Reading Pattern Modeling

We model eye movements on difficult pages using a neural network in order to find reading patterns related to subjective impressions of the page’s difficulty. Recently, neural networks become to extract effective features automatically in several tasks such as image classification [6]. Eye movement data consist of sequences of gaze points and contain both spatial and temporal information. It is difficult to design manual features for modeling such eye movements. Therefore, we choose a neural network as our model.

Our neural network estimates whether eye movement data belong to a difficult page, and the output is a probability of page difficulty. To input eye movement data to our neural network, we use a reading pattern code [12] as the input. The reading pattern code represents T density maps of gaze points. We divide a sequence of gaze points into T time slots, and each heatmap is computed at each time slot. Therefore, a reading pattern also contains both spatial and temporal information. Our neural network accepts a reading pattern code and provides the probability of page difficulty. Training our neural network is performed based on difficult pages. Afterward, our neural network models relationships between eye movements and difficult pages.

We analyze which parts of eye movements our neural network focuses on to interpret the relationships our model learns. In other words, we investigate eye movements when the eye movements are classified as ones of difficult pages by our neural network. In this study, layer-wise relevance propagation (LRP) [8] is used for computing relevance maps, which represent the contribution to the classification result at each element of the input. An element with a high score in a relevance map means related to difficult pages. Therefore, we focus on the relevance maps to visualize page areas where students found difficulty.

4.3 Evaluation

Qualitative Evaluation. Figure 3 illustrates relevance maps of pages with more than five students who found difficulty on those pages. We compare the relevance maps with two types of maps; gaze maps and highlight maps. The gaze maps and highlight maps are heatmaps of the number of gaze points and highlights added by students. A figure in blue has a smaller value than a figure in red.

We roughly observe that the relevance maps are similar to the highlight maps. Especially, both these maps focus on equations. The gaze maps are distributed on each page, and the relevance maps focus on specific regions in the gaze maps. Therefore, we believe that our neural network learns relationships between the eye movements and subjective impressions of the pages with difficult content.

Table 1. Questionnaire about the quality of the relevance maps. n is the number of pages.

Evaluation for Modification of Teaching Materials. We investigate whether relevance maps help teachers revise teaching materials. For the investigation, we administered a questionnaire to the creator of the teaching materials. First, the relevance maps and the gaze maps are shown to the creator of the teaching materials. We tell the creator that these maps are generated from two different systems. In Table 1, system A is a generator of gaze maps, and system B is a generator of relevance maps. After viewing the maps, the creator answers two questions for each page.

According to Table 1, the relevance maps do not help the creator modify teaching materials. However, we expect that teachers modify contents in difficult pages. Therefore, we focus on difficult pages in the two questions. Table 2 shows some weighted averages when easier pages were removed based on thresholds. We ignore pages if the number of students finding difficulty in a page is less than threshold values. In Q2, almost all weighted averages increase. Therefore, the relevance maps can help the creator modify difficult pages.

Fig. 3.
figure 3

Highlight maps, relevance maps, and gaze maps in pages with more than five students who found difficulty. (Color figure online)

5 Discussion

The number of students to be analyzed and the level of detail of the analyzed results differ depending on the type of educational data. Clickstream-based approaches apply to many students and provide analytics at the page and course level. Eye movement-based approaches, on the other hand, can provide detailed analysis at the word level, however, it is difficult to apply the approach to more than 100 students. This argument applies to other educational data such as electroencephalogram (EEG). We discuss a combination of different educational data to perform a detailed analysis of many students. To discuss it, we focus on highlights and eye movement data [11].

Table 2. Results of the questionnaire when focusing on difficult pages. n is the number of pages used for calculating the weighted average.
Table 3. Averaged precision, recall, and F-measure values comparing the three maps.

To investigate the similarity between a large amount of clickstream data and a small amount of eye movement data, we collect highlights from approximately 1,200 students using the same e-book system. The students have the opportunity to add or delete highlights on the digital textbook when they found difficulty. Highlight maps are generated from these highlights. For this discussion, we compare the relevance maps with the highlight maps and gaze maps. The three maps are binalized in the comparison. Figure 4 illustrated the binalized highlight, relevance, and gaze maps for the three most difficult pages.

To evaluate the similarities between the different types of the data, Table 3 shows the precision, recall, and F-measure values that were calculated by comparing the highlight maps with the two maps that were generated from eye movement data. According to Table 3, the relevance maps achieved higher precision than the gaze maps in the top three most difficult pages. This indicates that the relevance maps have more similar information about the highlight maps in difficult pages.

Fig. 4.
figure 4

Binary maps in highlight maps, relevance maps, and gaze maps for the three most difficult pages.

We observed that the highlight maps are similar to, but not completely the same as the relevance maps. Therefore, we expect to combine highlights and eye movements complementarily for estimating page difficulty. For instance, when a page area is detected in both a highlight and a relevance map, this suggests that the area has higher confidence than areas that were detected in either a highlight map or a relevance map. In this study, we do not discuss reading paths on each page and the other combinations of different resources such as texts of memos. For example, locations and the number of highlights may be related to the reading paths. We need to find how to choose related features.

6 Conclusion

We introduced our methods for supporting teachers. In our study, clickstream data and eye movement data were used to analyze relationships between students’ learning behaviors and their quiz scores and to estimate region-wise page difficulty. Our works help teachers select strategies for learning supports and revise teaching materials.

Depending on the type of educational data, the advantages and disadvantages are different in the number of students and data collection. To solve the disadvantages, we evaluated the similarity between eye movement data and highlights in clickstream data and discussed the combination of different resources. In future work, we investigate the different combinations of the other resources such as memos written by students. In addition, we need to develop methods for finding effective combinations and features.