Abstract
Integrating Artificial Intelligence (AI) technologies promises to open new possibilities for the development of smart systems and the creation of positive user experiences. While the acronym «AI»has often been used inflationary in recent marketese advertisements, the goal of the paper is to explore the relationship of AI and UX in concrete detail by referring to three case studies from our lab. The first case study is taken from a project targeted at the development of a clinical decision support system, while the second study focuses on the development of an autonomous mobility-on-demand system. The final project explores an innovative, AI-injected prototyping tool. We discuss challenges and the application of available guidelines when designing AI-based systems and provide insights into our learnings from the presented case studies.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
- User experience
- Artificial Intelligence
- AI and UX
- Human-AI Interaction
- Human factors
- Design
- Case studies
- Predictive prototyping
- ACT-R
- Clinical decision support systems
- Intensive care
- Autonomous mobility-on-demand
- Autonomous vehicles
1 Introduction
Trying to escape from the seemingly omnipresent acronym «AI»is almost impossible these days. Whether we look at bold promises of machine learning-based systems that pretend to match all our business needs, over AI-powered predictive maintenance tools that optimize servicing, to smart recruitment applications that claim to find the best candidates out of gazillions of applicants, to self-driving cars pledging to enhance safety and convenience, or AI-driven filters that allow intelligent replacements of skies to create truly dramatic (fake) photos – AI has become an irreplaceable buzzword to describe the impressive achievements stemming from recent advances in the development of smart machines. This marketese speech is, without doubt, not unfounded: in the past few years we have witnessed various impactful demonstrations of AI-injected systems [1] that nurtured brave expectations about future capabilities of intelligent tools. AI is currently undergoing another hype phase—and is experiencing at least its third hype wave since the foundation of Artificial Intelligence as a scientific discipline at the famous Dartmouth conference in 1956. Hypes naturally come with valleys of disillusion and the course of AI is no exception.
The term AI winter was coined to describe a phase in which research funding and industry investment in AI declines, mostly due to disappointments in the light of excessive promises. The history of AI has counted two of such winters: the Lighthill report [2] heralded the first period of drought in the 1970s by pointing out the limitations of the then current technology to meet the grand plans of early AI approaches. Initial successes of so-called Expert systems in the 1980s helped to raise interest (and funding) in AI in a second wave, but the brittleness of resulting systems—aside from their often very narrow domains—led to another decline and, subsequently, to a second AI winter. It is fair to denominate the vast majority of systems developed in these earlier periods of AI history as being research demonstrators that illustrated the state of the art at that time. Applications flourishing in the current AI summer, however, clearly left the protective walls of research labs, but have long found their ways into commercial products. Speech-based systems, including Alexa and Siri, have conquered our living rooms and kitchens while machine learning approaches are now comprehensively used to analyze big data bodies in cloud-based computing to derive patterns and support decision making. AI is now described as a key technology of the millennium [3]. The unbridled blossoming of AI in its present third summer is facilitated by national-level research funding and gigantic industry budgets.
The broad commercial success of recent AI-injected systems comes, as a consequence, with a wide diversification in their user base. New functionalities of AI systems lead to questions of how these can be used to address relevant user needs. Taming the complexity of AI applications requires appropriate interfaces for Human-AI interaction. Evaluating these systems asks for innovative approaches to understand how we can identify barriers and improve a user’s experience with AI systems using formative evaluation methods. The aforementioned considerations are of course just selected, non-exhaustive examples, but they point to a pool of critical questions for the future success of AI-based systems. In this paper, we argue that User Experience (UX) Design - as a human-centered discipline that is able to balance (potentially conflicting) requirements from users, technologies and businesses - provides a structured framework of methods to support the development of AI-based systems. To explore the relationship of AI and UX, we present three case studies from our lab and discuss respective learnings from these projects. Other authors have already ascertained that the “field—where UX meets AI—is full of tensions” [4], and we can only agree. While it is generally approved that a positive user experience is a core ingredient for the acceptance of a product or service (e.g. [5]), a discussion of the role of UX methods for the development of AI-injected systems has only recently seen several calls to action and participation from both researchers and major journals, e.g. [4, 6, 7].
2 UX and AI
A tremendous number of attempts aiming to arrive at a universal definition of the term «Artificial Intelligence» can be found in the literature—with not too much consensus in their core conclusions. We certainly do not intend to add another facet to these attempts, but will use Nilsson’s description (1998) as a working definition for the remainder of this paper: “Artificial Intelligence (AI), broadly (and somewhat circularly) defined, is concerned with intelligence behavior in artifacts. Intelligence behavior, in turn, involves perception, reasoning, learning communicating, and acting in complex environments. AI has one of its long-term goals in the development of machines that can do these things as well as humans can, or possibly even better” [1, 8].
Today’s AI systems—with the possible exception of very sophisticated robots—mostly do not comprehensively match all aspects of Nilsson’s intelligence behavior, but typically focus on a selection of these. With regard to commercial systems, some that use the attribute «AI» in advertisement may even just refer to single “AI-powered tools”, like Luminar’s “AI Sky replacement” or “AI skin enhancer” in photo processing. Note that Nilsson’s definition also does not require any similarity to (postulated) underlying human structures or processes when generating intelligent behavior, but is taking an engineering perspective. With regard to the case studies reported in this article, the same stance is taken—with the exception of the ANTETYPE project (see section Sect. 5.3) where the goal is explicitly targeted towards a simulation of human behavior based on a theory about the human Cognitive Architecture [9]. It is worth to emphasize that Nilsson’s definition clearly surpasses a purely Machine Learning (ML) approach, which is sometimes illegitimately equated with AI (see [10]).
Although several different definitions of «User Experience» are used in the literature, they mostly converge in their core meaning. For the purpose of the paper, we render the widely used definition put forward in [7, 11] adequate: User experience (UX) can be described as a “person’s perceptions and responses resulting from the use and/or anticipated use of a product, system or service [... It] includes all the users’ emotions, beliefs, preferences, perceptions, physical and psychological responses, behaviors and accomplishments that occur before, during and after use”. This definition importantly emphasizes a temporal aspect of the concept of user experience: the expectations and anticipations of a prospective user contributes to the total of a user’s experience with a system, just as her experience during the actual usage situation and her retrospective considerations after use do.
3 Challenges in Human-AI Interaction
Intelligent, AI-injected systems perform more and more tasks previously carried out by humans. This achievement fulfills the last part of the above cited definition of AI by Nilsson: “machines that can do these things as well as humans can, or possibly even better” [1, 8]. If we consider this part from a broader, human factors-oriented perspective, we can describe this process as automation [12, 13]. Following the steps of human information processing, Parasuraman, Sheridan and Wickens [13] describe four corresponding functions that can be automated: information acquisition (sensory data gathering), information analysis (processing of acquired information), decision selection (choosing an option) and action implementation (execution of selected decision). To differentiate automation levels, i.e. the degree to which tasks are performed by a machine, Flemisch et al. [14] propose an automation spectrum with five levels (manual - assisted - semi automated - highly automated - autonomous/fully automated). While advantages and benefits such as increases in effectiveness and efficiency are obviously prevalent throughout all of these levels, automation in general comes with a batch of well-known human factor challenges. Bibby et al. [15] argues that automated systems will always remain human-machine systems, no matter how advanced the technology gets. Bainbridge [16] concludes an “irony of automation” in the sense, that the role of a human operator becomes even more crucial the more advanced a system becomes. In today’s cars, for example, systems like Adaptive Cruise Control and Lane Keeping Assistance are able to take over (parts of) the driving task. However, if a system limit is reached or an error occurs, the driver, i.e. the human operator, needs to understand what happened (or what is about to happen) in order to be able to take over control in potentially critical situations.
The driving situation outlined above evokes automation problems that can be allocated to three main reasons [17]: 1) Inappropriate trust, 2) Loss of manual skills and 3) Insufficient situation awareness. For the scope of this paper, we focus on these reasons, but want to point out that - depending on context and scope of an application - there might be other, more specific challenges prevailing, e.g. in terms of reliability, performance, expectancy, ethics, security (perception) or data privacy (see e.g. [18]).
3.1 Trust
Trust can be described as “a belief that something is expected to be reliable, good and effective” and as the mental state people have based on their expectations and perceptions [19]. With regard to the definition of User Experience in Sect. 2 of this paper, inappropriate trust can thus be either (a) the outcome of (positive or negative) expectations that precede actual driving and/or (b) are established as a result of an earlier or the current driving experience. The level of trust towards a system depends on its reliability, its perceived usefulness and transparency [20], i.e. the degree of its comprehensibility. For the use of intelligent systems it is essential, that the level of trust people have is appropriate. Neither “overtrust” nor “distrust” is desired [21] as it might eventually result in “misuse” or “disuse” [12, 13]. With increasing system experience people calibrate their level of trust [22], i.e. they adjust their trust level to match system capabilities, which eventually leads to appropriate use [21]. Besides perceived usefulness and perceived ease of use [23], trust has a major influence on the (public and personal) acceptance of systems (e.g. [24]).
3.2 Loss of Skills
We can observe a change of humans’ role from active operating tasks to passive monitoring tasks in automated and AI-injected systems [25]. As a result, operators loose experience, training time and associated motor and cognitive skills [17]. Since automation often takes over the ‘standard case’, but might eventually fail in critical situations, Bainbridge’s [16] irony becomes even more paradox.
3.3 Insufficient Situation Awareness
Situation awareness describes “a person’s state of knowledge about a dynamic environment” [26, p. 60]. It includes the perception and comprehension of elements that are part of this environment, as well as the respective projection of future states based on this understanding [26]. In dynamic systems control, situation awareness is a prerequisite for making adequate decisions. If situational awareness is insufficient, it is more likely that humans make wrong decision. Human factors literature describes this adverse state also as «out-of-the-loop unfamiliarity», illustrating a situation in which the operator/user takes unnecessary much time to get back into the control ‘loop’ [27].
4 Design Guidelines
Besides general design processes and guidelines, e.g. [11, 28, 29], there are several attempts to provide specific guidance for designing Human-AV interactions. Recent collections, e.g. [1, 30], focus on adapting approved general guidelines featuring a human-centred perspective. For instance, the People+AI Guidebook by Google [30] emphasizes an explicit focus on solving an actual problem where the strengths of AI can be used to support user needs. It is mandatory to find the right balance between augmenting and automating tasks - instead of simply adding AI functions on top of existing products just because of their technological feasibility or for marketese advertisements. Furthermore,’reward functions’ of AI systems determining how AI defines successes and failures need to be designed and evaluated considering various perspectives and - if possible - be communicated to users [30]. In the same spirit, Amershi et al. (Microsoft) [1] put forward to “Make [the user] clear what the system can do” and how well it can do that. Building on the work of Horvitz [31], they propose 18 AI usability guidelines (see Table 1 for an excerpt), each complemented with a description and detailed examples. The set is split up in 4 categories: initially (G1–G2), during interaction (G3–G6), when wrong (G7–G11) and over time (G12–G18). While Amershi et al. [1] introduce the set as generally applicable, they also note an inherent trade-off in terms of its validity for specific applications.
5 Case Studies
In this section, we present three case studies that exemplify the design of AI-injected systems in different domains, discuss respective underlying interaction types and carve out some of the learnings we gathered. During the description of the projects, we also refer to the guidelines summarized in Table 1 to point to their usefulness. Using the model by Parasuraman, Sheridan and Wickens [13], Fig. 1 provides an overview of the studies by classifying their automation level along a continuum from low (fully manual performance) to high (full automation) in four functions: information acquisition, information analysis, decision selection and action implementation. The first case study, IMEDALytics (Sect. 5.1), is taken from a project targeted at the development of a decision-support system (DSS) for individualized medical risk assessment, monitoring, and therapy management in intensive care medicine. The second project, APEROL (Sect. 5.2), focuses on the development of autonomous mobility-on-demand public buses and the required services for their operation. The final case study, Predictive prototyping (Sect. 5.3), introduces ANTETYPE, a state-of-the-art user interface prototyping tool that we combined with the ACT-R Cognitive Architecture (see [9]) to support the prediction of human behavior based on synthetic cognitive models. We selected these examples from the pool of our current research projects to not only present cases from different domains, but also to consider different types of Human-AI interaction. While the interaction type «guardian angel» (see [32]) – resembling an automatic machine with a ‘protective’ character – prevails in the APEROL case, interaction in IMEDALytics can be adequately characterized as type of «colleague». Similar to that, users of the ANTETYPE prototyping tool interact with the respective AI-enhanced prediction module in a «best friend» style, where the tool acts as a partner who assists with delivering requested results.
Description of our case studies using an adaptation of the model by Parasuraman, Sheridan, and Wickens [13].
5.1 IMEDALytics: Clinical DSS for Intensive Care
Personalized medicine is a research field that unites many disciplines. The mutual aim of personal medicine is to treat patients based on their individual parameters, including their physiological constitution, gender-specific characteristics, or the results of an analysis of their genetic codes. Even highly skilled professionals are unable to recognize the complex statistical interrelations between all these parameters. Algorithmic analyses, however, can successfully rise to that challenge and detect meaningful patterns in complex data sets. It is paramount to present the ramified information—and possible identified patterns—to physicians in an unambiguous and understandable way, classifying the task as a problem of Explainable AI (XAI, see [33]). On the one hand, the healthcare staff’s requirement to act rapidly under time pressure needs to be met. Simultaneously, the system needs to offer enough informational depth to persuade physicians and nurses to even consider the information in the first place.
In IMEDALytics, an ongoing applied research project, we are designing an AI-based system to support high-consequence clinical decision-making in intensive care units (ICU). This type of system belongs to the category of clinical decision support systems (CDSS). With the adoption of electronic patient records and significant advances in AI technology, CDSSs have the potential - from a technical perspective - to provide complementary insights into medical prediction, diagnosis and/or treatment choice [34,35,36,37,38,39,40]. Although CDSS can potentially enhance the quality of care, it is noteworthy that many CDSS—despite considerable progress in AI technology—still fail to be adopted into clinical practice [41, 42]. An insufficient understanding of user needs due to lacking user research, as well as deficient considerations of HCI guidelines in system design are claimed to be main causes for the failure of adoption [43,44,45,46,47]. Results of Khairat et al. [45] indicate, based on a critical review of CDSS papers focusing on user acceptance, that poor workflow integration, questionable validity of systems, excessive interference by the systems and efficiency issues are often related to lower user acceptance.
Within a qualitative field study, Yang et al. [42] investigated a particular use case of a prognostic CDSS: the medical decision-making process for a ventricular assist device to partially replace heart functions. The authors likewise identified a lack of trust in the capabilities of an AI-injected CDSS to assist in difficult cases. Beyond this finding, they failed to observe the need for such support as the observed clinicians felt that they “knew how to effectively factor patient conditions into clinical decisions” [42]. In sum, Yang et al. argued for the necessity to carefully consider the social context. There is an urgent need for designers to gain a deeper understanding of CDSSs, their (future) users and their particular contexts of use to maximize opportunities for CDSSs.
While research within the fields of Medicine, Medical Informatics, and AI has until today mainly focused on supporting decisions for arriving at the correct diagnosis or the prediction of a deterioration of a patient’s state [36,37,38,39,40], our case study focuses on supporting continuous decisions for optimal therapy, more precisely volume therapy. Volume therapy is defined as infusion therapy that serves to compensate for a volume deficit inside blood vessels. The particular challenge for treating ICU physicians is to determine the optimal, individualized indication for each patient based on medical guidelines and to administer the correct dose and the most suitable infusion solution. Incorrect therapy can result in undesired long-term consequences such as the need for long-term care or long-term ventilation. In IMEDALytics we focus on assisting physicians in individualized medical risk assessment, monitoring, and therapy management for volume therapy.
We argue, that in order to holistically support decision-making processes in intensive medicine, a change of perspective from classical problem solving through technology to the design of experience potentials is essential. Questions with which we were faced during project work ranged from general questions regarding the creation of positive Human-AI-Interaction to specific questions on data visualization techniques:
-
1.
How can we combine the human abilities of healthcare professionals - such as their general understanding, their previous experiences, their flexibility and creativity in the decision-making process - with the powerful possibilities of an AI-based system?
-
2.
How can we make the diagnosis and therapy suggestions provided by the system accessible to healthcare professionals without depriving their self-efficacy?
-
3.
Which design processes are needed to design an interactive interface that leads to a long-term positive UX?
-
4.
Which influence has (the type of) presented information - e.g. in the form of information visualizations - on the perceived transparency or even trust in a CDSS?
Understanding UX in Volume Therapy. To gradually address these questions and to derive solutions from the aforementioned perspective “experiences before functionality (technologies)”, we chose an experience design approach as proposed by Hassenzahl [48]. This approach focuses on the user and concentrates first on his or her experience. Experiences are analyzed by using psychological needs to identify why an experience is considered positive.
In the very beginning of our project, our goal was to gain a detailed understanding of how physicians and nurses work together to make decisions around volume therapy and how CDSSs can be integrated into their daily clinical work. In particular, we wanted to understand decision-making within the specific organizational framework of an ICU and within a heterogeneous team. To gain insights into these experiences, we conducted contextual inquiries (observations and semi-structured interviews) [49] in three German ICUs [50]. We transferred and visualized our findings on workflows, situations, actions, emotions, context, and interactions that a (future) user may experience during a typical day along a time axis into a user experience map [51], thereby blending well-established UX methods and service design techniques. We modified conventional user experience maps to emphasize collaboration by including two users instead of one against the background of the ICU context. The goal of working with a user experience map was to aid discovery of experience opportunities that a CDSS for volume therapy might bring. Our findings show that adapting a system’s interface to both, context and users, facilitates collaboration and embraces interactions with a CDSS to combine human and machine intelligence [50].
Subsequent to this ethnographic approach, we chose to complement our insights with additional interviews to validate gathered findings and to discuss initial design concepts derived from these findings with nurses and physicians. Therefore, we are applying a method inspired by Séguin et al.’s proposed Triptech approach [52] featuring storyboards to collect prospective users’ reactions (likes/dislikes/potential use cases/questions/concerns) to early design concepts. In contrast to the Triptech approach that is used in focus groups, we are using the storyboards in individual interviews (Fig. 2). In a first step, interviewees assess the extent to which psychological needs (see [53], e.g., autonomy, competence, security) in volume therapy are currently met. This enables us to place an increased focus on the psychological needs according to Hassenzahl et al. [53] within the volume therapy and use psychological need statements as an impulse for the presentation of first concept drafts (second step). To gather user feedback, we present three to five design concepts in this second step that address the psychological needs that the interviewee prioritized in the initial step. The design concepts consist of storyboards (see Fig. 2) and help us to discuss design concepts with a focus on interviewees’ experiences. Using storyboards, we enable prospective users to better imagine situations where CDSS support is desirable. Particularly, we intent to gather information on how to provide proper granular feedback (G15) and on how to clearly communicate users why the system did what it did (G11).
As AI in the IMEDALytics case takes over the role of a ‘colleague’ (see [32]) supporting nurses and physicians particularly in their decision selection, considering this aspect is crucial to set the appropriate tone in the communication with users. I.e. system design has to carefully take the needs and user requirements into account to satisfy the mentioned guidelines by [1]. This in turn provides the preconditions to arrive at an appropriate level of trust and to facilitate system acceptance. In contrast to the ‘colleague’ metaphor, the AI-based system in APEROL, our second case study, can be described as an automatic ‘guardian angle’.
5.2 APEROL: Autonomous Mobility-on-Demand
In autonomous mobility-on-demand systems, passengers are transported by robotic, self-driving cars [54], i.e. by vehicles with high or full driving automation (SAE levels 4 or 5 [55]). Due to the rapid progresses in vehicle automation, such AI-driven autonomous vehicles (AVs) will soon be introduced to the public. As a result, the use of public, demand-oriented transport systems and autonomous ride sharing will become reality in our daily commuting. Since AMoD services will be always available and neither rely on scheduled timetables nor on fixed stops, they will provide spatial and temporal flexibility to passengers while increasing efficiency and sustainability of transport systems [54, 56]. Consequently, fewer vehicles will then be on our roads in terms of both riding and parking. AMoD offers great potential to solve major challenges of today’s public transport systems, e.g. regarding congestion prevention, accessibility and first/last mile problems [54, 56,57,58]. Traffic simulations on the integration of AMoD systems in major metropolises - e.g. New York City and Singapore [54] - support this promising conclusion and provide evidence for their effectiveness and efficiency. In addition, gained free time (due to not being engaged with the driving task) might increase our productivity or can be used for communication and relaxation [59], resulting in overall societal benefits.
Despite achieving technical maturity, AVs face major challenges with regard to public adoption. Adoption barriers include (inappropriate) user expectations, concerns about the technology’s reliability, performance and security, as well as privacy considerations—and most important of all: trust issues [18]. To counteract these challenges, a precise understanding of people, systems, and their respective environment is essential [19]. A clear comprehension of a user’s experience journey using an AMoD system enables the thorough design of corresponding touch points (i.e. HMIs) and Human-AV (i.e. Human-AI) interactions. Touch point design is a vital part of our publicly funded project APEROL (Autonomous, Personal Organization of Road Traffic and Digital Logistics; [60]). After having gathered a thorough understanding of the context of use through extensive user research in this project, we now focus on two main questions:
-
1.
How can we create an enjoyable UX for (future) passengers when interacting with AVs before, during and after use?
-
2.
How can we efficiently evaluate design concepts for the required interfaces at the respective touch points - especially in very early phases of the development?
In the next sections we provide insights on how we are tackling the aforementioned questions within the project APEROL.
Understanding UX in AMoD. In contrast to lower levels of driving automation, all occupants of AVs (SAE levels 4 and 5 [55]) are passengers that do not need to take care of the vehicle’s driving at all. This situation can roughly be compared to taking a taxi. A main difference is, however, that no (human) driver, who controls the vehicle or can communicate with passengers, is present in AVs. Thus, there is no driver asking passengers where they want to go or notifying them when there is a traffic jam ahead. Instead, AI has to take over both responsibilities. The AI-powered system conducts primary driving tasks, i.e. navigating, steering and stabilization (see [61] for further elaboration), as well as secondary driving tasks (e.g. light control). To do this, environmental data from multiple sensor inputs (e.g. from stereo cameras, lidar and radar) is collected and analyzed in real-time, applying AI-based and stochastic algorithms for object detection and tracking. The algorithms use - for instance - artificial neural networks to recognize roads, other vehicles and infrastructure (e.g. [62]) or to predict the path of pedestrians and cyclists (e.g. [63]). Combining the sensor information with HD maps and GNSS data enables the AV to plan its movements through complex traffic environments. Even for sophisticated researchers such AI systems typically remain - at least to some extent - “black boxes” [64]. Ordinary passengers, not having knowledge about their capabilities, can experience a loss of control and a corresponding feeling of insecurity, making it difficult to establish an appropriate level of trust. However, trust is considered to be an essential prerequisite for technology acceptance (e.g. [24]). By providing passengers with appropriate information and feedback about the AV’s current state, its activities and its intentions we intend to support trust calibration and aim to compensate the absence of a human driver.
To foster a comprehensive understanding of future AMoD users, their needs and requirements, prospective users need to be continuously integrated into the development process [65] from early phases on. Within the APEROL project, we co-conducted a citizens’ dialogue on autonomous driving with a representative sample of 76 prospective users of an AMoD service [66]. The findings of this dialogue supported the challenges of Human-AI interaction mentioned above and served as a foundation for design considerations and decisions.
Designing Human-AV Interactions. Strengthened by insights from our user research activities we consider well-designed and trustworthy systems with an enjoyable UX as crucial to counteract the hurdles of AMoD adoption. Such systems inform and enable passengers (1) to understand the signals, intentions and actions of (AI-controlled) AVs, (2) communicate their own intents and needs, and (3) to foster an adequate level of trust towards the technology. Based on our research results and its synthesis with AI design guidelines (Sect. 4) we developed two conceptual design proposals for Human-AV interaction: an in-vehicle passenger information display and a smartphone travel app. The interface proposals are still in early concept phases and are currently evaluated in a study involving a representative user sample. By presenting these initial interface drafts we, nevertheless, hope to contribute to a discussion on the creation of efficient and enjoyable UX for future AMoD systems.
Smartphone App. Since there are no driving-related controls (e.g. steering wheel, gas pedal) available in AVs, the main user interface in AMoD systems will probably be a (smartphone) app. Particularly following AI Design Guidelines G1, G3, G4, G17 and G18 (Table 1), our app concept focuses on providing users with adequate information to arrive at a profound level of situation awareness, as well as on offering control functionalities while taking a ride in a (shared) AV. Figure 3 shows three different states of the app concept’s main screen during a ride. The app displays the AV’s location, its planned route and traffic information in the map (Fig. 3: A) and provides - similar to hardware buttons in public busses - a “STOP” functionality (Fig. 3: B) to support efficient correction (G9; Table 1). In addition, an emergency button (see also [67]) provides direct access to customer support and emergency functionalities (Fig. 3: C).
In-Vehicle Passenger Information Display. Passenger information systems promise to increase user acceptance and customer comfort of public transport systems [68]. Similar to the smartphone app, our in-vehicle HMI concept for a shared AV encompasses a map displaying current location, route, planned stops as well as traffic conditions (Fig. 4: A, B, D). Furthermore, personal ticket IDs (Fig. 4: C) are displayed in a ’stop list’ to anonymously communicate drop-off stops to respective passengers. When booking a ride, passengers receive the ticket ID which then functions as an (anonymous) allocator for individual passenger information. The in-vehicle HMI enables the passengers to get all required information without having to constantly monitor their smartphone, while at the same time protecting their privacy requirements.
Evaluating Human-AV Interactions. Since “autonomous ridesharing is still a theoretical subject [...,] users still lack the hands-on experience” [69] and field studies with AVs are only practicable within tight boundaries impairing the results, adequate methods and tools are needed for exploration, prototyping and evaluation (see also [7]). Such methods are, however, necessary to enable continuous and iterative evaluations of interface and service concepts. For expert-based evaluation, guidelines (e.g. [1] are good starting points. Generally, the interaction with AVs and AMoD systems is highly context-sensitive, making the actual usage situation an essential aspect of evaluation setups. Context-based prototyping and empirical simulation studies are needed to conduct proper user experience evaluations. To meet these requirements, we constructed a simple video-based AV simulator with a CAVE-like environment. Placed in a standard office room, our AV simulator enables stakeholders and users to experience a simulated (shared) ride in an AV (see [70] for further elaboration). Initial user studies incorporating the setup [70] show promising results regarding both presence perception and its suitability for valid and context-sensitive usability testing.
5.3 AI-Based Predictive Prototyping
In the IMEDALytics case study we presented an AI-based clinical decision support system that will assists physicians in individualized medical risk assessment, monitoring, and therapy management in the—clearly circumscribed—domain of volume therapy. The assumed interaction type with the system can be characterized as physicians dealing with a competent, non-human «colleague» (see [32]). Passengers of the AMoD system in the APEROL case are likely to experience the autonomous bus as a «guardian angel» that safely transports them to the location they wish to reach. In the case study discussed in this section we exemplify the interaction type «best friend»: A designer uses a prototyping tool to create interactive interface prototypes and asks an AI-injected module of the tool—her helpful «best friend»—to deliver quantitative performance predictions for given scenarios.
We have proposed such a predictive prototyping approach [71] and demonstrated how the interaction performance (e.g. in terms of efficiency) of user interface proposals can successfully be predicted by the integration of generated AI-models based on the ACT-R cognitive architecture [9]. A cognitive architecture embodies a comprehensive, computer-simulated scientific hypothesis about the structures and mechanisms of the human cognitive system that are regarded “as relatively constant over time and relatively independent of task” [72, 312]. The ACT-R framework allows the creation of models that can then be run to predict and explain human behavior. ACT-R models can interact with an environment and learn (on a symbolic and sub-symbolic, neurally-inspired level) to adapt the behavior to the statistical structure of an environment. We have integrated ACT-R as a module in ANTETYPE, a commercial design tool to create sophisticated, responsive UI prototypes for desktop, mobile and web-based applications [73]. ANTETYPE was designed to support a seamless transition from the development of early wireframes defining the layout of an interface, over the creation of visual design alternatives to the creation of complex, responsive, interactive prototypes without switching between different software tools.
An ACT-R model is derived automatically using ANTETYPE’s monitoring mode from observing a designer demonstrating the interactions to complete a relevant key scenario with an interface prototype. If interactions depend on the setting of specific values shown in the interface, (simulated) user actions can alternatively be described using a graphical inspector interface in ANTETYPE’s instruction mode. To run simulated users on a prototype, a designer simply (1) demonstrates the necessary steps to complete a task scenario, and (if necessary) (2) instructs the model using ANTETYPE’s instruction mode. After a designer has finished task demonstration, an ACT-R model is automatically generated by mechanisms described in detail in [71]. The model is then run on the scenario to create a distribution of performance times for a number of trials using the respective interface prototype. In this setting, the designer interacts with the prediction module by asking a «best friend» for performance predictions: the AI-based friend then delivers the results just like a friend would do after running a study. In our case, however, the participants are generated, synthetic users and the study is run automatically for an arbitrary number of trials. Figure 5 shows an example of using predictive prototyping to comparatively predict the performances of using three different interfaces for a given scenario (listening to a playlist on mobile music players, e.g. Spotify, QQMusic and a revised version of Spotify).
The outlined predictive prototyping method illustrates how quantitative performance predictions (like time-on task, initial learning to skilled behavior) can support designers by providing quick and valid analyses of the performance consequences of design variants. Alternative design proposals can be compared with regard to defined quantitative performance metrics without the need to conduct effortful empirical usability evaluations. Predictive prototyping thus allows iteration cycles to be accelerated. It is, of course, not our goal to replace empirical usability tests. They remain an irreplaceable method to identify conceptual usability barriers or receive qualitative information about a user’s experience with a system. In fact, predictive prototyping is in some sense complementary to empirical studies since it provides a promising approach to gather quantitative performance data that is beyond the (practical) scope of usability tests in a lab. We argue, that quantitative performance predictions cannot reliably be derived from empirical usability tests because (1) participants are typically not repeatedly exposed to a given test task: skilled performance—and learning—are, however, a function of the number of practice trials; (2) thinking-aloud, as a standard requirement for participants during usability test, interferes with the primary process of working on a task (see [74]); (3) most instructions in usability tests do not even require participants “to work as fast as possible” and (4) participants are aware of being recorded during usability tests and might thus focus on avoiding errors instead of working as efficient as possible on a (new) given task. By providing a solution to these objections, AI-based predictive modeling opens up new possibilities for interface designers. Initial applications of the method in real-world projects and encouraging goodness-of-fit comparisons of predicted and empirically observed user data provided evidence for the validity of the approach (see [71]).
The first and second case study reported in this paper emphasize how the methodological apparatus of human-centered design approaches can contribute to the development of better AI-based systems, increasing the likelihood of their adoption. The case study in this section shows how prototyping of user interfaces can directly benefit from the integration of an AI-based module that significantly enhances the scope of a prototyping tool. With regard to the guidelines by [1] (see Sect. 4), we want to especially highlight G2 (Make clear why the system did what it did) and G13 (Learn from user behavior). Learning from users (i.e. a designer demonstrating an interaction path) forms the basis of predictive prototyping. To support an understanding of why the model performs in the observed way, the module offers helpful visualization and tracing option to explain its behavior.
6 Conclusion
In order to explore the relationship of humans and AI, we presented three case studies for Human-AI interaction from our lab . These studies can, of course, only cover a small portion of the wide and ‘tension-full’ field where AI meets UX. The discussed challenges, guidelines, ideas and learnings might, however, be useful for further reference and exploration in other domains.
We illustrated the necessity to design understandable and trustworthy systems and the need to carefully consider contextual factors. CDSS, for example, still lack adoption in clinical practice, although their performance and capabilities have intensively improved over the last years due to the progress in AI technology [41, 42]. We claim that a core reason for this can be traced back to a lack of acceptance that is due to negligence of considering user requirements and context during the design process. As Lacher et al. [19] point out, it is crucial to understand people, systems and context in order to counteract respective challenges—and this might be of particular importance when designing dynamic, machine learning-based systems.
We appreciate the rich value of AI capabilities for UX and contemplate AI as an enabler of new (product) experiences, while at the same tine emphasising the eminent role of UX methods and frameworks for envisioning and creating positive interactions between humans and AI. Established UX methods and service design techniques need to be applied and, where necessary, adapted to tackle the challenges of AI-based automation. We thus consider the relationship of UX and AI as mutually beneficial.
References
Amershi, S., et al.: Guidelines for Human-AI interaction. In: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM Glasgow (2019)
Lighthill, J.: Artificial intelligence: a general survey. In: Artificial Intelligence: a Paper Symposium (1973)
Glenn, J.C., Millennium Project Team: Work/Technology 2050: Scenarios and Actions, technical report, The Millennium Project, Washington (2019)
Cramer, H., Kim, J.: Confronting the tensions where UX meets AI. Interactions 26(6), 69–71 (2019)
Eden, G.: Transforming cars into computers: interdisciplinary opportunities for HCI. In: Proceedings of the 32nd International BCS Human Computer Interaction Conference (HCI 2018), no. July (2018)
Loi, D., Wolf, C.T., Blomberg, J.L., Arar, R., Brereton, M.: Co-designing AI futures: integrating AI ethics, social computing, and design. In: DIS 2019 Companion - Companion Publication of the 2019 ACM Designing Interactive Systems Conference, no. Ml, pp. 381–384 (2019)
Churchill, E.F., Van Allen, P., Kuniavsky, M.: Designing AI. Interactions 25(6), 35–37 (2018)
Nilsson, N.J.: Artificial Intelligence: A New Synthesis. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Anderson, J.R., Bothell, D., Byrne, M.D., Douglass, S., Lebiere, C., Qin, Y.: An integrated theory of the mind. Psychol. Rev. 111(4), 1036–1060 (2004)
Vajapey, K.: What’s the Difference Between AI, ML, Deep Learning, and Active Learning? (2019)
DIN Deutsches Institut für Normung e, V.: Ergonomics of human-system interaction - Part 210: Human-centred design for interactive systems (ISO 9241–210:2010) English translation of DIN EN ISO 9241–210:2011–01 (2011)
Parasuraman, R., Riley, V.: Humans and automation: use, misuse, disuse, abuse. Hum. Factors: J. Hum. Factors Ergon. Soc. 39(2), 230–253 (1997)
Parasuraman, R., Sheridan, T.B., Wickens, C.D.: A model for types and levels of human interaction with automation. IEEE Trans. Syst. Man Cybern.-Part A: Syst. Hum. 30(3), 286–297 (2000)
Flemisch, F., Kelsch, J., Löper, C., Schieben, A., Schindler, J.: Automation spectrum, inner/outer compatibility and other potentially useful human factors concepts for assistance and automation. Hum. Factors Assist. Autom. 2008, 1–16 (2008)
Bibby, K.S., Margulies, F., Rijnsdorp, J.E., Withers, R.M.J., Makarov, I.M.: Man’s role in control systems. In: 6th IFAC Congress Boston (1975)
Bainbridge, L.: Ironies of automation. Automatica 19(6), 775–779 (1983)
Manzey, D.: Systemgestaltung und Automatisierung. In: Badke-Schaub, P., Hofinger, G., Lauche, K. (eds.) Human Factors, 2nd edn, pp. 333–352. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-19886-1_19. Chapter 19
Kaur, K., Rampersad, G.: Trust in driverless cars: investigating key factors influencing the adoption of driverless cars. J. Eng. Technol. Manage. 48, 87–96 (2018)
Lacher, A., Grabowski, R., Cook, S.: Autonomy, trust, and transportation. In: Proceedings of the 2014 AAAI Spring Symposium, pp. 42–49 (2014)
Wolf, I.: Wechselwirkung Mensch und autonomer agent. In: Maurer, M., Gerdes, J.C., Lenz, B., Winner, H. (eds.) Autonomes Fahren, pp. 103–125. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-45854-9_6
Lee, J.D., See, K.A.: Trust in automation: designing for appropriate reliance. Hum. factors 46(1), 50–80 (2004)
Muir, B.M.: Trust in automation: Part I. theoretical issues in the study of trust and human intervention in automated systems. Ergonomics 37(11), 1905–1922 (1994)
Davis, F.D., Bagozzi, R.P., Warshaw, P.R.: User acceptance of computer technology: a comparison of two theoretical models. Manage. Sci. 35(8), 982–1003 (1989)
Carsten, O., Martens, M.H.: How can humans understand their automated cars? HMI principles, problems and solutions. Cognit. Technol. Work 21(1), 3–20 (2018). https://doi.org/10.1007/s10111-018-0484-0
Bubb, H.: Das Regelkreisparadigma der Ergonomie. Automobilergonomie. A, pp. 27–65. Springer, Wiesbaden (2015). https://doi.org/10.1007/978-3-8348-2297-0_2
Endsley, M.R., Kiris, E.O.: The out-of-the-loop performance problem and level of control in automation. Hum. Factors: J. Hum. Factors Ergon. Soc. 37(2), 381–394 (1995)
Wickens, C.D.: Designing for situation awareness and trust in automation. IFAC Proc. Vol. 28(23), 365–370 (1994)
DIN Deutsches Institut für Normung e, V.: DIN EN ISO 9241–110:2008–09 Ergonomics Of Human-System Interaction - Part 110: Dialogue Principles (ISO 9241–110:2006) English Version Of DIN EN ISO 9241–110:2008–09 (2008)
Nielsen, J.: Heuristic evaluation. In: Nielsen, J., Mack, R. (eds.) Usability Inspection Methods, ch. 2, pp. 25–62. John Wiley, New York (1994)
Google: People + AI Guidebook: User Needs + Defining Success (2020)
Horvitz, E.: Proceedings of the SIGCHI conference on human factors in computing systems the CHI is the limit - CHI 1999, In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, no. May, pp. 159–166 (1999)
Alan, Y., Urbach, N., Hinsen, S., Jöhnk, J., Beisel, P., Weißert, M.: Think beyond tomorrow - KI, mein Freund und Helfer - Herausforderungen und Implikationen für die Mensch-KI-Interaktion, technical report, EY & Fraunhofer FIT, Bayreuth (2019)
Samek, W., Wiegand, T., Müller, K.-R.: Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models (2017)
McKinney, S.M., et al.: International evaluation of an AI system for breast cancer screening. Nature 577(7788), 89–94 (2020)
Gulshan, V., et al.: Performance of a deep-learning algorithm vs manual grading for detecting diabetic retinopathy in India. JAMA Ophthalmol. 137(9), 987–993 (2019)
Komorowski, M., Celi, L.A., Badawi, O., Gordon, A., Faisal, A.: The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care. Nat. Med. 24, 11 (2018)
Krishnan, G.S., Sowmya Kamath, S.: A supervised learning approach for ICU mortality prediction based on unstructured electrocardiogram text reports. In: Silberztein, M., Atigui, F., Kornyshova, E., Métais, E., Meziane, F. (eds.) NLDB 2018. LNCS, vol. 10859, pp. 126–134. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91947-8_13
Ettori, F., et al.: Impact of a computer-assisted decision support system (CDSS) on nutrition management in critically ill hematology patients: the nutchoco study (nutritional care in hematology oncologic patients and critical outcome). Ann. Intensive Care 9(1), 53 (2019)
Tafelski, S., et al.: Supporting antibiotic therapy in German ICUS - analysis of user friendliness and satisfaction with a computer-assisted stewardship programme. Anasthesiologie und Intensivmedizin 57, 174–181 (2016)
Saeed, M., Lieu, C., Raber, G., Mark, R.G.: Mimic ii: a massive temporal ICU patient database to support research in intelligent patient monitoring. In: Computers in Cardiology, pp. 641–644, September 2002
Belard, A., et al.: Precision diagnosis: a view of the clinical decision support systems (CDSS) landscape through the lens of critical care. J. Clin. Monitor. Comput. 31, 02 (2016)
Yang, Q., Zimmerman, J., Steinfeld, A., Carey, L., Antaki, J.F.: Investigating the heart pump implant decision process: opportunities for decision support tools to help. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, CHI 2016, New York, USA, pp. 4477–4488. ACM (2016)
McGinn, T.: Cds, UX, and system redesign - promising techniques and tools to bridge the evidence gap. In: EGEMS, Washington, DC, vol. 3, p. 1184, July 2015
Sittig, D.F., et al.: Grand challenges in clinical decision support. J. Biomed. Inform. 41, 387–392 (2008)
Khairat, S., Marc, D., Crosby, W., Al Sanousi, A.: Reasons for physicians not adopting clinical decision support systems: critical analysis. JMIR Med. Inform. 6(2), e24 (2018)
Horsky, J., Schiff, G.D., Johnston, D., Mercincavage, L., Bell, D., Middleton, B.: Interface design principles for usable decision support: a targeted review of best practices for clinical prescribing interventions. J. Biomed. Inform. 45(6), 1202–1216 (2012)
Cai, C.J., Winter, S., Steiner, D., Wilcox, L. and Terry, M.: “Hello AI”: uncovering the onboarding needs of medical practitioners for human-AI collaborative decision-making. In: Proceedings of ACM Human-Computer Interaction, vol. 3, November 2019
Hassenzahl, M.: Experience design: technology for all the right reasons. Synth. Lect. Hum.-Centered Inform. 3(1), 01–95 (2010)
Beyer, H., Holtzblatt, K.: Contextual Design: Defining Customer-Centered Systems. Morgan Kaufmann Publishers Inc., San Francisco (1997)
Kaltenhauser, A., Rheinstädter, V., Butz, A., Wallach, D.: “You Have to Piece the Puzzle Together” - Designing for Decision Support in Intensive Care. In: Proceedings of the Designing Interactive Systems Conference 2020 (DIS 2020). Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3357236.3395436
Kalbach, J.: Mapping Experiences: A Complete Guide to Creating Value Through Journeys, Blueprints, and Diagrams, 1st edn. O’Reilly Media Inc., Newton (2016)
Séguin, J.A., Scharff, A., Pedersen, K.: Triptech: a method for evaluating early design concepts. In: Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, CHI EA 2019. NY, USA. Association for Computing Machinery, New York (2019)
Hassenzahl, M., Diefenbach, S., Göritz, A.: Needs, affect, and interactive products-Facets of user experience. Interact. Comput. 22(5), 353–362 (2010)
Pavone, M.: Autonomous mobility-on-demand systems for future urban mobility. In: Maurer, M., Gerdes, J.C., Lenz, B., Winner, H. (eds.) Autonomes Fahren, pp. 399–416. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-45854-9_19
SAE International: J3016-JUN2018 - Surface Vehicle Recommend Practice: Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles (2018)
Spieser, K., Treleaven, K., Zhang, R., Frazzoli, E., Morton, D., Pavone, M.: Toward a systematic approach to the design and evaluation of automated mobility-on-demand systems: a case study in Singapore. In: Meyer, G., Beiker, S. (eds.) Road Vehicle Automation. LNM, pp. 229–245. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05990-7_20
Chong, Z.J., et al.: Autonomy for mobility on demand. In: Proceedings of the 12th International Conference on Intelligent Autonomous Systems (IAS 2013), vol. 293, pp. 671–682, Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-33926-4_64
Hinderer, H., Stegmuller, J., Schmidt, J., Sommer, J., Lucke, J.: Acceptance of autonomous vehicles in suburban public transport. In: Proceedings of the 2018 IEEE International Conference on Engineering, Technology and Innovation (ICE/ITMC 2018) (2018)
Fraunhofer IAO and Horváth & Partners: The Value of Time - Nutzerbezogene Service-Potenziale durch autonomes Fahren, technical report, Stuttgart (2016)
APEROL i.V. PSI Logistics GmbH. www.autonomousshuttle.de - APEROL - Autonome personenbezogene Organisation des Straßenverkehrs und digitale Logistik (2019)
Bubb, H., Bengler, K., Breuninger, J., Gold, C., Helmbrecht, M.: Systemergonomie des Fahrzeugs. Automobilergonomie. A, pp. 259–344. Springer, Wiesbaden (2015). https://doi.org/10.1007/978-3-8348-2297-0_6
Sun, Z., Bebis, G., Miller, R.: On-road vehicle detection: a review. IEEE Trans. Pattern Anal. Mach. Intell. 28(5), 694–711 (2006)
Kooij, J.F., Flohr, F., Pool, E.A., Gavrila, D.M.: Context-based path prediction for targets with switching dynamics. Int. J. Comput. Vision 127(3), 239–262 (2019)
Olden, J.D., Jackson, D.A.: Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks. Ecol. Model. 154(1–2), 135–150 (2002)
Brell, T.: Aachener Bürgerdialog zum Thema autonome Mobilität (2019)
Brell, T., Philipsen, R., Ziefle, M.: Suspicious minds? - users’ perceptions of autonomous and connected driving. Theor. Issues Ergon. Sci. 20(3), 301–331 (2019)
Uber: Uber’s Emergency Button (2019)
Beul-Leusmann, S., Jakobs, E.M., Ziefle, M.: User-centered design of passenger information systems. In: Proceedings of the IEEE International Professional Communication 2013 Conference (IPCC 2013) (2013)
Philipsen, R., Brell, T., Ziefle, M.: Carriage Without a driver – user requirements for intelligent autonomous mobility services. In: Stanton, N. (ed.) AHFE 2018. AISC, vol. 786, pp. 339–350. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-93885-1_31
Flohr, L.A., Janetzko, D., Wallach, D.P., Scholz, S.C., Krüger, A.: Context-Based Interface Prototyping and Evaluation for (Shared) Autonomous Vehicles Using a Lightweight Immersive Video-Based Simulator. In: Proceedings of the Designing Interactive Systems Conference 2020 (DIS 2020). Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3357236.3395468
Wallach, D.P., Fackert, S., Albach, V.: Predictive prototyping for real-world applications: a model-based evaluation approach based on the ACT-R cognitive architecture. In: DIS 2019 - Proceedings of the 2019 ACM Designing Interactive Systems Conference, pp. 1495–1502 (2019)
Howes, A., Young, R.M.: The role of cognitive architecture in modeling the user: soar’s learning mechanism. Hum.-Comput. Interact. 12(4), 311–343 (1997)
Ergosign GmbH: Antetype.com (2020)
Wallach, D., Scholz, S.: Thinking aloud: foundations, prospects and practical challenges. In: Klopp, J., Schneider, F., Stark, R. (eds.) Thinking Aloud - The Mind in Action. Weimar: Bertuch (2019)
Acknowledgements
This work has been funded by the German Federal Ministry of Education and Research (BMBF) under the grant numbers 13GW0280B and 02L15A212 as well as by the German Federal Ministry of Transport and Digital Infrastructure (BMVI) under the grant number 16AVF2134A.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wallach, D.P., Flohr, L.A., Kaltenhauser, A. (2020). Beyond the Buzzwords: On the Perspective of AI in UX and Vice Versa. In: Degen, H., Reinerman-Jones, L. (eds) Artificial Intelligence in HCI. HCII 2020. Lecture Notes in Computer Science(), vol 12217. Springer, Cham. https://doi.org/10.1007/978-3-030-50334-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-50334-5_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50333-8
Online ISBN: 978-3-030-50334-5
eBook Packages: Computer ScienceComputer Science (R0)