Urbano, an Interactive Mobile Tour-Guide Robot

2008, Advances in Service Robotics

14 Urbano, an Interactive Mobile Tour-Guide Robot Diego Rodriguez-Losada, Fernando Matia, Ramon Galan, Miguel Hernando, Juan Manuel Montero and Juan Manuel Lucas Universidad Politecnica de Madrid Spain 1. Introduction Autonomous service robot applications can be divided in two main groups: outdoor and field robots, and indoor robots. Autonomous lawnmowers, de-mining and search and rescue robots, mars rovers, automated cargo, unmanned aerial and underwater vehicles, are some applications of field robotics. The term indoor robotics usually applies to autonomous mobile robots that move in a typical populated indoor environment. Robotic vacuum cleaners, entertainment and companion robots or security and surveillance applications are also some examples of successful indoor robot applications. Probably, one of the first real world applications of indoor service robots has been that of mobile robots serving as tour guides in museums or exhibitions. Such one is an extremely interesting application for researchers because allows them to advance in knowledge fields as autonomous navigation in dynamic environments, human robot interaction, indoor environment modelling with simultaneous localization and map building, etc., while also serving as a showcase for attracting the general public as well as possible investors. We have developed our own interactive mobile robot called Urbano, especially designed to be a tour guide in exhibitions. This chapter describes the Urbano robot system, its hardware, software and the experiences we have obtained through its development and use until its actual mature stage. This chapter doesn’t pretend to be an exhaustive technical description of algorithms, mathematical or implementation details, but just an overview of the system. The interested reader will be referred to more specific bibliography for these details. The rest of the chapter is structured as follows: This section presents the related work, other existing systems, as well as our motivation to develop our own robot. Section 2 presents an overview of Urbano, the description of its hardware and also the software components in which the robot control is structured. These components are afterwards described in subsequent sections: Section 3 describes the feature based mapping and navigation subsystem, while the interaction capabilities including our own proprietary voice recognition and synthesis engine will be described in section 4. Section 5 briefly describes the web based remote visit that Urbano is also able to perform. The integration of all these components is managed through a programmable kernel that allows a high level management of all modules, described in section 6. The chapter ends with the presentation of some successful real deployments of Urbano in section 7, and our conclusions in section 8. 230 Service Robots 1.1 State of the art As previously stated, Urbano is not the first mobile tour-guide robot. There have been many others that have served as a reference for the mobile robots research community. Probably, the first one was Rhino (Burgard et al., 1999) robot from Bonn University, followed by Minerva (Thrun et al., 1999) from Pittsburgh, CMU and Bonn Universities. These robots meant important advances in mobile robot mapping, localization and reactive control, using commercial robot platforms as a hardware base for developments. Another approach was Sage (Nourbakhsh et al., 1999) robot that focused on a more commercial vision that was later accomplished by Mobot Inc. Sage robot uses artificial coloured landmarks in the environment to achieve a robust navigation. These robots were later followed by other derivative works as the robotic assistants for the elderly called Flo and Pearl (Montemerlo et al., 2002). These robots were afterwards followed by many others from different universities and research centers around the globe, as Albert from Freiburg University (Germany), Lefkos from Forth institute (Greece), Tito from Valladolid University-Cartif (Spain), Carl from Aveiro University (Portugal), just to cite some examples. All of them were interactive mobile robots to serve as tour guides. Recently, several other robotic tour guides have been developed and commercialized by some companies as the BlueBotics RoboX. Eleven RoboX were involved during 5 months in the Robotics exhibition at the International Expo 02. Industry giants have also built their own robotic guides, as Toyota TPR-Robina operating in their company headquarters. Likewise, Fujitsu developed the Enon robot that served as a guide in the Kyotaro Nishimura museum. 1.2 Previous work Our initial trials with interactive robots were carried out with Blacky (Fig. 1), a robot for tour-guiding, tele-visit and entertainment. We developed navigation algorithms focussed on indoor, populated, complex and low structured environments. Fig. 1. Blacky robot in a fair Blacky was a MRV4 mobile platform from Denning Branch, Inc., with a ring of sonars, a three wheeled synchro-drive system, a radio link for wireless Ethernet connection, a Urbano, an Interactive Mobile Tour-Guide Robot 231 horizontal rotating laser called LaserNav able to identify up to 32 different bar coded passive landmarks, and loudspeakers for voice synthesis. The robot implemented reactive behaviours such as follow corridor, go to point, escape from minimum, border by the right or the left or intelligent escape, as well as tasks such as walk along this corridor or take this corridor in this direction. As the robot moved autonomously in a populated public environment making oral presentations and guided tours, interaction with people was an important point. Predefined sentences were used for greetings, welcome, self presentations or asking for free way. A very simple web server was also developed to allow remote users to remotely operate the robot. Blacky worked in long-term experiments and was tested in exhibition-like contests, were the exhibition organizers point of view was also taken into account. Lessons learnt from that experience conditioned the posterior research of our group. Further details about Blacky robot can be found in (Rodriguez-Losada et al., 2002). 1.3 Motivation Our main research line is the development of autonomous navigation algorithms for mobile robots, especially focusing on the Simultaneous Localization and Map Building Problem (Rodriguez-Losada et al. 2006a; 2006b; 2007; Pedraza et al., 2007). This research line was partly motivated because in the setup of Blacky a time consuming manual installation and measurement of landmarks had to be done. Nevertheless, it is also true that the main goal of Urbano is more than having a platform to evaluate our navigation algorithms. We also wanted to have a platform that could serve to present our advances to the general public, to help us to get funding for our projects, and very importantly, to attract people and students to get involved in research programmes in our group. We think that we have succeeded in these goals: it has helped to present our research and publish our results, we have obtained increasing marks and funding from Spanish Government research programmes and the number of people in our group has also increased. It could be said that Urbano is the spirit of our group. 2. System overview This section presents the description of Urbano hardware, both the commercial base but also our own developments: a mechatronic face and a robotic arm for gestures. Also, the general structure of software modules is presented. Later sections will present details about these software modules. 2.1 Urbano hardware Urbano (Fig. 2) robot is a B21r platform from iRobot, equipped with a four wheeled synchrodrive locomotion system, a SICK LMS200 laser scanner mounted horizontally in the top used for navigation and SLAM, and a mechatronic face and a robotic arm used to express emotions as happiness, sadness, surprise or anger. The robot is also equipped with two sonar rings and one infrared ring, which allows detecting obstacles at different heights that can be used for obstacle avoidance and safety. The platform has also two onboard PCs and one touch screen. These PCs are mainly 232 Service Robots dedicated to access the hardware, low-level control of the base, interfacing the laser rangefinder, controlling the arm and face, and performing voice synthesis and recognition. Fig. 2. Urbano, our interactive mobile robot The hardware architecture is completed with two off-board PCs as shown in Figure 3. Communication with them is implemented via wireless Ethernet. One of the external computers is dedicated to the system kernel, which handles coordination of all modules. This system kernel can, however, run also in one onboard PC, leaving the external one just as a simple monitoring and supervision tool that could be even switched off. Camera and video emitter Wireless video WWW Video compressor and http server Web server LAN Wireless Ethernet Access point OnBoard PC1 and PC2 Kernel or supervisor Fig. 3. System hardware architecture The second PC is fully dedicated to the web server, which communicates with the kernel via Ethernet and TCP-IP protocol. The web server acts as interface between the robot and the Urbano, an Interactive Mobile Tour-Guide Robot 233 world, allowing remote users connection to operate the robot and visualize dynamic information coming from its sensors. We can also keep the web server at our laboratory, while the rest of the equipment is physically present at the exhibition site. The video stream is served trough a dedicated http video compressor and server, which output is just redirected by the web server. 2.2 Robotic face The robot was thought to interact with people. In fact, all the environments in which the navigation tests were done were plenty of people. In order to achieve a satisfactory interaction with the public, it was strictly necessary the design and implementation of a robotic face. People find in it an attraction point to look when talking to the robot. At the same time, the face allows the robot, in combination with the voice, to express basic emotions. The first step was the face design, analyzing its ability to express emotions, and taking into account that the design should be simple enough to be build by ourselves. Other existing robots faces were also analyzed, like Kissmet (developed by MIT) which was too complex for our purposes. Albert’s face, a robot from University of Freiburg was finally our referent. In an initial version we developed a face with 5 degrees of freedom, 2 to control the mouth, one for each eyebrow and another one for closing both eyes. The actuators were model servomotors S3003 from Futaba. In this version the eyes were completely static, the mouth could not be opened and both eyelids had to be moved together. Nevertheless, as shown in Figure 4, it was perfectly able to show basic emotions. Fig. 4. Urbano face initial version with 5 degrees of freedom, showing happiness, sadness, being neutral, angry or asleep. A simple board, the Mini SSC II, allows controlling the servos in an easy manner through a serial port, just by sending chains of three characters. This controller has low consumption and low dimensions. This face is the one we have currently mounted in Urbano. Nevertheless, we have recently developed a new one (Fig. 5) with increased interaction capabilities. We have incremented the count of servos up to eleven, including four of them to control the mouth that can be opened and closed, simulating speech in a much more realistic way. Four servos are used to move both eyes left and right independently (cross-eyed possible) and independent eyelid closing for winking. Another servo moves both eyes together up and down. The eyebrows are controlled by two more servos. Jamara Mini-blue and Micro-blue servos are used in this version. 234 Service Robots Fig. 5. Urbano face with 11 degrees of freedom, showing happiness, sadness, being neutral, angry, and moving the mouth. 2.3 Wired robotic arm Since the beginning of Urbano development the need of a gestural interaction system between human and robot was considered. While showing the programmed tour, without gestural communication, the attention is easily missed due to the hieratic interaction between the robot and the visitors. With the inclusion of a robotic arm in the system, a more direct interaction with the environment is achieved. Moreover, it increases the ability to attract attention and helps to give emphasis and to include emotional aspects to the speech. Clearly, the use of sign language makes the interaction more natural, friendly and attractive. All this implies that the robotic arm must have a set of specific qualities and characteristics. Because it must reflect the common gestures in a speech, the structure, proportion and the dimensions of the robotic arm should be similar to the human arm. The arm movements should have similar dynamics, which requires that the movements should be stiff-less, quick and as natural as possible. As a consequence, the absolute accuracy and repeatability are not significant within a range, since the relative motions are more important in order to gesticulate than the absolute positions. Urbano is conceived as a tour guide robot. Therefore it would be moving close to people and in a non structured environment. Safety issues have special relevance both for the humans and the robotic system. Due to its application, the system even has to allow contact with people without risk to them or to the robot hardware. However, as it has been specified before, the robot arm must move with agility and fast movements become more dangerous as the mass of the arm increases. As a consequence, a major requirement of the robot arm is that it has to be as light as possible. Reducing the inertia simplifies the actuator complexity and reduces the safety problems. Moreover, the actuation system should be somewhat reversible, so if the arm is moved manually it has to allow such movement or adopt a compliant behaviour. The adopted solution in order to accomplish these requirements is to extract the drives from the arm and place them in the base of the robot, which is the equivalent to the shoulder blade. Placing the actuators in such way entails the problem of transmitting the mechanical power to the different joints and in particular to the elbow. Figure 6 shows the robot arm 235 Urbano, an Interactive Mobile Tour-Guide Robot kinematics. It has four degrees of freedom (dof), three on the shoulder articulation and the forth in the elbow. Arm Kinematics Shoulder nd 2 Pulley Wrist st 1 Pulley rd 3 Pulley Shoulder Elbow rd 4 Pulley Elbow Arm structure Shoulder detail Fig. 6. Kinematics and Joint drive pulleys of the robotic arm. 1st pulley 4rd actuator 2nd 2nd pulley 3rd pulley rd 3 actuator to the elbow 1st actuator Set of conducting pulleys Fig. 7. Left) Schematic representation of the drive cables that are going through the shoulder. Right) Picture of the robot shoulder blade. A cable based transmission system has been chosen as solution rather than gears. Using a gear based transmission system would be more expensive and complex from the mechanical and control point of view. In such systems a fine and complex control loop have to be used 236 Service Robots in order to cancel the coupling among the different joints, due to the effect that a joint movement has on the subsequent articulations. Each joint has a pulley where two wires are attached. Those cables turn the joint in opposite directions, and therefore, the length of wire that is winded must be the same that is released. Wires that run through the articulations have to be not affected by the joint movements. These can be accomplished conducting the cables through the axis of the previous joints. An added difficulty is the emergence of a friction that increases exponentially with the number of turns that the cable performs. Therefore, all the turns are made through a set of small polyester pulleys because of its low friction coefficient with nylon. Figure 7 (left) shows an scheme of the different drive cables that are going through the shoulder. The two drive cables for the elbow articulation have to go trough the three previous joint axes. Therefore it is conducted by three different sets of pulleys as is represented in the figure. Finally, the winding and releasing of the wires is done by four servo based drive units. The Figure 7 (right) shows these units placed on the shoulder blade of the robot. In order to keep the cables tensed each unit tighten them by an adjustable spring attached to the servo and the winding pulley. Given that the absolute accuracy is not a major requirement; the position feedback is done in the drive itself, not in the joint. Therefore there is no electronic components on the arm neither signal or power wires. Fig. 8. Several video frames captured during a speech of Urbano. The actuator units are controlled through a microcontroller based control board, that is linked through a serial RS232 connection to the onboard robot computer. This computer has 237 Urbano, an Interactive Mobile Tour-Guide Robot a server process that is responsible of the execution of the different commands received by a TCP/IP socket. Figure 8 shows several video frames captured during a speech of Urbano. Predefined sequences of joint movements are able to express messages like this, goodbye, at your commands, everybody, etc. 2.4 Software components overview The software is structured in several executable modules (Fig. 9) to allow a decoupled development by several teams of programmers, and they are connected via TCP/IP. Most of these executables are conceived as servers or service providers, as the face control, the arm control, the navigation system, voice synthesis and recognition, and the web server. The client-server paradigm is used, being the only client a central module that we call the Urbano kernel. This kernel is the responsible of managing the whole system, issuing requests to the services based on the input data defined by the exhibition database and the established robot behaviour, that is defined in a high-level programming language that will be described later. Face control Voice synthesis Voice recognition Navigation Supervisor interface Exhibition database Kernel DB TCP/IP Arm control Urbano Behaviour Web server Fig 9. Urbano software control modules overview. The supervisor interface acts as a client of the kernel, that reflects all necessary information to the user. Although this is the common use, the supervisor is also able to directly connect the server modules to check low level functionality. 3. Feature based mapping and navigation We realized from our experiences with Blacky that automatic map building was required for an easy deployment of Urbano in new environments. The analysis of the exhibitions and the setup procedure indicated that a feature based approach could probably achieve better results and more robustness could be obtained both in the mapping procedure and the later localization in the built map. We noted that the environments were plenty of representative geometric entities, mainly straight walls, but they were also crowded because the setup procedure had to be performed while the exhibitions were open to the public. The most extended approach for feature based SLAM is the EKF algorithm, but this filter is difficult to apply when the features of the environment cannot be completely observed, e.g. when a wall is observed partially because of occlusions. Most of our recent research has been 238 Service Robots focused in the feature based SLAM problem under an EKF approach. We developed our own version of the SPMap (Castellanos et al., 1999) algorithm, which is probably the best existing solution to handle the problem of partial observations. Our algorithm (RodriguezLosada et al., 2006a) efficiently handles the edges information, which is extremely important when navigating in corridors. We soon realized that the SLAM-EKF filter was quite optimistic due to the intrinsic inconsistency (Rodriguez-Losada et al., 2007) that arises due to EKF linearizations. We proposed the use of perfectly known shape constraints (parallelism, orthogonality, colinearity) between segments of a map to reduce the angular uncertainty of the robot that is the main source of linearizations (Rodriguez-Losada et al., 2006a; Rodriguez-Losada et al., 2007). With this solution, medium size maps with loops can be built in real time, which is more than enough for all the environments were Urbano has been deployed. Nevertheless, we also developed an algorithm based on the use of local maps (Rodriguez-Losada et al., 20076b), that allows multirobot mapping of large environments in real time. The setup procedure is usually performed with a laptop connected to the robot base, used to manually drive the robot around the environment, while the SLAM-EFK algorithm runs, building the map in real time that is showed to the operator. Nevertheless, the system can also serve for remote exploration and autonomous return, in a fashion similar to (Newman et al., 2002) as we showed in (Rodriguez-Losada et al., 2007). Once the map is built, it is downloaded to the robot, so it can automatically start a simple pose tracking algorithm. This continuous localization or pose tracking is just a simplified version of the SLAM-EKF algorithm, with the map of the environment considered as perfectly known and static. Thus the estimation is only done over the robot position and orientation, resulting in a fast and robust algorithm. Path planning in a feature based map is not recommended, as not every obstacle is represented in the map. Grid maps could be used, but the problem of obstacles at different heights still remains. Consider the existence of tables, stairs, fences, etc, which are basically undetectable by the robot perception system. The only way to achieve a safe navigation is to constraint the robot to certain areas supervised by the installer of the system. We used a graph based approach. While exploring, Urbano automatically builds a graph of the environment deploying nodes in the virtual map, that are connected by branches only when revisiting them is showed to be possible. Path planning is computed in this graph with an A-star heuristic, giving as a result a sequence of ordered nodes or waypoints to the final goal. The reactive controller moves the robot to the next waypoint with a simple regulator, but also avoiding obstacles with a deviation from the direction provided by the regulator. Safety is obtained by permitting only a limited distance to the actual branch. Usually, the graph computed by the robot is not enough to allow guided tours, so a graphical user interface allows the installer to add, delete, edit, move nodes and branches, as well as assigning tags to places that can be used to identify particular exhibits Urbano can show. To allow the supervision of the map building procedure and the navigation performance, a GUI application has been developed. The SLAM and navigation kernel has been implemented in portable C++ for efficiency, and the interface has been developed (Figure 10) in a multidocument-view MFC application, using OpenGL for 3D rendering. This application has been proved to be of critical importance for an easy deployment of Urbano. Urbano, an Interactive Mobile Tour-Guide Robot 239 Fig 10. Map building and navigation GUI tool. This navigation software has been also used in a different robot: the robotic smartwalker Guido of Haptica Ltd. (Dublin, Ireland), an assistive walker to support and guide the frail blinded elderly. The feature based mapping and navigation approach proved to be an improvement in Guido control, as shown in (Lacey & Rodriguez-Losada, 2008). 4. Interactive subsystem 4.1 Interaction capabilities As described above, Urbano possesses several features that could be used for interacting with the people. If we conceive Urbano as a system, the interaction capabilities could be classified in inputs and outputs: Outputs: The robotic arm is only able to perform gestures, but not force feedback is allowed. Thus the arm is not able to sense the environment or feel any contact. Although this would be a very interesting feature, it would also be quite complex and expensive. The face is able to show basic emotions, to move the mouth while speaking and to direct the eyes to any point. The robot base itself is an element that can interact with the people. It can move faster or slower, to look at the closest person, it can perform basic movements as steps, nodding, quick rotations, that can be used for complementing interaction. The voice synthesis is the most powerful and versatile output, being able to transmit any kind of information but also to change voice parameters (volume, speed, tone) and speak with different emotional pronunciation, but on the other hand it also requires a more complex handling. Inputs: Voice recognition is the main input for interaction, despite its complexity. Both the difficulty of understanding the speaker in a noisy environment like an exhibition, and the management of textual information, makes impossible a general dialog manager. Nevertheless, it is still quit a powerful tool when the dialog is managed by the robot. 240 - - Service Robots The robot navigation system provides useful information, about close obstacles and people blocking its path, that can be easily used for initiating interaction. The face webcam is used for automatic face tracking (Figure 11) with the robot eyes, with a simple threshold of the image in the hue space, plus a geometric analysis of the binarized image. Some other information can be used as modifiers of the interaction with the user, as the battery level that can be associated to fatigue, or the time employed to perform a task that can produce stress to Urbano Fig. 11. Face detection for tracking with the robotic eyes. 4.2 Proprietary voice synthesis and recognition In order to provide Urbano with an appropriate human-robot interface, a speech synthesizer and a speech recognizer must be designed and developed. The proposed interface would allow a natural but reliable speech dialogue between visitors and the robotic guide. Although speech technology has progressively become a mature engineering area with several commercially available products, the development of robust applications in real-life ever-changing environments is still a topic of comprehensive research. 4.2.1 Speech recognition and understanding Commercial speech recognition products are mainly oriented to classical speaker-dependent dictation products developed by Dragon Systems and IBM, telephone-based systems (the market is currently dominated by Nuance) or restricted-domain applications (Philips has developed several products mainly for hospitals). These systems come with limitations, as they cannot be used in open-access museums or trade-fairs without a significant reduction in their performance, because human spontaneity and limited linguistic coverage minimize the potential benefits of commercial products (Fernandez et al, 2006). In addition to this, available systems do not provide automatic systems for speech understanding, but just speech recognition. The mobile robot needs procedures to extract concepts and values from the text that outputs the recogniser, being able to cope with recognition errors and ambiguities. Urbano, an Interactive Mobile Tour-Guide Robot 241 Fig. 12. Speech processing architecture in Urbano Finally, commercial products generally do not provide a confidence measure on the result of the recognition process, a measure that allows a robust behavior on noisy working conditions. For example, when there are many children surrounding the robot. Considering all these limitations, we have developed speech recognition software customized for use with a robot, as have we developed adapted modules for an air-traffic control domain or for controlling a HIFI system (Cordoba et al, 2006). The standard speech recognition technique (Hidden Markov Models) is based on stochastic modeling of each phoneme in its context and trainable language model (bigram) that uses the probability of two words to be consecutively uttered in the specific application domain. We have trained our system with a 4000-speaker speech database to achieve speakerindependent models. In a 500-word command-and-control task, recogniser’s word accuracy is greater than 95% (word accuracy takes into account speech recognition errors due to word substitutions, insertions and deletions). Although any microphone can be used successfully, close-talk head-microphones are the best choice, due to immunity to ambient noise (which can be high in children-oriented museums, for example). As for improving the performance of the recognition for certain special speakers, a speakeradaptation module has been included. This module significantly improves the general models trained with 4000 speakers). Error reduction can be as high as 20%, especially for female speakers. As speech recognition is only the first component of the speech processing, we have also developed an automatic speech understanding module. In order to adapt the system to a new exhibition or trade fair, we must provide the system with a set of samples that should be recognized by the robot, and the set of concepts and values involved in each sentence. The system automatically learns a set of understanding by induction. The rules that are learnt can convert the recognized speech into the suitable sequence of concepts and values, without the need of a human expert. Nevertheless, if the set of examples is reduced, new rules can be manually added. 242 Service Robots 4.2.2 Emotional speech synthesis While in recent years many speech synthesizers have managed to achieve a high degree of intelligibility, one important problem remains, which is the inability of simulating the variability in human speech conveyed by factors such as the emotional state of the speaker. The approach of this work has been based on formant synthesis, including four primary emotions, namely happiness, sadness, anger and surprise as well as a neutral state. Although this approach produces less natural speech when compared to other approaches such as concatenate synthesis, it provides a high degree of flexibility and control over acoustic parameters. To improve the results of previous approaches, we have optimized the prosodic models that mimic the rhythm and intonation of the reference professional speaker we have recorded. Each emotion is prosodically modeled as a deviation from the neutral way of speaking. To simulate sadness we have included an artificial tremor that, although not used by the professional actor, has significantly increased the identifiably of this emotional synthetic speech. Our actor has simulated cold anger (instead of hot anger), which is a very-controlled but menacing emotion. To simulate this kind of anger, he created a special noise during the articulation of most of the sounds, without modifying the quite neutral prosody. This nonprosodic anger is very difficult to be completely simulated on formant synthesis. This time, the significant improvement was obtained by combining an artificial articulation noise with an intensity pattern that progressively simulates hot anger. Finally, to improve happiness, the most difficult emotion, we have increased the amount of high frequency in synthetic speech, to provide it with richer sound that is easily associated with a happy state. The simulation of emotions in synthetic speech was tested by a group of 24 non-trained listeners. The confusion matrix obtained for the 25 sentences that composed the test is showed in next table. Identified emotion (%) Simulated Happines Emotion s 53.9 Happiness Cold Anger 7.0 Surprise 17.4 Sadness 0.0 Neutral 1.7 Cold Anger 9.6 70.4 2.6 1.7 3.5 Surprise Sadness 20.9 14.8 79.1 0.0 2.6 0.0 2.6 0.0 87.0 7.8 Neutral Other 7.8 3.5 0.0 10.4 83.5 7.8 1.7 0.9 0.9 0.9 Table 1. Confusion matrix from emotion identification experiments on speech synthesis We can observe that all the emotions present a recognition level above 50%, and for all the emotions with the exception of happiness, this level exceeds 70%. The mean identification rate in the new perceptual test is 75%, in this semantically-neutral short-sentence emotion identification experiment. When compared to previous formant-based results on the Spanish work package in VAESS project, (Montero et al, 2002) the improvement ranges from 65% for anger, 42% for neutral, 15% for happiness, to just 5% for sadness. These results are even better (>65%) for the last 10 sentences that composed the test, in spite of the fact that listeners did not receive information about the identification success or failure they were getting Urbano, an Interactive Mobile Tour-Guide Robot 243 4.3 Emotional manager Many investigations in the emotional model area have been done and many others are currently under way. It is quite a new field and it involves many different sciences, for that reason it is not common to find fix structures for studying or developing artificial emotional models. One of the most significant studies is (Picard, 1997). From a pure scientific point of view, emotional models are studied in psychology, neuroscience, biology, etc. Humaine Network of Excellence (http://emotion-research.net) aims to create an investigation community to study emotions in the frame of human-robot interaction. Fig. 13. Emotional state model In order to reach a nearer approximation to human emotional system, the Urbano model makes use of dynamic variables to represent internal emotional state. The model follows the classic diagram showed in Fig. 13, being the system stimuli u(k) considered as inputs variables, emotions x(k) as state variables and task modifiers y(k) as output variables. In the following paragraphs the concepts used to build the emotional model are introduced more accurately. Trying to define an emotional state in a human being, an emotion and its magnitude are used. For example, the statement “I am very happy” includes qualitative information, the emotion “happy”, and quantitative information that is expressed with terms that give an idea about the intensity of the emotion “very”. Based on that, the emotional state at the time k is defined as the set of considered emotions with their intensity levels. Intensity levels of each emotion change continually, giving dynamics to emotional state. Emotional state tends naturally to a nominal emotional state where a balance of emotion intensities exists. An emotion is an internal variable. A system stimulus is any event that has an influence in the system producing an emotional state change. There are many events that may stimulate the system, the only limitation is the system ability to sense, i.e. sensors, cameras, etc. Robotic stimuli may be internal or external. An example of internal stimuli is the life or energy the robot has, usually considered as the battery state. Urbano has scheduled tasks; such schedule can be modified because of instantaneous emotional state. All these changes are considered as system task modifiers. An example of scheduled task is the tour in a museum, which a guide robot has to direct. Modifiers for this task could be the tour tempo, information to give, jokes used to build a better connection with public, etc. Following the classic state variable model, four matrices have to be defined: A-matrix emotional dynamic matrix represents the model dynamic, the influence of each emotion over 244 Service Robots itself and over the other emotions. B-matrix is the sensitivity matrix. C-matrix has the information of how emotional state influences modifiers. Let us call this matrix the emotional behavior matrix. D-matrix is the direct action matrix. Due to the difficulty of finding an analytic calculation for the matrices coefficients, a set of fuzzy rules is used to obtain each coefficient. The matrices coefficients are function of time k, giving dynamics to the system. Because of that coefficients are calculated for each time k. To define fuzzy rules is a simple task; the information contained in the rules can be obtained from experts in emotions. The use of fuzzy knowledge bases opens the opportunity to a future automatic adjustment, e.g. genetic algorithms. 5. Web based remote visit One of the project goals was the development of a Web server to allow users to visit remotely an exhibition, navigating through the robot movement and watching through its sensors. The user can be a normal citizen that enjoys connecting from his home, or a business man that connects from his office. This allows saving the displacement costs derived from travelling physically to the exhibition site, especially when the visitor lives or works in another city or country. Three kinds of users are allowed: The standard visitor, which can navigate through the web page accessing general information and watching the behaviour of the robot, or ask for an account. Privileged visitor, which can operate the robot and interact with the remote site, as well as with other connected users. Administrator, which can manage users, creating new accounts and assigning access privileges. A privileged user can: Set a destination goal for the robot (high level command). The navigation system works in autonomous mode. Chat with other users. Command the robot sending low level commands (move forward or backward, turn). A security system avoids the robot to crash. To receive dynamic information of the surroundings of the robot. Visualize the robot environment through its camera. To receive the audio signal present at the remote site. To write down sentences to be synthesized by the robot. To select emotions to be expressed by the robot face. The web server was developed using Jakarta Apache Tomcat 4.0, which includes Java support, over a Linux operating system (Debian 3.0 release1). The programming tools used were those included in Java 2 Platform, Enterprise Edition, J2EE (Java Server Pages -JSP-, JavaBeans, JavaXML), server and applets applications, and every program was written in standard Java 2. The Web pages format is standard HTML 4.0. Figure 14 shows the typical frames displayed during normal operation (map, camera, chat and control windows). All data is stored in a mySQL data base. Information exchanged with the database is carried out using SQL (Structured Query Language) through queries to a data base server (mySQL Server) resident in the same PC. The development application was the programming environment supplied by Sun MicroSystems, SunOne Studio 4.1 Community Edition. The web server was deeply tested at INDUMATICA 2004 fair celebrated at UPM. The server worked for 3 days, a total of 16 hours. 63 users registered, being 18 professors, 31 students Urbano, an Interactive Mobile Tour-Guide Robot 245 and 14 people form outside of the university. The web Server was also successfully proven at the Science Museum Príncipe Felipe of Valencia, and allows carrying out remote tours to our laboratory at UPM. Fig. 14. Urbano Web based remote visit 6. Integration of components: Multitask Kernel Front the point of view of Urbano’s software components, it is an agent based architecture. A specific CORBA based mechanism is used as integration glue. Every agent is a server and there is only one client, the Kernel module. Each computer has a Monitor program that interacts with the Operating system to start, suspend or kill the applications assigned to this machine. Watchdog supervision mechanism are used to detect blocks in every client and if it is necessary to restart it. Some agents need to save a safe state in order to recovery the whole functionality (robot’s recent position). There are different kinds of information involved in Urbano: Configuration. All necessary configuration data (IP address, file names, etc.) Working data. Each agent can uses specific information usually data files (sequence of movements for the “Hello” action in Arm agent) General information. About social, humoristic, sportive information that Urbano uses to interact with the public Corpus. About the specific domain which Urbano works (Museum or fair contents). A relational database was implemented to support general and corpus information, and specific files for working and configuration data. There is not redundant or shared information. The agents and their function are described in Table 2. 246 Agent Kernel Speech Listen Face Arm Navigation Emotional Supervisor Web server Service Robots Function Task scheduler, knowledge Voice synthesis Voice recognition Face expression control Arm movements control Base movements control Emotional model control Monitoring of kernel and modules Computer OnBoardPC2 (win) OnBoardPC2 (win) OnBoardPC2 (win) OnBoardPC1 (linux) OnBoardPC1 (linux) OnBoardPC1 (linux) OnBoardPC2 (win) Activity Client, Server Server Server Server Server Server Server External PC1 (win) Client of kernel Serve web pages External PC2 (linux) Server Http server Table 2. Agents and functions Some other programs have been developed for different needs. Mapper was designed to elaborate and managing maps and graphs for path planning. UDE the Urbano development environment is a complex program designed to help the end user in the maintenance and task development, also is a supervisor program of the whole architecture. Figure 15 shows the main window of this program. Video stream Fig. 15. Urbano Development Environment Urbano, an Interactive Mobile Tour-Guide Robot 247 The Kernel agent is a scheduler that executes Urbano tasks. Each task has a starting time and a priority. High priority tasks interrupt lower priority tasks. Tasks are coded by the user in a high-level programming language designed for this purpose. The tasks are compiled with yacc-lex technologies to avoid errors and to simplify the execution by the Kernel. 6.1 High-level programming High level programming language designed is C-like. Variables can be numerical or string and the first assign defines the type. Expressions and execution control sentences are available in the same syntax that C language. There are an important set of functions related with database access, string operations, global variables, system, etc. There are also functions to control the robot. The following table 3 shows some of these functions: Function listen listendb say saydb face arm play image buttons feeling go where turn isblock Description Waits for a specific sentence from de voice recognition module Waits for a sentence defined in Database Synthesizes a sentence Synthesizes a random sentence of a category from the database. Shows a specific expression in the face Does a set of arm movements that was defined as a expression. Shows a multimedia movie in the Touch-Window Shows an image in the Touch-Window Returns the identification of the selected button in the Touch panel Does an evaluation of robot emotions Goes to a specific point Returns where the robot is Turns some degrees Returns true if the robot is blocked Table 3. Control Functions of Urbano programming language The following text shows an example of task. The robot is walking and helloing around the available 14 places in the map. Task starts with an order to go to the next place. While is moving call to another task to verify if the robot is blocked in his path by objects or people and say a random helloing phrase from the database and wait 30 seconds. When the robot is in the next place, put itself in the agenda as a new task with 5 seconds of delay to start and a priority of 20. // Walking ! destination=where()+1; if(destination ==14) destination = 0; endif go(destination); while (where() != destination) jump("blocks"); saydb("Hello"); sleep(30); endwhile task("walking",5,20); end 248 Service Robots In this new task example, the robot wait for a question (recorded in the database), then the listen function returns a keyword and a SQL query is performed to obtain from the ‘explains’ table all records with this keyword. In every record there are a Text, a voice type, an arm movement and a face expression that are used to give the answer. // Questions say(“What is your question?”); Theme=listendb(); pTable=dbsql(format("SELECT * FROM EXPLAINS WHERE KEYWORD = '%s' ORDER BY ORDEN", Theme)); if (pTable>=0) ndatos=dbgetcount(pTable); while (ndatos>0) arm(dbgetint(pTable,”ARM”)); setspeakingvoicetype(dbgetint(pTable,”VOICE”)); face(dbgetint(pTable,”FACE”)); say(dbgetstr(pTable,"TEXT")); dbnext(pTable); ndatos=ndatos-1; endwhile dbclose(pTable); gestobrazo("POSICION_CERO"); else say("I don’t now anything about this theme!"); endif end 6.2 Managing visits Urbano database has and inventory of objects. Each object is included in several categories, for example a Picasso’s picture is a picture, modern art, cubism style, big size, etc. Each object is in a place in the map and the order of visit is important in order to avoid comings and goings. About each object there are different kinds of information: general description, specific for expert, specific for child, components, history, details, anecdotes, etc. Urbano as tour guide robot must guide to a people group in a museum or fair in a visit. For Urbano a visit is defined as a set of categories to explain in limited time for some kind of visitors defined by some topics: Expert, Normal, Child, etc. Some SQL queries to database select the objects and the information about each object to be explained. If there isn’t enough time for the exposition of all selected objects, a prune process is executed to reduce the number of explanations of each object (a priority value). This work is previous to the visit and can produce a test of visit, the robot makes the visit and controls the moving time and the explanation time in each object. During the real visit timing can vary depending on questions or moving time (visitors blocks Urbano) if time lacks a prune process is used. Free time can be used by Urbano to tell jokes or recent social news. 7. Urbano successful deployments Urbano robot has been successfully deployed in several environments, and has operated as tour guide in many occasions: Lab Tour: guided visit to our laboratory Urbano, an Interactive Mobile Tour-Guide Robot 249 Indumatica 2004 (ETSII, Madrid, Spain): industrial trade fair Indumatica 2005(ETSII, Madrid, Spain): industrial trade fair Fitur 2006 (IFEMA, Madrid, Spain): international fair of tourism Principe Felipe Museum (CACSA, Valencia, Spain). Science museum. Demonstration at UPM. A demonstration was performed at our university that started with a teleoperated real time exploration and mapping at rush hour with the environment crowded with students. The installer used the GUI tool to teleoperate the robot with the reference of the map graphical render, and the assistance of the robot reactive control for safety and automatic graph building, path planning and execution for convenience and comfort. In this way, the installer teleoperated the robot while exploring, but Urbano could go to any previous explored area fully autonomously, releasing the user from direct control most of the time. The duration of the experiment was 22’15’’ with a travelled distance of 134 meters. With an experiment duration similar to the “Explore and return” experiment (Newman et al., 2002) the explored and mapped (in real time) environment is much bigger (Figure 16). Fig. 16. Map of UPM built in an “Explore and return” experiment Urbano robot was also deployed in the Indumatica trade fair (Figure 17) in two occasions 2004 and 2005. In both occasions it had to be installed while the fair was open to the public, and the map building was accomplished with the exhibition plenty of people. Next figure shows the map provided by the organizers; as it can be seen it is useless for navigation, as it does not resemble the actual environment. The built map accurately represents the features of the environment. Fig. 17. Indumatica 2004 trade fair. Left) Map provided by organizers. Center) Actual environment. Right) Partial view of the built map. 250 Service Robots The map of the environment was built in real time while manually driving Urbano in a 102 meters long trajectory in less than five minutes. The complete map and navigation graph is shown in Figure 18, as well as Urbano guiding two visitors around the fair. Fig. 18. Indumatica 2004 trade fair. Left) Map built by Urbano in real time. Right) Urbano guiding two visitors. The Urbano project has been supervised by the “Principe Felipe” museum at the City of Science and Arts of Valencia (CACSA), one of the biggest museums in Spain, as partner and potential end user of Urbano. A demonstration of the system deployment was performed (Figure 19), as well as the functionality of Urbano as a tour guide. The map of the exhibition was correctly built in real time along a 130 meters long trajectory in approximately 16 minutes. Fig. 19. Map building at Principe Felipe Museum. Left) Manually operating the robot. Center) Real time map building as seen by the installer. Right) Resulting map and navigation graph. 8. Conclusions and future work The Urbano service robot system has been presented in this chapter, with an overview of both its hardware and control software. The hardware used for interaction (robotic face and arm), that has been specifically designed and built for Urbano following performance and cost criteria, has been showed to successfully accomplish its task. All the control, navigation, interaction (including speech) and management software has been developed from scratch according to our research lines. These developments have served to increase our scientific Urbano, an Interactive Mobile Tour-Guide Robot 251 publication records, but have also resulted in the attainment of a quite mature service robot system that has been successfully deployed and tested in several occasions in different scenarios. Moreover, due to its success, we have been requested many times to rent Urbano for several days in exhibitions by several institutions and private companies. The only reason we couldn’t go on with this renting, was the lack of support in the University for this purpose, as our University is public and a non-profit organization. We are currently considering forming a spin-off to continue with Urbano in a more commercial line. We are currently working in 3D data acquisition, modelling, mapping and navigation in order to achieve a much more robust system (able to detect stairs, obstacles at different heights), that wouldn’t require any human supervision (navigation graph editing) for a more automated setup. In fact our goal (Robonauta project, see Acknowledgement) is the fully automated deployment of Urbano by showing it the environment, guiding it with natural language, just as it would be done with a new human guide in a museum staff. The interaction capabilities of Urbano are also being expanded, implementing some people tracking and following behaviours, as well as an improved image processing system. The software distributed architecture will also be improved by the standarization of modules interfaces using XML technologies and the (Web Services Definition Language) WSDL specification. In this way, the modules will not require to have the interfaces hardwired, and more flexibility and simplicity will be allowed for a more fast and error-free development. Also, the programming language will be substituted by some standard as the State Chart XML (SCXML), that could result in a more powerful and simpler to manage tool that took full advantage of the new architecture. 9. Acknowledgment The Urbano project has been the result of the work of many people, whose contributions we gratefully acknowledge: Agustin Jimenez and Jose M. Pardo for project management and supervision, Alberto Valero for web development, Andres Feito and Marcos Doblado for face design and building, Enrique Lillo for his work in the wired arm, Javier Diez for programming the Urbano high-level programming language and kernel, Jaime Gomez and Sergio Alvarez for improvements in the kernel, and all of DISAM and IEL (both at UPM) staff for their support. This work is funded by the Spanish Ministry of Science and Technology (URBANO: DPI2001-3652C0201, ROBINT: DPI-2004-07907-C02, Robonauta: DPI2007-66846-C02-01) and EU 5th R&D Framework Program (WebFAIR: IST-2000-29456), and supervised by CACSA whose kindness we gratefully acknowledge. 10. References Burgard W., Cremers A.B., Fox D., Hähnel D., Lakemeyer G., Schulz D., Steiner W., Thrun S. (1999) Experiences with an interactive museum tour-guide robot. Artificial Intelligence. Vol. 1-2 N. 114. pp. 3-55. Thrun S., Bennewitz M., Burgard W., Cremers A.B., Dellaert F., Fox D., Hahnel D., Rosenberg C., Roy N., Schulte J., Schulz D. (1999). MINERVA: A SecondGeneration Museum Tour-Guide Robot. IEEE International Conference on Robotics and Automation. Vol.3, pp. 1999-2005. 252 Service Robots Nourbakhsh I., Bobenage J., Grange S., Lutz R., Meyer R., and Soto A. (1999). An Affective Mobile Educator with a Full-time Job. Artificial Intelligence, Vol. 114, No. 1 - 2, pp. 95-124. Montemerlo M., Pineau J., Roy N., Thrun S., and Verma, V., (2002). Experiences with a Mobile Robotic Guide for the Elderly. Proceedings of the AAAI National Conference on Artificial Intelligence, Edmonton, Canada. Rodriguez-Losada D., Matia F., Galan R., Jimenez A. (2002). Blacky, an interactive mobile robot at a trade fair. IEEE International Conference on Robotics and Automation. Vol. 4. Washington DC, USA. pp. 3930-3935. Rodriguez-Losada D., Matia F., and Galan R. (2006a) Building geometric feature based maps for indoor service robots. Robotics and Autonomous Systems, vol. 54, pp. 546-558, 2006. Rodriguez-Losada D., Matia F., Jimenez A., Galan R. (2006b). Local map fusion for real-time indoor simultaneous localization and mapping. Journal of Field Robotics. Wiley Interscience. Vol 23, Issue 5, p 291-309, May 2006 Rodriguez-Losada D., Matia F., Pedraza L., Jimenez A., Galan R. (2007). Consistency of SLAM-EKF Algorihtms for Indoor Environments. Journal of Intelligent and Robotic Systems. Springer. ISSN 0921-0296, Vol. 50, Nº. 4, 2007, pags. 375-397. Pedraza L., Dissanayake G., Valls Miró J., Rodriguez-Losada D., and Matía F. (2007). BSSLAM: Shaping the world. In Proc. Robotics: Science and Systems, Atlanta, GA, USA, June 2007. Castellanos J.A., Montiel J.M.M., Neira J., Tardos J.D. (1999). The SPmap: A Probabilistic Framework for Simultaneous Localization and Map Building. IEEE Transactions on Robotics and Automation. Vol. 15 N. 5. pp. 948-953. Newman P., Leonard J., Tardos J.D., Neira J. (2002) Explore and Return: Experimental Validation of Real-Time Concurrent Mapping and Localization. IEEE International Conference on Robotics and Automation. Washington DC, USA. pp. 1802-1809 Lacey G. and Rodriguez-Losada D. (2008) The evolution of Guido: a smart walker for the blind. Accepted for publication in IEEE Robotics and Automation Magazine. To appear in 2008. Fernández, F.; Ferreiros, J.; Pardo, J.M. ; Sama, V.; Córdoba, R. de ; Macías-Guarasa, J.; Montero, J.M.; San Segundo, R.; D´Haro, L.F.; Santamaría, M. & González G. (2006). Automatic understanding of ATC speech. IEEE Aerospace and Electronic Systems Magazine, Vol. 21, No 9, pp. 12-17, ISSN: 0885-8985 Córdoba, R. de ; Ferreiros, J.; San Segundo, R.; Macías-Guarasa, J.; Montero, J.M.; Fernández, F.; D´Haro, L.F. & Pardo, J.M. (2006). Air traffic control speech recognition system cross-task & speaker adaptation. IEEE Aerospace and Electronic Systems Magazine, Vol. 12, No 9, pp. 12-17, ISSN: 0885-8985 Montero, J.M.; Gutiérrez-Arriola, J.; Córdoba, R.; Enríquez, E. & Pardo, J.M. (2002). The role of pitch and tempo in Spanish emotional speech: towards concatenative synthesis. In: Improvements in speech synthesis, Eric Keller y Gerard Bailey, A. Monahan, J. Terken, M. Huckvale (Ed.) pp. 246-251, John Wiley & Sons, Ltd. Picard R. W., (1997). Affective Computing, The MIT Press, Massachusetts, USA. ISBN:0-26216170-2

Log In

Urbano, an Interactive Mobile Tour-Guide Robot

Related papers

Related papers

Related topics