Abstract
Robots are poised to interact with humans in unstructured environments. Despite increasingly robust control algorithms, failure modes arise whenever the underlying dynamics are poorly modeled, especially in unstructured environments. We contribute a set of recovery policies to deal with anomalies produced by external disturbances. The recoveries work when various different types of anomalies are triggered any number of times at any point in the task, including during already running recoveries. Our recovery critic stands atop of a tightly-integrated, graph-based online motion-generation and introspection system. Policies, skills, and introspection models are learned incrementally and contextually over time. Recoveries are studied via a collaborative kitting task where a wide range of anomalous conditions are experienced in the system. We also contribute an extensive analysis of the performance of the tightly integrated anomaly identification, classification, and recovery system under extreme anomalous conditions. We show how the integration of such a system achieves performances greater than the sum of its parts.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Data Availability
An extensive dataset has been collected for this work as described in Appendix B and is available via our project page www.JuanRojas.net/spair or at Github https://github.com/birlrobotics/ktting_anomaly_dataset. The authors of this paper have made all aspects of the coding of this work open-source and are well documented, please refer to our supplementary page [26].
References
Ijspeert, A.J., Nakanishi, J., Hoffmann, H., Pastor, P., Schaal, S.: Dynamical movement primitives: learning attractor models formotor behaviors. Neural Comput. 25(2), 328–373 (2013)
Paraschos, A., Daniel, C., Peters, J.R., Neumann, G.: Probabilistic movement primitives. In: Advances in Neural Information Processing Systems, pp. 2616–2624 (2013)
Calinon, S., D’Halluin, F., Sauser, E.L., Caldwell, D.G., Billard, A.G.: Learning and reproduction of gestures by imitation. IEEE Robot. Autom. Magazine 17(2), 44–54 (2010)
Jain, A., Wojcik, B., Joachims, T., Saxena, A.: Learning trajectory preferences for manipulators via iterative improvement. In: Advances in Neural Information Processing Systems. [Online]. Available: http://pr.cs.cornell.edu/coactive (2013)
Konidaris, G., Kuindersma, S., Grupen, R., Barto, A.: Robot learning from demonstration by constructing skill trees. Int. J. Robot. Res. 31(3), 360–375 (2012)
Gutierrez, R.A., Chu, V., Thomaz, A.L., Niekum, S.: Incremental task modification via corrective demonstrations. In: Proceedings - IEEE International Conference on Robotics and Automation, pp. 1126–1133 (2018)
Bajcsy, A., Losey, D.P., O’Malley, M.K., Dragan, A.D.: Learning from physical human corrections, one feature at a time. In: ACM/IEEE International Conference on Human-Robot Interaction, pp. 141–149 (2018)
Hovland, G.E., McCarragher, B.J.: Hidden Markov models as a process monitor in robotic assembly. Model. Identif. Control 20(4), 201–223 (1999)
Pettersson, O.: Execution monitoring in robotics: a survey. Robot. Auton. Syst. 53(2), 73–88 (2005)
Kobayashi, Y, Matsumoto, T., Takano, W., Wollherr, D., Gabler, V.: Motion recognition by natural language including success and failure of tasks for co-working robot with human. In: IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM. Institute of Electrical and Electronics Engineers Inc., pp. 10–15 (2017)
Inceoglu, A., Ince, G., Yaslan, Y., Sariel, S.: Failure detection using proprioceptive, auditory and visual modalities. In: IEEE International Conference on Intelligent Robots and Systems. Institute of Electrical and Electronics Engineers Inc., pp. 2491–2496 (2018)
Di Lello, E., Klotzbucher, M., De Laet, T., Bruyninckx, H.: Bayesian time-series models for continuous fault detection and recognition in industrial robotic tasks. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5827–5833. IEEE (2013)
Cheng, X., Jia, Z., Mason, M.T.: Data-efficient process monitoring and failure detection for robust robotic screwdriving. In: IEEE International Conference on Automation Science and Engineering, vol. 2019-Augus. IEEE Computer Society, pp. 1705–1711 (2019)
Wu, H., Guan, Y., Rojas, J.: A latent state-based multimodal execution monitor with anomaly detection and classification for robot introspection. Appl. Sci. (Switzerland) 9(6), 1072 (2019). [Online]. Available: https://www.mdpi.com/2076-3417/9/6/1072
Park, D., Erickson, Z., Bhattacharjee, T., Kemp, C.C.: Multimodal execution monitoring for anomaly detection during robot manipulation. In: Proceedings - IEEE International Conference on Robotics and Automation, vol. 2016-June, pp. 407–414 (2016)
Park, D., Kim, H., Kemp, C.C.: Multimodal anomaly detection for assistive robots. Autonomous Robots 43(3), 611–629 (2019). [Online]. Available: https://doi.org/10.1007/s10514-018-9733-6
Luo, S., Wu, H., Lin, H., Duan, S., Guan, Y., Rojas, J.: Fast, robust, and versatile event detection through HMM belief state gradient measures. In: The 27th IEEE International Symposium on Robot and Human Interactive Communication, 2018. Proceedings. ROMAN 2018, vol. 2018-Janua. Nanjing, China: Institute of Electrical and Electronics Engineers Inc, pp. 1–8 (2018). [Online]. Available: 1709.07876
Park, D., Kim, H., Hoshi, Y., Erickson, Z., Kapusta, A., Kemp, C.C.: A multimodal execution monitor with anomaly classification for robot-assisted feeding. In: IEEE International Conference on Intelligent Robots and Systems, vol. 2017-September, pp. 5406–5413 (2017)
Rodriguez, A., Mason, M.T., Srinivasa, S.S., Bernstein, M., Zirbel, A.: Abort and retry in grasping. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1804–1810. IEEE (2011)
Wu, H., Luo, S., Lin, H., Duan, S., Guan, Y., Rojas, J., Luo, S., Duan, S., Guan, Y., Rojas, J.: Recovering from external disturbances in online manipulation through state-dependent revertive recovery policies. In: RO-MAN 2018 - 27th IEEE International Symposium on Robot and Human Interactive Communication, pp. 166–173 (2018)
Chang, G., Kulic, D., Kulić, D., Kulic, D.: Robot task error recovery using Petri nets learned from demonstration. In: 2013 16th International Conference on Advanced Robotics (ICAR), pp. 1–6. IEEE (2013)
Kappler, D., Pastor, P., Kalakrishnan, M., Wüthrich, M., Schaal, S.: Data-driven online decision making for autonomous manipulation. In: Robotics: Science and Systems. Rome, Italy, vol. 11 (2015)
Niekum, S., Osentoski, S., Konidaris, G., Chitta, S., Marthi, B., Barto, A.G.: Learning grounded finite-state representations from unstructured demonstrations. Int. J. Robot. Res. 34(2), 131–157 (2015)
Wang, A.S., Kroemer, O.: Learning robust manipulation strategies with multimodal state transition models and recovery heuristics. In: Proceedings - IEEE International Conference on Robotics and Automation, vol. 2019-May, pp. 1309–1315. [Online]. Available: https://www.ri.cmu.edu/wp-content/uploads/2019/03/Kroemer_Wang_ICRA_2019.pdf (2019)
Wu, H., Lin, H., Guan, Y., Harada, K., Rojas, J., Wu, H., Lin, H., Guan, Y., Harada, K., Rojas, J.: Robot introspection with Bayesian nonparametric vector autoregressive hidden Markov models. In: IEEE-RAS International Conference on Humanoid Robots, vol. Part F1341, no. Nips. IEEE, pp. 882–888 (2017). [Online]. Available: http://www.juanrojas.net/shdp-var-hmm/
Wu, H., Luo, S., Chen, L., Duan, S., Chumkamon, S., Liu, D., Guan, Y., Rojas, J.: Endowing robots with longer-term autonomy by recovering from external disturbances in manipulation through grounded anomaly classification and recovery policies. [Online]. Available: http://www.juanrojas.net/spair (2018)
Kroemer, O., Daniel, C., Neumann, G., van Hoof, H., Peters, J.: Towards learning hierarchical skills for multi-phase manipulation tasks. In: International Conference on Robotics and Automation (ICRA), vol. 2015-June, no. June, pp. 1503–1510 (2015)
Rojas, J., Luo, S., Zhu, D., Du, Y., Lin, H., Huang, Z., Kuang, W. , Harada, K.: Online robot introspection via wrench-based action grammars. In: IEEE International Conference on Intelligent Robots and Systems, vol. 2017-Septe, pp. 5429–5436. [Online]. Available: http://www.juanrojas.net/online_introspection_wrench_grammar/ (2017)
Lin, H.C., Shafran, I., Yuh, D., Hager, G.D.: Towards automatic skill evaluation: detection and segmentation of robot-assisted surgical motions. Comput. Aided Surg. 11(5), 220–230 (2006)
Rosen, J., Brown, J.D., Chang, L., Sinanan, M.N., Hannaford, B.: Generalized approach for modeling minimally invasive surgery as a stochastic process using a discrete Markov model. IEEE Trans. Biomed. Eng. 53(3), 399–413 (2006)
Le, T.H.L., Maslyczyk, A., Roberge, J.P., Duchaine, V.: A highly sensitive multimodal capacitive tactile sensor. In: Proceedings - IEEE International Conference on Robotics and Automation, pp. 407–412. IEEE (2017)
Schaal, S., Peters, J., Nakanishi, J., Ijspeert, A.: Learning movement primitives. In: Springer Tracts in Advanced Robotics, vol. 15, pp. 561–572. Springer (2005)
Grollman, D.H., Jenkins, O.C.: Incremental learning of subtasks from unsegmented demonstration. In: IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings. IEEE, pp. 261–266 (2010)
Rojas, J., Peters Ii, R.A., Peters, R.A., Peters Ii, R.A., Peters, R.A.: Sensory integration with articulated motion on a humanoid robot. Appl. Bionics Biomechan. 2(3-4), 171–178 (2005)
Fox, E.B., Sudderth, E.B., Jordan, M.I., Willsky, A.S.: Bayesian nonparametric methods for learning markov switching processes. IEEE Signal Process. Mag. 27(6), 43–54 (2010)
Hughes, M.C., Stephenson, W.T., Sudderth, E.B.: Scalable adaptation of state complexity for nonparametric hidden Markov models. Adv. Neural Inform. Process. Syst 2015-Janua, 1198–1206 (2015)
Fox, E.B., Hughes, M.C., Sudderth, E.B., Jordan, M.I., et al.: Joint modeling of multiple time series via the beta process with application to motion capture segmentation. Ann. Appl. Stat. 8(3), 1281–1313 (2014)
Johnson, M.J., Willsky, A.S.: Stochastic variational inference for Bayesian time series models. In: 31st International Conference on Machine Learning, ICML 2014, vol. 5, pp. 3872–3880 (2014)
Foti, N.N.J., Xu, J., Laird, D., Fox, E.B.: Stochastic variational inference for hidden Markov models. In: Advances in Neural Information Processing Systems, vol. 4, no. January, pp. 3599–3607 (2014)
Chang, J., Fisher, J.W.: Parallel sampling of HDPs using sub-cluster splits. In: Advances in Neural Information Processing Systems, vol. 1, no. January, pp. 235–243 (2014)
Bnpy: Bayesian nonparametric machine learning for Python. [Online]. Available: https://github.com/bnpy/bnpy/(2017)
Murphy, K.P., Robert, C. In: Dietterich, T. (ed.) : Machine Learning: a Probabilistic Perspective, vol. 27. MIT Press, Cambridge (2012)
Nakamura, A., Nagata, K., Harada, K., Yamanobe, N., Tsuji, T., Foissotte, T., Kawai, Y.: Error recovery using task stratification and error classification for manipulation robots in various fields. In: IEEE International Conference on Intelligent Robots and Systems, pp. 3535–3542. IEEE (2013)
Council, N.R., et al.: Modeling human and organizational behavior: application to military simulations. National Academies Press (1998)
An, J., Cho, S.: SNU data mining center 2015-2 special lecture on IE variational autoencoder based anomaly detection using reconstruction probability, Soeul National University, Tech. Rep. (2015)
Park, D., Hoshi, Y., Kemp, C.C.: A multimodal anomaly detector for robot-assisted feeding using an LSTM-based variational autoencoder, vol. 3, pp. 1544–1551 (2018)
Chen, R.-Q., Shi, G.-H., Zhao, W.-L., Liang, C.-H.: Sequential VAE-LSTM for anomaly detection on time series. [Online]. Available: 1910.03818 (2019)
Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Machine Learn. Res. 17(1), 1334–1373 (2016). [Online]. Available: http://www.jmlr.org/papers/volume17/15-522/15-522.pdf
Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. Int. J. Robot. Res. 37(4-5), 173–184 (2016)
Haarnoja, T., Pong, V., Zhou, A., Dalal, M., Abbeel, P., Levine, S.: Composable deep reinforcement learning for robotic manipulation. In: Proceedings - IEEE International Conference on Robotics and Automation, pp. 6244–6251 (2018)
Jund, P., Eitel, A., Abdo, N., Burgard, W., Philipp. Jund Andreas Eitel, N.A., Burgard, W.: Optimization beyond the convolution: generalizing spatial relations with end-to-end metric learning. In: Proceedings - IEEE International Conference on Robotics and Automation, pp. 4510–4516 (2018)
Adjali, O., Ramdane-Cherif, A.: High-level MLN-based approach for spatial context disambiguation. In: Proceedings - IEEE International Conference on Robotics and Automation, pp. 2909–2915 (2018)
Aly, A., Taniguchi, T.: Towards understanding object-directed actions: a generative model for grounding syntactic categories of speech through visual perception. In: Proceedings - IEEE International Conference on Robotics and Automation, pp. 7143–7150 (2018)
Gong, Z., Zhang, Y.: Temporal spatial inverse semantics for robots communicating with humans. In: Proceedings - IEEE International Conference on Robotics and Automation, pp. 4451–4458 (2018)
Koppula, H.S., Gupta, R., Saxena, A.: Learning human activities and object affordances from RGB-D videos. Int. J. Robot. Res. 32(8), 951–970 (2013). [Online]. Available: https://doi.org/10.1177/0278364913478446
Koppula, H.S., Saxena, A.: Anticipating human activities using object affordances for reactive robotic response. IEEE Trans. Pattern Anal. Machine Intell. 38(1), 14–29 (2016)
Paulius, D., Huang, Y., Milton, R., Buchanan, W.D., Sam, J., Sun, Y.: Functional object-oriented network for manipulation learning. In: IEEE International Conference on Intelligent Robots and Systems, vol. 2016-Novem. IEEE, pp. 2655–2662 (2016)
Jelodar, A.B., Sirajus Salekin, M., Sun, Y.: Identifying object states in cooking-related images, arXiv. [Online]. Available: 1805.06956 (2018)
Radovanov, B., Marcikić, A., Larrue, D., Legeard, M.: A comparison of four different lens mappers. Croatian Oper. Res. Rev. 91(2), 189–202 (2014)
Forestier, G., Petitjean, F., Dau, H.A., Webb, G.I., Keogh, E.: Generating synthetic time series to augment sparse datasets. In: Proceedings - IEEE International Conference on Data Mining, ICDM, vol. 2017-Novem. IEEE, pp. 865–870 (2017)
Vinod, H.D., López-de Lacalle, J.: Others Maximum entropy bootstrap for time series: the meboot R package. J. Stat. Softw. 29(5), 1–19 (2009)
Guennec, A.L., Malinowski, S., Tavenard, R., Le Guennec, A., Malinowski, S., Tavenard, R.: Data augmentation for time series classification using convolutional neural networks. In: ECML/PKDD Workshop on Advanced Analytics and Learning on Temporal Data (2016)
Wu, H., Luo, S., Chen, L., Duan, S., Chumkamon, S., Liu, D., Guan, Y., Rojas, J., Wu, H., Luo, S., Chen, L., Duan, S., Chumkamon, S., Liu, D., Guan, Y., Rojas, J.: Endowing robots with longer-term autonomy by recovering from external disturbances in manipulation through grounded anomaly classification and recovery policies, Arxiv. [Online]. Available: http://arxiv.org/abs/1809.03979http://www.juanrojas.net/re_enact_adapt/ (2018)
Pastor, P., Hoffmann, H., Asfour, T., Schaal, S.: Learning and generalization of motor skills by learning from demonstration. In: IEEE International Conference on Robotics and Automation, 2009. ICRA’09, vol. 2009, pp. 763–768. IEEE (2009)
Acknowledgments
We would like to thank Prof. Vincent Duchaine with the Department of Automated Manufacturing Engineering at Quebec University for his kind support in donating the multimodal tactile sensor used in this work [31].
Funding
This work is supported by grants from the Major Project of the Guangdong Province Department for Science and Technology [2019A050510040] and NSFC [61950410758], as well as the VC fund of the CUHK T Stone Robotics Institute (4930745).
Author information
Authors and Affiliations
Contributions
Shuangqi Luo did a large portion of the code development for the graph, anomaly identification, and testing along with ideas for all aspects of the work. Hongmin Wu strongly contributed in all areas of anomaly identification, classification, and experimentation. Shuangda Duan helped with manipulation coding, data collection, and experimentation. Yijiong Lin helped with performance comparisons and ablation studies. Juan Rojas provided the main ideas for the work as well as guidance for the team.
Corresponding author
Ethics declarations
Ethics approval
This study did not require ethics approval as the information consists of naturalistic observations regarding choice. Choices remained anonymous. There is not any identifier information that would allow attribution of private information to an individual.
Conflict of Interests
All the authors of this paper have no conflicts of interest, financial or otherwise. The funding listed above poses no conflict to this work.
Consent to participate
Informed consent was obtained from all participants in the study.
Consent to publish
Participant consented to the submission of this article to the journal.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Graph Structure
1.1 Nodes
In principle, a node specifies a motion-generation model and an associated goal. Two type of nodes are specified in our system: nominal and adaptive nodes.
1.1.1 Nominal Nodes
Nominal nodes are implemented as ROS-SMACH states whose class definition contains member functions. In the specific case of DMPs, these are labeled as “get_dmp_model” and “get_pose_goal” for model and goal retrieval respectively. There is an additional attribute of integer type in the class definition acting as the ID of the nominal node.
1.1.2 Adaptive Nodes
Adaptive nodes are not implemented as a specific entity but rather as two procedures.
The first procedure concerns when and how to create an adaptive node. Since a new type of adaptation can only be brought into our system via human demonstration, we create a new adaptive node after a human demonstration has curred. The new adaptive node simply contains a unique integer as its ID and a DMP model trained from that human demonstration.
The second procedure concerns how to determine the goal for an adaptive node. If we were to use the last frame of a human demonstration as the goal, it would result in an adaptive node having little or no generalization ability due to the fixed structure. We thus propose that the goal of an adaption is a linear transformation with respect to the previous goal of the system. This linear transformation can be retrieved by computing the transformation matrix from the previous goal to the last frame of human demonstration. This information is then saved alongside the model of the adaptive node. At runtime, we can determine the skill goal of an adaptive node by applying the saved linear transformation on the previous goal of the system.
1.2 Node Transitions
1.2.1 Transitions Across Nominal Nodes
Since nominal nodes are implemented as SMACH states, we inherit SMACH’s state transition paradigm as our node transition paradigm. In the ROS-SMACH state definition, the member function named “determine_successor” is called by our system to determine a nominal node’s successor.
1.2.2 Transitions Among Adaptive Nodes
Since an adaptive node are entered only after an anomaly has occurred, we create a mapping from anomalies to their corresponding adaptive nodes. A key aspect of the mapping is a “compound key” composed of the ID of the node in which the anomaly happened and the anomaly type. For example, a key could be “nominal_node_(4)_anomaly_type_(tool_collision)”.
After an adaptive node terminates its motion, we must consider the successor node. The system assumes that adaptive nodes, perform recovery for a nominal node that previously failed and that must arrive at the next phase or milestone of the task. In this sense, when the adaptive node terminates, it signals that a nominal state into the next phase has been attained. In this case, the originally nominal node that experienced an anomalous condition should now regain its control in determining its successor such that the original task control flow could continue as if no anomaly happened at all.
Appendix B: Kitting Anomaly Dataset
The dataset captures sensory-motor and video data regarding the Kitting experiment under anomalous scenarios as outlined in this paper. The dataset consists of 538 rosbags. 85 of those rosbags are paired with RGB video that was captured by an external camera placed directly in front of the robot. The size of the 538 rosbags is of 37GB whilst the size of all videos is of 3.1GB. The dataset is found as Supplement 2 in the paper as well as in [63].
1.1 Data Description
The main content of our dataset is the sensory-motor recordings of the robot manipulator’s experience while performing the manipulation task. Specifically for the Rethink Baxter robot, we use the following data modalities:
-
the right endpoint state: contains end-effector pose, twist, and a wrench defined from the joint torques (not used).
-
the stamped wrench: obtained from a Robotiq FT 180 force-torque sensor installed on the right wrist (see Fig. 4).
-
tactile data: obtained from a custom designed tactile sensor (see Section 8).
When anomalies are triggered, we also record: (i) the time-stamp at which the anomaly is flagged as well as the anomaly classification label.
1.2 Recording Methodology
All sensory-motor signals exist as ROS topics in our system and as such recorded as ROS bags offline. When an anomaly is identified, we signal this event by sending a timestamped ROS message to a pre-defined topic that is also recorded as a rosbag. Anomaly classification labels are recorded in a text file in a line-by-line basis.
Mapping from data modalities to ROS topics is as follows:
-
Baxter right endpoint state
/robot/limb/right/endpoint_state
-
Robotiq force sensor FT 180
/robotiq_force_torque_wrench
-
Robotiq tactile sensor
/TactileSensor4/Accelerometer,
/TactileSensor4/Dynamic,
/TactileSensor4/EulerAngle,
/TactileSensor4/Gyroscope,
/TactileSensor4/Magnetometer,
/TactileSensor4/StaticData
1.3 Data Organization
The dataset is composed of folders that use the format: ”experiment_at_[time]”. Each folder represents a test trial in the kitting experiment. Within a given folder, there will be a rosbag ”record.bag” and a text file ”anomaly_labels.txt”. Each of these contain the rosbag topics mentioned in Section B and the recorded labels for the given experiment.
1.4 Anomaly Data Extraction
To extract anomaly data, one should first focus on the topic ”/anomaly_detection_signal” whose messages are effectively timestamps indicating when anomalies were identified. It’s worth noting that a burst of anomaly timestamps might have been published to this topic for one anomaly. Therefore timestamps that are adjacent in time should be ignored. We recommend ignoring a timestamp if its distance to its precursor is less than 1 second. After anomaly timestamps are extracted, labels in the accompanied ”anomaly_labels.txt” can be paired accordingly.
We have tried to clear the dataset of any corrupted trials. However, if the number of anomaly timestamps does not equal to the number of labels, that experiment should be discarded.
Appendix C: Notation Table
Appendix D: Motor Skills
The DMP framework encodes dynamical systems through a set of nonlinear differential equations whose point attractor system is defined by a nonlinear forcing function, which in turn depends on a canonical system for temporal scaling. In this section, we introduce the main concepts and leave it to the reader to refer to the original text for details. Formally, for a one DoF point attractor system, the point attractor system is defined as [64]:
Equation 14, is an extended PD control signal with spring and damping constants K and D respectively, position and velocity x and v, goal g, scaling s, and temporal scaling factor τ.
The scaling term is controlled the canonical dynamical system \(\tau \dot {s} = -\alpha s\), where α can be an arbitrary constant.
The forcing term f(s) is an arbitrary function that, in our work, is provided by the user demonstration. The term is defined as a phase-dependent linear combination of Gaussian basis functions ψi(s) with variable weights [64]. Spatio-temporal scaling is possible through the (g − x) term in Eq. 14, which enables the system to adjust to varying goals. System speed-up is also possible through the τ variable in Eq. 14.
1.1 sHDP-AR-HMM Parameters & Hyperparameters
For the observation model, we use a first-order vector autoregressive with regression matrix coefficients A and covariance matrix Σ for specific latent states. Since both of these dynamic parameters are uncertain, they need to be learned. The MNIW is an appropriate prior distribution when both the mean and the covariance are uncertain [36].
We begin by determining the covariance Σ through the use of the IW distribution NIW. For this computation, we must define the first moment of the distribution according to Eq. 4. Here, we set ν, the degrees of freedom to the number of dimensions plus two: ν = d + 2. This setting ensures the conjugate MNIW prior has a valid mean (see Sec. 4.5.1 in [42]). As for the computation of the expectation of the covariance in Eq. 5, the scalar sF is set to 1.0 and multiplied by the scatter matrix (also the empirical covariance). This setting is motivated by the fact that the covariance is computed from polling all of the data and it tends to overestimate latent-state-specific covariances. A value slightly less than or equal to 1 of the constant in the scatter matrix mitigates the overestimation.
Then, to determine the matrix A of regression coefficients, the matrix normal of the MNIW uses a mean matrix M set to the zeros matrix M = 0d, of size d × d. We do so to let the new observation be primarily be determined by the signal noise.
For the covariance K across the columns an identity matrix is used such that K = 1.0 ∗Id with the same dimension as Σ.
For the concentration parameter α of the HDP prior, a Gamma(a,b) distribution with values a = 0.5,b = 5 is used. For the self-transition parameter μ a weakly informative Beta(c,d) prior distribution is used with values c = 1,d = 10.
For the sticky HMM transition distribution, another κ (the degree of self-transition bias) is set to 50. The number of maximum iterations for the Split-Merge Monte Carlo method is set to 1000. Finally, the truncation (maximum) number for latent states is empirically set to K = 10 for both anomaly identification and classification.
Appendix E: Experimentation
1.1 Human Subject Training
In Exp. 3-6, five different human subjects, under consent, took part in the experiment as human collaborators. They were trained to place consumer goods, one-at-a-time, in the collection bin of the robot. We ask human subjects to assume they are multi-tasking and experiencing loss of attention. The loss of attention can lead (as recorded by the cataloging experiments in Section 2.2) to a number of anomalous events including: (i) HCs, (ii) TCs, (iii) OSs, and (IV) NOs—wall collisions (WC) are introduced in Exp. 4 but these are not caused by humans but from using DMPs that were trained with a particular geometry and size and testing with objects that differed from training. HCs may occur when the robot picks up objects from the collection bin and the human collaborator places new ones. TCs may occur when humans inadvertently place objects near each other such that when the robot attempts to pick an object, one of its fingers collides with the adjacent object (see Fig. 9b). OSs may occur after human collisions that rattle the gripper and cause heavier or smoother objects to fall. NO anomalies may occur when a human accidentally collides or removes an object that the robot intended to pick up.
1.2 Signal Processing
Observations consists of a a 7 DoF pose (using quaternions as orientation), a 6 DoF end-effector twist and wrench, and 56 taxel values (each finger has a 4-by-7 grid). Various pre-processing techniques were tested for a combination of these features. We conducted validation to select the optimal feature set. Details for Anomaly Identification and Classification are reported in Exp. 1 and 2 respectively.
All signals were scaled, resampled, and aligned.
Signals were scaled to lie in a range of − 1 ≤ yi ≤ 1 by computing the absolute value of the maximum signals during training. Different signals publish at different rates rates (wrench: 1000Hz, tactile: 1000Hz, pose and twist: 100Hz). We resample to acquire a single time-point to model the observations. Our code relies primarily on python and ROS. Rospy nodes inherently use Python’s multi-threading class to handle multiple publishers and subscribers. The class, however, lacks real-time performance support and we have only achieved re-sampling rates of up to 50Hz. Alignment takes places by syncing the timestamps from the varying ROS topics.
1.3 Anomaly Identification Baselines
The HMM models an empirical covariance matrix with two observation models (Gaussian ‘G’ and Autoregressive ‘AR’) and two inference algorithms (Expectation-Maximization ‘EM’, Variational Bayes ‘VB’). We use 3 different values for the complexity k of the HMM (3,5,10).
For machine learning, the Isolation Forest uses default values from sklearn, the maximum number of samples is set to automatic, and the contamination value set to 0.01. For LOF, default values are used. Exceptions are novelty set to true and contamination set to 0.01. The MLP and LSTM networks both use feature distribution #14, a batch size of 16, learning rate of 0.0005, a leaky relu fixed α = 0.2 and an outlier fraction of 0.2. For the MLP, we use 18 input dimensions, 128 hidden units, and for the VAE latent states we use 16 dimensions. For the LSTM-VAE, we have a 16 time-steps input and 64 hidden states.
1.4 Exp. 4c
In Exp. 4c (Section 6.5.4), it was noted that one set of objects in particular posed challenges to the classification system. Under perfect classification, an adaptive behavior rotated the gripped object and cause a collision with objects leading to an irrecoverable situation. For imperfect classification, there was a set of trials that led to 0 completions. Failure occurred during the adaptation to the persistent wall collision in node 3 as the system moved to the box. The culprit was the inability of the system to adapt its motion when an object with different shape attributes (height) was used compared to the one used during user demonstrations. This result points to a weakness in the system’s ability to generalize adaptations when object shapes vary drastically from training as no spatial reasoning is yet embedded in the system.
1.5 Exp. 6
In Exp. 6, there were three experiments that failed due to the following situation: during the 2nd adaptation attempt to grasp the block, the approach pose was inaccurate. Normally, our fingers open when a pre-pick motion has terminated. The approach trajectory had some imprecision and led to the fingers making contact with the block causing it to tip (instead of sliding along the block to reach an optimal pick pose). After the tip, the block was displaced beyond the field-of-view of the camera. At this point the system continued to correctly trigger an NO flag, however on re-enactment the pose of the object was unavailable; thus holding-up the execution of the re-enactment. This could be prevented by a better implementation of the manipulation skills taught to pick the object. In retrospect, we never envisioned that training the pick in this way would be problematic. It is unclear whether end-to-end training would suffer a similar problem from inception. Clearly, the adaptations could be re-trained or improved to address the issue under any manipulation scheme. The question remains as to which approach would be more robust to previously unseen situations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Luo, S., Wu, H., Duan, S. et al. Endowing Robots with Longer-term Autonomy by Recovering from External Disturbances in Manipulation Through Grounded Anomaly Classification and Recovery Policies. J Intell Robot Syst 101, 51 (2021). https://doi.org/10.1007/s10846-021-01312-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-021-01312-6