Endowing Robots with Longer-term Autonomy by Recovering from External Disturbances in Manipulation Through Grounded Anomaly Classification and Recovery Policies

Download PDF

Shuangqi Luo¹,
Hongmin Wu²,
Shuangda Duan³,
Yijiong Lin⁴ &
…
Juan Rojas ORCID: orcid.org/0000-0002-6552-4572⁵

789 Accesses
5 Citations
Explore all metrics

Abstract

Robots are poised to interact with humans in unstructured environments. Despite increasingly robust control algorithms, failure modes arise whenever the underlying dynamics are poorly modeled, especially in unstructured environments. We contribute a set of recovery policies to deal with anomalies produced by external disturbances. The recoveries work when various different types of anomalies are triggered any number of times at any point in the task, including during already running recoveries. Our recovery critic stands atop of a tightly-integrated, graph-based online motion-generation and introspection system. Policies, skills, and introspection models are learned incrementally and contextually over time. Recoveries are studied via a collaborative kitting task where a wide range of anomalous conditions are experienced in the system. We also contribute an extensive analysis of the performance of the tightly integrated anomaly identification, classification, and recovery system under extreme anomalous conditions. We show how the integration of such a system achieves performances greater than the sum of its parts.

Article PDF

Multimodal anomaly detection for assistive robots

Article 13 April 2018

Evaluating Task-General Resilience Mechanisms in a Multi-robot Team Task

Resilient Robot Teams: a Review Integrating Decentralised Control, Change-Detection, and Learning

Article Open access 13 June 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Data Availability

An extensive dataset has been collected for this work as described in Appendix B and is available via our project page www.JuanRojas.net/spair or at Github https://github.com/birlrobotics/ktting_anomaly_dataset. The authors of this paper have made all aspects of the coding of this work open-source and are well documented, please refer to our supplementary page [26].

References

Ijspeert, A.J., Nakanishi, J., Hoffmann, H., Pastor, P., Schaal, S.: Dynamical movement primitives: learning attractor models formotor behaviors. Neural Comput. 25(2), 328–373 (2013)
Article MathSciNet Google Scholar
Paraschos, A., Daniel, C., Peters, J.R., Neumann, G.: Probabilistic movement primitives. In: Advances in Neural Information Processing Systems, pp. 2616–2624 (2013)
Calinon, S., D’Halluin, F., Sauser, E.L., Caldwell, D.G., Billard, A.G.: Learning and reproduction of gestures by imitation. IEEE Robot. Autom. Magazine 17(2), 44–54 (2010)
Article Google Scholar
Jain, A., Wojcik, B., Joachims, T., Saxena, A.: Learning trajectory preferences for manipulators via iterative improvement. In: Advances in Neural Information Processing Systems. [Online]. Available: http://pr.cs.cornell.edu/coactive (2013)
Konidaris, G., Kuindersma, S., Grupen, R., Barto, A.: Robot learning from demonstration by constructing skill trees. Int. J. Robot. Res. 31(3), 360–375 (2012)
Article Google Scholar
Gutierrez, R.A., Chu, V., Thomaz, A.L., Niekum, S.: Incremental task modification via corrective demonstrations. In: Proceedings - IEEE International Conference on Robotics and Automation, pp. 1126–1133 (2018)
Bajcsy, A., Losey, D.P., O’Malley, M.K., Dragan, A.D.: Learning from physical human corrections, one feature at a time. In: ACM/IEEE International Conference on Human-Robot Interaction, pp. 141–149 (2018)
Hovland, G.E., McCarragher, B.J.: Hidden Markov models as a process monitor in robotic assembly. Model. Identif. Control 20(4), 201–223 (1999)
Article Google Scholar
Pettersson, O.: Execution monitoring in robotics: a survey. Robot. Auton. Syst. 53(2), 73–88 (2005)
Article Google Scholar
Kobayashi, Y, Matsumoto, T., Takano, W., Wollherr, D., Gabler, V.: Motion recognition by natural language including success and failure of tasks for co-working robot with human. In: IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM. Institute of Electrical and Electronics Engineers Inc., pp. 10–15 (2017)
Inceoglu, A., Ince, G., Yaslan, Y., Sariel, S.: Failure detection using proprioceptive, auditory and visual modalities. In: IEEE International Conference on Intelligent Robots and Systems. Institute of Electrical and Electronics Engineers Inc., pp. 2491–2496 (2018)
Di Lello, E., Klotzbucher, M., De Laet, T., Bruyninckx, H.: Bayesian time-series models for continuous fault detection and recognition in industrial robotic tasks. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5827–5833. IEEE (2013)
Cheng, X., Jia, Z., Mason, M.T.: Data-efficient process monitoring and failure detection for robust robotic screwdriving. In: IEEE International Conference on Automation Science and Engineering, vol. 2019-Augus. IEEE Computer Society, pp. 1705–1711 (2019)
Wu, H., Guan, Y., Rojas, J.: A latent state-based multimodal execution monitor with anomaly detection and classification for robot introspection. Appl. Sci. (Switzerland) 9(6), 1072 (2019). [Online]. Available: https://www.mdpi.com/2076-3417/9/6/1072
Google Scholar
Park, D., Erickson, Z., Bhattacharjee, T., Kemp, C.C.: Multimodal execution monitoring for anomaly detection during robot manipulation. In: Proceedings - IEEE International Conference on Robotics and Automation, vol. 2016-June, pp. 407–414 (2016)
Park, D., Kim, H., Kemp, C.C.: Multimodal anomaly detection for assistive robots. Autonomous Robots 43(3), 611–629 (2019). [Online]. Available: https://doi.org/10.1007/s10514-018-9733-6
Article Google Scholar
Luo, S., Wu, H., Lin, H., Duan, S., Guan, Y., Rojas, J.: Fast, robust, and versatile event detection through HMM belief state gradient measures. In: The 27th IEEE International Symposium on Robot and Human Interactive Communication, 2018. Proceedings. ROMAN 2018, vol. 2018-Janua. Nanjing, China: Institute of Electrical and Electronics Engineers Inc, pp. 1–8 (2018). [Online]. Available: 1709.07876
Park, D., Kim, H., Hoshi, Y., Erickson, Z., Kapusta, A., Kemp, C.C.: A multimodal execution monitor with anomaly classification for robot-assisted feeding. In: IEEE International Conference on Intelligent Robots and Systems, vol. 2017-September, pp. 5406–5413 (2017)
Rodriguez, A., Mason, M.T., Srinivasa, S.S., Bernstein, M., Zirbel, A.: Abort and retry in grasping. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1804–1810. IEEE (2011)
Wu, H., Luo, S., Lin, H., Duan, S., Guan, Y., Rojas, J., Luo, S., Duan, S., Guan, Y., Rojas, J.: Recovering from external disturbances in online manipulation through state-dependent revertive recovery policies. In: RO-MAN 2018 - 27th IEEE International Symposium on Robot and Human Interactive Communication, pp. 166–173 (2018)
Chang, G., Kulic, D., Kulić, D., Kulic, D.: Robot task error recovery using Petri nets learned from demonstration. In: 2013 16th International Conference on Advanced Robotics (ICAR), pp. 1–6. IEEE (2013)
Kappler, D., Pastor, P., Kalakrishnan, M., Wüthrich, M., Schaal, S.: Data-driven online decision making for autonomous manipulation. In: Robotics: Science and Systems. Rome, Italy, vol. 11 (2015)
Niekum, S., Osentoski, S., Konidaris, G., Chitta, S., Marthi, B., Barto, A.G.: Learning grounded finite-state representations from unstructured demonstrations. Int. J. Robot. Res. 34(2), 131–157 (2015)
Article Google Scholar
Wang, A.S., Kroemer, O.: Learning robust manipulation strategies with multimodal state transition models and recovery heuristics. In: Proceedings - IEEE International Conference on Robotics and Automation, vol. 2019-May, pp. 1309–1315. [Online]. Available: https://www.ri.cmu.edu/wp-content/uploads/2019/03/Kroemer_Wang_ICRA_2019.pdf (2019)
Wu, H., Lin, H., Guan, Y., Harada, K., Rojas, J., Wu, H., Lin, H., Guan, Y., Harada, K., Rojas, J.: Robot introspection with Bayesian nonparametric vector autoregressive hidden Markov models. In: IEEE-RAS International Conference on Humanoid Robots, vol. Part F1341, no. Nips. IEEE, pp. 882–888 (2017). [Online]. Available: http://www.juanrojas.net/shdp-var-hmm/
Wu, H., Luo, S., Chen, L., Duan, S., Chumkamon, S., Liu, D., Guan, Y., Rojas, J.: Endowing robots with longer-term autonomy by recovering from external disturbances in manipulation through grounded anomaly classification and recovery policies. [Online]. Available: http://www.juanrojas.net/spair (2018)
Kroemer, O., Daniel, C., Neumann, G., van Hoof, H., Peters, J.: Towards learning hierarchical skills for multi-phase manipulation tasks. In: International Conference on Robotics and Automation (ICRA), vol. 2015-June, no. June, pp. 1503–1510 (2015)
Rojas, J., Luo, S., Zhu, D., Du, Y., Lin, H., Huang, Z., Kuang, W. , Harada, K.: Online robot introspection via wrench-based action grammars. In: IEEE International Conference on Intelligent Robots and Systems, vol. 2017-Septe, pp. 5429–5436. [Online]. Available: http://www.juanrojas.net/online_introspection_wrench_grammar/ (2017)
Lin, H.C., Shafran, I., Yuh, D., Hager, G.D.: Towards automatic skill evaluation: detection and segmentation of robot-assisted surgical motions. Comput. Aided Surg. 11(5), 220–230 (2006)
Article Google Scholar
Rosen, J., Brown, J.D., Chang, L., Sinanan, M.N., Hannaford, B.: Generalized approach for modeling minimally invasive surgery as a stochastic process using a discrete Markov model. IEEE Trans. Biomed. Eng. 53(3), 399–413 (2006)
Article Google Scholar
Le, T.H.L., Maslyczyk, A., Roberge, J.P., Duchaine, V.: A highly sensitive multimodal capacitive tactile sensor. In: Proceedings - IEEE International Conference on Robotics and Automation, pp. 407–412. IEEE (2017)
Schaal, S., Peters, J., Nakanishi, J., Ijspeert, A.: Learning movement primitives. In: Springer Tracts in Advanced Robotics, vol. 15, pp. 561–572. Springer (2005)
Grollman, D.H., Jenkins, O.C.: Incremental learning of subtasks from unsegmented demonstration. In: IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings. IEEE, pp. 261–266 (2010)
Rojas, J., Peters Ii, R.A., Peters, R.A., Peters Ii, R.A., Peters, R.A.: Sensory integration with articulated motion on a humanoid robot. Appl. Bionics Biomechan. 2(3-4), 171–178 (2005)
Article Google Scholar
Fox, E.B., Sudderth, E.B., Jordan, M.I., Willsky, A.S.: Bayesian nonparametric methods for learning markov switching processes. IEEE Signal Process. Mag. 27(6), 43–54 (2010)
Google Scholar
Hughes, M.C., Stephenson, W.T., Sudderth, E.B.: Scalable adaptation of state complexity for nonparametric hidden Markov models. Adv. Neural Inform. Process. Syst 2015-Janua, 1198–1206 (2015)
Google Scholar
Fox, E.B., Hughes, M.C., Sudderth, E.B., Jordan, M.I., et al.: Joint modeling of multiple time series via the beta process with application to motion capture segmentation. Ann. Appl. Stat. 8(3), 1281–1313 (2014)
Article MathSciNet Google Scholar
Johnson, M.J., Willsky, A.S.: Stochastic variational inference for Bayesian time series models. In: 31st International Conference on Machine Learning, ICML 2014, vol. 5, pp. 3872–3880 (2014)
Foti, N.N.J., Xu, J., Laird, D., Fox, E.B.: Stochastic variational inference for hidden Markov models. In: Advances in Neural Information Processing Systems, vol. 4, no. January, pp. 3599–3607 (2014)
Chang, J., Fisher, J.W.: Parallel sampling of HDPs using sub-cluster splits. In: Advances in Neural Information Processing Systems, vol. 1, no. January, pp. 235–243 (2014)
Bnpy: Bayesian nonparametric machine learning for Python. [Online]. Available: https://github.com/bnpy/bnpy/(2017)
Murphy, K.P., Robert, C. In: Dietterich, T. (ed.) : Machine Learning: a Probabilistic Perspective, vol. 27. MIT Press, Cambridge (2012)
Nakamura, A., Nagata, K., Harada, K., Yamanobe, N., Tsuji, T., Foissotte, T., Kawai, Y.: Error recovery using task stratification and error classification for manipulation robots in various fields. In: IEEE International Conference on Intelligent Robots and Systems, pp. 3535–3542. IEEE (2013)
Council, N.R., et al.: Modeling human and organizational behavior: application to military simulations. National Academies Press (1998)
An, J., Cho, S.: SNU data mining center 2015-2 special lecture on IE variational autoencoder based anomaly detection using reconstruction probability, Soeul National University, Tech. Rep. (2015)
Park, D., Hoshi, Y., Kemp, C.C.: A multimodal anomaly detector for robot-assisted feeding using an LSTM-based variational autoencoder, vol. 3, pp. 1544–1551 (2018)
Chen, R.-Q., Shi, G.-H., Zhao, W.-L., Liang, C.-H.: Sequential VAE-LSTM for anomaly detection on time series. [Online]. Available: 1910.03818 (2019)
Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Machine Learn. Res. 17(1), 1334–1373 (2016). [Online]. Available: http://www.jmlr.org/papers/volume17/15-522/15-522.pdf
MathSciNet MATH Google Scholar
Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. Int. J. Robot. Res. 37(4-5), 173–184 (2016)
Google Scholar
Haarnoja, T., Pong, V., Zhou, A., Dalal, M., Abbeel, P., Levine, S.: Composable deep reinforcement learning for robotic manipulation. In: Proceedings - IEEE International Conference on Robotics and Automation, pp. 6244–6251 (2018)
Jund, P., Eitel, A., Abdo, N., Burgard, W., Philipp. Jund Andreas Eitel, N.A., Burgard, W.: Optimization beyond the convolution: generalizing spatial relations with end-to-end metric learning. In: Proceedings - IEEE International Conference on Robotics and Automation, pp. 4510–4516 (2018)
Adjali, O., Ramdane-Cherif, A.: High-level MLN-based approach for spatial context disambiguation. In: Proceedings - IEEE International Conference on Robotics and Automation, pp. 2909–2915 (2018)
Aly, A., Taniguchi, T.: Towards understanding object-directed actions: a generative model for grounding syntactic categories of speech through visual perception. In: Proceedings - IEEE International Conference on Robotics and Automation, pp. 7143–7150 (2018)
Gong, Z., Zhang, Y.: Temporal spatial inverse semantics for robots communicating with humans. In: Proceedings - IEEE International Conference on Robotics and Automation, pp. 4451–4458 (2018)
Koppula, H.S., Gupta, R., Saxena, A.: Learning human activities and object affordances from RGB-D videos. Int. J. Robot. Res. 32(8), 951–970 (2013). [Online]. Available: https://doi.org/10.1177/0278364913478446
Article Google Scholar
Koppula, H.S., Saxena, A.: Anticipating human activities using object affordances for reactive robotic response. IEEE Trans. Pattern Anal. Machine Intell. 38(1), 14–29 (2016)
Article Google Scholar
Paulius, D., Huang, Y., Milton, R., Buchanan, W.D., Sam, J., Sun, Y.: Functional object-oriented network for manipulation learning. In: IEEE International Conference on Intelligent Robots and Systems, vol. 2016-Novem. IEEE, pp. 2655–2662 (2016)
Jelodar, A.B., Sirajus Salekin, M., Sun, Y.: Identifying object states in cooking-related images, arXiv. [Online]. Available: 1805.06956 (2018)
Radovanov, B., Marcikić, A., Larrue, D., Legeard, M.: A comparison of four different lens mappers. Croatian Oper. Res. Rev. 91(2), 189–202 (2014)
Article Google Scholar
Forestier, G., Petitjean, F., Dau, H.A., Webb, G.I., Keogh, E.: Generating synthetic time series to augment sparse datasets. In: Proceedings - IEEE International Conference on Data Mining, ICDM, vol. 2017-Novem. IEEE, pp. 865–870 (2017)
Vinod, H.D., López-de Lacalle, J.: Others Maximum entropy bootstrap for time series: the meboot R package. J. Stat. Softw. 29(5), 1–19 (2009)
Article Google Scholar
Guennec, A.L., Malinowski, S., Tavenard, R., Le Guennec, A., Malinowski, S., Tavenard, R.: Data augmentation for time series classification using convolutional neural networks. In: ECML/PKDD Workshop on Advanced Analytics and Learning on Temporal Data (2016)
Wu, H., Luo, S., Chen, L., Duan, S., Chumkamon, S., Liu, D., Guan, Y., Rojas, J., Wu, H., Luo, S., Chen, L., Duan, S., Chumkamon, S., Liu, D., Guan, Y., Rojas, J.: Endowing robots with longer-term autonomy by recovering from external disturbances in manipulation through grounded anomaly classification and recovery policies, Arxiv. [Online]. Available: http://arxiv.org/abs/1809.03979 http://www.juanrojas.net/re_enact_adapt/ (2018)
Pastor, P., Hoffmann, H., Asfour, T., Schaal, S.: Learning and generalization of motor skills by learning from demonstration. In: IEEE International Conference on Robotics and Automation, 2009. ICRA’09, vol. 2009, pp. 763–768. IEEE (2009)

Download references

Acknowledgments

We would like to thank Prof. Vincent Duchaine with the Department of Automated Manufacturing Engineering at Quebec University for his kind support in donating the multimodal tactile sensor used in this work [31].

Funding

This work is supported by grants from the Major Project of the Guangdong Province Department for Science and Technology [2019A050510040] and NSFC [61950410758], as well as the VC fund of the CUHK T Stone Robotics Institute (4930745).

Author information

Authors and Affiliations

University of Maryland, College Park, MD, USA
Shuangqi Luo
Guangdong Institute of Intelligent Manufacturing, Guangzhou, People’s Republic of China
Hongmin Wu
University of Waterloo (Foshan) Innovation Center, Foshan, People’s Republic of China
Shuangda Duan
University of Bristol, Bristol, UK
Yijiong Lin
School of Mechanical and Automation Engineering, Chinese University of Hong Kong, Hong Kong, People’s Republic of China
Juan Rojas

Authors

Shuangqi Luo
View author publications
You can also search for this author in PubMed Google Scholar
Hongmin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Shuangda Duan
View author publications
You can also search for this author in PubMed Google Scholar
Yijiong Lin
View author publications
You can also search for this author in PubMed Google Scholar
Juan Rojas
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Shuangqi Luo did a large portion of the code development for the graph, anomaly identification, and testing along with ideas for all aspects of the work. Hongmin Wu strongly contributed in all areas of anomaly identification, classification, and experimentation. Shuangda Duan helped with manipulation coding, data collection, and experimentation. Yijiong Lin helped with performance comparisons and ablation studies. Juan Rojas provided the main ideas for the work as well as guidance for the team.

Corresponding author

Correspondence to Juan Rojas.

Ethics declarations

Ethics approval

This study did not require ethics approval as the information consists of naturalistic observations regarding choice. Choices remained anonymous. There is not any identifier information that would allow attribution of private information to an individual.

Conflict of Interests

All the authors of this paper have no conflicts of interest, financial or otherwise. The funding listed above poses no conflict to this work.

Consent to participate

Informed consent was obtained from all participants in the study.

Consent to publish

Participant consented to the submission of this article to the journal.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Graph Structure

1.1 Nodes

In principle, a node specifies a motion-generation model and an associated goal. Two type of nodes are specified in our system: nominal and adaptive nodes.

1.1.1 Nominal Nodes

Nominal nodes are implemented as ROS-SMACH states whose class definition contains member functions. In the specific case of DMPs, these are labeled as “get_dmp_model” and “get_pose_goal” for model and goal retrieval respectively. There is an additional attribute of integer type in the class definition acting as the ID of the nominal node.

1.1.2 Adaptive Nodes

Adaptive nodes are not implemented as a specific entity but rather as two procedures.

The first procedure concerns when and how to create an adaptive node. Since a new type of adaptation can only be brought into our system via human demonstration, we create a new adaptive node after a human demonstration has curred. The new adaptive node simply contains a unique integer as its ID and a DMP model trained from that human demonstration.

The second procedure concerns how to determine the goal for an adaptive node. If we were to use the last frame of a human demonstration as the goal, it would result in an adaptive node having little or no generalization ability due to the fixed structure. We thus propose that the goal of an adaption is a linear transformation with respect to the previous goal of the system. This linear transformation can be retrieved by computing the transformation matrix from the previous goal to the last frame of human demonstration. This information is then saved alongside the model of the adaptive node. At runtime, we can determine the skill goal of an adaptive node by applying the saved linear transformation on the previous goal of the system.

1.2 Node Transitions

1.2.1 Transitions Across Nominal Nodes

Since nominal nodes are implemented as SMACH states, we inherit SMACH’s state transition paradigm as our node transition paradigm. In the ROS-SMACH state definition, the member function named “determine_successor” is called by our system to determine a nominal node’s successor.

1.2.2 Transitions Among Adaptive Nodes

Since an adaptive node are entered only after an anomaly has occurred, we create a mapping from anomalies to their corresponding adaptive nodes. A key aspect of the mapping is a “compound key” composed of the ID of the node in which the anomaly happened and the anomaly type. For example, a key could be “nominal_node_(4)_anomaly_type_(tool_collision)”.

After an adaptive node terminates its motion, we must consider the successor node. The system assumes that adaptive nodes, perform recovery for a nominal node that previously failed and that must arrive at the next phase or milestone of the task. In this sense, when the adaptive node terminates, it signals that a nominal state into the next phase has been attained. In this case, the originally nominal node that experienced an anomalous condition should now regain its control in determining its successor such that the original task control flow could continue as if no anomaly happened at all.

Appendix B: Kitting Anomaly Dataset

The dataset captures sensory-motor and video data regarding the Kitting experiment under anomalous scenarios as outlined in this paper. The dataset consists of 538 rosbags. 85 of those rosbags are paired with RGB video that was captured by an external camera placed directly in front of the robot. The size of the 538 rosbags is of 37GB whilst the size of all videos is of 3.1GB. The dataset is found as Supplement 2 in the paper as well as in [63].

1.1 Data Description

The main content of our dataset is the sensory-motor recordings of the robot manipulator’s experience while performing the manipulation task. Specifically for the Rethink Baxter robot, we use the following data modalities:

the right endpoint state: contains end-effector pose, twist, and a wrench defined from the joint torques (not used).
the stamped wrench: obtained from a Robotiq FT 180 force-torque sensor installed on the right wrist (see Fig. 4).
tactile data: obtained from a custom designed tactile sensor (see Section 8).

When anomalies are triggered, we also record: (i) the time-stamp at which the anomaly is flagged as well as the anomaly classification label.

1.2 Recording Methodology

All sensory-motor signals exist as ROS topics in our system and as such recorded as ROS bags offline. When an anomaly is identified, we signal this event by sending a timestamped ROS message to a pre-defined topic that is also recorded as a rosbag. Anomaly classification labels are recorded in a text file in a line-by-line basis.

Mapping from data modalities to ROS topics is as follows:

Baxter right endpoint state

/robot/limb/right/endpoint_state
Robotiq force sensor FT 180

/robotiq_force_torque_wrench
Robotiq tactile sensor

/TactileSensor4/Accelerometer,

/TactileSensor4/Dynamic,

/TactileSensor4/EulerAngle,

/TactileSensor4/Gyroscope,

/TactileSensor4/Magnetometer,

/TactileSensor4/StaticData

1.3 Data Organization

The dataset is composed of folders that use the format: ”experiment_at_[time]”. Each folder represents a test trial in the kitting experiment. Within a given folder, there will be a rosbag ”record.bag” and a text file ”anomaly_labels.txt”. Each of these contain the rosbag topics mentioned in Section B and the recorded labels for the given experiment.

1.4 Anomaly Data Extraction

To extract anomaly data, one should first focus on the topic ”/anomaly_detection_signal” whose messages are effectively timestamps indicating when anomalies were identified. It’s worth noting that a burst of anomaly timestamps might have been published to this topic for one anomaly. Therefore timestamps that are adjacent in time should be ignored. We recommend ignoring a timestamp if its distance to its precursor is less than 1 second. After anomaly timestamps are extracted, labels in the accompanied ”anomaly_labels.txt” can be paired accordingly.

We have tried to clear the dataset of any corrupted trials. However, if the number of anomaly timestamps does not equal to the number of labels, that experiment should be discarded.

Appendix C: Notation Table

Table 10 Summary of graph and DMP notation

Full size table

Appendix D: Motor Skills

The DMP framework encodes dynamical systems through a set of nonlinear differential equations whose point attractor system is defined by a nonlinear forcing function, which in turn depends on a canonical system for temporal scaling. In this section, we introduce the main concepts and leave it to the reader to refer to the original text for details. Formally, for a one DoF point attractor system, the point attractor system is defined as [64]:

$$ \tau \dot{v}=K(g-x)-Dv-K(g-x_{0})s + Kf(s), $$

(14)

$$ \tau \dot{x}=v. $$

Equation 14, is an extended PD control signal with spring and damping constants K and D respectively, position and velocity x and v, goal g, scaling s, and temporal scaling factor τ.

The scaling term is controlled the canonical dynamical system $\tau \dot {s} = -\alpha s$, where α can be an arbitrary constant.

The forcing term f(s) is an arbitrary function that, in our work, is provided by the user demonstration. The term is defined as a phase-dependent linear combination of Gaussian basis functions ψ_i(s) with variable weights [64]. Spatio-temporal scaling is possible through the (g − x) term in Eq. 14, which enables the system to adjust to varying goals. System speed-up is also possible through the τ variable in Eq. 14.

1.1 sHDP-AR-HMM Parameters & Hyperparameters

For the observation model, we use a first-order vector autoregressive with regression matrix coefficients A and covariance matrix Σ for specific latent states. Since both of these dynamic parameters are uncertain, they need to be learned. The MNIW is an appropriate prior distribution when both the mean and the covariance are uncertain [36].

We begin by determining the covariance Σ through the use of the IW distribution NIW. For this computation, we must define the first moment of the distribution according to Eq. 4. Here, we set ν, the degrees of freedom to the number of dimensions plus two: ν = d + 2. This setting ensures the conjugate MNIW prior has a valid mean (see Sec. 4.5.1 in [42]). As for the computation of the expectation of the covariance in Eq. 5, the scalar s_F is set to 1.0 and multiplied by the scatter matrix (also the empirical covariance). This setting is motivated by the fact that the covariance is computed from polling all of the data and it tends to overestimate latent-state-specific covariances. A value slightly less than or equal to 1 of the constant in the scatter matrix mitigates the overestimation.

Then, to determine the matrix A of regression coefficients, the matrix normal of the MNIW uses a mean matrix M set to the zeros matrix M = 0_d, of size d × d. We do so to let the new observation be primarily be determined by the signal noise.

For the covariance K across the columns an identity matrix is used such that K = 1.0 ∗I_d with the same dimension as Σ.

For the concentration parameter α of the HDP prior, a Gamma(a,b) distribution with values a = 0.5,b = 5 is used. For the self-transition parameter μ a weakly informative Beta(c,d) prior distribution is used with values c = 1,d = 10.

For the sticky HMM transition distribution, another κ (the degree of self-transition bias) is set to 50. The number of maximum iterations for the Split-Merge Monte Carlo method is set to 1000. Finally, the truncation (maximum) number for latent states is empirically set to K = 10 for both anomaly identification and classification.

Appendix E: Experimentation

1.1 Human Subject Training

In Exp. 3-6, five different human subjects, under consent, took part in the experiment as human collaborators. They were trained to place consumer goods, one-at-a-time, in the collection bin of the robot. We ask human subjects to assume they are multi-tasking and experiencing loss of attention. The loss of attention can lead (as recorded by the cataloging experiments in Section 2.2) to a number of anomalous events including: (i) HCs, (ii) TCs, (iii) OSs, and (IV) NOs—wall collisions (WC) are introduced in Exp. 4 but these are not caused by humans but from using DMPs that were trained with a particular geometry and size and testing with objects that differed from training. HCs may occur when the robot picks up objects from the collection bin and the human collaborator places new ones. TCs may occur when humans inadvertently place objects near each other such that when the robot attempts to pick an object, one of its fingers collides with the adjacent object (see Fig. 9b). OSs may occur after human collisions that rattle the gripper and cause heavier or smoother objects to fall. NO anomalies may occur when a human accidentally collides or removes an object that the robot intended to pick up.

1.2 Signal Processing

Observations consists of a a 7 DoF pose (using quaternions as orientation), a 6 DoF end-effector twist and wrench, and 56 taxel values (each finger has a 4-by-7 grid). Various pre-processing techniques were tested for a combination of these features. We conducted validation to select the optimal feature set. Details for Anomaly Identification and Classification are reported in Exp. 1 and 2 respectively.

All signals were scaled, resampled, and aligned.

Signals were scaled to lie in a range of − 1 ≤ y_i ≤ 1 by computing the absolute value of the maximum signals during training. Different signals publish at different rates rates (wrench: 1000Hz, tactile: 1000Hz, pose and twist: 100Hz). We resample to acquire a single time-point to model the observations. Our code relies primarily on python and ROS. Rospy nodes inherently use Python’s multi-threading class to handle multiple publishers and subscribers. The class, however, lacks real-time performance support and we have only achieved re-sampling rates of up to 50Hz. Alignment takes places by syncing the timestamps from the varying ROS topics.

1.3 Anomaly Identification Baselines

The HMM models an empirical covariance matrix with two observation models (Gaussian ‘G’ and Autoregressive ‘AR’) and two inference algorithms (Expectation-Maximization ‘EM’, Variational Bayes ‘VB’). We use 3 different values for the complexity k of the HMM (3,5,10).

For machine learning, the Isolation Forest uses default values from sklearn, the maximum number of samples is set to automatic, and the contamination value set to 0.01. For LOF, default values are used. Exceptions are novelty set to true and contamination set to 0.01. The MLP and LSTM networks both use feature distribution #14, a batch size of 16, learning rate of 0.0005, a leaky relu fixed α = 0.2 and an outlier fraction of 0.2. For the MLP, we use 18 input dimensions, 128 hidden units, and for the VAE latent states we use 16 dimensions. For the LSTM-VAE, we have a 16 time-steps input and 64 hidden states.

1.4 Exp. 4c

In Exp. 4c (Section 6.5.4), it was noted that one set of objects in particular posed challenges to the classification system. Under perfect classification, an adaptive behavior rotated the gripped object and cause a collision with objects leading to an irrecoverable situation. For imperfect classification, there was a set of trials that led to 0 completions. Failure occurred during the adaptation to the persistent wall collision in node 3 as the system moved to the box. The culprit was the inability of the system to adapt its motion when an object with different shape attributes (height) was used compared to the one used during user demonstrations. This result points to a weakness in the system’s ability to generalize adaptations when object shapes vary drastically from training as no spatial reasoning is yet embedded in the system.

1.5 Exp. 6

In Exp. 6, there were three experiments that failed due to the following situation: during the 2nd adaptation attempt to grasp the block, the approach pose was inaccurate. Normally, our fingers open when a pre-pick motion has terminated. The approach trajectory had some imprecision and led to the fingers making contact with the block causing it to tip (instead of sliding along the block to reach an optimal pick pose). After the tip, the block was displaced beyond the field-of-view of the camera. At this point the system continued to correctly trigger an NO flag, however on re-enactment the pose of the object was unavailable; thus holding-up the execution of the re-enactment. This could be prevented by a better implementation of the manipulation skills taught to pick the object. In retrospect, we never envisioned that training the pick in this way would be problematic. It is unclear whether end-to-end training would suffer a similar problem from inception. Clearly, the adaptations could be re-trained or improved to address the issue under any manipulation scheme. The question remains as to which approach would be more robust to previously unseen situations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Luo, S., Wu, H., Duan, S. et al. Endowing Robots with Longer-term Autonomy by Recovering from External Disturbances in Manipulation Through Grounded Anomaly Classification and Recovery Policies. J Intell Robot Syst 101, 51 (2021). https://doi.org/10.1007/s10846-021-01312-6

Download citation

Received: 24 August 2020
Accepted: 04 January 2021
Published: 22 February 2021
DOI: https://doi.org/10.1007/s10846-021-01312-6