Abstract
Transfer learning approaches have shown to significantly improve performance on downstream tasks. However, it is common for prior works to only report where transfer learning was beneficial, ignoring the significant trial-and-error required to find effective settings for transfer. Indeed, not all task combinations lead to performance benefits, and brute-force searching rapidly becomes computationally infeasible. Hence the question arises, can we predict whether transfer between two tasks will be beneficial without actually performing the experiment? In this paper, we leverage explainability techniques to effectively predict whether task pairs will be complementary, through comparison of neural network activation between single-task models. In this way, we can avoid grid-searches over all task and hyperparameter combinations, dramatically reducing the time needed to find effective task pairs. Our results show that, through this approach, it is possible to reduce training time by up to 83.5% at a cost of only 0.034 reduction in positive-class F1 on the TREC-IS 2020-A dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
More information on metrics and tasks can be found at http://trecis.org.
References
Bingel, J., Søgaard, A.: Identifying beneficial task relations for multi-task learning in deep neural networks. arXiv:1702.08303 [cs], February 2017
Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. CoRR abs/1603.02754 (2016). http://arxiv.org/abs/1603.02754
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (2019)
Dhamdhere, K., Sundararajan, M., Yan, Q.: How important is a neuron? CoRR abs/1805.12233 (2018). http://arxiv.org/abs/1805.12233
Ismail, A.A., Gunady, M.K., Bravo, H.C., Feizi, S.: Benchmarking deep learning interpretability in time series predictions. CoRR abs/2010.13924 (2020). https://arxiv.org/abs/2010.13924
Liu, N., Ge, Y., Li, L., Hu, X., Chen, R., Choi, S.: Explainable recommender systems via resolving learning representations. CoRR abs/2008.09316 (2020). https://arxiv.org/abs/2008.09316
Mou, L., Meng, Z., Yan, R., Li, G., Xu, Y., Zhang, L., Jin, Z.: How Transferable are Neural Networks in NLP Applications? arXiv:1603.06111 [cs], October 2016
Mundhenk, T.N., Chen, B.Y., Friedland, G.: Efficient saliency maps for explainable AI. CoRR abs/1911.11293 (2019). http://arxiv.org/abs/1911.11293
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010)
Ribeiro, M.T., Singh, S., Guestrin, C.: “why should I trust you?”: Explaining the predictions of any classifier. CoRR abs/1602.04938 (2016). http://arxiv.org/abs/1602.04938
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. CoRR abs/1703.01365 (2017). http://arxiv.org/abs/1703.01365
Wu, Z., Kao, B., Wu, T.H., Yin, P., Liu, Q.: Perq: predicting, explaining, and rectifying failed questions in kb-qa systems. In: Proceedings of the 13th International Conference on Web Search and Data Mining, WSDM 2020, pp. 663–671. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3336191.3371782. https://doi.org/10.1145/3336191.3371782
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. CoRR abs/1311.2901 (2013). http://arxiv.org/abs/1311.2901
Zhang, Z., Rudra, K., Anand, A.: Explain and predict, and then predict again. CoRR abs/2101.04109 (2021). https://arxiv.org/abs/2101.04109
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hepburn, A.J., McCreadie, R. (2022). Identifying Suitable Tasks for Inductive Transfer Through the Analysis of Feature Attributions. In: Hagen, M., et al. Advances in Information Retrieval. ECIR 2022. Lecture Notes in Computer Science, vol 13186. Springer, Cham. https://doi.org/10.1007/978-3-030-99739-7_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-99739-7_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-99738-0
Online ISBN: 978-3-030-99739-7
eBook Packages: Computer ScienceComputer Science (R0)