Abstract
Legal artificial intelligence (LegalAI) is an emerging field that leverages AI technology to enhance legal services. Similar Case Matching (SCM), which calculates the relevance between a candidate and a target case, is a critical technique in LegalAI to enable diverse legal intelligences. Existing approaches mainly rely the on single query texts or specific keywords for retrieval, yet neglected the domain complexity and multi-faceted nature of queries. Thus, a multi-example matching paradigm is motivated where three inherent challenges reveal. 1) Relevance assessment across multiple examples is complex. 2) The inherent lengthy and structured property of legal documents. 3) Lacking datasets containing golden labels for multi-example-based legal text matching. To address these challenges, this paper develops a novel multi-example dataset, and a Multi-level Correlation Semantic Matching (MCSM) is devised to extract similarity between cases given multi-example inputs. The proposed multi-level scheme can be interpreted as two aspects. Firstly, we consider both content and structure correlations to evaluate the relevance. Secondly, by dividing legal documents into distinctive segments, we can hierarchically learn the intra- and inter-segment dependencies to model the long-term dependencies across components of legal documents. An attention mechanism is employed to capture the complex interconnections among these examples and enable an attentive matching aggregation of content and structure. With multiple examples, the MCSM tackles the intricate and diverse nature of legal queries, providing a comprehensive and multi-dimensional description view. Extensive experimental evaluations show that the proposed MCSM outperforms baseline methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Anand, D., Wagh, R.: Effective deep learning approaches for summarization of legal texts. J. King Saud Univ.-Comput. Inf. Sci. 34(5), 2141–2150 (2022)
Beltagy, I., Peters, M.E., Cohan, A.: Longformer: the long-document transformer. arXiv preprint arXiv:2004.05150 (2020)
Chalkidis, I., Fergadiotis, M., Malakasiotis, P., Aletras, N., Androutsopoulos, I.: Legal-BERT: the muppets straight out of law school. arXiv preprint arXiv:2010.02559 (2020)
Chen, H., Cai, D., Dai, W., Dai, Z., Ding, Y.: Charge-based prison term prediction with deep gating network. arXiv preprint arXiv:1908.11521 (2019)
Chen, Y.: Convolutional neural network for sentence classification. Master’s thesis, University of Waterloo (2015)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Fawei, B., Pan, J.Z., Kollingbaum, M., Wyner, A.Z.: A semi-automated ontology construction for legal question answering. N. Gener. Comput. 37, 453–478 (2019)
Gan, L., Kuang, K., Yang, Y., Wu, F.: Judgment prediction via injecting legal knowledge into neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 12866–12874 (2021)
Hachey, B., Grover, C.: Extractive summarisation of legal texts. Artif. Intell. Law 14, 305–345 (2006)
Huang, W., Jiang, J., Qu, Q., Yang, M.: AILA: a question answering system in the legal domain. In: IJCAI, pp. 5258–5260 (2020)
Lissandrini, M., Mottin, D., Palpanas, T., Velegrakis, Y.: Multi-example search in rich information graphs. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp. 809–820. IEEE (2018)
Ma, Y., et al.: LeCaRD: a legal case retrieval dataset for Chinese law system. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2342–2348 (2021)
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2019). https://arxiv.org/abs/1908.10084
Robertson, S.E., Walker, S.: Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In: Croft, B.W., van Rijsbergen, C.J. (eds.) SIGIR 1994, pp. 232–241. Springer, London (1994). https://doi.org/10.1007/978-1-4471-2099-5_24
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)
Shao, Y., et al.: BERT-PLI: modeling paragraph-level interactions for legal case retrieval. In: IJCAI, pp. 3501–3507 (2020)
Shulayeva, O., Siddharthan, A., Wyner, A.: Recognizing cited facts and principles in legal judgements. Artif. Intell. Law 25(1), 107–126 (2017). https://doi.org/10.1007/s10506-017-9197-6
Tran, V., Nguyen, M.L., Satoh, K.: Building legal case retrieval systems with lexical matching and summarization using a pre-trained phrase scoring model. In: Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law, pp. 275–282 (2019)
Xiao, C., Hu, X., Liu, Z., Tu, C., Sun, M.: Lawformer: a pre-trained language model for Chinese legal long documents. AI Open 2, 79–84 (2021)
Xiao, C., et al.: CAIL 2018: a large-scale legal dataset for judgment prediction. arXiv preprint arXiv:1807.02478 (2018)
Yao, F., et al.: Leven: a large-scale Chinese legal event detection dataset. arXiv preprint arXiv:2203.08556 (2022)
Zhu, M., Xu, C., Wu, Y.F.B.: IFME: information filtering by multiple examples with under-sampling in a digital library environment. In: Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 107–110 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Huang, T., Xie, X., Liu, X. (2023). Multi-level Correlation Matching for Legal Text Similarity Modeling with Multiple Examples. In: Zhang, F., Wang, H., Barhamgi, M., Chen, L., Zhou, R. (eds) Web Information Systems Engineering – WISE 2023. WISE 2023. Lecture Notes in Computer Science, vol 14306. Springer, Singapore. https://doi.org/10.1007/978-981-99-7254-8_48
Download citation
DOI: https://doi.org/10.1007/978-981-99-7254-8_48
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7253-1
Online ISBN: 978-981-99-7254-8
eBook Packages: Computer ScienceComputer Science (R0)