[go: up one dir, main page]

Skip to main content

Multi-level Correlation Matching for Legal Text Similarity Modeling with Multiple Examples

  • Conference paper
  • First Online:
Web Information Systems Engineering – WISE 2023 (WISE 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14306))

Included in the following conference series:

  • 1255 Accesses

Abstract

Legal artificial intelligence (LegalAI) is an emerging field that leverages AI technology to enhance legal services. Similar Case Matching (SCM), which calculates the relevance between a candidate and a target case, is a critical technique in LegalAI to enable diverse legal intelligences. Existing approaches mainly rely the on single query texts or specific keywords for retrieval, yet neglected the domain complexity and multi-faceted nature of queries. Thus, a multi-example matching paradigm is motivated where three inherent challenges reveal. 1) Relevance assessment across multiple examples is complex. 2) The inherent lengthy and structured property of legal documents. 3) Lacking datasets containing golden labels for multi-example-based legal text matching. To address these challenges, this paper develops a novel multi-example dataset, and a Multi-level Correlation Semantic Matching (MCSM) is devised to extract similarity between cases given multi-example inputs. The proposed multi-level scheme can be interpreted as two aspects. Firstly, we consider both content and structure correlations to evaluate the relevance. Secondly, by dividing legal documents into distinctive segments, we can hierarchically learn the intra- and inter-segment dependencies to model the long-term dependencies across components of legal documents. An attention mechanism is employed to capture the complex interconnections among these examples and enable an attentive matching aggregation of content and structure. With multiple examples, the MCSM tackles the intricate and diverse nature of legal queries, providing a comprehensive and multi-dimensional description view. Extensive experimental evaluations show that the proposed MCSM outperforms baseline methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Anand, D., Wagh, R.: Effective deep learning approaches for summarization of legal texts. J. King Saud Univ.-Comput. Inf. Sci. 34(5), 2141–2150 (2022)

    Google Scholar 

  2. Beltagy, I., Peters, M.E., Cohan, A.: Longformer: the long-document transformer. arXiv preprint arXiv:2004.05150 (2020)

  3. Chalkidis, I., Fergadiotis, M., Malakasiotis, P., Aletras, N., Androutsopoulos, I.: Legal-BERT: the muppets straight out of law school. arXiv preprint arXiv:2010.02559 (2020)

  4. Chen, H., Cai, D., Dai, W., Dai, Z., Ding, Y.: Charge-based prison term prediction with deep gating network. arXiv preprint arXiv:1908.11521 (2019)

  5. Chen, Y.: Convolutional neural network for sentence classification. Master’s thesis, University of Waterloo (2015)

    Google Scholar 

  6. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  7. Fawei, B., Pan, J.Z., Kollingbaum, M., Wyner, A.Z.: A semi-automated ontology construction for legal question answering. N. Gener. Comput. 37, 453–478 (2019)

    Article  Google Scholar 

  8. Gan, L., Kuang, K., Yang, Y., Wu, F.: Judgment prediction via injecting legal knowledge into neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 12866–12874 (2021)

    Google Scholar 

  9. Hachey, B., Grover, C.: Extractive summarisation of legal texts. Artif. Intell. Law 14, 305–345 (2006)

    Article  Google Scholar 

  10. Huang, W., Jiang, J., Qu, Q., Yang, M.: AILA: a question answering system in the legal domain. In: IJCAI, pp. 5258–5260 (2020)

    Google Scholar 

  11. Lissandrini, M., Mottin, D., Palpanas, T., Velegrakis, Y.: Multi-example search in rich information graphs. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp. 809–820. IEEE (2018)

    Google Scholar 

  12. Ma, Y., et al.: LeCaRD: a legal case retrieval dataset for Chinese law system. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2342–2348 (2021)

    Google Scholar 

  13. Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2019). https://arxiv.org/abs/1908.10084

  14. Robertson, S.E., Walker, S.: Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In: Croft, B.W., van Rijsbergen, C.J. (eds.) SIGIR 1994, pp. 232–241. Springer, London (1994). https://doi.org/10.1007/978-1-4471-2099-5_24

    Chapter  Google Scholar 

  15. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)

    Article  Google Scholar 

  16. Shao, Y., et al.: BERT-PLI: modeling paragraph-level interactions for legal case retrieval. In: IJCAI, pp. 3501–3507 (2020)

    Google Scholar 

  17. Shulayeva, O., Siddharthan, A., Wyner, A.: Recognizing cited facts and principles in legal judgements. Artif. Intell. Law 25(1), 107–126 (2017). https://doi.org/10.1007/s10506-017-9197-6

    Article  Google Scholar 

  18. Tran, V., Nguyen, M.L., Satoh, K.: Building legal case retrieval systems with lexical matching and summarization using a pre-trained phrase scoring model. In: Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law, pp. 275–282 (2019)

    Google Scholar 

  19. Xiao, C., Hu, X., Liu, Z., Tu, C., Sun, M.: Lawformer: a pre-trained language model for Chinese legal long documents. AI Open 2, 79–84 (2021)

    Article  Google Scholar 

  20. Xiao, C., et al.: CAIL 2018: a large-scale legal dataset for judgment prediction. arXiv preprint arXiv:1807.02478 (2018)

  21. Yao, F., et al.: Leven: a large-scale Chinese legal event detection dataset. arXiv preprint arXiv:2203.08556 (2022)

  22. Zhu, M., Xu, C., Wu, Y.F.B.: IFME: information filtering by multiple examples with under-sampling in a digital library environment. In: Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 107–110 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xike Xie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huang, T., Xie, X., Liu, X. (2023). Multi-level Correlation Matching for Legal Text Similarity Modeling with Multiple Examples. In: Zhang, F., Wang, H., Barhamgi, M., Chen, L., Zhou, R. (eds) Web Information Systems Engineering – WISE 2023. WISE 2023. Lecture Notes in Computer Science, vol 14306. Springer, Singapore. https://doi.org/10.1007/978-981-99-7254-8_48

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7254-8_48

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7253-1

  • Online ISBN: 978-981-99-7254-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics