The 2015 Workshop on Automated Detection, Extraction and Analysis of Semantic Information in Legal Texts was held in conjunction with ICAIL 2015: XV International Conference on AI and Law, Friday June 12th 2015, San Diego, CA, USA

This workshop brought together an interdisciplinary group of scholars, academic and corporate researchers, legal practitioners, and legal service providers for an extended, collaborative discussion about applying natural language processing and machine learning to the semantic analysis of legal texts. Semantic analysis is the process of relating syntactic elements and structures, drawn from the levels of phrases, clauses, sentences, paragraphs, and whole documents, to their language-independent meanings in a given domain, including meanings specific to legal information. The focal texts range from:

The papers and discussion included retrieving documents with varying concepts of relevance, extracting legal norms from retrieved documents, and extracting various sorts of arguments.

The workshop was especially timely. Researchers have long been developing tools to aggregate, synthesize, structure, summarize, and reason about legal norms and arguments in texts. Recently, however, dramatic advances in natural language processing, text and argument mining, information extraction, and automated question answering are changing how automatic semantic analysis of legal rules and arguments will be performed in the future.


The workshop began with a brief tutorial session. University of Pittsburgh Intelligent Systems Program graduate students, Matthias Grabmair and Jaromir Savelka provided a brief introduction to natural language processing and machine learning and demonstrated some open source NLP tools.

Invited Speech

Computational linguist Paul Zhang, Consulting Research Scientist at the R&D Group of LexisNexis (a Division of Reed Elsevier Inc.) delivered the invited speech entitled "Semantic Annotation of Legal Texts". He described his project for generating semantic surrogates of legal documents in the LexisNexis corpus and for assessing semantic similarity. His approach focuses on legal concepts as points for discussion about legal issues. He defined legal concepts operationally as words or phrases to represent the same idea used in legal discussion. Using a combination of automated and manual techniques, the team constructed a list of concepts and variations for expressing a concept. They mined the corpus for n-grams, filtered the n-grams with linguistic rules, grouped varied terms together, manually editing throughout, and selected the legal concepts. Zhang discussed the problems of how to maintain consistency and integrity of the concept list, how to deal with infrequent terms and new terms, and determining the appropriate level of abstractness of concepts. Zhang discussed conceptual search, searching for similar texts by matching texts at the concept level. He discussed annotating semantics of texts through v-triples and finding the sameness of predicates, that is, using multiple levels of annotation in terms of linguistic constructs associated with a concept (verbs, nouns, verb-predicates (Verb-centered predicates or V-Triples). He said that they have not evaluated the texts quantitatively but could survey users. The result is a list of key concepts of cases that can be used for identifying cases cited for a user’s concept if interest and as a basis for conceptual search. He characterized his approach as a naïve but practical text annotation technique. He closed with the observation that semantics is like the horizon, ever receding.

Paper Presentations

Vern R. Walker, Parisa Bagheri and Andrew J. Lauria, Argumentation Mining from Judicial Decisions: The Attribution Problem and the Need for Legal Discourse Models

Walker focused on the importance of the "Attribution Problem" in extracting argument-related information from case texts. Ideally, the extraction method should identify who believes which propositions. Who treats or accepts the proposition as probably true? Who relies upon or uses the proposition as support in argument or reasoning? A proposition’s utility varies depending on whether it was asserted as the trier-of-fact’s own finding or that of an expert witness for one side or the other. Walker distinguished between the attribution object, subject, and cue (i.e., a lexical anchor or cue such as "testified that", "found the conclusion that ... to be reasonable"). He underscored the importance of presuppositional information, that is, background information shared among writers and readers useful for understanding the meaning of a legal document. This includes both legal procedural information and information about the substantive domain. This information (e.g., understanding which party has the burden of proof on which legal issues, is key to understanding attributions. He established the need for and sketched a legal discourse model to deal with the attribution problem in argumentation mining from judicial decisions. The legal discourse model includes types of actors, actions, and attribution relations. Walker provided a series of examples drawn from real cases illustrating simpler and more complex issues for attributing. For instance, if the trier of fact cites consistent testimony of experts for both sides, one may infer that the trier regards the proposition as an uncontested finding and part of the court’s holding.

Sara Frug and Thomas Bruce, Uses of Crowdsourcing for NLP and Machine-Learning Experiments in the Legal Domain

The authors addressed the important issue of who will annotate texts for purposes of machine learning and information extraction. They raised the enticing possibility of crowdsourced annotation. In particular, their Legal Information Institute (LII) at Cornell University has 29 million users who may be "just-expert-enough" to annotate documents. This is especially true where annotation tasks can be decomposed into "tiny tasks" that minimize the need for expertise. They focused on a LII tool that links defined terms in regulations to their definitions. Using a combination of rule-based extraction and machine learning, the tool detects the definition, parses the defined terms, parses the definitional phrase, retrieves definitions incorporated by reference and detects scoping language that may limit the applicability of a definition. LII users (i.e., the "crowd") were asked to assess the quality of the definition offered. Key questions in deciding whether to use of crowds for more detailed annotations are whether one can come up with a reasonable decomposition and, if so, build an interface people will use.

Rosario Girardi, Identifying Problems and Solutions on the Automatic Extraction of Ontology Elements from Legal Texts

Girardi provided a formal definition of an ontology, discussed text mining to extract concepts and relations for the ontology, and an approach to assessing the correctness of the inferred ontology by comparing results of an application using the inferred ontology versus one using a reference ontology. Since manual construction of ontologies by domain experts and knowledge engineers is a costly task, automatic and/or semi-automatic approaches to their development are needed. This field of research is sometimes referred to as "ontology learning and population". Girardi discussed some problems and solutions for the automated acquisition of each one of the components of an ontology (classes, taxonomic and non-taxonomic relationships, axioms and instances) using examples from the family law domain.

Jaromir Savelka and Kevin D. Ashley, Multi-label Classification of Public Health System Emergency Statutes

Savelka reported on mining statutory texts for specific functional information using NLP techniques and a supervised ML approach. He focused on California regulatory provisions dealing with the general topic of public health system emergency preparedness and response. They investigated whether provisions can be assigned multiple labels in different categories more effectively using various multi-label classification techniques. He reported experiments suggesting that classifiers capable of predicting more than one label for an individual datapoint outperformed classifiers trained to assign single labels only.

Marc van Opijnen, Nico Verwer and Jan Meijer, Beyond the Experiment: the eXtendable Legal Link extractor

Opijnen presented a software framework for detecting and resolving references to national and EU legislation, case law, parliamentary documents and official gazettes. It is designed to function in a large-scale production environment. The authors employ the pipeline architecture of Apache Cocoon, using the trie data structure for named entity recognition and a parsing expression grammar for pattern recognition instead of using regular expressions. He also discussed the challenges they faced with regard to diversity of citations for European Union legal documents and the changes in its standardization.

Llio Humphreys, Guido Boella, Livio Robaldo, Luigi di Caro, Loredana Cupi, Sepideh Ghanavati, Robert Muthuri and Leendert van der Torre, Classifying and Extracting Elements of Norms for Ontology Population using Semantic Role Labelling

Muthuri reported on initial experiments of classification and extraction of norm elements in European Directives using dependency parsing and semantic role labeling. The experimental system takes advantage of the way Eunomos and Legal URN present norms in a structured and searchable format for representing who may or must do what and when. He focused on how to extract prescriptions (i.e., norms) and other concepts (e.g., reason, power, obligation, nested norms) from legislation, how to automate ontology construction, and how to implement semantic role labeling to detect semantic arguments. He also contrasted EU directives and member states' implementing legislation. The goal is a role-based information system that can pull together the information relevant to particular roles (e.g., construction manager).

Akshay Minocha, Arjit Srivastava, Navjyoti Singh and Ashleigh Rhea Gonzales, Case Outcome Prediction using Legal Citation Network and Litigant Identification Features

Gonzales described a case-based prediction technique based on their observation that content sentences in legal cases evidenced polarity based on sentiment analysis, with strong emphasis assigned to the party to which the sentence refers, i.e, either the appellant or the respondent. They built a citation network from legal cases citing the Constitution of India, analyzed the extracted clusters, and related directionally connected node pairs by a text similarity score. They then predicted the outcome of the results using the sentiment analysis scores assigned by Sentiwordnet and thematic features of the sentence. A complementary method adapted the approach for evaluation of case evidence.

Mauro Dragoni, Guido Governatori and Serena Villata, Automated Rules Generation from Natural Language Legal Texts

Governatori addressed automated rules generation form natural language legal texts. Technical legal documents and codes describe in natural language what is permitted, forbidden or mandatory in the regulated context. Using normative reasoning techniques, business processes could be tested for compliance with the norms given only the texts of the regulations. The problem, however, is translating a normative text in natural language to a set of (semi-) formal rules of the kind: IF A occurred, THEN B is obligatory. Governatori reported on an evaluation of a natural language processing framework performing such a translation from the Australian "Telecommunications Consumer Protections Code". They employed shallow ontologies and a fine-grained parser for extracting sentences and terms corresponding to business process tasks. They employed deontic logic for combining terms to define and refine the rules (i.e., to identify the obligations permissions etc., and decide which logical templates apply, such as "if then", "if then unless", etc.) Testing the approach on 35 clauses, they found that one third were wrong due to the need for more complex patterns or sentences that were hard to parse.

Moderated Discussion

A moderated discussion closed the workshop. Topics discussed included the question of who will perform all of the annotations for these particular tasks and how will it be performed. It was observed that automation of part of the annotation tasks can facilitate human effort dramatically. Humans then are employed primarily to certify the annotation output. The need for a process model of annotation seemed especially important. Tom Bruce raised the question of what corpus one should use. For funding, one needs to show proof of a corpus at a reasonable scale that is tractable, does not contain too much dirty data, and for which annotator participation can be motivated. Someone mentioned potential IP problems such as patent infringement possibilities of mark-up processes. Walker observed that law is unlikely to ever be a big data project. The numbers of cases, issues, and concepts are not high enough. The need for annotated dedicated test sets was underscored. One needs to assess if the subdomains share structures and features. Curation efforts are required.

Follow-Up Requests

Interested persons are invited to address questions or comments to the speakers and authors or to the workshop organizing committee.


Organizing Committee

Program Committee

Design by Nicolas Fafchamps