Artificial intelligence (AI) for business which learns how to categorize information within the document, store it and extract it when it is requested

Researchers at the University of Maryland, Baltimore County (UMBC) have made strides in automated legal document analytics (ALDA) by creating a way to machine-process the Code of Federal Regulations (CFR). The CFR is a complex document containing policies related to doing business with the federal government. All business affiliates of the federal government must comply with the CFR. For government contracts to be equitably open to a broad range of businesses, policies within the CFR must be accessible.

This document automation is just one part of a broader project to help contractors and other entities manage and monitor their legal documents. Directed by Karuna Joshi, associate professor of information systems, the team has successfully managed to do a complete analysis of the CFR. Digital Government: Research and Practice recently published their methodology.

Automating document review through AI

The team’s method for analyzing the CFR involves artificial intelligence (AI), which learns how to categorize information within the document, store it and extract it when it is requested. Joshi and her team achieved this by creating a knowledge graph using Semantic Web technologies to illustrate all the key terms, rules, and regulations in the document. This basic framework enables users to ask an automated tool about a specific rule and be provided with the answer.

The semantic web language OWL, or Web Ontology Language, is used to represent concepts and to contextualize relationships. According to Joshi, the framework of the knowledge graph can be “adopted by federal agencies and businesses to automate their internal processes that reference the CFR rules and policies.” To facilitate this, they will make it available in the public domain.

Question and answer

General users can interact with the knowledge graph through a kind of question-and-answer process, similar to how many people use Amazon’s Alexa or Apple’s Siri. For example, Joshi suggests that someone could ask a policy-related question like, “How many days at a minimum must a Request for Proposal (RFP) be posted open/available?” The system would query the CFR knowledge graph to find sections in the document that answer this question.

The researchers anticipate this will be a highly useful system for any business held to the CFR thanks to how it breaks down CFR’s legal complexity through the automated process with ease.

Access and accountability

This project to automate and support users’ understanding of legal documents has been an ongoing effort by the UMBC team. Beyond the CFR, they seek to assist people with understanding legally binding contracts that they encounter every day, such as terms of service for major companies. Lavanya Elluri, graduate student of information systems, adds, “Our research helps the organizations that use cloud services to understand the context from these textual documents quickly.”

Many users have found their data being used without their knowledge, due to the information buried within terms of service and privacy policies. Joshi predicts that the tools her team is developing to help users better understand these documents will be essential to hold companies accountable for their data use.

Source: UNIVERSITY OF MARYLAND BALTIMORE COUNTY