Appuyez sur Entrée pour voir vos résultats ou Echap pour annuler.

Artificial intelligence in (French) law : the reality behind the hype
Executive summary for my (long) French post

I maintain a long post (in French) on legal AI, mainly about French AI legal tools : Intelligence artificielle en droit : derrière la "hype", la réalité.

Here are ten points in English for those in a hurry (TL ;DR) or who don’t read French :

1. Artificial intelligence is first and foremost a field of research and a marketing term that’s a great seller but a catch-all. One could even say a marketing or a hype term. Specialists define it as the most advanced fringe of computer science applied to information processing. In other words, achievements worthy of the name, particularly in law, are rare. The rest is classic IT.

2. Technically, true AI in law (including the best of "daddy-style" AI) is characterized by the combined use of :

  • Big data
  • machine learning (ML), increasingly used in place of regexes (string searches)
  • probability calculations, with all the limitations of statistics
  • and, above all, natural language processing (NLP). Either (rare case until 2020) boosted by machine learning and syntax analysis, or (less frequent case since 2020) based on expert systems themselves relying on character strings (regex). This means that so-called "artificial intelligences" in law are in fact a) new-generation search engines (all of them) and b) decision support systems (in France, only Case Law Analytics). Not legal brains. However, since 2022, "large language models" (LLMs) (GPT et al.) have entered the fray and, despite (or given) their inability to reason and their dependence on training data, are performing at times bluffing and at other times despairing (see 4. below)
  • and expert systems, which formalize the expertise of specialists, notably by means of hierarchical trees or in the field of vocabulary (which contributes to NLP).

3. The value created by artificial intelligence comes from the data required for learning, much more than from the algorithm, whose development (apart from OpenAI’s GPT) is open source. In other words, data is more important than software.

This should enable traditional legal publishers, who have mostly been lagging behind since 2016, to get back into the race, as they are the ones who hold the richest data in law, especially when it comes to legal analysis and commentary.

4. Thanks to a major marketing and communications offensive, much has been said about Ross, IBM’s legal AI, which disappeared at the end of 2020. But its actual performance fell far short of the reputation it had been given by a skillful press and social networking campaign. All it did was research and analyze U.S. case law in specific areas, such as bankruptcy and intellectual property.

Still by IBM, Debater, cut out for argumentation, might seem more promising. However, it has no legal specialization at this stage, not even in the United States.

What about ChatGPT, GPT-3 and GPT-4 ? The GPT family, from version 3 onwards, has shown bluffing writing skills from the outset. At the same time, they don’t reason, and didn’t have access to commentary-rich legal content during their training. As a result, they are not legally rigorous and frequently come out with errors and absurdities. I consider them unreliable in French law.

Developed from GPT-3, the American Lexion is already capable of suggesting a fully drafted clause from just a few words. Correctly parameterized, GPT-3.5 passed two subjects on the American Bar exam. On the same test, GPT-4 passed all subjects. Based on GPT-3, Harvey.ai is an in-house ChatGPT for Anglo-Saxon law firms.

Perplexity.ai trained on a background selected for its reliability and connected to the Web, can (not always) give correct answers where ChatGPT fails, even if to do so it essentially copies and pastes.

Since, as said before, "content (training dataset) is king", generalist generative AI performance on legal matters is inherently limited.

5. AI and legal publishers. Westlaw and Lexis Advance, then Doctrine, Lexbase or Lefebvre Dalloz (with Ok.Doc) have been integrating small bits of AI, essentially NLP, to improve search relevance through a kind of improved synonymy and disambiguation, but also, more recently, through statistics by judge or lawyer.
In 2023, the real challenge for publishers and legaltechs will be to offer an LLM chatbot and thus drive a GPT into their funds. The first to do so appears to be Lefebvre Dalloz in Spain with GenIA-L in september 2023, closely followed in the USA by Lexis+ AI. In French law, it’s LegiGPT, developed by an independent on a GPT-3 API, that has burned their bridges.

6. This emphasis on research and so-called "predictive justice" overlooks the fact that the most widespread type of legal AI application is probably "contract review" software (detection, analysis and classification of clauses in Anglo-Saxon contracts) : eBrevia, Kira or Luminance, for example.

7. In French law, only a limited number of applications currently qualify as (weak) AI :

  • in contract review, Softlaw and Hyperlex
  • from Lefebvre Dalloz : as a search engine, Ok.Doc, and the RAG (Retrieval Augmented Generation) GenIA-L (demonstrated in November at the IBA in Paris)
  • on Codes alone (for the time being), LegiGPT
  • Ordalie
  • in predictive justice, Case Law Analytics (acquired by LexisNexis France in August 2023), Predictice and Lexbase’s Legalmetrics. Case Law Analytics has a highly delimited, haute couture approach, while Predictice is de facto more focused on employment law and civil liability. We could also add Francis Lefebvre’s pioneering Jurisprudence Chiffrée, which was already working with natural language in 2010. The benefits of these applications : easier searching, and calculation of the foreseeable amount of damages and the chances of winning a case. This contribution is not always enough to convince judges, who have more suitable in-house tools (scales), but it is attracting growing interest from law firms and insurers
  • on official texts, Luxia’s RegMind, an application for automatic monitoring of banking and financial law, and Mlang, an open-source algorithm for calculating French income tax that the DGFiP is due to put into production in 2023.

8. The risk of a net loss of jobs in the legal sector is currently a theoretical subject of debate. The fact remains, however, that keyword research will be greatly simplified, that simple and "cuttable" tasks will be automated, and that collaborators, "paralegals" and legal documentalists will have to learn to work with AI (and not alongside it), in other words, to use and improve it. As for AI connected to the brain, even if research exists, we’re a long way off.

9. So-called predictive justice could entail serious risks (although this has not been demonstrated in the current state of tools), foremost among which is judgement based on obscure criteria and judges conforming to what has already been judged. Legal limits already exist, and technical countermeasures have been proposed, such as open source code or verification tests based on data sets. The advantage of so-called predictive justice is that it makes it easier to calculate the chances of winning or losing a dispute, which encourages settlements and can reduce the congestion of courts, which are faced with budgetary underfunding.

The executive and legislative French branches are clearly pushing the wheel, since with the Justice Reform Act of March 23, 2019 and its implementing decree of December 11, 2019, conciliation or prior mediation has become compulsory for disputes under 5,000 euros and neighborhood disputes (even if the implementing texts ... complicate the certification - optional - side of this reform and this ADR obligation "jumps" if conciliators remain unavailable for more than three months).

10. In conclusion, given the importance of the issues at stake - which are basically very political and economic - and at the same time the gulf between hype and fantasy on the one hand, and reality on the other, we strongly recommend testing these new applications for yourself. As none of them are open source or available as a free demo, you need to make up your own mind, for yourself.

See the full post (in French) : Intelligence artificielle en droit : derrière la "hype", la réalité.

Emmanuel Barthe
law librarian researcher and specialist in legal monitoring, legal search engines and legal open data
licence en droit, Faculté de droit de Sceaux

Minority report by Philip K. Dick