Skip to Main Content

Generative AI Tools for USD Law Students

How does ChatGPT work?

ChatGPT is a large language model (LLM) that takes text prompts from users and produces responses that mimic human writing. It is not a search engine, nor is it a database.  In addition to generating text, it is a powerful tool for summarizing, annotating, analyzing, translating, categorizing, and interpreting language.  In this guide we focus on ChatGPT, though other generative AI tools like Bing, Bard, Microsoft Co-Pilot, and Claude are other LLMs gaining in popularity. 

Background: Artificial intelligence has origins as early as the 1950s and 60s, originally trying to answer the question of how to make something “smart." One approach was to constantly create millions flow charts and code them so that then the computer would be “smart." The other approach was the neural net (connectivist) approach which sought to emulate the brain. Although the neural net approach had legitimacy, it only began to gain steam as computer processing speed improved in 1970s and 80s. ChatGPT is based on the neural network approach. 

While AI made great strides in decision-making tasks like self driving cars, gameplay/chess, and modeling, other areas like language proved more difficult to conquer.   To emulate language, initially terms were coded in metadata one term at a time, in something like a large spreadsheet. Language models improved on this by assigning words to unique vectors. AI assigns vectors again and again and tries to play the game to get better. 

Example: The man went to the store to buy a gallon of [______].  The first time it plays the game it guesses the word [elephant] and self-supervises itself by evaluating the response.  Eventually it realizes the correct response [milk] and remembers that vector choice.  

In 2017, a team led by Google developed the modern day transformer.  A transformer essentially boosts the speed of these models, allowing for a language model that uses millions of dimensions and billions of words/vectors in multiple language.  This is known as a large language. ChatGPT stands for Chat Generative Pre-trained Transformer and is an example of a large language model. 

It is crucial to note that although ChatGPT may appear to "know" something about a particular law, it does not actually know about its legislative intent, statute of limitations, or application in case law.  Its powerful algorithms and ability to mimic human language can give the appearance of knowledge and understanding.

Example: Jack and Jill went up the hill to fetch a pail of [_______].  In this instance, ChatGPT will likely respond with [water] due to the popular English nursery rhyme.  Factually though, this makes little sense.  Water flows downward. Jack and Jill shouldn't be heading up a hill to fetch a pail of water at all.  ChatGPT does not understand gravity or physics but does know that the word "water" often follows that text prompt. ChatGPT and large language models in general are trained to simply guess the next word, without any concern for accuracy. 

Release of ChatGPT-3.5: Open AI launched several versions of ChatGPT beginning in 2018 and fine-tuned it by adding elements like Reinforcement Learning form Human Feedback (RLFH). RLFH uses human feedback in the training loop to minimize untruthful, offensive, biased, or heavily nuanced outputs (like our Jack and Jill example above). ChatGPT-3.5 was released in December 2022 to much fanfare.  However, at the time OpenAI CEO Sam Altman cautioned that ChatGPT really shouldn't be used "for anything important right now." 

Tweet from Sam Altman: ChatGPT is incredibly limited, but good enough at some things to create a misleading impression of greatness.  It's a mistake to be relying on it for anything important right now.  It's a view of progress; we have lots of work to do on robustness and truthfulness.

For more on what exactly ChatGPT and other large language models are doing: 

ChatGPT-3.5 versus 4

ChatGPT-3.5 is the free version offered by OpenAI, released in December 2022.  ChatGPT-4 is the newest publicly available version of ChatGPT.  It costs $20/month. Although they were released within just six months of each other, ChatGPT-4 is significantly more sophisticated. GPT-4 correctly articulates the laws that are relevant and successfully orients the student/lawyer to the correct elements and terms of art.  GPT-4 is also better able to handle longer and more sophisticated fact patterns. 

  • Bar Exam - GPT-3.5 took the multiple-choice portion of the bar exam in early 2023 and failed to pass. GPT-4 has now passed not only the multiple-choice portion, but also the essay portion, and scored around the top 10% of test takers.
  • Legal Writing - GPT-3.5 writes like a law student who knows of laws, but has no real understanding of how they are applied. So it gives a lot of legal jargon and limited or no elements.  GPT-4 will give you much more in terms of different aspects the court will consider in application
  • Legal Research - GPT-3.5 is well known for hallucinations (fictional cases and statutes).  GPT-4 provides much better (mostly real) sources however the analysis is sometimes flawed. 

Image title: Report Card for ChatGPT. ChatGPT-3.5: Legal Research F, Legal Writing C+. ChatGPT-4: Legal Research C+, Legal Writing B+

Image taken from Laura Killinger & Leslie A. Street, What Did I Miss? A Demonstration of the Differences Between ChatGPT-4 and 3.5 that Impact Legal Research and Writing (August 2023)

How to interact with GPT-4 without a subscription 

  • As a result of Microsoft's $1 billion dollar investment in OpenAI, you can use the free generative AI tool Bing on "creative mode" which is essentially running GPT-4. 
  • Legal Writing educational platform Write.law provides a place for people to interact with GPT-4 and learn how to use it

GPT Strengths and Weaknesses

screenshot of ChatGPT landing page July 2023 showing examples (e.g. creative ideas for a 10yos birthday), capabilities (allows user to provide followup instructions), and limitations (limited knowledge of world events after 2021).

Strengths

  • Able to summarize large amounts of text in very little time, e.g. it can summarize the bible in less than a minute. 
  • Conversational style allows user to continue to refine prompts and improve outputs
  • Able to identify typos/grammar mistakes. 
  • Able to provide incremental improvements to your drafts by rephrasing, changing the tone, or providing different examples of the same sentence for different audiences. 

Weaknesses 

  • Hallucinations - ChatGPT was built for language fluency, not factual accuracy. This makes it prone to hallucinations.. 
  • Currency - ChatGPT has a limited scope of reference.  ChatGPT-3.5 has a cutoff date for its training data as September 2021. This means it has no knowledge of anything that happened after that date. GPT-4 utilizes Browse with Bing to search the internet for more current information in order to answer questions, a functionality not present in ChatGPT-3.5. However, being able to search the internet is different than training an LLM on updated data. 
  • Bias - The data sets that ChatGPT and other LLMs are trained on are not neutral.  They are created in a society where racism, sexism, and ableism already exist. 
  • Confidentiality - Most LLMs are using industry standards so there are some assurances of confidentiality when entering personal information.  Legal practitioners should never enter confidential client information.
  • Misunderstanding - ChatGPT’s model may not grasp the context of the prompt, and is known to struggle with “common sense” knowledge, idioms, and sarcasm.
  • Potential for plagiarism - ChatGPT may occasionally regurgitate data wholesale, especially when the prompt is very basic or has too little context.