Register now for better personalized quote!

These authors are suing OpenAI and Meta for copyright infringement now

Jul, 10, 2023 Hi-network.com

Sarah Silverman speaks on May 05, 2022 in New York City.

Cindy Ord/Getty Images for Variety

Sarah Silverman joined forces with fellow authors Richard Kadfrey and Christopher Golden to sue Meta and OpenAI in dual claims of copyright infringement. 

The suits are separate, each against one of the companies, and the authors claim they never consented for their copyrighted books to be used as training material for the large language models used (LLM) behind OpenAI's ChatGPT and Meta's LLaMa. 

Also: Generative AI is coming for your job. Here are 4 reasons to get excited

An LLM is a type of artificial intelligence algorithm trained using massive amounts of information from books and texts from the internet to learn language patterns, grammar, and context until it can generate human-like text and have chat interactions with users. 

According to the lawsuits, the models "remix the copyrighted works of thousands of book authors -- and many others -- without consent, compensation, or credit." 

Copyright infringement has been one of the many concerns of AI skeptics since ChatGPT became widely available in November, triggering the generative AI boom and questions about how AI will affect the creativity and copyright process.

Also: Who owns the code? If ChatGPT's AI helps write your app, does it still belong to you?

The lawsuits claim the LLMs were trained on illegally-acquired materials, such as those found in "shadow library" websites. According to the OpenAI suit:

"The OpenAI Books2 dataset can be estimated to contain about 294,000 titles. The only 'internet-based books corpora' that have ever offered that much material are notorious 'shadow library' websites like Library Genesis (aka LibGen), Z-Library (aka B-ok), Sci-Hub, and Bibliotik. The books aggregated by these websites have also been available in bulk via torrent systems."

The Meta suit makes similar claims, as it links to the sources where the books' training data was gathered. It divides them in two: The first as being from Project Gutenberg, which is an online archive of books that are out of copyright, and the second is from the "Books3 section of ThePile", which is a dataset available on the popular AI project hosting site, Hugging Face, and appears to represent all of Bibliotik, mentioned above.

Also: Want to build your own AI chatbot? Say hello to open-source HuggingChat

The plaintiffs are represented by lawyers Joseph Savery and Matthew Butterick, who also represent authors Mona Awad and Paul Tremblay in a lawsuit filed in June against OpenAI over copyright infringement.

Artificial Intelligence

Generative AI will far surpass what ChatGPT can do. Here's everything on how the tech advancesChatGPT's new web browsing feature is a big disappointment. Use this plugin insteadWhat is Amazon Bedrock? 4 ways it can help businesses use generative AI toolsCan generative AI solve computer science's greatest unsolved problem?
  • Generative AI will far surpass what ChatGPT can do. Here's everything on how the tech advances
  • ChatGPT's new web browsing feature is a big disappointment. Use this plugin instead
  • What is Amazon Bedrock? 4 ways it can help businesses use generative AI tools
  • Can generative AI solve computer science's greatest unsolved problem?

tag-icon Hot Tags : Artificial Intelligence Innovation

Copyright © 2014-2024 Hi-Network.com | HAILIAN TECHNOLOGY CO., LIMITED | All Rights Reserved.