Stacks of books on a table in front of a bookshelf.
Roland Lazenby's most recent release is “Magic: The Life of Earvin ‘Magic’ Johnson,” seen here at Book No Further in Roanoke. Several of his earlier sports biographies have been used to train AI, according to a database compiled by The Atlantic. Photo by Lisa Rowan.

A book deal doesn’t necessarily mean money in the bank, many authors have said. 

Charles Dickens’ books were so widely pirated early in his career that he came to the U.S. for an 1842 lecture tour in hopes of making some kind of cash. Jane Austen received a small lump sum for her classic “Sense and Sensibility,” but nothing after that.

Best-selling Roanoke County-based author Sharyn McCrumb relayed those stories in a recent email exchange to demonstrate that she was not “reeling from shock” at the publishing world’s latest news. A lawsuit filed in a New York federal courthouse last month alleges that OpenAI fed works from fiction writers including David Baldacci, Jodi Picault, George R.R. Martin and Sylvia Day into “large language models” to teach them how to communicate.

This screenshot from The Atlantic shows a searchable database of authors whose books were used to train AI.

The Atlantic magazine, in a September piece, reported that Meta and others used more than 191,000 pirated ebooks, most from the past 20 years, to train generative AI systems without permission. A database accompanying the story revealed that McCrumb and 14 of her books were among thousands of authors whose words fed the machine. 

“Cheating authors is a centuries old racket, so this one is about as much of a novelty as someone developing a new recipe to cook chicken,” McCrumb wrote. “(The relationship of authors to publishing is roughly that of chickens to KFC — a necessary part of the process but with no input in the operation.)”

McCrumb, famous for her “Ballad Novel” series, has 14 books, including “Kings Mountain,” “The Devil Amongst The Lawyers” and “St. Dale” on The Atlantic database of books fed to artificial intelligence. 

Others in the database with Southwest Virginia ties include Beth Macy, Barbara Kingsolver, Roland Lazenby, Martin Clark, Adriana Trigiani and Baldacci.

Writers and their advocates are making multiple moves against those they say are stealing their work. The Authors Guild filed its case in the Southern District of New York as a class-action lawsuit, with part-time Smith Mountain Lake resident Baldacci, “Game of Thrones” author Martin, Jonathan Franzen and other fiction writers’ names attached as plaintiffs. 

“These authors’ livelihoods derive from the works they create,” the suit reads. “But Defendants’ [large language models] endanger fiction writers’ ability to make a living, in that the LLMs allow anyone to generate — automatically and freely (or very cheaply) texts that they otherwise would pay writers to create. Moreover, Defendants’ LLMs can spit out derivative works: material that is based on, mimics, or summarizes Plaintiffs’ works, and harms the market for them.”

The Authors Guild, which says in the suit that public domain works could have been used instead, demands certification as a class and damages of up to $150,000 per infringed work to the listed plaintiffs and others who may join the class, plus attorneys’ fees.

Other authors have filed suits against Meta, which owns Facebook and has AI interests, according to multiple reports.

Baldacci has about four dozen books, including “Absolute Power,” listed in The Atlantic database. Baldacci, through a representative, declined an interview with Cardinal News.

The guild also published an open letter to generative AI CEOs including OpenAI’s Sam Altman,  Alphabet’s Sundar Pichai, Mark Zuckerberg of Meta and Emad Mostaque of Stability AI. Baldacci, Dan Brown, James Patterson, Jennifer Egan, Nora Roberts and Margaret Atwood are among the authors who have signed it.

“We understand that many of the books used to develop AI systems originated from notorious piracy websites,” the letter reads. “[G]enerative AI threatens to damage our profession by flooding the market with mediocre, machine-written books, stories, and journalism based on our work. In the past decade or so, authors have experienced a forty percent decline in income, and the current median income for full-time writers in 2022 was only $23,000. The introduction of AI threatens to tip the scale to make it even more difficult, if not impossible, for writers — especially young writers and voices from under-represented communities — to earn a living from their profession.”

That letter calls for AI leaders to get permission to use copyrighted material in generative AI programs; compensate writers for past and continuing use in AI; and compensate writers for AI output, “regardless of whether the outputs infringe upon current laws.”

Those companies’ press representatives did not respond to emails seeking comment for this story.

A display at Book No Further in Roanoke featuring Martin Clark’s novels. “It’s flat-out theft,” Clark said of the use of copyrighted materials to train AI. Photo by Lisa Rowan.

Clark, a retired Patrick County judge whose books have both sold well and been critical favorites, had three on the database: “Plain Heathen Mischief,” “The Many Aspects of Mobile Home Living” and “The Substitution Order.” He was on a book tour for his most recent, “The Plinko Bounce,” when asked for comment.

“It’s flat-out theft, and wrong on so many levels,” he wrote in an email exchange. “Just criminal.”

Or uncivil at least, if the Authors Guild has its way. But that is an open question.

“It may be beyond the scope of copyright law to address the harms being done to authors by generative AI, and the point remains that AI-training practices are secretive and fundamentally nonconsensual,” Alex Reisner, who built the database for The Atlantic, wrote. “Very few people understand exactly how these programs are developed, even as such initiatives threaten to upend the world as we know it.”

Beth Macy. Photo courtesy of Stephanie Klein-Davis.

Ebook chicanery has affected Macy, a Roanoke-based nonfiction writer whose best-sellers “Dopesick,” “Factory Man” and “Truevine” appear on The Atlantic’s database. [Macy is a member of the Cardinal News journalism advisory committee.] She said her books have been pirated from the beginning. She would create a Google alert for the book, receive notice that free downloads were available and notify her publisher.

“I think it’s been steady throughout,” she said. “I think it was more common back in ‘Factory Man’ days [the book was published in 2014]. But even [with 2016’s] ‘Truevine,’ they’ve all been ‘free downloads’ — the publisher has a person who was policing that. I never followed up to see what happened.”

What happened was “a game of Whac-A-Mole,” according to Lazenby, whose sports biographies “Michael Jordan: The Life,” “Mind Games: Phil Jackson’s Long Strange Journey” and “Showboat: The Life of Kobe Bryant” are in the database. Publishers sent cease-and-desist orders and the downloads would disappear, only to reappear elsewhere. Now, they’re becoming AI food.

He is in book-release mode for his latest, “Magic: The Life of Earvin ‘Magic’ Johnson.”

Roland Lazenby at the 2008 NBA Finals in Boston. Photo courtesy of Scott Cunningham.

“I did note one of my former students is working on a website for me, for the ‘Magic’ launch,” Lazenby said. “And he created a bio of me from AI, and … I even posted it to Facebook, laughing about it. That was the most flattering thing. It’s just ridiculously flattering. And I thought to myself, maybe that’s how AI is making a crease in the human soul. They’re going to flatter us to death.”

He said he’s more concerned about future writers than for himself. The Jordan book, released in 2014, has seen multiple reprints and translations, and the royalties from publisher Little, Brown and Co. have been good. 

“It involves a lot of other people, and they’ve got to get the rules straight or it’s going to be like the Alaska Gold Rush — they’re going to be knocking the little guy off of his claim,” he said. “It becomes a power game. … It’s a global battle, all of the AI stuff. It’s going to take a tremendous amount of will to bend this technology to a common good, I believe.”

Tad Dickens is technology reporter for Cardinal News. He previously worked for the Bristol Herald Courier...