A new study has found that Meta’s latest AI model, Llama 3.1, is copying parts of popular books including Harry Potter a lot more than expected. In fact, researchers say the AI has memorized nearly 42% of the first Harry Potter book, and can repeat 50-word chunks of it correctly about half the time.
The research was done by experts from Stanford, Cornell, and West Virginia University. They looked at how five major AI models handled a dataset called Books3, which contains thousands of books many of them still under copyright.
Meta’s Llama 3.1 stood out for remembering big parts of well-known books like The Hobbit, 1984, and Harry Potter and the Sorcerer’s Stone. Older models, like Llama 1, only memorized about 4% of Harry Potter. So, this newer model appears to be holding on to a lot more copyrighted material.
Interestingly, the AI didn’t memorise all books equally. For example, it only remembered 0.13% of a lesser-known novel called Sandman Slim. That difference could make it harder for authors to sue tech companies as a group since not all books are affected in the same way.
So why is this happening? Researchers think Meta may have used the same books too many times while training the AI. Others say the data could include quotes from fan websites, reviews, or student essays. It’s also possible that tweaks to the training process made the memorization worse without anyone realizing it.
The findings add to growing concerns about how AI models are trained—and whether they’re crossing legal lines when it comes to copyrighted content. As more authors push back, this could become a major issue for tech companies like Meta.
Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!
Find the best of Al News in one place, specially curated for you every weekend.
Stay on top of the latest tech trends and biggest startup news.