REUTERS/Rebecca Cook
The University of Michigan has allegedly sold 85 hours of audio recordings from various academic settings including lectures, interviews, office hours, study groups, and student presentations to third parties for the purposes of training artificial intelligence. The school has also sold a dataset of 829 academic papers from students to help fine tune large language models (LLMs) as well.
It is unclear whether those included in the data consented to having their audio and texts used in such a manner. However, a sample dataset downloaded by The Daily Beast included a recording of a lecture from 1999 making it highly unlikely that they knew their data would be used to train future generative AI models.
AI engineer Susan Zhang took to X to post a screenshot showing what looks to be an advertisement from Catalyst Research Alliance, a firm selling the UM data, that she recently received on LinkedIn. The sender wrote that they were “reaching out because, based on your profile, you may be working with” LLMs.