Runway, a startup building generative AI for content creators, allegedly stole thousands of YouTube videos to train its AI video generator without permission
In the fiercely competitive landscape of AI startups, the race to replicate the success of OpenAI has led some to questionable practices. Runway, a prominent startup building generative AI for content creators, has been accused of illegally using thousands of YouTube videos, including over a thousand from renowned tech YouTuber Marques Brownlee, to train its AI video generator without permission.
According to a bombshell report from 404 Media, Runway’s AI, designed for content creators, was secretly trained using an extensive collection of YouTube videos and pirated content. A leaked internal document revealed a coordinated effort within the company to “collect thousands of YouTube videos and pirated content for training data.”
Generative AI startup Runway Scraped YouTube Videos Without Permission, Including Nintendo’s and YouTuber Marques Brownlee’s
The report added that Runway’s highly-touted AI video generation tool was secretly trained on scraped videos obtained from popular YouTube channels and brands, along with pirated films. This revelation comes from a comprehensive internal spreadsheet of training data obtained by 404 Media. The videos potentially include several of Nintendo’s channels.
“A highly-praised AI video generation tool made by multi-billion dollar company Runway was secretly trained by scraping thousands of videos from popular YouTube creators and brands, as well as pirated films, according to a massive internal spreadsheet of training data obtained by 404 Media,” the report added.
Marques Brownlee, one of YouTube’s most influential tech reviewers, took to X (formerly Twitter) to express his dismay:
“Well well well. Runway AI video generator was trained on YouTube videos without permission, including 1600+ MKBHD videos.”
https://twitter.com/MKBHD/status/1816487078265344313
The controversial model, initially codenamed Jupiter and later released as Gen-3, garnered widespread acclaim from the AI community and tech media at its launch in June.
Last year, Runway raised $141 million in a Series C extension round from investors like Google and Nvidia, achieving a valuation of $1.5 billion. At the time, Runway CEO Cris Valenzuela emphasized their commitment to innovation in tools for artists and creators.
However, when TechCrunch inquired in June about the origins of the training data for Gen-3, Runway co-founder Anastasis Germanidis remained vague.
“We have an in-house research team that oversees all of our training, and we use curated, internal datasets to train our models,” Germanidis told TechCrunch.
Founded in 2018 by Valenzuela, Alejandro Matamala, and Germanidis, Runway emerged from NYU’s art school, where the trio bonded over their shared fascination with AI’s creative potential. Initially focused on AI-powered tools for filmmakers and photographers, Runway has since pivoted to generative AI, with a particular emphasis on video.
Runway’s flagship product, Gen-2, an AI model that generates videos from text prompts or images, has been central to its recent success. Yet, this latest controversy casts a shadow over the company’s practices and raises questions about the ethical implications of its data acquisition methods. According to funding data from CrunchBase, Runway has raised a total of $236.5 million since inception.
Watch the video below for more details!