OpenAI to train LLMs on Financial Times content

The Financial Times (full disclosure — the owners of The Next Web) have inked a deal with OpenAI. The American firm will use the British publisher’s content to train its generative AI models.

The deal is the latest in a string of new partnerships between OpenAI and global news publishers like Axel Springer, Associated Press, and Le Monde. The company did not disclose the financial terms of any of the contracts.

In 2023 alone, hundreds of pages of litigation and countless articles accused tech firms of stealing artists’ and publishers’ work to train their AI models.

OpenAI has come under fire for training its GPT models on content scraped from the web without consent. Last year, The New York Times even sued OpenAI and Microsoft for copyright infringement.

TNW Conference, June 20-21 - Last chance to save BIG!

Final Price increase is this week on May 17! Startups, investors, corporates, governments, all in one place to explore the future of tech.

Count me in

OpenAI’s recent tie-ups with publishers will allow it to continue to train its algorithms on web content. But, this time, it will have permission.

Strategic partnership

The FT called the deal with OpenAI a “strategic partnership.”

The 100 million-plus users of ChatGPT will have direct access to summaries, quotes, and links to the publisher’s articles. This content is usually hidden behind a paywall. OpenAI will attribute all information from the FT to the publication.

In exchange, OpenAI will help the news organisation develop new AI tools. The FT already uses OpenAI products, including ChatGPT Enterprise, we can confirm.

FT Group CEO John Ridding said the publisher was still committed to “human journalism.”

“This is an important agreement in a number of respects,” said Ridding. “It recognises the value of our award-winning journalism and will give us early insights into how content is surfaced through AI.”

“Apart from the benefits to the FT, there are broader implications for the industry. It’s right, of course, that AI platforms pay publishers for the use of their material,” Ridding continued. “OpenAI understands the importance of transparency, attribution, and compensation – all essential for us. At the same time, it’s clearly in the interests of users that these products contain reliable sources.”

Fair use or unfair?

However, just because OpenAI is cozying up to publishers doesn’t mean it’s not still scraping information from the web without permission.

Earlier this month, the New York Times reported that OpenAI was using Youtube scripts to train its models. According to the publication, this contravenes copyright laws, since YouTube creators who upload videos to the platform still retain the copyright to the content they create.

OpenAI, however, insists its use of online material constitutes “fair use.” The firm, and many other tech companies, claim their large language models (LLMs) transform information gathered online into something entirely new.

Yet, as we’ve previously reported in-depth, studies have shown that LLMs consistently regurgitate large chunks of their original training text verbatim.

Agreements with publishers could mark a potential step forward for AI copyright contentions. However, they are likely to remain more the exception than the rule.

Story by Siôn Geschwindt

Siôn is a reporter at TNW. From startups to tech giants, he covers the length and breadth of the European tech ecosystem. With a background (show all) Siôn is a reporter at TNW. From startups to tech giants, he covers the length and breadth of the European tech ecosystem. With a background in environmental science, Siôn has a bias for solutions delivering environmental and social impact at scale.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

OpenAI to train LLMs on Financial Times content — with permission

Strategic partnership

Fair use or unfair?

Get the TNW newsletter

Also tagged with

Einride starts building ‘world’s largest’ autonomous trucking network in Dubai

Tree-planting search engine Ecosia to monitor reforestation from space

Join TNW All Access

Darktrace agrees £4.3B sale to US investor in blow to UK stock market

AI-powered ‘deep medicine’ could transform healthcare in the NHS