0 likes109 views

OpenAI transcribed over a million hours of YouTube videos to train GPT-4

April 7, 2024

Cath Virginia / The Verge | Photos from Getty Images

Earlier this week, The Wall Street Journal reported that AI companies were running into a wall when it comes to gathering high-quality training data. Today, The New York Times detailed some of the ways companies have dealt with this. Unsurprisingly, it involves doing things that fall into the hazy gray area of AI copyright law.

The story opens on OpenAI which, desperate for training data, reportedly developed its Whisper audio transcription model to get over the hump, transcribing over a million hours of YouTube videos to train GPT-4, its most advanced large language model. That’s according to The New York Times, which reports that the company knew this was legally questionable but believed it to be fair use. OpenAI president Greg…

Latest
Tech News

3 Stocks Seasoned Investors Should Watch

As we navigate the evolving stock market landscape, understanding key sectors and their trends is...

What Happens Next for the S&P 500? Pick Your Path!

The S&P continues to push higher, with the equity benchmark almost reaching 6300 this week for...

These 25 Stocks Drive the Market: Are You Watching Them?

If you’re serious about trading or investing, establishing a weekly market routine is a must....

The CappThesis Market Strength Indicator: What It’s Telling Us Now

Up to this point, the S&P 500 ($SPX) has now stayed above the 6,200-mark for eight straight...

The Small Cap ‘Early Warning’ System: Use StockCharts to Time Pullbacks and Protect Profits

The stock market continued to push higher with the S&P 500 ($SPX) and Nasdaq Composite...

Trump’s tariffs on Brazil could make your coffee even more expensive

President Donald Trump’s proposed 50% tariff on Brazilian imports is bad news for coffee...