Supervised

Supervised

Share this post

Supervised
Supervised
Web crawlers and precision data sets
Copy link
Facebook
Email
Notes
More

Web crawlers and precision data sets

OpenAI released details about a crawling bot it is using to collect data from the web for its training set. But the best model might not just be build on the biggest training set.

Matthew Lynley's avatar
Matthew Lynley
Aug 08, 2023
∙ Paid
7

Share this post

Supervised
Supervised
Web crawlers and precision data sets
Copy link
Facebook
Email
Notes
More
Share
A group of friendly robots vacuuming paper off the floor — midjourney

Author’s note: Wednesday’s issue will be coming out on Thursday this week.

In addition, Tuesday’s issue next week will be coming out on Thursday. This is due to a trip I’ll be taking to New York for the next two weeks. If you’re based in New York, let’s hang out! You can find my email …

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Matthew Lynley
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More