More

PigiVinci83 · 2024-12-10T18:49:55 1733856595

I’d like to try it! I’ll write you in the next days

PigiVinci83 · on Oct 26, 2024

It’s a custom GPT for learning scraping, the newsletter is called The Web Scraping Club and the archive of the article can be found here: https://substack.thewebscraping.club/archive

PigiVinci83 · on Aug 7, 2024

Nice article, enjoyed reading it. I’m Pier, co founder of https://Databoutique.com, which is a marketplace for web scraped data. If you’re willing to monetize your data extractions, you can list them on our website. We just started with the grocery industry and it would be great to have you on board.

bob_theslob646 · on Aug 11, 2024

This looks like a really cool website but my only critique is how are you verifying that the data is actually real and not just generated randomly?

redblacktree · on Aug 7, 2024

Do you have data on which data is in higher demand? Do you keep a list of frequently-requested datasets?

PigiVinci83 · on April 20, 2024

Forza e coraggio, from another italian in tech and fully remote worker

yousuke86 · on April 21, 2024

Grazie! Love what you've built, it must've been a lot lit of work!

PigiVinci83 · on March 24, 2024

Just enter in the dataset description and see it, and can even download a sample of the file. https://www.databoutique.com/buy-data-page-detail/balenciaga...

But thanks for the feedback, probably we should make the website clearer

PigiVinci83 · on March 24, 2024

Well, alternative data in general is anonymized and absolutely does not contain any personal info (even because PII is useless for hedge funds, they need to see trends not sell something to people).

danw1979 · on March 24, 2024

I meant data belonging to other companies really, not individuals.

PigiVinci83 · on March 24, 2024

Unless it’s proprietary data (or data acquired from third parties and elaborated), the other source is mainly web scraping and this is regulated. You need to have the rights to scrape this data, which it means that it’s public data

PigiVinci83 · on Dec 12, 2023

At Databoutique.com we’re trying to solve the web data accessibility problem with a marketplace which connects web data sellers and buyers. Buyers can get the data with three clicks, on S3 bucket or download from the website. It’s pre scraped, quality checked and legal compliant. If a website is not listed, you can ask to sellers to provide it. Sellers deliver data using standard data structures and make their price. Far from perfect since we launched three months ago, but working on it.

PigiVinci83 · on Oct 12, 2023

  The biggest risk I perceived was something I warned the team about: "The first rule of Scrape Club is: Don't Talk About Scrape Club."

It seems someone broke this rule: https://www.google.com/search?q=the+web+scraping+club

PigiVinci83 · on March 22, 2023

This project is interesting, we at https://www.databoutique.com are building something similar, a curated dataset marketplace for web scraping data. We believe that using standardization, quality controls and high density verticals, we can cut prices and time to value for web scraping data.

PigiVinci83 · on Nov 16, 2022

Since I’m quite experienced in the field, I opened a substack called The Web Scraping Club as a side gig. It is mostly free but has some paid articles and have already hundreds of $ in MRR after 2 months.

It is a niche where tutorials and info are pretty sparse around the web and having a centralized blog is useful for operators.

Hope this helps find your way