Hacker Newsnew | past | comments | ask | show | jobs | submit | pronoiac's commentslogin

I think SciOp is doing something in that area, with a catalog site and webseeds. https://sciop.net/


The Archive Team - not part of the Internet Archive - worked on a distributed backup of a portion of the Internet Archive - https://wiki.archiveteam.org/index.php/INTERNETARCHIVE.BAK

It's been dormant / on hiatus for a few years now.


That can only cover other collections though, because the WARC files from the Wayback Machine web scrapes are not public.


I wonder if they'll go with "toploaders" - like Backblaze Storage Pods - later. They have better density and faster setup, as they don't have to screw in every drive.

They got used drives. I wonder if they did any testing? I've gotten used drives that were DOA, which showed up in tests - SMART tests, short and long, then writing pseudorandom data to verify capacity.


yeah we're very interested in trying toploaders, we'll do a test rack next time we expand and switch to that if it goes well.

w.r.t. testing the main thing we did was try to buy a bit from each supplier a month or two ahead of time, so by the time we were doing the full build that rack was a known variable. We did find one drive lot which was super sketchy and just didn't include it in the bulk orders later. diversity in suppliers helps a lot with tail risk


"don't have to screw in every drive" is relative, but at least tool-less drive carriers are a thing now.

A lot of older toploaders from vendors like Dell are not tool-free. If you bought vendor drives and one fails, you RMA it and move on. However if you want to replace failed drives in the field, or want to go it alone from the start with refurbished drives... you'll be doing a lot of screwing. They're quite fragile and the plastic snaps easily. It's pretty tedious work.


Used Supermicro machines of this generation and very cheap (all things considered)

https://www.theserverstore.com/supermicro-superstorage-ssg-6...


There's a flamewar detector, which triggers when there are far more comments than upvotes.


Kern Type, perhaps? https://type.method.ac/


This is absolutely brilliant! And an example of why every time I come up with an idea, I should check to see whether someone else had made it before. But brilliant.

I got 100/100 on the first six, except for "Yves" where I got 70/100. I think they're wrong on that one. From any distance, the v should really nestle beneath the Y.

Gonna send this to all my design nerd friends, thank you.


I attempted OCR, and while it's not great, it's a start. I considered adding a reference to "software wants to be free!" or the Open Letter, but I'm winding down for the night. https://github.com/pronoiac/altair-basic-source-code


I attempted OCR with OCRmyPDF / Tesseract. It's not great, but it's under 1% the size, at least. https://github.com/pronoiac/altair-basic-source-code


Maybe you should try something like EasyOCR instead: https://github.com/JaidedAI/EasyOCR


Feel free to run EasyOCR against it and submit a PR


Checking diskprices.com - https://diskprices.com/?locale=us&condition=new,used&disk_ty... - there's a cheaper outlier for DVD-R, then it's 25GB BD-Rs for a bit.

LTO tape can be cheaper, but the cost of the drives has long been an obstacle to dabbling.


Yeah, the prices don't seem to be correct. New 16TB HDD for $200. DVD+R 25x pack for $2, etc. Clicking the links shows different prices on amazon, etc.


I've used ocrit, which uses those APIs. https://github.com/insidegui/ocrit

There are also:

* swiftocr - https://github.com/fny/swiftocr

* macos-vision-ocr - https://github.com/bytefer/macos-vision-ocr


They asked for something like Bluesky starter packs on Mastodon, not Bluesky starter packs on Bluesky.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: