Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's one reason I'd prefer that academics just put data into some kind of local university archive, where possible. Many universities provide resources to host scientific data (and have done so for decades, since the days of ftp.dept.university.edu servers), and putting it there makes it more likely that it'll still be there in 10 years. Torrents by comparison tend to be: 1) slow, as you rely on random seeders rather than a university that's peered onto Internet2 or the LambdaRail; and 2) unreliably seeded, as people drop off. Plus the workflow of "curl -O URL" is nicer than torrenting.

Universities typically have great bandwidth and good peering, and already host much larger data repositories than this seems to be targeting (e.g. here's a 30-terabyte repository, http://gis.iu.edu/), so they should be able to provide space for your local scientific data. Complain if not!



Another alternative is something like the Dryad Digital Repository:

http://datadryad.org/

It's meant to include companion datasets for published papers, and gives out DOIs so datasets can be cited in other works. And it's mirrored at various universities to prevent loss.


Perhaps if universities would robustly seed their staffs and students torrents?


Kind of solves a problem that doesn't exist though doesn't it? It isn't like these universities are crying about bandwidth costs and it isn't like demand is maxing out their upstream.


It's not just about their bandwidth; some countries / zones / networks have much better local connectivity than external, particularly international.

For example, until a few years ago, some of our ISPs had different caps for national vs international traffic, and there were popular forks of P2P clients that allowed you to filter based on that.

We have since moved to unlimited everything, but I wouldn't be surprised if some countries still had different caps or speeds for international traffic.


I agree about university data.

But there is a need for a way to distributed large datasets that come out of nonacademic projects.

For example, the DBPedia data dumps are very slow to download at the moment.


You can have both an use a web seed with most clients.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: