This time once archive.org is back online again… is it possible to get torrents of some of their popular data storage? For example I wouldn’t imagine their catalog of books with expired copyright to be very big. Would love a community way to keep the data alive if something even worse happens in the future (and their track record isn’t looking good now)
Yep, that seems like the ideal decentralized solution. If all the info can be distributed via torrent, anyone with spare disk space can help back up the data and anyone with spare bandwidth can help serve it.
Most of us can’t afford the sort of disk capacity they use, but it would be really cool if there were a project to give volunteers pieces of the archive so that information was spread out. Then volunteers could specify if they want to contribute a few gigabytes to multiple terabytes of drive space towards the project and the software could send out packets any time the content changes. Hmm this description sounds familiar but I can’t think of what else might be doing something similar – anyone know of anything like that that could be applied to the archive?
Yeah, the projects I’ve heard about that have done something like this broke it into multiples.
For example, 1000GB could be broken into forty 25GB torrents and within that, you can tell the client to only download some of the files.
At scale, a webpage can show the seed/leach numbers and averages foe each torrent over a time period to give an idea of what is well mirrored and what people can shore up. You could also change which torrent is shown as the top download when people go to the contributor page and say they want to help host it ensuring a better distribution.
I’m pretty sure all their content is available by torrent, so you could mirror the data and provide the torrent files for direct download. It’ll probably be here when it’s back up: https://archive.org/details/public-domain-archive
This again??
This time once archive.org is back online again… is it possible to get torrents of some of their popular data storage? For example I wouldn’t imagine their catalog of books with expired copyright to be very big. Would love a community way to keep the data alive if something even worse happens in the future (and their track record isn’t looking good now)
Like this idea
Yep, that seems like the ideal decentralized solution. If all the info can be distributed via torrent, anyone with spare disk space can help back up the data and anyone with spare bandwidth can help serve it.
Most of us can’t afford the sort of disk capacity they use, but it would be really cool if there were a project to give volunteers pieces of the archive so that information was spread out. Then volunteers could specify if they want to contribute a few gigabytes to multiple terabytes of drive space towards the project and the software could send out packets any time the content changes. Hmm this description sounds familiar but I can’t think of what else might be doing something similar – anyone know of anything like that that could be applied to the archive?
Yeah, the projects I’ve heard about that have done something like this broke it into multiples.
For example, 1000GB could be broken into forty 25GB torrents and within that, you can tell the client to only download some of the files.
At scale, a webpage can show the seed/leach numbers and averages foe each torrent over a time period to give an idea of what is well mirrored and what people can shore up. You could also change which torrent is shown as the top download when people go to the contributor page and say they want to help host it ensuring a better distribution.
I’m pretty sure all their content is available by torrent, so you could mirror the data and provide the torrent files for direct download. It’ll probably be here when it’s back up: https://archive.org/details/public-domain-archive
Anna’s Archive does this. I think its a really good way to make it difficult to take them down.
Hopefully this hack starts some conversations on how they can ensure longevity for their project. Seems they’re being attacked on multiple fronts now.