Cans has source? I’ve been hacking on a per-instance community scraper (that pulls a list of communities from the community directory page) for !lemmy411!lemmy411@lemmy.ca, in an attempt to have a regularly published community list, but I’m a bit bogged.
I have links to the source for my crawler (written in nodejs with redis task queuing) on https://lemmyverse.net (GitHub link top right) 🤗
It’s a bit messy at the moment but has all the community crawling stuff.
I’m probably going to put the source up shortly but how I decided to find communities is by scraping my instances “federated with” list from https://toast.ooo/api/v3/site
Then going through the /.well-known/nodeinfo -> /nodeinfo/v2.0.json (or whatever url the well-known gives) to check the software to make sure it’s a Lemmy instance
Then using the Lemmy rest API to paginate through the open communities for that instance
Cool!
Cans has source? I’ve been hacking on a per-instance community scraper (that pulls a list of communities from the community directory page) for !lemmy411!lemmy411@lemmy.ca, in an attempt to have a regularly published community list, but I’m a bit bogged.
I have links to the source for my crawler (written in nodejs with redis task queuing) on https://lemmyverse.net (GitHub link top right) 🤗 It’s a bit messy at the moment but has all the community crawling stuff.
I’m probably going to put the source up shortly but how I decided to find communities is by scraping my instances “federated with” list from https://toast.ooo/api/v3/site
Then going through the /.well-known/nodeinfo -> /nodeinfo/v2.0.json (or whatever url the well-known gives) to check the software to make sure it’s a Lemmy instance
Then using the Lemmy rest API to paginate through the open communities for that instance