i tried to get access to facebook’s api to mess around (as a student) but they declined my request. i ended up making a bot that ran in a headless browser wasting far more of facebooks resources and i used it to create shitposts that updated the post with the number of reactions lmao.
fun fact: on the r-site, you can still append
.json
to the end of any path (before the query params) to get the formatted datafun fact 2: on the same site you get a similar json if you grab the script that says
id="data"
(trivial with jsdom if you run nodejs), eval it in a sandbox (node’s built-in vm package), and look for your passed global object’s$.___r
paramfun fact 3: also on the same site, if you use the old interface it’s full of data tags intended for css, jsdom goes brrr
fun fact 4: even if they stopped all of this you could use a headless browser and grab the data in flight from the api calls (virgin dom scrubber vs chad api capturer)
i don’t know much about the t-site and can’t check right now because you can’t even access it the normal way, lol
Scraping my beloved…using more resources from a company’s server makes me drool
This cracked me up. Especially the 10 minute delay and rate limiting making it better to just scrape.
Can someone eli5 me. What’s scraping and how does it work? Like for example in the context of twitter with their current limitations, will scraping still work?
Scraping is getting a webpage as if you’re a normal user going to that page in firefox/chrome and extracting the bits you want from it. If Twitter makes you sign in to view tweets (which I guess it will now?) then scraping won’t help much, otherwise it probably will, however it may take a fair bit of trickery to get working