• 0 Posts
  • 17 Comments
Joined 1 year ago
cake
Cake day: June 17th, 2023

help-circle

  • Here’s the heart of the not-so-obvious problem:

    Websites treat the Google crawler like a 1st class citizen. Paywalls give Google unpaid junk-free access. Then Google search results direct people to a website that treats humans differently (worse). So Google users are led to sites they cannot access. The heart of the problem is access inequality. Google effectively serves to refer people to sites that are not publicly accessible.

    I do not want to see search results I cannot access. Google cache was the equalizer that neutralizes that problem. Now that problem is back in our face.


  • From the article:

    “was meant for helping people access pages when way back, you often couldn’t depend on a page loading. These days, things have greatly improved. So, it was decided to retire it.” (emphasis added)

    Bullshit! The web gets increasingly enshitified and content is less accessible every day.

    For now, you can still build your own cache links even without the button, just by going to “https://webcache.googleusercontent.com/search?q=cache:” plus a website URL, or by typing “cache:” plus a URL into Google Search.

    You can also use 12ft.io.

    Cached links were great if the website was down or quickly changed, but they also gave some insight over the years about how the “Google Bot” web crawler views the web. … A lot of Google Bot details are shrouded in secrecy to hide from SEO spammers, but you could learn a lot by investigating what cached pages look like.

    Okay, so there’s a more plausible theory about the real reason for this move. Google may be trying to increase the secrecy of how its crawler functions.

    The pages aren’t necessarily rendered like how you would expect.

    More importantly, they don’t render the way authors expect. And that’s a fucking good thing! It’s how caching helps give us some escape from enshification. From the 12ft.io faq:

    “Prepend 12ft.io/ to the URL webpage, and we’ll try our best to remove the popups, ads, and other visual distractions.

    It also circumvents #paywalls. No doubt there must be legal pressure on Google from angry website owners who want to force their content to come with garbage.

    The death of cached sites will mean the Internet Archive has a larger burden of archiving and tracking changes on the world’s webpages.

    The possibly good news is that Google’s role shrinks a bit. Any Google shrinkage is a good outcome overall. But there is a concerning relationship between archive.org and Cloudflare. I depend heavily on archive.org largely because Cloudflare has broken ~25% of the web. The day #InternetArchive becomes Cloudflared itself, we’re fucked.

    We need several non-profits to archive the web in parallel redundancy with archive.org.





  • Not exactly. !showerthoughts@lemmy.world was a poor choice, as is:

    • !showerthoughts@zerobytes.monster ← Cloudflare
    • !showerthoughts@sh.itjust.works ← Cloudflare
    • !showerthoughts@lemmy.ca ← Cloudflare
    • !showerthoughts@lemm.ee ← Cloudflare
    • !hotshowerthoughts@x69.org ← Cloudflare, and possibly irrelevant
    • !showerthoughts@lemmy.ml ← not CF, but copious political baggage, abusive moderation & centralized by disproportionate size

    They’re all shit & the OP’s own account is limited to creating a new community on #lemmyWorld. !showerthoughts@lemmy.ml would be the lesser of evils but the best move would be create an acct on a digital rights-respecting instance that allows community creations and then create showerthoughts community there.

    (EDIT) !showerThoughts@fedia.io should address these issues.


  • Normal users don’t have these issues.

    That’s not true. Cloudflare marginalizes both normal users and street-wise users. In particular:

    • users whose ISP uses CGNAT to distribute a limited range of IPv4 addresses (this generally impacts poor people in impoverished regions)
    • the Tor community
    • VPN users
    • users of public libraries, and generally networks where IP addresses are shared
    • privacy enthusiasts who will not disclose ~25% of their web traffic to one single corporation in a country without privacy safeguards
    • blind people who disable images in their browsers (which triggers false positives for robots, as scripts are generally not interested in images either)
    • the permacomputing community and people on limited internet connections, who also disable browser images to reduce bandwidth which makes them appear as bots
    • people who actually run bots – Cloudflare is outspokenly anti-robot and treats beneficial bots the same as malicious bots

    There are likely more oppressed groups beyond that because there is no transparency with Cloudflare.




  • And cf also allows you to block and report child porn

    That’s been tried. When someone reported CP to Cloudflare, CF demanded the identity of the whiste blower then doxxed them to the offending CF user, who then published the whistle blower’s identity so their users could retaliate. When the CEO (Matthew Prince) was confronted about this, his reply was that the whistle blowers “should have used fake names”. Then this company you support had the nerve to claim to have a privacy pledge: “[A]ny personal information you provide to us is just that: personal and private.”

    Also cf is about the only way to make federation affordable and safe. (emphasis mine)

    Forcing children to reveal their residential IP addresses to the fedi whereby any interested person (read: child preditors) can derive their approximate location – do you really think that’s a good idea for safety?

    What are you even thinking? It most certainly is not safe to expose 20%+ of everyone’s traffic to a single corporation.






  • And IIRC, license plates only need to be censored if bad behavior is demonstrated. Notice that the car to the left which was correctly parked has an exposed license plate.

    What baffles me is that the plate number is only meaningful to law enforcement. The public does not get access to the records associated with a plate number. I see no reason to hide the info from law enforcement. The evidence may be too low of a standard to be usable, but so be it.