It’s excruciatingly obnoxious to have to rely on third party sources for what should be a first-party feature.
Like, I select all and then search a query. “Oh no, nobody on your server used a third party service to find it, so you won’t see it here.”
Like, how short-sighted is that, really? If I search for a string in the ‘all’ servers, I should have a list of ‘all’ the servers containing that string.
It’s a really simple concept. Not sure why this post even has to be made, but I’m wondering if there’s something I can do to make these ‘features’ more intuitive.
totally understand the frustration, and i’m not going to try and invalidate it!
… however, it’s definitely not a problem with a simple solution
since anyone can start an instance, when you search “all”, where should it search? i don’t mean generally like “all the instances”, i mean where specifically? things like lemmy.world, lemmy.ml, kbin.social, etc are obvious… but what about lemmy.mydomainforfriends.social (not real but let’s pretend someone created their own little instance for friends there!)?
let’s say you say yes that should be searched, okay… how does your instance know it’s there? does it tell all other instances that it exists at some point? where does IT get that list from? (the current solution to this is that your instance starts to “know about” an instance after someone interacts with it, but this has the problem you’ve described)
let’s say that instance shouldn’t be searched… now, what are the rules (automatic id assume; not with human intervention) that would allow an instance to be added to some big list somewhere? also where is that list? now we’re back at problem 1: how do you store a federated list of servers?
the problem gets even harder when you consider mastodon, pixelfed, peertube, etc… all these services interact: should all include them? only certain things in them?
While it has problems of its own, instances could pool and share that knowledge. The first time an instance talks to a different insta ce it could just ask “hey, what other instances are you aware of?”. The main issue there is just instances obsessively sending exponentially growing lists of instances back and forth.
But no, that is the main bane of federated social media, discoverability without a center of truth
yup! 100% agree! federation is kind of a new thing and we have some issues to work out that’s for sure!
heck, i could even see some kind of federated search service: activitypub instances could submit their content for indexing and individual instance could choose an existing, or run their own federated fediverse search… importantly, there would need to be choice for each individual instance with no centralised repository
So many options, doing none seems lazy. I can source all kinds of lists for my pihole to block traffic. I can put a lot of repos in my yum.conf. It’s not like this should be reliant on any one single source of truth. There could certainly be an open source list maintained. I’m surprised this is considered such a difficult problem with so many smart folks involved, I’m obviously really ignorant to how this stuff works. I just don’t get how a problem that seems to have been solved across a litany of technical products using shared sources in defederated environments is such an exotic hurdle here.
okay so now you have a decentralised list with 1000 servers on it. does your instance… make 1000 requests when you search?
Lists can be cached and updated. Even if posts from all doesn’t include all active content it would be very manageable to have queries include communities across instances based on names and other fields. All this shit is already solved problems.
Easy! It should search all the servers your server is federated with! Servers should contain a list of their community names that can be easily and quickly queried by other servers.
Federation isn’t opt-in though. It would be VERY easy to spin up a bunch of instances with millions or billions of fake communities and use them to DDOS a server’s search function.
Searching current active subscriptions helps mitigate that vector a little.
I would suggest that instances should have settings that allow them to decide whether to “advertise” a community list. With configurable settings like "all, “most active”, “top X”, or even a manually maintained list depending on the admins and instances preferences.
Then your home instance, when searching, should have it’s own settings to decide what results it’s going to ping other servers for. Big/popular/high confidence instances can have an open all/all relationship, while you might query only the top 10 communities from unknown or new instances to handle the scenario you describe.
Federation can be binary yes/no but there should be room to add more logic around enabling search on communities from your instance and controlling the search results from other instances. I don’t think the two are mutually exclusive, unless I fundamentally misunderstand how federation works!
I… don’t think you know what ddossing means but okay.
Would it really be very easy? Especially considering once instances find your doing that, they just block you? Would it be worth people’s time?
Is there any way around this, perhaps querying a global repository of federated instances and sorting them by popularity?
In all honesty, you don’t have a point. If you did, third-party services already wouldn’t offer this. Seeing as they can, it’s clearly possible.
Sorry you’re right that I wasn’t being precise with my terminology. It’s not a DDOS but it could be used to slow down targeted features, take up some HTTP connections, inflate the target’s DB, and waste CPU cycles, so it shares some characteristics of one.
In general, you want to be very very careful of implementing features that allow untrusted parties to supply potentially unbounded resources to your server.
And yeah, it would be trivial to write a set of scripts that pretend to be a lemmy instance and supply an endless number of fake communities to the target server. The nice thing about this attack vector is that it’s also not bound by the normal rate limiting since it’s the target server making the requests. There are definitely a bunch of ways lemmy could mitigate such an attack, but the current approach of “list communities current users are subscribed to” seems like a decent first approach.