cross-posted from: https://lemmy.g97.top/post/761
cross-posted from: https://lemmy.g97.top/post/723
Hi! I spawned my own instance of lemmy on my server and I discovered new things about how lemmy and federation works, and I have a lot of doubt. I don’t know exactly if those doubts are problems of my implementation of if they are normal, so!
- My main account is on lemmy.world and I see that new posts from communities I follow show up before on lemmy.world and then on my instance. Is it normal?
- With comments happens the same thing and they are slower to “sync”. Why?
- If a community has been never discovered from the search form with the full format !community@instance, it will never appear on my instance. This means that is not possible to search for an argument (i.e. steam deck) and finding all the posts and communities about it. Is this normal or a feature that we/you would like to see in future/is adaptable to the concept of the fediverse? Because if I am on a big instance with a lot of users maybe I found that specific community or post, but on smaller instances like mine it will never appear If I don’t know the exact name.
- I created a community on my instance and subscribed it from lemmy.world but I don’t see any post nor are they in sync. Why? https://lemmy.world/c/[email protected] vs https://lemmy.g97.top/c/announcements.
- From my instance I am unable to follow lemmy.ml communities (they are pending, usually on lemmy.world the pending status is faster)
- I am unable to search for communities on Kbin.social, and when I try I see this log message of type “couldnt_find_object: error decoding response body: missing field
properties
at line 1 column 206” from my docker instance:
2023-06-20T22:02:16.056226139Z 2023-06-20T22:02:16.055937Z ERROR HTTP request{http.method=GET http.scheme="https" http.host=lemmy.g97.top http.target=/api/v3/ws otel.kind="server" request_id=8211e6a4-2b30-4f8c-98b3-d93843a0e293 http.status_code=101 otel.status_code="OK"}: lemmy_server::api_routes_websocket: couldnt_find_object: error decoding response body: missing field
propertiesat line 1 column 206 2023-06-20T22:02:16.056276976Z 0: lemmy_apub::fetcher::search::search_query_to_object_id 2023-06-20T22:02:16.056286500Z at crates/apub/src/fetcher/search.rs:17 2023-06-20T22:02:16.056293804Z 1: lemmy_apub::api::resolve_object::perform 2023-06-20T22:02:16.056300316Z with self=ResolveObject { q: "[email protected]", auth: Some(Sensitive) } 2023-06-20T22:02:16.056307712Z at crates/apub/src/api/resolve_object.rs:21 2023-06-20T22:02:16.056314152Z 2: lemmy_server::root_span_builder::HTTP request 2023-06-20T22:02:16.056320693Z with http.method=GET http.scheme="https" http.host=lemmy.g97.top http.target=/api/v3/ws otel.kind="server" request_id=8211e6a4-2b30-4f8c-98b3-d93843a0e293 http.status_code=101 otel.status_code="OK" 2023-06-20T22:02:16.056351870Z at src/root_span_builder.rs:16
- I have a lot of warnings in the lemmy log of type “Error encountered while processing the incoming HTTP request: lemmy_server::root_span_builder: Header is expired” such as:
2023-06-20T21:58:12.484449111Z 2023-06-20T21:58:12.484275Z WARN Error encountered while processing the incoming HTTP request: lemmy_server::root_span_builder: Header is expired 2023-06-20T21:58:12.484510012Z 0: lemmy_server::root_span_builder::HTTP request 2023-06-20T21:58:12.484517559Z with http.method=POST http.scheme="https" http.host=lemmy.g97.top http.target=/inbox otel.kind="server" request_id=caf194c5-cac3-4c37-a29c-577d65deb050 http.status_code=400 otel.status_code="OK" 2023-06-20T21:58:12.484525578Z at src/root_span_builder.rs:16 2023-06-20T21:58:12.484530286Z LemmyError { message: None, inner: Header is expired, context: "SpanTrace" }
I have more questions/doubt but for now this is enough I think! Thank you!
Thank you for the detailed response! I’ll open the issues for 6. And 7.
The IP addresses (both IPv4 and IPv6) I’m seeing for lemmy.ml trace to OVH, not Cloudflare.
CloudFlare does not complicate federation process. Cloudflare proxy requests and filter out bad actors, that’s all. The federation issues we are seeing now is due to the way ActivityPub protocol is being used. Every write interaction on the host lemmy instance must be announced to each federated instance. So for example, there’s 3.6K active daily user on lemmy.world when I checked yesterday; and there was 2200+ linked servers when I checked yesterday. If there’s a popular post and 10% of the daily user comment on it, lemmy.world server needs to send 360 x 2200= 700K+ outbound messages to the federated serves. If the messages arrives more than 10 seconds after the action is performed, federated instances toss the message out as expired, so federated instances doesn’t show the comment. This is the crux of the problem. Cloudflare has nothing to do with the issue.
There was a GitHub issue where when federating an instance was being sent to the CF “I am not a robot” prompt, which of course meant some federation requests like subscriptions simply aren’t going through, so CF actually does complicate federation. Additionally, if a community is pretty active on an instance it’s going to be sending a lot of requests over and over which could look like a “bad actor”. Not that it matters for this particular case since lemmy.ml doesn’t use cloudflare as someone else pointed out. Anyway, I just stated CF as a possibility. I didn’t check myself, but the main thing I was pointing out was volume. Comments do seem to be eventually federating, though, albeit some are indeed being thrown out probably due to the issue you pointed out.
I think there’s a lot of misunderstanding because Federation is not one direction.
Please allow me to clarify a couple of things:
So with the combination of those, the reality is that if you’re self hosting at home, using a residential IP that is likely to be flagged as abusive due to neighbors on your ISP misbehaving, you will see bot challenge when you’re browsing. However, because your self hosted instance is unlikely to have a lot of active communities with people subscribing to, your instance will not be making out bound ActivityPub messages that would trigger CF bot challenge on other instances that uses CF.
On the flip side, larger instances with active communities will be sending out a lot of federation messages. As they are generally in data centers with more stricter abuse prevention policies than residential ISPs (lemmy.ml is on OVH; lemmy.world is on Hertzner; Beehaw is on Digital Ocean; etc.), the out bound messages are less likely to be flagged should they be posting to another instance behind CloudFlare. In the unlikely event where they are triggering bot challenge, subscribing instances (i.e.: my own lemmy.chiisana.net , which is behind CloudFlare) can whitelist them on CloudFlare side to allow them to come through no problem.
Overall, it is a good idea to have CF in front of any public facing instance. Doesn’t matter how powerful the individual instance servers are, we’re in an age where it is very easy to take down a single service. CF will unlikely introduce complications, and the good it provides significantly out-benefit the unlikely complications.
Source: I’ve been running servers since late 90s, building hugely popular web applications (used by significant portion of internet at some point) since early 00s, as well as worked with CF since their super early days.