What is the long-term storage plan for Lemmy instances?

@[email protected] · edit-2 3 years ago

What is the long-term storage plan for Lemmy instances?

ubergeek77 · edit-2 3 years ago

Pictrs 0.4 recently added support for object storage. This is fantastic, because object storage is dirt cheap compared to traditional block storage (like a VM filesystem). This helps a lot for image storage, which is a large part of the problem, but it’s not the whole problem.

I know Lemmy uses Postgres for everything else, but they should really invest time into moving towards something more sustainable for long term/permanent hosting. Paid Postgres services are obscenely upcharged and prohibitively expensive, so that’s not an option.

I’m armchair architecting here so I’m not sure what that would look like for Lemmy (Cloudflare KV? Redis?)

Still, even my own private instance has been growing at a rate of about 700MB per day, and I don’t even subscribe to that many things. I can’t imagine what the major instances are dealing with. This isn’t sustainable unless we want to start purging old data, which will kill Lemmy long term.

@[email protected] · 3 years ago

The 700MB are the postgres data or everything including the images?

I’m under the impression that text should be very cheap to store inside postgres.

@[email protected] · edit-2 3 years ago

I’m not really sure that a K/V service is a more scalable option than Postgres for storing text posts and the like. If you’re not performing complex queries or requiring microsecond latencies then Postgres doesn’t require that much compute or memory.

People can get unnecessary scared of relational databases if they’ve had bad experiences with databases that are used poorly, but attempting to force relational data into a K/V can lead to the application layer essentially just doing a less efficient job of the same types of queries that the database would normally handle. Maybe there’ll be some future need to offload post and comment bodies into object storage or something but that seems incredibly premature.

Object storage for pictrs is definitely a fantastic addition, though.

Qazwsxedcrfv000 · 3 years ago

It would be greater if it can also leverage IPFS. So we can have unique identifiers per media object and hence deduplication in a P2P network which in my opinion is more federvise affinitive. I have been thinking of making such an alternative media backend for a while.