- Version snapshots are now a local-only feature; they are no longer shared over the network. Version snapshots will be cleared out whenever you reset your local data, clear your browser cache, use LK in a different browser, etc. Technical details below.
Details 4 Nerdz
We've been having database-related latency issues over the past few weeks. In particular, LK is old enough that some documents have grown to be pretty large. In order for LK to work its offline/online/multiplayer/seamless-merging magic, documents accumulate change metadata forever; i.e, whenever you type a character or delete something, that information is saved. This may sound bad, but the metadata costs about 50% of the space of the actual content. Yjs, the technology powering our collab system, keeps the space usage down by doing garbage collection, or GC. The garbage collector can look at the history of edits and delete and merge redundant information.
For example, let's say you have a document with this text:
"Hi, these are details for nerds."
So, the metadata would have 32 inserts recorded, one for each character I typed. But, Y.js is smart and can look at the history and compress it. After compaction, instead of 32 inserts, it would be a single insert with size 32. Additionally, let's say I backspace-deleted each character one-by-one. So that's 32 delete operations added to the metadata: "delete ., delete s, delete d, etc.",. But again, Y.js is smart, so it can turn that into a datapoint called: "delete(32) by braden". So after all those edits, the metadata is only those 2 transactions, instead of 64+ inserts and deletes. This metadata is used to gracefully interweave other users' edits when necessary.
If we use the version snapshots feature, we have to disable the garbage collector. It makes sense: Let's say I take a version snapshot, delete all the content, and now I want to restore it. How can I get the data back? If the garbage collector compressed all the deletes into one operation called ("deleted something of size 32"), there's no way to reconstruct that data. We only know how big it was. So, for wiki articles, LK disables the garbage collector in exchange for being able to use version snapshots.
Herein lies the problem: LK has been around a few years now, and some documents are getting CHONKY, even if they don't look chonky. For example, you might have a scratch pad article where you draft articles, and then paste them out into other articles. I guarantee you this scratchpad article, which may even be empty, is the largest document in your project, because all that history is still there and the garbage collector is turned off.
When these uber-documents enter the processing pipeline, LK's compute nodes and database allocate a huge amount of resources to load them into memory and process them, starving all the other documents and overloading the DB. (Processing involves saving the documents to long term database storage, and indexing them for public worlds usage.)
So how do we fix this? We could just remove the version snapshot feature, but it's a nice feature. It makes it easy to recover deleted stuff, see what's changed etc. So the compromise I made was this:
The server now garbage collects all documents. The client doesn't so it can keep snapshots, but it doesn't share those snapshots with the server anymore, since the server won't have the necessary metadata to understand the snapshots. The client can generally deal with large docs, since they're already there. That said, you can recover some performance by resetting local data and getting the garbage-collected version of your doc from the server.
Additionally, the server now has a hard limit on the size of documents it will accept. I won't disclose the limit, but it's very high and impacts statistically few documents, less than 0.0001%.
Anyways, I'm excited to get back to the editor. This issue has been a thorn in my side for a week now!