Skip to content

problem saving a file to disk #5823

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
williamstein opened this issue Apr 1, 2022 · 6 comments
Closed

problem saving a file to disk #5823

williamstein opened this issue Apr 1, 2022 · 6 comments

Comments

@williamstein
Copy link
Contributor

There has again been a report of a file not saving to disk.

I don't know how to reproduce this. It might be fixed in #5674, since I did a deep dive into the realtime sync code, and fixed a bug that could I think have caused exactly this behavior. (It was a bug where in rare instances something could get stuck causing sync to not move forward.) So when that is merged (soon!), this might be fixed.

WORKAROUND: Though painful, restarting the project resets the backend sync state, which works around this.

@williamstein
Copy link
Contributor Author

The code I mentioned above in #5674 is now live, and there haven't been any new reports of this issue. I'll close this for now, but if any new reports come up (esp with a project restarted since yesterday!), then I'll reopen it.

@williamstein
Copy link
Contributor Author

I've seen evidence of this, even after the change. Observations:

  • I saw this myself when I worked heavily on a specific file for many hours, then turned off my laptop (closed screen and suspended) for just over an hour, then came back and resumed.
  • It's really not specifically a problem of saving to disk, but of nothing getting sync'd to the project at all.
  • If you open the file in another browser sync between the browsers (which goes through the project) does work.
  • Patches from editing are being saved to the central database.

So the one thing that isn't happening is the patches aren't being applied to the version of the file in memory in the project.

@williamstein
Copy link
Contributor Author

More observations:

  • command+shift+f to autoformat doesn't do anything. This is evidence that the version of the file seen by the project is not getting updated to the latest version (even though all patches are there in the project).
  • waiting 5 minutes for the file to autoclose: everything works fine again upon reopening the file.

I carefully studied logs, and this the problem, I think:

2022-04-06T23:46:45.750Z: 2022-04-06T23:46:45.750Z cocalc:debug:websocket-api SyncTableChannel('sync:2aae74165d6e2dd2a7ffda8bafaf102202104030', '{"syncstrings":[{"archived":null,"doctype":null,"init":null,"last_active":null,"last_file_change":null,"last_snapshot":null,"path":"cocalc/src/packages/frontend/chat/input.tsx","project_id":"10f0e544-313c-4efe-8718-2142ac97ad11","read_only":null,"save":null,"settings":null,"snapshot_interval":null,"string_id":"e130ad1228a76cd26b68e35d15dbdccd72c10a31","users":null}]}'):  save_and_close_if_possible: after save there are 0 connections and delay=494008
2022-04-06T23:46:45.750Z: 2022-04-06T23:46:45.750Z cocalc:debug:websocket-api SyncTableChannel('sync:2aae74165d6e2dd2a7ffda8bafaf102202104030', '{"syncstrings":[{"archived":null,"doctype":null,"init":null,"last_active":null,"last_file_change":null,"last_snapshot":null,"path":"cocalc/src/packages/frontend/chat/input.tsx","project_id":"10f0e544-313c-4efe-8718-2142ac97ad11","read_only":null,"save":null,"settings":null,"snapshot_interval":null,"string_id":"e130ad1228a76cd26b68e35d15dbdccd72c10a31","users":null}]}'):  save_and_close_if_possible: close this SyncTableChannel atomically
2022-04-06T23:46:45.750Z: 2022-04-06T23:46:45.750Z cocalc:debug:websocket-api SyncTableChannel('sync:2aae74165d6e2dd2a7ffda8bafaf102202104030', '{"syncstrings":[{"archived":null,"doctype":null,"init":null,"last_active":null,"last_file_change":null,"last_snapshot":null,"path":"cocalc/src/packages/frontend/chat/input.tsx","project_id":"10f0e544-313c-4efe-8718-2142ac97ad11","read_only":null,"save":null,"settings":null,"snapshot_interval":null,"string_id":"e130ad1228a76cd26b68e35d15dbdccd72c10a31","users":null}]}'):  close: closing
2022-04-06T23:46:45.750Z: 2022-04-06T23:46:45.750Z cocalc:debug:websocket-api SyncTableChannel('sync:2aae74165d6e2dd2a7ffda8bafaf102202104030', '{"syncstrings":[{"archived":null,"doctype":null,"init":null,"last_active":null,"last_file_change":null,"last_snapshot":null,"path":"cocalc/src/packages/frontend/chat/input.tsx","project_id":"10f0e544-313c-4efe-8718-2142ac97ad11","read_only":null,"save":null,"settings":null,"snapshot_interval":null,"string_id":"e130ad1228a76cd26b68e35d15dbdccd72c10a31","users":null}]}'):  close: closing
2022-04-06T23:46:45.750Z: 2022-04-06T23:46:45.750Z cocalc:debug:websocket-api SyncTableChannel('sync:2aae74165d6e2dd2a7ffda8bafaf102202104030', '{"syncstrings":[{"archived":null,"doctype":null,"init":null,"last_active":null,"last_file_change":null,"last_snapshot":null,"path":"cocalc/src/packages/frontend/chat/input.tsx","project_id":"10f0e544-313c-4efe-8718-2142ac97ad11","read_only":null,"save":null,"settings":null,"snapshot_interval":null,"string_id":"e130ad1228a76cd26b68e35d15dbdccd72c10a31","users":null}]}'):  close: closed
2022-04-06T23:46:45.751Z: 2022-04-06T23:46:45.751Z cocalc:debug:websocket-api init_syncdoc --  syncstring table closed, so closing syncdoc cocalc/src/packages/frontend/chat/input.tsx
2022-04-06T23:46:47.346Z: 2022-04-06T23:46:47.346Z cocalc:debug:websocket-api primus-api request object {"cmd":"synctable_channel","query":{"syncstrings":[{"string_id":"e130ad1228a76cd26b68e35d15dbdccd72c10a31","project_id":"10f0e544-313c-4efe-8718-2142ac97ad11","path":"cocalc/src/packages/frontend/chat/input.tsx","users":null,"last_snapshot":null,"snapshot_interval":null,"save":null,"last_active":null,"init":null,"read_only":null,"last_file_change":null,"doctype":null,"archived":null,"settings":null}]},"options":[]}
2022-04-06T23:46:47.346Z: 2022-04-06T23:46:47.346Z cocalc:debug:websocket-api synctable_channel {"syncstrings":[{"string_id":"e130ad1228a76cd26b68e35d15dbdccd72c10a31","project_id":"10f0e544-313c-4efe-8718-2142ac97ad11","path":"cocalc/src/packages/frontend/chat/input.tsx","users":null,"last_snapshot":null,"snapshot_interval":null,"save":null,"last_active":null,"init":null,"read_only":null,"last_file_change":null,"doctype":null,"archived":null,"settings":null}]} sync:2aae74165d6e2dd2a7ffda8bafaf102202104030

There is a request to OPEN from a frontend about 1.5 seconds after closing the sync tables, but before (I suspect) the syncdoc itself is closed.

A way to simulate this:

  1. make closing of syncdoc happen very quickly (not 5 minutes), then
  2. where the code does "syncstring table closed, so closing syncdoc", make it wait 30 seconds, so it's easy to try to open the file again during this window.
  3. If it then doesn't work in the same way... bingo, we've totally solved the problem, and I just need to ensure that everything really does happen atomically as mentioned above in the log entry.

Here's what should happen, and did happen later (after waiting 5 minutes, trying again, and it working):

2022-04-07T00:34:35.255Z: 2022-04-07T00:34:35.255Z cocalc:debug:websocket-api SyncTableChannel('sync:2aae74165d6e2dd2a7ffda8bafaf102202104030', '{"syncstrings":[{"archived":null,"doctype":null,"init":null,"last_active":null,"last_file_change":null,"last_snapshot":null,"path":"cocalc/src/packages/frontend/chat/input.tsx","project_id":"10f0e544-313c-4efe-8718-2142ac97ad11","read_only":null,"save":null,"settings":null,"snapshot_interval":null,"string_id":"e130ad1228a76cd26b68e35d15dbdccd72c10a31","users":null}]}'):  save_and_close_if_possible: close this SyncTableChannel atomically
2022-04-07T00:34:35.255Z: 2022-04-07T00:34:35.255Z cocalc:debug:websocket-api SyncTableChannel('sync:2aae74165d6e2dd2a7ffda8bafaf102202104030', '{"syncstrings":[{"archived":null,"doctype":null,"init":null,"last_active":null,"last_file_change":null,"last_snapshot":null,"path":"cocalc/src/packages/frontend/chat/input.tsx","project_id":"10f0e544-313c-4efe-8718-2142ac97ad11","read_only":null,"save":null,"settings":null,"snapshot_interval":null,"string_id":"e130ad1228a76cd26b68e35d15dbdccd72c10a31","users":null}]}'):  close: closing
2022-04-07T00:34:35.255Z: 2022-04-07T00:34:35.255Z cocalc:debug:websocket-api SyncTableChannel('sync:2aae74165d6e2dd2a7ffda8bafaf102202104030', '{"syncstrings":[{"archived":null,"doctype":null,"init":null,"last_active":null,"last_file_change":null,"last_snapshot":null,"path":"cocalc/src/packages/frontend/chat/input.tsx","project_id":"10f0e544-313c-4efe-8718-2142ac97ad11","read_only":null,"save":null,"settings":null,"snapshot_interval":null,"string_id":"e130ad1228a76cd26b68e35d15dbdccd72c10a31","users":null}]}'):  close: closing
2022-04-07T00:34:35.255Z: 2022-04-07T00:34:35.255Z cocalc:debug:websocket-api SyncTableChannel('sync:2aae74165d6e2dd2a7ffda8bafaf102202104030', '{"syncstrings":[{"archived":null,"doctype":null,"init":null,"last_active":null,"last_file_change":null,"last_snapshot":null,"path":"cocalc/src/packages/frontend/chat/input.tsx","project_id":"10f0e544-313c-4efe-8718-2142ac97ad11","read_only":null,"save":null,"settings":null,"snapshot_interval":null,"string_id":"e130ad1228a76cd26b68e35d15dbdccd72c10a31","users":null}]}'):  close: closed
2022-04-07T00:34:35.256Z: 2022-04-07T00:34:35.256Z cocalc:debug:websocket-api init_syncdoc --  syncstring table closed, so closing syncdoc cocalc/src/packages/frontend/chat/input.tsx

...
SEVERAL MINUTES ELAPSE WITH NO LOGGING about input.tsx
...

2022-04-07T00:36:58.466Z: 2022-04-07T00:36:58.466Z cocalc:debug:websocket-api primus-api request object {"cmd":"realpath","path":"cocalc/src/packages/frontend/chat/input.tsx"}
2022-04-07T00:36:58.722Z: 2022-04-07T00:36:58.722Z cocalc:debug:websocket-api primus-api request object {"cmd":"synctable_channel","query":{"syncstrings":[{"string_id":"e130ad1228a76cd26b68e35d15dbdccd72c10a31","project_id":"10f0e544-313c-4efe-8718-2142ac97ad11","path":"cocalc/src/packages/frontend/chat/input.tsx","users":null,"last_snapshot":null,"snapshot_interval":null,"save":null,"last_active":null,"init":null,"read_only":null,"last_file_change":null,"doctype":null,"archived":null,"settings":null}]},"options":[]}


@williamstein
Copy link
Contributor Author

Good news. I put this code:

    setTimeout(() => {
      log("DOING delayed patches_table re-close", opts.path);
      syncdocs[opts.path]?.patches_list.close();
    }, 15000);

in src/packages/project/sync/sync-doc.ts right after

    log("syncstring table closed, so closing syncdoc", opts.path);
    close_syncdoc(opts.path);

and was able to exactly reproduce the bug. Finally!

@williamstein
Copy link
Contributor Author

I think a root cause that made this vanishing unlikely event highly likely is

c98e510#diff-f5abc78278efe51cc4ae9ad7a67c1f96ed281d04816aec6740e2adc857efeecaR16

which I just fixed.

@williamstein
Copy link
Contributor Author

It's now been over two weeks with me heavily using CoCalc, and I haven't seen any indication of this (or any other) sync bug. Moreover, there have been no reports from anybody else either. Finally!

Maybe our sync implementation is "bug free", at least within the constraints of what is possible give our choices...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant