-
Notifications
You must be signed in to change notification settings - Fork 881
Description
⚠️ Before submitting, please verify the following: ⚠️
- This is a bug, not a question or a configuration issue.
- This issue is not already reported on Github (I've searched it).
- Nextcloud Server and Desktop Client are up to date. See Server Maintenance and Release Schedule and Desktop Releases for supported versions.
- I agree to follow Nextcloud's Code of Conduct
Bug description
I am running a Nextcloud server 25.0.3 and use the Windows Desktop Client 3.6.6 on several Windows 10 installations. While working with the Nextcloud Virtual Drive / Files in Windows I encountered issues with large virtual disk files I wanted to synchronize via the virtual drive from my Windows clients to the Nextcloud server. These files are up to 120GB in sizes, but could be larger.
Regardless of what I tried, the Desktop Client aborted the sychnronization of such files 30 minutes after the upload progress bar had reached 100% reporting a "Connection timed out" error message.
So I started to dig deeper. This is what I came up with while testing with a 77GB file.
Synchronization of larger files via the Desktop Client consists of two major stages for files which do not yet exist on the Nextcloud server. In the first stage, the Desktop Client uploads the file in chunks to a temporary upload folder on the Nextcloud server. Once this is completed, the Desktop Client asks the Nextcloud server to assemble these chunks back to one file at the destination folder.
The first stage of the upload works fine. The large file gets chunked and uploaded to the upload folder of the Nextcloud server. While this is ongoing, the Desktop Client continuously updates the remaining time and the progress bar on its "Settings" screen. Once all chunks have been uploaded, the status information of the Desktop Client changes to "A few seconds left". Then it starts the second stage of the synchronization run.
The Desktop Client sends a MOVE command to the Nextcloud server and starts waiting for the reply to this request. The Nextcloud server begins to assemble the chunks at the final folder. While the Nextcloud server is assembling, the Desktop Client keeps showing the "A few seconds left" status message and visually seems to be "stuck". However it is still maintaining the connection to the Nextcloud server waiting for the reply to the MOVE command.
Based on its size and the speed of the disk drives of my Nextcloud server, assembling my 76GB test file takes about 40 minutes (sometimes even more). In case it already exists on the Nextcloud server, the overall processing time roughly doubles. Because its previous version needs to be copied by the Nextcloud server to the files_versions folder prior to the MOVE operation.
After waiting for 30 minutes on the response to the MOVE command from the Nextcloud server, the Desktop Client terminates the connection with the Nextcloud server and displays a "Connection timed out" error message.
The Nextcloud server however does not mind and finishes the MOVE operation properly. Following the time out, the Desktop Client marks the transfer as incomplete and starts the next attempt to synchronize the file. Because the file already exists on the Nextcloud server, the server creates a new version of the file, starts to assemble the chunks of the new upload, and while doing so, the Desktop Client runs into the next 30 minute timeout and the procedure starts all over again.
To make things even worse, after the second timeout the Desktop Client detects that there are chunks left over on the Nextcloud server which it believes belong to a failed synchronization. And because of that, it requests the deletion of those chunks from the Nextcloud server. The server deletes the chunks in a second thread, while the first thread initiated by the timed out connection is still assembling. As a result, the assembling thread fails in the middle of its execution, because the remaining chunks are no longer available. It stops, leaving a partially assembled fragment of the original file at the destination folder behind. Hence the second synchronization creates a corrupted new version of the file on the server.
While trying to find the origin of that 30 minute timeout, I checked the source code of the Desktop Client and detected, that it is caused by a hardcoded maximum value inside of the method PropagateUploadFileCommon::adjustLastJobTimeout of the file libsync\propagateupload.cpp.
In order to verify my assumption, I built my own version of the Desktop Client with that value set to 120 minutes (which would still cause issues with files larger than mine). I was able to confirm that this time my files synchronized like expected. The Desktop Client did not run into the 30 minute timeout. It waited until the finish of the MOVE operation and completed the synchronization successfully with the green check mark.
The related method contains a formular which calculates the MOVE timeout based on the size of the files. That value would have worked for me. But it limits the calculated value to 30 minutes (hard coded). This might make sense to avoid "stuck" Desktop Client synchronization runs, but for larger files which need longer to synchronize leads to this exact problem.
To make a long story short. I would really appreciate if the hard coded limit would be increased or - which would even be better - could be set or disabled by using a configuration parameter of the Desktop Client.
I am really sorry for the long post, but it took me almost a week to figure out why my uploads aborted. So I wanted to share as much information as possible.
Please consider changing this behaviour in one of the future releases. Thank you very much for all of your past and future contributions to this project.
Steps to reproduce
- Move a file larger than approx. 75GB to a folder of your virtual drive
- Wait for the syncrhonization to start
- Wait until the progress bar of the Settings screeen is at 100% and the status message above the status bar changes to "A few seconds left". Note the time.
- Wait for 30 minutes, the Desktop Client will report a "Connection timed out" error
Expected behavior
The synchronization will successfully finish without a timeout error.
Which files are affected by this bug
libsync\propagateupload.cpp - method PropagateUploadFileCommon::adjustLastJobTimeout
Operating system
Windows
Which version of the operating system you are running.
Windows 10
Package
Appimage
Nextcloud Server version
25.0.3
Nextcloud Desktop Client version
3.6.6
Is this bug present after an update or on a fresh install?
Fresh desktop client install
Are you using the Nextcloud Server Encryption module?
Encryption is Disabled
Are you using an external user-backend?
- Default internal user-backend
- LDAP/ Active Directory
- SSO - SAML
- Other
Nextcloud Server logs
No response
Additional info
No response