Skip to content

Conversation

@maheshsooryambylu
Copy link
Collaborator

@maheshsooryambylu maheshsooryambylu commented Mar 8, 2025

Fixes the issues related to large file upload using sdm java plugin and performance is on par with that of application option when compared in similar network speed.
The crux of the problem is that using our locally hosted container even the existing upload function in SDMServiceImpl works for large files, but fails miserably for larger files ( even half a GB file upload is to take around 15 mins). Why it works because InputStream in locally hosted container gets filled up quick enough for being able to serve the InputStreamReader so that upload logic where InputStream being read in chunks and sent to DI from the locally hosted middleware. Now juxtaposing this behaviour with that of leading application container hosted in BTP. By the time the Upload logic in SDMServiceImpl was being executed the InputStream is not ready with the complete stream data rather stream would have few chunks to be read. That throttling was good enough for it to manifest in large file uplpoad scenario.

Gist of the fix is as follows:
DocumentUploadService is the new class created to upload documents. The upload of large files is accomplished by sending multi part upload to DI. For the smaller files still usinf single chunk upload. Right now hardcoding to 100MB as file size to determine large file or not. Used Javas reactive framework to make http calls as in the case of continuous streaming that would be a better choice but here it would not make much difference as we are sending chunk by chunk. Anyways keeping it that way to introduce ourselves to much adept reactive framework ! Also note that I have used the MemeoryManagement bean to emit the current usage of the heap atleast once in every 5 chunks of processing (to understand if the GC is taking place as per our usgae of the heap). Also note that I am using System.out.println for now until I understand the logging framework being used (Can be changed to log. messages once we know where and how log object is wired). Avoided using SDMServiceImpl because in the fix I am using all httpClient5 libraries where as in the SDMServiceImpl there are other functions using httpClient4
ReadAheadInputStream is to override the InputStream for the simple reason that processing the Inputstream chunk by chunks and calling DI api wont scale out because DI spends around 20-30 sec to process a chunk of 100MB. That is the precious time lost if we were to wait idle for that duration for every chunk. Instead keep the next chunk ready to be sent when the current chunk is being sent to DI and we wait for the synchronous response from DI. This is achieved using Executor framework spawning a thread to read the next chunk and populate the FIFO queue. That facilitates us read the next chunk from the head of the queue once the DI response for the current chunk (appendContentStream) is obtained. Do this for all chunks until the last chunk

Here is the comparative performance of this Fix Vs that of Application option! Please refer the attached files with the corresponding file names
2.13 GB file - Plugin and Application option taking 10.9 min and 11.2 min respectively at 69mbps upload speed bandwidth
167 MB file - Plugin and Application option taking 1.1 min and 1.1 min respectively at 73mbps upload speed bandwidth

PluginUploading2200MBFileAt69mbpsSpeed PluginUploadin167MBFile ApplicationOptionUploading2200MBFileAt69mbpsSpeed PluginUploading2200MBFileAt69mbpsSpeed ApplicationOptionUploading167MBFile ApplicationOptionUploading2200MBFileAt69mbpsSpeed PluginUploadin167MBFile

Copy link
Collaborator

@ashishjain14 ashishjain14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for excellent root cause analysis @maheshsooryambylu
Can you please review the following:

  1. Replace system.out.println statements with log messages
  2. Can we keep the chunk size configurable instead of hardcoding it to 100 MB for example on the basis of file size.
  3. Do we consider retries in case of failures.

@maheshsooryambylu
Copy link
Collaborator Author

Thanks for excellent root cause analysis @maheshsooryambylu Can you please review the following:

  1. Replace system.out.println statements with log messages
  2. Can we keep the chunk size configurable instead of hardcoding it to 100 MB for example on the basis of file size.
  3. Do we consider retries in case of failures.

Thanks Ashish. Thanks for the review. Regarding the comments

  1. Yes, those messages are just a placeholder as at present I dont see any logging framework is being used in our plugin code. Will align with @rashmiangadi11 and use sl4j may be
  2. chunksize will be configurable via env variable, but if not present it would assume 100MB. I would refrain from making the chunksize as dynamic value for the simple reason that there is nothing much we can deduce dynamically. Like for example, the ideal chunksize would be based on the on an avergae what would be the defintion of large file size for the given application and netwrok speed would be the only other factor that would help us deduce an ideal chunk size which anyway from the code it is difficult to figure out.
  3. Yes, that is very much needed. will add in the code

…dcontent failing the whole operation wont fail
@ashishjain14
Copy link
Collaborator

Hi @maheshsooryambylu : Can we close this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants