-
-
Notifications
You must be signed in to change notification settings - Fork 3.9k
[WIP] fix 5287 optimize series discovery on load time #5327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[WIP] fix 5287 optimize series discovery on load time #5327
Conversation
… reflect expected return type. Signed-off-by: Luis M. Santos <[email protected]>
…g metadata. WADO metadata request is more likely to return JSON and some VNAs do not like the extra options in the Accept header. Also, passing an empty array is not sufficient because somewhere we still include a comma that breaks the Accept header. It's better to omit it for the metadata request only. Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
…s typed checked. Of course, I upgraded the module to a TypeScript module. Moved the request header interfaces into its own TypeScript module (RequestHeaders.ts) in the core types. Signed-off-by: Luis M. Santos <[email protected]>
…s a result, upgraded the source file to TypeScript. Removed the User interface from RequestHeaders module and moved them to the user module. Signed-off-by: Luis M. Santos <[email protected]>
…ests are passing. Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
…urely from QIDO as possible. Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
…ccount so that slightly different requests are attempted. In particular, it fixes the instance metadata retrieval getting blocked because they share a study uid. Signed-off-by: Luis M. Santos <[email protected]>
… used by the display protocol engine. Signed-off-by: Luis M. Santos <[email protected]>
…rging of data from the qido search into copies of a single reference slice metadata. It saves on unnecessary data retrieval and thus makes the viewer loading very snappy. Signed-off-by: Luis M. Santos <[email protected]>
…ive on what to retrieve. Signed-off-by: Luis M. Santos <[email protected]>
…compatibility. Signed-off-by: Luis M. Santos <[email protected]>
… allow usage of type definition elsewhere for TS source files. Signed-off-by: Luis M. Santos <[email protected]>
…hat I can retrieve multiple slice metadata blocks. Signed-off-by: Luis M. Santos <[email protected]>
…ed to pass a list of promises as opposed to a single promise so we can reconstruct the spatial information. Signed-off-by: Luis M. Santos <[email protected]>
…we now fetch 2 slice metadata blocks (first and last slices) instead of one. Signed-off-by: Luis M. Santos <[email protected]>
…econstructed metadata array can be used by the viewer and correctly perform a 3D reconstruction. Adjusted the type definitions to improve self documentation of the input data. Since dcmjs is a JS library, the naturalized dataset lacks proper type annotations so this is the beginning work to add such annotations. Signed-off-by: Luis M. Santos <[email protected]>
…ction for the retrieval of metadata from the dicom source. Made stylistic refactors and added basic documentation. Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
…h level flow of the loading logic so I can figure out a small defect. Signed-off-by: Luis M. Santos <[email protected]>
…make the extraction logic cleaner and allow for sharing of types with the wado modules. Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
… level since it is a core definition to the rest of the local codebase. Signed-off-by: Luis M. Santos <[email protected]>
…an cleaning the _retrieveMetaDataAsync logic to help in finding the root cause of a defect introduced in my sync retrieval changes. Signed-off-by: Luis M. Santos <[email protected]>
const imageIds = getImageId({ instance, frame }); | ||
return imageIds; | ||
}, | ||
getImageIdsForInstance: getImageIdsForInstance, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be just getImageIdsForInstance,
as your assignment variable name is identical to what you are assigning it to.
It is indeed interesting to be able to query for studies with qido and display them, fetching first/last etc, but it definitely needs to be something that can be turned on/off with a flag because the performance of this can be either much better or much worse than the existing behaviour. The reason is that not all systems store all the metadata as individual pieces, but if they store it like static dicomweb or other stream optimized variants, then the fastest way to get the data is actually to fetch metadata for the entire series at once - that allows the best compression to be applied, and works equally well for both single and multiframe. |
To make this more configurable/manageable without introducing bugs, could you create a new data source with these changes? I would set it up so that it is easy to fall back to the basic data source retrieval options on conditions like only have <= N sop instances - that way you can use N~=10 and just fallback to retrieving full metadata, and only use the qido options when they actually make a significant difference. It might also need to have some way to customize when that occurs, and almost certainly needs to have a way to go back and reload everything using the full metadata set on a per-series basis. |
Let me make it a flag for now. You bring excellent points that I need to consider. In terms of the WADO/QIDO part of the standard, it's a REST API, so I think a standards compliant system has to comply with expected metadata. WADO example,
https://dicom.nema.org/medical/dicom/current/output/html/part18.html#sect_10.4.1.1.2 QIDO example,
https://dicom.nema.org/medical/dicom/current/output/html/part18.html#sect_10.6.3.3.1 I do understand helping gzip or zstd achieve better compression, but I am not convinced that it is necessarily faster to retrieve MB of data in most cases. In my particular case, the response is already gzip compressed. This is something that would have to be extensively benchmarked across vendor implementations and it is well outside my scope and resources at the moment. With that said, if we stick strictly to the standard, QIDO as deep into the tree as possible and then WADO at the leaf should include much fewer metadata than WADO at the higher levels (series) despite having multiple separate requests and potentially poor compression per request. Also, we get to benefit from request parallelization and keeping them close to less than 14KB to avoid the TCP slow start tax per connection. Meaning, it should be near instantaneous which is what I found in my crude testing. This is not to discredit your observations. I am thinking that most studies should have a lot of metadata because that is the nature of DICOM (CT, MRI, PET, DWI, Perfusion, etc...) and DICOMweb lacks a GraphQL specification like FHIR has. PS..... I just read your follow up comment and I fully agree. That is a better way to approach this. I was avoiding adding to the project since I am new here, but I do like that suggestion the more I think about this. One thing we (more like I) might need to revisit the correct threshold for falling back to full retrieval. Starting at 10 sounds good to me for now. I will check CINE and CS3D if possible. Expect changes starting tomorrow since I need to take care of something this afternoon at my institution. Thank you. This is a good discussion! Also, I already saw a few other emails about style and I will look at those as well for the other PRs, but I wanted to share that TypeScript is not my primary language so occasionally you might see more of a C or Rust or Pythony style when writing things. I am going to take this opportunity to learn from you! |
Signed-off-by: Luis M. Santos <[email protected]>
…s into its own module. Signed-off-by: Luis M. Santos <[email protected]>
Moved my changes to its own module as requested. |
…ces. Reorganized the utility functionality so that different data sources can select the functionality they need while minimizing duplication and migrating code into TypeScript land. Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
…to the new utils toolbox. Signed-off-by: Luis M. Santos <[email protected]>
…ring this portion of the API in the DataSources because I may have made a mistake previously. I think it is part of the API to have their own copy-paste version of getImageIdsForInstance. I will need to revisit refactoring this at a later time. Signed-off-by: Luis M. Santos <[email protected]>
…ataSource API as far as I could see doing a codebase search. I am implementing the same elsewhere. Signed-off-by: Luis M. Santos <[email protected]>
…ataSource API and I wanted to make all of these guys similar so I can more consistently refactor the general API in the future. Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
…it uses the same utilities as the DicomWebMinimalDataSource. Signed-off-by: Luis M. Santos <[email protected]>
…se these functions across several data sources. Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
…the study contains fewer than 10 slices. Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
… generateWadoHeader. Signed-off-by: Luis M. Santos <[email protected]>
TODO:
|
…ents were expected to also change per slice. Signed-off-by: Luis M. Santos <[email protected]>
…com Data Sources directory. Signed-off-by: Luis M. Santos <[email protected]>
…rrectly. Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
The different queries do in fact need to be implemented, but the performance of different types of queries and what is available to actually return differs between systems. As well, the actual parsing performance differs between systems. Given that, to figure out how to better deal with slower PACS systems or ones not supporting multiframes:
For the first one, try the hanging protocols which don't load the entire study, but only load the required series. That means that it has to be determinable which series are displayed first by things available in the series query, without needing to load all the metadata. That can be a problem for some study types. For the second, you idea of using the instances query is probably the best option - however, you still need to be able to determine the split/sort rules that are applicable. There are ways to "guess" that, but the query data returned by the initial query isn't really good enough to guess that - something like: You can start seeing why a general solution to not load all the metadata up front becomes a problem because you need to deal with a lot of different use cases. If the hanging protocol used to determine initial load can also determine whether it is safe to do a query instance metadata for some of the instances, then it would help you to switch between full metadata modes and partial, even on a single series. Those hanging protocols probably need to be tuned to the exact back end response data you can get on the instance level query and series level queries. |
Context
The synchronous load method pulls too much metadata during the loading phase. This wastes resources such as bandwidth, server connections, and user patience. In my test case, I was downloading up to 29 MB of JSON metadata which could take up to 10 seconds. The vast majority of the metadata is not used at all when setting up the UI/Hanging Protocol.
We can have a snappy synchronous experience by requesting and making use of as much of the
QIDO
metadata as possible.We can also skip downloading the full study metadata and simply download two slices worth of metadata per series. With a little
IPP
patching, the Viewer can still enable the 3D tools.Breaking the process in smaller QIDO and WADO queries allows for parallelization of the request against the server thus minimizing time spent requesting data.
Fixes #5287
Note
I apologize for the number of changes but I saw opportunities to improve code quality while chasing a few defects prior to PR release. I took time to add more type definitions, reorganize some of the files, and testing both the lazy load and sync load features.
Also, note that this only takes care of the metadata portion itself but not the image load. I will tackle the image load optimization in a separate PR as an opt in setting. For now, the UI should update pretty fast although the thumbnail retrieval will prevent it from showing the images which can take a second in my environment.
Ignore commits from August 4th since those are being reviewed in #5293. They are only showing here because I had to base my new branch off the one associated with that PR so I could test against my live environment as well.
Finally, I will be adding a few more commits but those will be unit tests. I decided to open this PR to allow initial reviews. I should be adding the rest of the unit tests soon (tomorrow).
Changes & Results
DicomWebDataSource
module/component.Before
After
Testing
Checklist
PR
semantic-release format and guidelines.
Code
etc.)
Public Documentation Updates
additions or removals.
Tested Environment