Skip to content

Conversation

luissantosHCIT
Copy link

Context

Fixes #5340
Requires #5293, #5327, and #5337 before merging.
Commits start August 15th.

This PR brings a React component that can handle SR encapsulated contents.
Currently, the viewer assumes all data in the SR tree are pieces of formatted text.
However, in my environment, I come across SR series that are encapsulated reports.
The encapsulated report is an HTML report in my case.
As part of the component bundle, I included a component for handling PDF if the encountered data is PDF.
The new components default to text like before.

Also, because of the danger of rendering HTML, I added a dependency on sanitize-html and take steps to clean the input HTML as much as possible. I make sure no scripts are allowed to execute when viewing HTML reports.

A follow up PR or two will be needed to introduce:

  • Refactor the UTF-8 encoding logic to use a more updated version of dcmjs library once my PR (455) gets merged.
  • Introduce test data and unit tests.

Changes & Results

  • Added HTML rendering component.
  • Added PDF rendering component
  • Preserved Text Rendering behavior as default.
  • Added dependency on html-sanitize to help clean up document.
  • Added utility function to make sure DICOM inputs are encoded in utf-8.
  • Added regexes to make sure we isolate HTML in the payload and discard anything else.
  • Corrected logic bug with obtaining the CodeMeaning elements in SR nodes.

Before

image

After

image

Testing

E2E tests and test dataset will need to be introduced in a later PR because I need to figure out how to propose test data.
Also, the PR is getting much bigger than I feel comfortable and I think it will create more burden to the reviewer, so I am breaking this into a few chunks.

TODO: Finalize before and after testing with current test suites in the morning.

Checklist

PR

  • My Pull Request title is descriptive, accurate and follows the
    semantic-release format and guidelines.

Code

  • My code has been well-documented (function documentation, inline comments,
    etc.)

Public Documentation Updates

  • [] The documentation page has been updated as necessary for any public API
    additions or removals.

Tested Environment

  • OS: Ubuntu Linux
  • Node version: 20.18.0
  • Browser: Firefox 141, Chromium 138.0.7204.157

… reflect expected return type.

Signed-off-by: Luis M. Santos <[email protected]>
…g metadata. WADO metadata request is more likely to return JSON and some VNAs do not like the extra options in the Accept header. Also, passing an empty array is not sufficient because somewhere we still include a comma that breaks the Accept header. It's better to omit it for the metadata request only.

Signed-off-by: Luis M. Santos <[email protected]>
…s typed checked. Of course, I upgraded the module to a TypeScript module.

Moved the request header interfaces into its own TypeScript module (RequestHeaders.ts) in the core types.

Signed-off-by: Luis M. Santos <[email protected]>
…s a result, upgraded the source file to TypeScript.

Removed the User interface from RequestHeaders module and moved them to the user module.

Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
…urely from QIDO as possible.

Signed-off-by: Luis M. Santos <[email protected]>
…ccount so that slightly different requests are attempted. In particular, it fixes the instance metadata retrieval getting blocked because they share a study uid.

Signed-off-by: Luis M. Santos <[email protected]>
… used by the display protocol engine.

Signed-off-by: Luis M. Santos <[email protected]>
…rging of data from the qido search into copies of a single reference slice metadata. It saves on unnecessary data retrieval and thus makes the viewer loading very snappy.

Signed-off-by: Luis M. Santos <[email protected]>
…ive on what to retrieve.

Signed-off-by: Luis M. Santos <[email protected]>
… allow usage of type definition elsewhere for TS source files.

Signed-off-by: Luis M. Santos <[email protected]>
…hat I can retrieve multiple slice metadata blocks.

Signed-off-by: Luis M. Santos <[email protected]>
…ed to pass a list of promises as opposed to a single promise so we can reconstruct the spatial information.

Signed-off-by: Luis M. Santos <[email protected]>
…we now fetch 2 slice metadata blocks (first and last slices) instead of one.

Signed-off-by: Luis M. Santos <[email protected]>
…econstructed metadata array can be used by the viewer and correctly perform a 3D reconstruction.

Adjusted the type definitions to improve self documentation of the input data. Since dcmjs is a JS library, the naturalized dataset lacks proper type annotations so this is the beginning work to add such annotations.

Signed-off-by: Luis M. Santos <[email protected]>
…ction for the retrieval of metadata from the dicom source.

Made stylistic refactors and added basic documentation.

Signed-off-by: Luis M. Santos <[email protected]>
…h level flow of the loading logic so I can figure out a small defect.

Signed-off-by: Luis M. Santos <[email protected]>
…make the extraction logic cleaner and allow for sharing of types with the wado modules.

Signed-off-by: Luis M. Santos <[email protected]>
… level since it is a core definition to the rest of the local codebase.

Signed-off-by: Luis M. Santos <[email protected]>
…an cleaning the _retrieveMetaDataAsync logic to help in finding the root cause of a defect introduced in my sync retrieval changes.

Signed-off-by: Luis M. Santos <[email protected]>
luissantosHCIT and others added 15 commits August 15, 2025 15:44
…ted report contents.

Basically, the viewer can now display HTML, Markdown, PDF, and Text reports with the appropriate page object.
Added sanitize-html, Markdown, and @types/sanitize-html as dependencies.
TODO: Add proper HTML sanitization options so we can preserve the core portions of the document but remove anything that can cause security issues.
TODO: Hook the html sanitization routine into the component logic.

Signed-off-by: Luis M. Santos <[email protected]>
Adjusted the html extraction regex to ensure we capture html contents only.

Signed-off-by: Luis M. Santos <[email protected]>
…evels of abstraction better.

Signed-off-by: Luis M. Santos <[email protected]>
… it was not meeting the standard specification of looking into the ConceptNameCodeSequence array. It assumed that the CodeMeaning field was at the array object level.

Refactored code for clarity and added type definitions.

Signed-off-by: Luis M. Santos <[email protected]>
…codesequence node. This is the same bug as at the container level.

Minor API refactor for clarity.
Restored logic from initial commit since it handles the continuity of content which I typically don't and I want ot avoid breaking someone else's expectations. If current solution turns out to not be complete for embedded reports, I shall revisit this file.

Signed-off-by: Luis M. Santos <[email protected]>
…ction for enforcing utf-8 encoding in payload. Updated the sanitizeHTML function to go through all steps necessary to end up with a clean HTML document.

Signed-off-by: Luis M. Santos <[email protected]>
…s to track the specified encoding type so that we can convert to utf-8 accordingly

Signed-off-by: Luis M. Santos <[email protected]>
…o that we can guarantee that PACS or VNA payloads are safe to render.

Signed-off-by: Luis M. Santos <[email protected]>
… render encapsulated reports correctly.

Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
Signed-off-by: Luis M. Santos <[email protected]>
Copy link

netlify bot commented Aug 19, 2025

Deploy Preview for ohif-dev ready!

Name Link
🔨 Latest commit fe2c5c7
🔍 Latest deploy log https://app.netlify.com/projects/ohif-dev/deploys/68a5e7c4347cf70008b4af22
😎 Deploy Preview https://deploy-preview-5345--ohif-dev.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@luissantosHCIT
Copy link
Author

I am not sure what Cypress is complaining about.
The test suite runs without errors in my local environment and I have not changed the tests at all.

image

@luissantosHCIT
Copy link
Author

I see lots of Segmentation related errors in Playwright which does not look related to anything I have done so far.

I am thinking the last set of commits I synced from upstream might be faulty or incomplete.

Removing the [WIP] designation since I do not see anything else to do for this PR until further guidance.

@luissantosHCIT luissantosHCIT changed the title [WIP] feat 5340 addition of html and pdf report rendering of encapsulated SR reports. feat 5340 addition of html and pdf report rendering of encapsulated SR reports. Aug 20, 2025
@sedghi
Copy link
Member

sedghi commented Aug 25, 2025

Same comment I had in other PR, which is seems like the diff includes other changes. This makes merging your PRs slower, if you can only include changes for SR for this we can review,test faster and merge while waiting for the others to get reivewed

@wayfarer3130
Copy link
Contributor

Can you provide an anonymized example of both pdf and html reports which can be added to the automated test data set? I can get those uploaded.
You must include a notification in that which indicates there is no PHI data in them, and that OHIF has permission to include and distribute them in the viewer test dataset.

@@ -0,0 +1,64 @@
import React, { useEffect } from 'react';
import Markdown from 'marked-react';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do lazy inclusion of the marked-react? This component isn't used elsewhere so loading it using a dynamice import statement, causing the separation into a separate load container will keep the overall OHIF size a bit smaller.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you not using react-markdown which is far more utilized in other projects?

https://npmtrends.com/marked-react-vs-react-markdown

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I will read more about this and educate myself and then submit a commit.

*/
export function getPayloadType(payload: string, suggested_mime: string = 'text/plain') {
// PDF
if (!payload.indexOf('%PDF-')) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you want payload.indexOf('%PDF-') to test if it IS PDF?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using the Magic number approach I would do in C specially if avoiding deep reads into data contents for performance reasons. Since we are talking JS here which is many abstraction layers above bare metal, I tend to be even less liberal with potentially costly checks. I am primarily a systems programmer so hopefully you can see a pattern here.

With that said, perhaps it is best I regex the input if PDF has a magic pattern at end of file so we ignore any potential malicious payloads at the boundaries and then verify here the payload starts with the magic number otherwise ignore it.
There's arguments in either direction, but I would rather detect the payload is a potentially well formed PDF (starts with the correct signature) and then let the browser object actually test if the PDF is valid. That way, I should have guarantees that the browser implementors have better ways to test this while being potentially more performant (the payload would be on the binary side of the JS engine at that point).

That's my thought process at least.

* @param {string} suggested_mime Default MIME to use if we cannot identify the content's MIME
* @return string
*/
export function getPayloadType(payload: string, suggested_mime: string = 'text/plain') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an options containing the suggestedMime (note lower camel case), and whether it should try scanning for html/pdf. It shouldn't automatically do it, since the mime type is supposed to be provided in the DICOM itself for encapsulated objects. You could then set a flag to only do the test if a certain customization was present. That customization could be a simple test function for the given type. That way other types can be recognized as needed, and no special types would need to be provided by default.

* @param {string} mime MIME to add to Blob so other components can know how to handle contents.
* @return Blob
*/
export function stringToBlob(data: string, mime: string = payloadMIMEOptions.DEFAULT): Blob {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a shared external function. It gets repeated several times in the code so sharing it means any bugs in it could be fixed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, you are probably asking about a TypeScript specific thing. I am not understanding this request. The function is already exported from the payload module in utils. I thought I had all areas pointing to this function as an import. Is it that I left places accidentally duplicating code?

My brain cleans the code in passes and sometimes I forget to do one last pass.

export function getCodeMeaningFromConceptNameCodeSequence(
conceptNameCodeSequence: ConceptNameCodeSequence
): string {
let item: ConceptNameCodeSequenceItem = conceptNameCodeSequence[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const

): string {
let item: ConceptNameCodeSequenceItem = conceptNameCodeSequence[0];
const { CodeMeaning } = item;
return CodeMeaning ?? "";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not
{
return conceptNameCodeSequence?.[0]?.CodeMeaning || ""
}

Really, all you are doing is extracting path values, and a single extractor is easier to manage.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just assumed I was done with cleaning up the code after some debugging and before PR which clearly it is not true. I will correct that.

Question: would the "compiler" elide the extra lines? In other words, do TS compilers optimize code (not just minimize) like a C or Rust compiler? I know TS is mostly linters on top of JS, but I am curious how far the compiler goes through and if the extra lines actually have effects on final package. Thank you for educating me on this one.

@luissantosHCIT
Copy link
Author

Can you provide an anonymized example of both pdf and html reports which can be added to the automated test data set? I can get those uploaded. You must include a notification in that which indicates there is no PHI data in them, and that OHIF has permission to include and distribute them in the viewer test dataset.

Let me get back to you on this. I believe I have access to purely fictitious examples that have not patient involvement. How should the notice be provided? What's the expected format so I can ask for proper permission?

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Add HTML Report display widget
3 participants