Skip to content

Conversation

shntu
Copy link
Contributor

@shntu shntu commented Sep 30, 2025

Description

I am currently testing this changeset, but I notice a significant slowdown when loading large instance segmentation masks in memory - currently, it is not possible to access the class or confidence without also loading the entire mask as well. By using a generator, this code reduces memory usage by enabling access to the labels and classes of predictions without requiring the entire mask to be loaded also.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

How has this change been tested, please provide a testcase or example of how you tested the change?

Currently profiling the change to see if it improves performance on edge devices (Jetson Orin 16GB)

Any specific deployment considerations

This does change the type from a list to a generator, but should not change any APIs or dependencies.

Docs

N/A

predictions = []
for pred, mask in zip(batch_predictions, batch_masks):
if class_filter and not self.class_names[int(pred[6])] in class_filter:
# TODO: logger.debug
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a debug log be necessary? This PR deletes this TODO since it didn't seem especially helpful, but I can also add logs if needed to the generator.

responses = [
InstanceSegmentationInferenceResponse(
predictions=[
predictions=(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it converted to list before rendering? Not sure if it help with memory usage.

Copy link
Contributor Author

@shntu shntu Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, there are a handful of enterprise users who import InferencePipeline and run it without actually using any of the HTTP responses.

I'm closing this PR because it indeed does not solve the problem - the memory usage is much higher on instance segmentation execution, and reducing the memory used by these masks is a very small fraction of it (would require a much larger change to actually reduce the memory pressure on the Jetson).

@shntu shntu closed this Oct 2, 2025
@shntu shntu deleted the sb/generate-seg-masks branch October 2, 2025 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants