-
Notifications
You must be signed in to change notification settings - Fork 7
[Feature] New Matcher Class: Voronoi #229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a new Voronoi-based region matching algorithm for instance matching, along with CuPy GPU backend support for connected components analysis. The implementation adds RegionBasedMatching as a new matching scheme that uses spatial distance rather than traditional overlap-based metrics.
- Implements RegionBasedMatching algorithm using distance transforms and connected components
- Adds CuPy GPU backend support for accelerated connected components analysis
- Updates result handling to support NaN values for region-based matching metrics
Reviewed Changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
unit_tests/test_region_matching.py | Basic test for RegionBasedMatching functionality |
unit_tests/test_region_integration.py | Integration test with full panoptic evaluation pipeline |
unit_tests/test_region_comprehensive.py | Comprehensive test suite covering various matching scenarios |
unit_tests/test_cupy_connected_components.py | Complete test suite for CuPy backend functionality |
unit_tests/test_config.py | Updates config tests to include CuPy backend |
pyproject.toml | Adds CuPy as optional GPU dependency with CUDA variants |
panoptica/utils/constants.py | Adds CuPy backend to CCABackend enum |
panoptica/panoptica_result.py | Updates type annotations and metric calculations for NaN support |
panoptica/panoptica_evaluator.py | Adds region-based matching detection and NaN handling |
panoptica/instance_matcher.py | Implements RegionBasedMatching algorithm |
panoptica/_functionals.py | Adds CuPy backend support to _connected_components |
panoptica/init.py | Exports RegionBasedMatching class |
benchmark/benchmark.py | Adds CuPy benchmarking functions |
.github/workflows/tests.yml | Adds separate CUDA testing workflow |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
region_mask = (labeled_array == region_label) | ||
|
||
# Compute distance transform | ||
distance = distance_transform_edt(~region_mask) |
Copilot
AI
Sep 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Computing distance transforms for all regions sequentially can be inefficient for large numbers of regions. Consider vectorizing this operation or using parallel processing for better performance with many ground truth regions.
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would move this to a functionals function and import that here.
Otherwise, I agree with copilot, this can be made faster I think.
_connected_components(mask, CCABackend.cupy) | ||
|
||
panoptica_cupy_time = timeit.timeit(label_panoptica_cupy, number=10) | ||
|
Copilot
AI
Sep 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benchmark_panoptica_cupy function lacks GPU memory cleanup unlike benchmark_cupy. Consider adding memory pool cleanup: cp.get_default_memory_pool().free_all_blocks()
after the timing to ensure consistent memory usage across benchmarks.
# Clean up GPU memory | |
cp.get_default_memory_pool().free_all_blocks() |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
except (ImportError, Exception) as e: | ||
# CuPy not available or CUDA issues, skip this test | ||
if "CUDA" in str(e) or "cupy" in str(e).lower(): | ||
self.skipTest( | ||
f"CuPy/CUDA not available for connected components test: {e}" | ||
) | ||
else: | ||
raise |
Copilot
AI
Sep 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catching broad 'Exception' and then checking error message strings is unreliable. Consider catching specific exceptions like CuPy's runtime errors (e.g., cupy.cuda.runtime.CUDARuntimeError) for more precise error handling.
Copilot uses AI. Check for mistakes.
@MarcelRosier #231 seems to have crashed our tests 🙈 |
I think the lockfile is simply outdated in this branch. @aymuos15 Can you try to run |
Ah okay. I will do this GMT evening. Unfortunately do not have access to my machine atm. Thanks for pointing it out. |
@MarcelRosier sorry for the delay but does this work? |
else: | ||
torch = None | ||
|
||
if TYPE_CHECKING: | ||
import torch | ||
|
||
|
||
def load_torch_image(image_path: str | Path) -> torch.Tensor: | ||
def load_torch_image(image_path: Union[str, Path]): | ||
if torch is None: | ||
raise ImportError("torch is not available. Please install torch to use this functionality.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that an issue? The input checker already governs package availability and should not even call this if it is not available, no?
pyproject.toml
Outdated
SimpleITK = "^2.2.2" | ||
nibabel = "^5.1.0" | ||
pynrrd = "^1.1.3" | ||
joblib = "^1.3.2" | ||
tqdm = ">=4.62.3" | ||
auxiliary = ">=0.1.0" | ||
pytest = ">=8.1.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want people to install every package we support by default, do we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mandatory for the tests, thats why I added this. Auxiliary I think is gonna go away #217
Otherwise these are very lightweight anyways right? My theory is that people need not install anything themselves in order to run tests. Obviously happy to revert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't call nibabel, simpleITK, pynrrd, joblib, and torch together lightweight^^
I believe @neuronflow and my idea was to keep panoptica's requirement to a minimum. And if people develop it and require the unittests, that's the point where they install the dev dependencies, right? That's what they are for. Am I mistaken?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sorry about the torch one, that was wobbly on my part. I am fixing that now. I do think the others are okay.
but yes, I guess I do agree need not install everything for the tests, I will revert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @Hendrik-code
from importlib.util import find_spec | ||
|
||
# Optional torch import | ||
_spec = find_spec("torch") | ||
if _spec is not None: | ||
import torch | ||
HAS_TORCH = True | ||
else: | ||
torch = None | ||
HAS_TORCH = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is why torch is in the dev dependancies, and thus this should not be a problem, is it for you?
Sorry about the mess but I think things finally work now in terms of CI? If this is okay< I will port the same to #228 |
# For region-based matching, set TP to NaN since it doesn't use traditional counting | ||
tp_value = processing_pair.tp | ||
if instance_matcher.__class__.__name__ == "RegionBasedMatching": | ||
tp_value = np.nan | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh I see. you create more labelmap entries and thus indirectly create the region-based loop we discussed.
Although that is fine, I really don't like something hacky like this thing.
I much rather would say the matcher just takes care of the rest by calling the remainder of the pipeline itself (no need to create labelamp entries, that all is practically useless computation) and then the matcher itself can set those values to nan after the corresponding result objects have been created. You agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. This seemed the most minimal approach in hindsight and thats why I went ahead with it. How do you feel about this then:
git diff --no-color panoptica/instance_matcher.py panoptica/utils/processing_pair.py panoptica/instance_evaluator.py panoptica/panoptica_evaluator.py | cat
diff --git a/panoptica/instance_evaluator.py b/panoptica/instance_evaluator.py
index 8b5727b..fb5707d 100644
--- a/panoptica/instance_evaluator.py
+++ b/panoptica/instance_evaluator.py
@@ -30,7 +30,11 @@ def evaluate_matched_instance(
], "decision metric not contained in eval_metrics"
assert decision_threshold is not None, "decision metric set but no threshold"
# Initialize variables for True Positives (tp)
- tp = len(matched_instance_pair.matched_instances)
+ # For non-traditional counting (e.g., region-based matching), tp should be NaN
+ if hasattr(matched_instance_pair, 'use_traditional_counting') and not matched_instance_pair.use_traditional_counting:
+ tp = float('nan')
+ else:
+ tp = len(matched_instance_pair.matched_instances)
score_dict: dict[Metric, list[float]] = {m: [] for m in eval_metrics}
reference_arr, prediction_arr = (
diff --git a/panoptica/instance_matcher.py b/panoptica/instance_matcher.py
index 0eb4afc..8df679a 100644
--- a/panoptica/instance_matcher.py
+++ b/panoptica/instance_matcher.py
@@ -118,7 +118,7 @@ class InstanceMatchingAlgorithm(SupportsConfig, metaclass=ABCMeta):
**kwargs,
)
- return map_instance_labels(unmatched_instance_pair.copy(), instance_labelmap)
+ return map_instance_labels(unmatched_instance_pair.copy(), instance_labelmap, use_traditional_counting=True)
def _calculate_matching_metric_pairs(
self,
@@ -164,7 +164,9 @@ class InstanceMatchingAlgorithm(SupportsConfig, metaclass=ABCMeta):
def map_instance_labels(
- processing_pair: UnmatchedInstancePair, labelmap: InstanceLabelMap
+ processing_pair: UnmatchedInstancePair,
+ labelmap: InstanceLabelMap,
+ use_traditional_counting: bool = True
) -> MatchedInstancePair:
"""
Map instance labels based on the provided labelmap and create a MatchedInstancePair.
@@ -198,6 +200,7 @@ def map_instance_labels(
matched_instance_pair = MatchedInstancePair(
prediction_arr=prediction_arr_relabeled,
reference_arr=processing_pair.reference_arr,
+ use_traditional_counting=use_traditional_counting,
)
return matched_instance_pair
@@ -524,6 +527,39 @@ class RegionBasedMatching(InstanceMatchingAlgorithm):
cca_backend (CCABackend): Backend for connected component analysis.
"""
self._cca_backend = cca_backend
+
+ def match_instances(
+ self,
+ unmatched_instance_pair: UnmatchedInstancePair,
+ label_group=None,
+ num_ref_labels=None,
+ processing_pair_orig_shape=None,
+ **kwargs,
+ ) -> MatchedInstancePair:
+ """
+ Override to set use_traditional_counting=False for region-based matching.
+ """
+ # Create context if needed
+ context = None
+ if (
+ label_group is not None
+ or num_ref_labels is not None
+ or processing_pair_orig_shape is not None
+ ):
+ context = MatchingContext(
+ label_group=label_group,
+ num_ref_labels=num_ref_labels,
+ processing_pair_orig_shape=processing_pair_orig_shape,
+ )
+
+ instance_labelmap = self._match_instances(
+ unmatched_instance_pair,
+ context,
+ **kwargs,
+ )
+
+ # Use use_traditional_counting=False since region-based matching doesn't use traditional TP semantics
+ return map_instance_labels(unmatched_instance_pair.copy(), instance_labelmap, use_traditional_counting=False)
def _get_gt_regions(self, gt: np.ndarray) -> Tuple[np.ndarray, int]:
"""
diff --git a/panoptica/panoptica_evaluator.py b/panoptica/panoptica_evaluator.py
index 6b2e2cf..2c8feee 100644
--- a/panoptica/panoptica_evaluator.py
+++ b/panoptica/panoptica_evaluator.py
@@ -442,10 +442,8 @@ def panoptic_evaluate(
else instance_metadata["original_num_refs"]
)
- # For region-based matching, set TP to NaN since it doesn't use traditional counting
+ # Use tp from processing_pair (already handles NaN for non-traditional counting methods)
tp_value = processing_pair.tp
- if instance_matcher.__class__.__name__ == "RegionBasedMatching":
- tp_value = np.nan
processing_pair = PanopticaResult(
reference_arr=processing_pair.reference_arr,
diff --git a/panoptica/utils/processing_pair.py b/panoptica/utils/processing_pair.py
index fed0b99..a061269 100644
--- a/panoptica/utils/processing_pair.py
+++ b/panoptica/utils/processing_pair.py
@@ -322,11 +322,13 @@ class MatchedInstancePair(_ProcessingPairInstanced):
missed_reference_labels (list[int]): Reference labels with no matching prediction.
missed_prediction_labels (list[int]): Prediction labels with no matching reference.
matched_instances (list[int]): Labels matched between prediction and reference arrays.
+ use_traditional_counting (bool): Whether this matching uses traditional TP/FP/FN semantics.
"""
missed_reference_labels: list[int]
missed_prediction_labels: list[int]
matched_instances: list[int]
+ use_traditional_counting: bool
def __init__(
self,
@@ -337,6 +339,7 @@ class MatchedInstancePair(_ProcessingPairInstanced):
matched_instances: list[int] | None = None,
n_prediction_instance: int | None = None,
n_reference_instance: int | None = None,
+ use_traditional_counting: bool = True,
) -> None:
"""Initializes a MatchedInstancePair
@@ -348,6 +351,7 @@ class MatchedInstancePair(_ProcessingPairInstanced):
matched_instances (int | None, optional): matched instances labels, i.e. unique matched labels in both maps. Defaults to None.
n_prediction_instance (int | None, optional): Number of prediction instances. Defaults to None.
n_reference_instance (int | None, optional): Number of reference instances. Defaults to None.
+ use_traditional_counting (bool, optional): Whether this matching uses traditional TP/FP/FN semantics. Defaults to True.
For each argument: If none, will calculate on initialization.
"""
@@ -360,6 +364,7 @@ class MatchedInstancePair(_ProcessingPairInstanced):
if matched_instances is None:
matched_instances = [i for i in self.pred_labels if i in self.ref_labels]
self.matched_instances = matched_instances
+ self.use_traditional_counting = use_traditional_counting
if missed_reference_labels is None:
missed_reference_labels = list(
@@ -389,6 +394,7 @@ class MatchedInstancePair(_ProcessingPairInstanced):
missed_reference_labels=self.missed_reference_labels,
missed_prediction_labels=self.missed_prediction_labels,
matched_instances=self.matched_instances,
+ use_traditional_counting=self.use_traditional_counting,
)
if np.isnan(res.tp): | ||
return np.nan | ||
return res.num_pred_instances - res.tp | ||
|
||
|
||
def fn(res: PanopticaResult): | ||
if np.isnan(res.tp): | ||
return np.nan | ||
return res.num_ref_instances - res.tp | ||
|
||
|
||
def prec(res: PanopticaResult): | ||
if np.isnan(res.tp): | ||
return np.nan | ||
return res.tp / (res.tp + res.fp) | ||
|
||
|
||
def rec(res: PanopticaResult): | ||
if np.isnan(res.tp): | ||
return np.nan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a better solution for this? I feel like putting these if-clauses everywhere is not the best solution (especially when we have to think about long-term here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diff --git a/panoptica/panoptica_result.py b/panoptica/panoptica_result.py
index aaaf560..b080ee7 100644
--- a/panoptica/panoptica_result.py
+++ b/panoptica/panoptica_result.py
@@ -105,6 +105,10 @@ class PanopticaResult(object):
default_value=tp,
was_calculated=True,
)
+
+ # Check if tp is NaN and handle dependent metrics centrally
+ self._tp_is_nan = np.isnan(tp) if isinstance(tp, float) else False
+
# endregion
#
# region Basic
@@ -112,36 +116,46 @@ class PanopticaResult(object):
self._add_metric(
"fp",
MetricType.MATCHING,
- fp,
+ fp if not self._tp_is_nan else None,
long_name="False Positives",
+ default_value=np.nan if self._tp_is_nan else None,
+ was_calculated=self._tp_is_nan,
)
self.fn: int | float # Allow NaN for region-based matching
self._add_metric(
"fn",
MetricType.MATCHING,
- fn,
+ fn if not self._tp_is_nan else None,
long_name="False Negatives",
+ default_value=np.nan if self._tp_is_nan else None,
+ was_calculated=self._tp_is_nan,
)
self.prec: int | float # Allow NaN for region-based matching
self._add_metric(
"prec",
MetricType.NO_PRINT,
- prec,
+ prec if not self._tp_is_nan else None,
long_name="Precision (positive predictive value)",
+ default_value=np.nan if self._tp_is_nan else None,
+ was_calculated=self._tp_is_nan,
)
self.rec: int | float # Allow NaN for region-based matching
self._add_metric(
"rec",
MetricType.NO_PRINT,
- rec,
+ rec if not self._tp_is_nan else None,
long_name="Recall (sensitivity)",
+ default_value=np.nan if self._tp_is_nan else None,
+ was_calculated=self._tp_is_nan,
)
self.rq: float
self._add_metric(
"rq",
MetricType.MATCHING,
- rq,
+ rq if not self._tp_is_nan else None,
long_name="Recognition Quality / F1-Score",
+ default_value=np.nan if self._tp_is_nan else None,
+ was_calculated=self._tp_is_nan,
)
# endregion
#
@@ -773,26 +787,18 @@ class PanopticaResult(object):
# region Basic
def fp(res: PanopticaResult):
- if np.isnan(res.tp):
- return np.nan
return res.num_pred_instances - res.tp
def fn(res: PanopticaResult):
- if np.isnan(res.tp):
- return np.nan
return res.num_ref_instances - res.tp
def prec(res: PanopticaResult):
- if np.isnan(res.tp):
- return np.nan
return res.tp / (res.tp + res.fp)
def rec(res: PanopticaResult):
- if np.isnan(res.tp):
- return np.nan
return res.tp / (res.tp + res.fn)
@@ -803,8 +809,6 @@ def rq(res: PanopticaResult):
Returns:
float: Recognition Quality (RQ).
"""
- if np.isnan(res.tp):
- return np.nan
if res.tp == 0:
return 0.0 if res.num_pred_instances + res.num_ref_instances > 0 else np.nan
return res.tp / (res.tp + 0.5 * res.fp + 0.5 * res.fn)
I defintely agree with that. Should've thought of that myself. How does this feel?
Great work here. I think we need to adress some points (see comments), but otherwise this is nice! @aymuos15 |
Kicking off the voronoi matcher.
I found it to be the easiest through a new matching scheme (which will be easier then to recursively put on more matching schemes on top of it).
Obviously open to change and discussion based on what everyone feels is right.
(apologies for the extra code, I branched off the wrong way -> They should disappear once the cupy PR is merged! Hope that is okay)