Skip to content

Conversation

@jiajic
Copy link
Member

@jiajic jiajic commented May 2, 2025

  • Bumps {GiottoClass} to v0.5.0

Changes

  • harmonization of params in calculateOverlap() and overlapToMatrix()

New

  • aggregateFeatures() wrapper for running calculateOverlap() and overlapToMatrix()
  • overlapPointDT() and overlapIntensityDT() classes to store overlaps relationships efficiently and help with aggregation pipeline

Enhancements

  • aggregateFeatures() now has a method param to switch between "raster" and "vector" overlapping methods. The default method remains as "raster" for now, but may change in the future.

bug fixes

  • overlaps() will now properly find image overlaps

- Also adds some tests for aggregation and overlaps objects

Summary by CodeRabbit

  • New Features

    • Introduced the aggregateFeatures function for streamlined spatial feature aggregation.
    • Added new data classes for overlap information: overlapPointDT and overlapIntensityDT.
    • New methods to convert overlap objects to data frames and matrices, and to display and combine them.
  • Improvements

    • Harmonized and updated parameters across overlap calculation and aggregation functions for consistency.
    • Enhanced subsetting, extraction, and display capabilities for overlap data.
    • Improved documentation for new and updated functions and methods.
  • Bug Fixes

    • Corrected overlap detection in image overlaps.
  • Tests

    • Added comprehensive tests for aggregation and overlap functionalities.
  • Chores

    • Updated package version and minimum R version requirement.

 hotfix: createGiottoPolygons() and CI changes
feat: composable data processing framework
- new: `aggregateFeatures()` gobject function
- param harmonizations:
    - `spatial_info` -> `spat_info` in `calculateOverlap()`
    - `poly_info` -> `spat_info` in `overlapToMatrix()`
    - `feat_subset_ids` -> `feat_subset_values` in `calculateOverlap()`
    - `count_info_column` -> `feat_count_column` in `calculateOverlap()` and `overlapToMatrix()`
    - `aggr_function` -> `fun` in `overlapToMatrtix()`

- Deprecated functions:
    - `calculateOverlapRaster()`
    - `overlapImageToMatrix()`
also should support qptiff
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lintr found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

@coderabbitai
Copy link

coderabbitai bot commented Jun 16, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This update introduces a new aggregation workflow for spatial features in the Giotto framework. It adds new classes (overlapPointDT, overlapIntensityDT), a high-level function (aggregateFeatures), and supporting methods for overlap calculation, subsetting, coercion, and display. Documentation, tests, and method signatures are updated for consistency and clarity.

Changes

Files/Group Change Summary
DESCRIPTION Version bump to 0.5.0; R dependency raised to 4.4.1.
NAMESPACE Registers new S3 method as.data.frame for overlapPointDT; exports aggregateFeatures.
NEWS.md Documents new features: parameter harmonization, new wrapper, new classes, and bug fix.
R/aggregate.R, man/aggregateFeatures.Rd Adds aggregateFeatures function and documentation; refactors overlap and aggregation workflow; standardizes parameters; deprecates old methods.
R/classes.R Adds new S4 classes for overlap info; updates object validation and conversion logic.
R/methods-coerce.R, man/as.data.table.Rd, man/as.matrix.Rd Adds coercion methods for new classes (as.data.frame, as.matrix); documents new methods.
R/methods-dims.R, man/dims-generic.Rd Adds dim methods for new overlap classes; updates documentation.
R/methods-extract.R Adds and refactors subsetting methods for overlap classes and helper functions.
R/methods-initialize.R Removes legacy feature/spatial info validation from initialization.
R/methods-overlaps.R Refines overlap retrieval logic for polygons.
R/methods-rbind.R, man/rbind-generic.Rd Adds rbind2 and rbind methods for overlapPointDT; documents new method.
R/methods-show.R Adds show methods for new overlap classes.
R/save_load.R Removes saving/loading of overlaps for v0.5.0+ objects.
R/slot_show.R, man/dot-abbrev_mat.Rd Simplifies internal documentation for matrix abbreviation; removes Rd file.
R/subset.R Refactors overlap subsetting using helper functions.
man/calculateOverlap.Rd, man/calculateOverlapRaster.Rd Updates documentation for parameter renaming and deprecation.
man/overlapToMatrix.Rd Documents new methods for overlap classes; standardizes argument names.
tests/testthat/test-aggregate.R Adds comprehensive tests for new aggregation and overlap features.
tests/testthat/test-create_mini_vizgen.R Updates tests for new argument names and output classes.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Giotto
    participant aggregateFeatures
    participant calculateOverlap
    participant overlapToMatrix

    User->>Giotto: Call aggregateFeatures(...)
    Giotto->>aggregateFeatures: aggregateFeatures(...)
    aggregateFeatures->>calculateOverlap: Calculate overlaps (points or images)
    calculateOverlap-->>aggregateFeatures: Overlap object (overlapPointDT/overlapIntensityDT)
    aggregateFeatures->>overlapToMatrix: Summarize overlaps (as matrix/exprObj)
    overlapToMatrix-->>aggregateFeatures: Aggregated matrix or exprObj
    aggregateFeatures-->>Giotto: Return updated Giotto object or exprObj
Loading

Poem

In a patchwork field of spatial delight,
New overlaps hop into view, crisp and bright.
With features and polygons, points and intensity,
Aggregation now flows with elegant consistency.
The rabbit leaps—code refactored, tests anew—
0.5.0 hops forward, bringing clarity through and through!
🐇✨

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jiajic
Copy link
Member Author

jiajic commented Jun 16, 2025

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Jun 16, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 20

♻️ Duplicate comments (1)
man/overlapToMatrix.Rd (1)

66-72: Same inconsistency for overlapIntensityDT

Mirror the fix/proviso applied to overlapPointDT.

🧹 Nitpick comments (23)
man/as.data.table.Rd (1)

8-25: Mixed documentation for as.data.table() vs as.data.frame() can confuse users

This Rd file is titled/documented for as.data.table, yet the new alias/usage block registers as.data.frame.overlapPointDT.

  1. If the intent is to export both coercions, create a separate Rd topic for as.data.frame so each generic is discoverable via ?as.data.frame.
  2. At minimum, add the matching alias & usage for as.data.table.overlapPointDT (and the symmetric overlapIntensityDT) to keep the file internally consistent.
NEWS.md (1)

3-14: Minor repetition & style tweaks for the changelog

Static analysis flagged a double “wrapper for” wording. While harmless, a quick edit smooths the prose:

- - `aggregateFeatures()` wrapper for running `calculateOverlap()` and `overlapToMatrix()`
+ - `aggregateFeatures()` – a wrapper around `calculateOverlap()` and `overlapToMatrix()`
R/methods-rbind.R (2)

118-120: Index created only for poly

setindex(x@data, "poly") is helpful, but look-ups on feat_id_index are equally common. Adding a secondary index keeps both access paths O(log n) with negligible cost.

data.table::setindex(x@data, c("poly", "feat_id_index"))

146-153: Variadic rbind leaks in-place mutation to first argument

The variadic wrapper ultimately calls rbind2(xs[[1L]], …) which mutates and returns the first object, surprising callers who expect a new object. Consider cloning the first operand (data.table::copy() or methods::copy) before passing it to rbind2.

R/methods-show.R (1)

670-678: features count should derive from feat_ids, not nfeats

nfeats is mutable during rbind2 and may be inaccurate (see previous comment).
Displaying length(object@feat_ids) avoids stale numbers.

-    cat(sprintf("* features : %d\n", object@nfeats))
+    cat(sprintf("* features : %d\n", length(object@feat_ids)))
man/as.matrix.Rd (1)

26-29: Please add default value & example for the new feat_count_column argument

The parameter is introduced in the docs, but:

  1. The default value (NULL) is not explicitly mentioned in the \usage{} block.
  2. No example shows how the argument affects the output.

That makes it harder for users to discover how to use the feature.
A short example similar to the tests in test-aggregate.R would fix this quickly.

tests/testthat/test-aggregate.R (1)

134-135: Floating-point comparison should set a tolerance

expect_equal(m[1,10], 536529.94) may fail on alternate BLAS / OS combinations due to tiny FP drift.

- expect_equal(m[1,10], 536529.94)
+ expect_equal(m[1,10], 536529.94, tolerance = 1e-6)
R/combine_metadata.R (1)

490-492: Inefficient filter converts the whole points set to data.table first

getFeatureInfo() returns a SpatVector → converted to DT via .spatvector_to_dt(pts) after sub-setting would avoid materialising unnecessary rows/columns.

Consider:

idx <- pts$feat_ID_uniq %in% feat_overlap@data$feat
feat_overlap_info <- .spatvector_to_dt(pts[idx, ])
R/methods-coerce.R (1)

153-161: as.data.frame.overlapPointDT can silently mis-label columns

poly_ID/feat_ID are generated with vector indexed look-ups. If any index is NA or out-of-range you get NA without warning.
Add basic bounds checking or at least stopifnot(all(!is.na(poly_ID), …)) to avoid hard-to-trace downstream errors.

man/aggregateFeatures.Rd (1)

44-45: Typo: “expresssion”

expresssion has three “s”.
Replace with expression to avoid grep/build warnings.

man/calculateOverlap.Rd (1)

17-18: spat_info argument undocumented

spat_info was added to the usage but is missing from the \arguments section (only deprecated spatial_info is listed).
Please add a proper description and mark spatial_info as deprecated to keep roxygen checks happy.

R/classes.R (1)

1647-1654: Potential memory blow-up when converting SpatVector

terra::as.data.frame(x) materialises the whole vector; for large datasets this can explode RAM.
Consider streaming rows or using terra::values() with selected columns instead.

R/methods-extract.R (4)

1122-1135: Potential implicit recycling in poly subsetting

x@overlaps[[feat]][i, ids = FALSE] relies on gIndex dispatch, but no guard exists for negative/zero indices or out-of-range values.
Consider validating i with checkmate::assert_integerish(i, lower = 1, any.missing = FALSE) (after resolving logical/character) to catch user mistakes early.


1178-1191: Logical recycling may give unexpected duplicates

x@data <- x@data[i, ] after expanding a recycled logical vector can duplicate rows if i is longer than nrow(x@data). Usually users expect simple recycling, not row replication. Consider using i <- rep_len(i, nr)[seq_len(nr)] to avoid this.


1283-1325: API surprise: ids = TRUE returns vectors not objects

Overloading [ so that ids = TRUE changes the return type from object → vector violates the usual S4 subsetting contract and will trip up users (class(x[i]) now depends on a flag). Strongly advise introducing a dedicated accessor (e.g., polys() / features()) or exposing this through separate functions to keep [ for structural subsetting only.


1327-1364: Helper could skip extra allocation

.select_overlap_point_dt_i/j() repeatedly uses match() and builds temporary vectors. These are on the hot path during aggregation. Caching a lookup table or using keyed joins (setkey) would cut runtime for large datasets.

R/aggregate.R (1)

84-93: Unimported helper operators & setters trigger R CMD check warnings

Functions/operators like %null%, spatUnit<-, setGiotto, prov<-, featType<-, and deprecated() are used but not imported in this file, generating “no visible global function definition” notes.
Declare them in NAMESPACE via @importFrom or @import to keep the package CRAN-clean.

man/overlapToMatrix.Rd (6)

9-10: Alias section should reflect full S4 method signatures

The new overlapPointDT and overlapIntensityDT aliases are correctly added.
Double-check that their corresponding roxygen blocks include @aliases overlapToMatrix,overlapPointDT-method overlapToMatrix,overlapIntensityDT-method; otherwise the next roxygen2::roxygenise() run will drop these lines.


16-26: Parameter order / deprecated args doc drift

spat_info, feat_count_column, and fun were inserted before the ... placeholder, while the three deprecated arguments remain after it.
For consistency with the rest of the package (and to avoid mismatched usage examples), consider moving all deprecated arguments to the very end of the signature and adding @deprecated tags in the roxygen source, e.g.:

- aggr_function = deprecated(),
- poly_info = deprecated(),
- count_info_column = deprecated(),
...
+ ...,
+ aggr_function = deprecated(),
+ poly_info     = deprecated(),
+ count_info_column = deprecated()

This keeps the public interface clean and reduces the risk that users think the old names are still first-class.


75-82: Argument description for x omits the two new classes

The description still reads “giotto object or SpatVector points or data.table”.
Please append overlapPointDT and overlapIntensityDT so help users discover the new workflow.


88-90: Clarify that fun can be a function object

Current text: “character. Function to aggregate …”.
In practice, fun can also be an actual function (e.g. mean). Spell this out to prevent misuse:

Function or single-character name of a function used to aggregate …


95-100: Mark deprecated parameters with @keyword internal (or similar)

Listing deprecated args inline clutters the rendered help.
Consider adding @keywords internal or a separate “Deprecated Arguments” section so they’re still searchable but visually separated.


110-112: sort description could note performance impact

Mixed sorting on large matrices can be costly. Add a short note such as “may be slower for very large matrices” so users understand the trade-off.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a0ff9c8 and 01e16ab.

📒 Files selected for processing (28)
  • DESCRIPTION (2 hunks)
  • NAMESPACE (2 hunks)
  • NEWS.md (1 hunks)
  • R/aggregate.R (34 hunks)
  • R/classes.R (4 hunks)
  • R/combine_metadata.R (1 hunks)
  • R/methods-coerce.R (3 hunks)
  • R/methods-crop.R (1 hunks)
  • R/methods-dims.R (1 hunks)
  • R/methods-extract.R (2 hunks)
  • R/methods-initialize.R (0 hunks)
  • R/methods-overlaps.R (1 hunks)
  • R/methods-rbind.R (2 hunks)
  • R/methods-show.R (1 hunks)
  • R/save_load.R (1 hunks)
  • R/slot_show.R (1 hunks)
  • R/subset.R (3 hunks)
  • man/aggregateFeatures.Rd (1 hunks)
  • man/as.data.table.Rd (2 hunks)
  • man/as.matrix.Rd (2 hunks)
  • man/calculateOverlap.Rd (8 hunks)
  • man/calculateOverlapRaster.Rd (2 hunks)
  • man/dims-generic.Rd (2 hunks)
  • man/dot-abbrev_mat.Rd (0 hunks)
  • man/overlapToMatrix.Rd (2 hunks)
  • man/rbind-generic.Rd (2 hunks)
  • tests/testthat/test-aggregate.R (1 hunks)
  • tests/testthat/test-create_mini_vizgen.R (3 hunks)
💤 Files with no reviewable changes (2)
  • man/dot-abbrev_mat.Rd
  • R/methods-initialize.R
🧰 Additional context used
🪛 LanguageTool
NEWS.md

[duplication] ~9-~9: Possible typo: you repeated a word.
Context: ...rapper for running calculateOverlap() and overlapToMatrix() - overlapPointDT() and overlapIntensityDT() classes to store...

(ENGLISH_WORD_REPEAT_RULE)

🪛 GitHub Check: lintr
R/aggregate.R

[notice] 50-50:
Lines should not be more than 80 characters. This line is 81 characters.


[warning] 102-102:
no visible global function definition for '%null%'


[warning] 105-105:
no visible global function definition for '%null%'


[warning] 181-181:
no visible global function definition for 'spatUnit<-'


[warning] 183-183:
no visible global function definition for 'setGiotto'


[warning] 337-337:
no visible global function definition for 'deprecated'


[warning] 513-513:
no visible global function definition for 'deprecated'


[warning] 513-513:
no visible global function definition for 'deprecated'


[warning] 541-541:
no visible global function definition for 'prov<-'


[warning] 541-541:
no visible global function definition for 'spatUnit'


[warning] 542-542:
no visible global function definition for 'spatUnit<-'


[warning] 542-542:
no visible global function definition for 'spatUnit'


[warning] 543-543:
no visible global function definition for 'featType<-'


[warning] 595-595:
no visible global function definition for 'affine'


[warning] 639-639:
no visible global function definition for 'prov<-'


[warning] 639-639:
no visible global function definition for 'spatUnit'


[warning] 640-640:
no visible global function definition for 'spatUnit<-'


[warning] 640-640:
no visible global function definition for 'spatUnit'


[warning] 641-641:
no visible global function definition for 'featType<-'


[warning] 739-739:
no visible global function definition for 'deprecated'


[warning] 739-739:
no visible global function definition for 'deprecated'


[warning] 855-855:
no visible global function definition for 'deprecated'


[warning] 855-855:
no visible global function definition for 'deprecated'


[warning] 857-857:
no visible global function definition for 'deprecate_warn'


[warning] 912-912:
Use && in conditional expressions.


[notice] 955-955:
Hanging indent should be 37 spaces but is 8 spaces.


[warning] 982-982:
no visible global function definition for ':='


[warning] 985-985:
no visible global function definition for ':='


[warning] 989-989:
no visible global function definition for ':='


[warning] 992-992:
no visible global function definition for ':='

🔇 Additional comments (15)
DESCRIPTION (1)

3-3: Verify necessity of tighter version constraints

Bumping the package to 0.5.0 while simultaneously tightening the minimal R requirement from 4.4.0 → 4.4.1 may exclude users on the latest 4.4-series stable release (4.4.0) without an obvious technical reason.

Please double-check that (a) the new code genuinely needs ≥ 4.4.1, and (b) the NEWS / changelog clearly explains the rationale so downstream maintainers aren’t surprised.

Also applies to: 29-29

NAMESPACE (1)

32-32: Export confirmation for aggregateFeatures

Export looks good and avoids name collisions with existing generics.

man/rbind-generic.Rd (1)

9-10: Documentation addition is consistent

The new aliases correctly expose the rbind2,overlapPointDT,… method. No further action needed.

Also applies to: 20-21

R/slot_show.R (1)

1107-1110: Internal helper now documented inline – OK

Replacing the roxygen header with simple comments is fine for a non-exported utility; it prevents an unnecessary Rd file without impacting users.

man/dims-generic.Rd (1)

27-28: Alias & usage entries look consistent for the new overlap classes

dim() is now documented for both overlapPointDT and overlapIntensityDT, and the usage stubs are present. No further action required.

Also applies to: 75-77

R/methods-crop.R (1)

229-237: Crop now delegates overlap filtering to .subset_overlaps_poly – validate side-effects

Good move to centralise overlap subsetting. Please double-check that .subset_overlaps_poly()

  1. updates both point- and intensity-based overlap objects for the removed polygons,
  2. preserves the class (overlapPointDT, overlapIntensityDT) after subsetting so downstream coercion & dim() methods keep working.

No code change required; this is a verification reminder.

R/methods-rbind.R (1)

93-122: Missing duplicate-ID guard

Unlike the existing cellMetaObj/spatLocsObj methods, this method doesn’t call .check_id_dups().
If two overlapPointDT objects contain the same poly/feat combination the result will silently contain duplicates. Consider re-using .check_id_dups() (or an equivalent) before merging.

man/calculateOverlapRaster.Rd (1)

14-20: Documentation looks coherent with code changes

Parameter renaming and deprecation notes are consistent; no issues spotted.

Also applies to: 36-40, 45-47

tests/testthat/test-create_mini_vizgen.R (3)

144-145: Argument rename reflected correctly

Switch from largeImages to images aligns with the API; good catch.


162-176: calculateOverlap() calls: missing return_gobject?

The earlier calculateOverlapRaster() defaulted to return_gobject = TRUE.
If the new function preserved that default this is fine; otherwise the test will fail. Please verify.


203-213: Parameter rename in overlapToMatrix() verified

The switch from poly_info to spat_info is correctly reflected in the test code.

R/methods-extract.R (2)

1062-1064: Good refactor – but double-check class coercion

Calling .subset_overlaps_poly() instead of hand-rolled filtering greatly clarifies intent.
Just verify that every legacy object arriving in x@overlaps is updated to the new overlap*DT classes before this call; otherwise dispatch on [ may fall back to the default method and silently drop the ids = FALSE flag.


1067-1112: NULL-safety for intensity branch

names(x@overlaps$intensity) is accessed unguarded.
When x@overlaps$intensity is NULL (not just length-0 list) this will raise an error.
Add an early return or wrap with is.null() to avoid surprising failures.

-    is_int_feat <- feat_type %in% names(x@overlaps$intensity)
+    is_int_feat <- !is.null(x@overlaps$intensity) &&
+        feat_type %in% names(x@overlaps$intensity)

Likely an incorrect or invalid review comment.

R/aggregate.R (1)

2075-2085: Missing namespace for mixedsort

.mixedsort_rowcols() calls mixedsort() without qualification; unless gtools (or equivalent) is attached this will
error at runtime. Add an explicit import or use gtools::mixedsort.

- match(mixedsort(mat_r_names), mat_r_names),
+ match(gtools::mixedsort(mat_r_names), mat_r_names),

and add @importFrom gtools mixedsort to the roxygen header or NAMESPACE.

man/overlapToMatrix.Rd (1)

50-55: data.table method: missing feat_count_column & sort args

The data.table S4 method now exposes fun, but does not expose feat_count_column (added elsewhere) or sort.
If these arguments are intentionally unsupported for plain data.tables, please document that explicitly; otherwise add them to avoid an inconsistent API surface.

S3method(.DollarNames,spatEnrObj)
S3method(.DollarNames,spatLocsObj)
S3method(.DollarNames,terraVectData)
S3method(as.data.frame,overlapPointDT)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Missing symmetrical as.data.frame for overlapIntensityDT

You registered as.data.frame.overlapPointDT, but its sibling class overlapIntensityDT did not get the same treatment. Down-stream code that relies on as.data.frame() for both overlap classes will break.

+S3method(as.data.frame,overlapIntensityDT)

Add the roxygen tag to the method implementation and re-document.

🤖 Prompt for AI Agents
In the NAMESPACE file at line 11, you registered the S3 method as.data.frame for
overlapPointDT but missed doing the same for overlapIntensityDT. To fix this,
add a corresponding S3method(as.data.frame, overlapIntensityDT) line to the
NAMESPACE file. Also, in the R script where the as.data.frame.overlapIntensityDT
method is defined, add the appropriate roxygen2 @method tag for as.data.frame
and overlapIntensityDT, then re-document the package to update the NAMESPACE
accordingly.

Comment on lines +149 to +155
#' @rdname dims-generic
#' @export
setMethod("dim", signature("overlapPointDT"), function(x) dim(x@data))

#' @rdname dims-generic
#' @export
setMethod("dim", signature("overlapIntensityDT"), function(x) dim(x@data))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add nrow/ncol for the new overlap classes for API parity

You introduced dim() methods but left out nrow() and ncol(). Many client-side calls use the simpler helpers and will now fall back to dim() + indexing, which is less efficient and could surprise users.

+#' @rdname dims-generic
+#' @export
+setMethod("nrow", signature("overlapPointDT"), function(x) nrow(x@data))
+
+#' @rdname dims-generic
+#' @export
+setMethod("ncol", signature("overlapPointDT"), function(x) ncol(x@data))
+
+#' @rdname dims-generic
+#' @export
+setMethod("nrow", signature("overlapIntensityDT"), function(x) nrow(x@data))
+
+#' @rdname dims-generic
+#' @export
+setMethod("ncol", signature("overlapIntensityDT"), function(x) ncol(x@data))

Implementing these keeps the public surface consistent with other Giotto classes.

🤖 Prompt for AI Agents
In R/methods-dims.R around lines 149 to 155, you defined dim() methods for
overlapPointDT and overlapIntensityDT classes but did not implement
corresponding nrow() and ncol() methods. To maintain API consistency and improve
efficiency, add nrow() and ncol() methods for these classes that directly return
the number of rows and columns of the underlying data slot (x@data), similar to
how dim() is implemented.

Comment on lines +100 to +114
x_spat_map <- match(x@spat_ids, comb_spat)
y_spat_map <- match(y@spat_ids, comb_spat)
x_feat_map <- match(x@feat_ids, comb_feat)
y_feat_map <- match(y@feat_ids, comb_feat)

# replace id dictionaries for x (output object)
x@spat_ids <- comb_spat
x@feat_ids <- comb_feat

# remap indices
x@data[, poly := x_spat_map[poly]]
y@data[, poly := y_spat_map[poly]]
x@data[, feat_id_index := x_feat_map[feat_id_index]]
y@data[, feat_id_index := y_feat_map[feat_id_index]]

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

feat column/key is left inconsistent after re-indexing

You remap feat_id_index but leave the character column feat untouched, then set a data-table key on it:

data.table::setkeyv(x@data, "feat")

If the two objects had different ordering this key will no longer match the new indices, breaking joins and look-ups.
Either update the feat column with comb_feat[feat_id_index] after re-mapping, or set the key on feat_id_index.

🤖 Prompt for AI Agents
In R/methods-rbind.R around lines 100 to 114, after remapping the feat_id_index
column, the feat character column remains unchanged, causing inconsistency when
setting a data.table key on feat. To fix this, update the feat column by
assigning comb_feat indexed by the updated feat_id_index (i.e., feat :=
comb_feat[feat_id_index]) before setting the key, ensuring the feat column
aligns with the new indices and maintains correct joins and look-ups.

Comment on lines +93 to +122
#' @rdname rbind-generic
#' @export
setMethod("rbind2", signature("overlapPointDT", "overlapPointDT"),
function(x, y, ...) {
comb_spat <- unique(c(x@spat_ids, y@spat_ids))
comb_feat <- unique(c(x@feat_ids, y@feat_ids))

x_spat_map <- match(x@spat_ids, comb_spat)
y_spat_map <- match(y@spat_ids, comb_spat)
x_feat_map <- match(x@feat_ids, comb_feat)
y_feat_map <- match(y@feat_ids, comb_feat)

# replace id dictionaries for x (output object)
x@spat_ids <- comb_spat
x@feat_ids <- comb_feat

# remap indices
x@data[, poly := x_spat_map[poly]]
y@data[, poly := y_spat_map[poly]]
x@data[, feat_id_index := x_feat_map[feat_id_index]]
y@data[, feat_id_index := y_feat_map[feat_id_index]]

x@data <- rbind(x@data, y@data)
x@nfeats <- x@nfeats + y@nfeats

data.table::setkeyv(x@data, "feat")
data.table::setindex(x@data, "poly")
x
}
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

nfeats is incremented instead of recomputed ‒ causes double-counting when feature sets overlap

x@nfeats <- x@nfeats + y@nfeats assumes that the two objects have disjoint feature dictionaries.
When x and y share features the count is inflated (e.g. 3 + 3 = 6 while the merged set has only 5).
Recompute from the merged dictionary instead:

-    x@nfeats <- x@nfeats + y@nfeats
+    x@nfeats <- length(comb_feat)
🤖 Prompt for AI Agents
In R/methods-rbind.R around lines 93 to 122, the code increments x@nfeats by
y@nfeats, which causes double-counting when feature sets overlap. Instead of
adding, recompute x@nfeats as the length of the combined feature dictionary
(comb_feat) after merging to accurately reflect the unique feature count.

Comment on lines +1138 to +1176
.subset_overlap_point_dt_i <- function(x, i) {
if (is.numeric(i) || is.logical(i)) {
i <- x@spat_ids[i]
}

poly <- NULL # NSE vars
idx <- match(i, x@spat_ids) # poly indices to keep
idx <- idx[!is.na(idx)] # drop unmatched NAs
x@spat_ids <- x@spat_ids[x@spat_ids %in% i] # replace spatial ids

x@data <- x@data[poly %in% idx]
x@data[, poly := match(poly, idx)]
data.table::setkeyv(x@data, "feat")
data.table::setindex(x@data, "poly")
x
}

.subset_overlap_point_dt_j <- function(x, j) {
# ---- convert j to numerical index ---- #
if (is.logical(j)) {
if (length(j) != length(x@feat_ids)) {
# recycle logical if needed
j <- rep(j, length.out = length(x@feat_ids))
}
j <- which(j)
}
if (is.character(j)) {
j <- match(j, x@feat_ids)
}

x@feat_ids <- x@feat_ids[j] # replace feature ids

# subset on feat_id_index matches
x@data <- x@data[feat_id_index %in% j]
x@data[, feat_id_index := match(feat_id_index, j)]
data.table::setkeyv(x@data, "feat")
data.table::setindex(x@data, "poly")
x
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Negative index semantics are lost

.subset_overlap_point_dt_i() treats numeric i as positive positions only; negative indices (common in R to drop rows) will silently mis-select. Either:

  1. Explicitly forbid negatives with an assertion, or
  2. Implement the standard R rule (setdiff(seq_len(n), abs(i))).

Fail-fast behaviour is preferable to silent mis-subset.

🤖 Prompt for AI Agents
In R/methods-extract.R around lines 1138 to 1176, the function
.subset_overlap_point_dt_i() does not handle negative numeric indices correctly,
which in R are used to exclude elements. To fix this, add a check for negative
numeric indices in the input i. If negatives are present, either raise an error
to forbid them or convert them to positive indices by excluding the absolute
values from the full sequence of x@spat_ids indices using
setdiff(seq_along(x@spat_ids), abs(i)). This ensures the function respects
standard R negative indexing semantics and avoids silent mis-selection.

Comment on lines +1193 to +1206
.subset_overlap_intensity_dt_j <- function(x, j) {
# convert j to char col reference
if (is.numeric(j)) {
j <- j + 1L
} else if (is.logical(j)) {
j <- c(TRUE, j)
}
if (!is.character(j)) {
j <- colnames(x@data)[j]
}
j <- unique(c("poly_ID", j))
x@data <- x@data[, .SD, .SDcols = j]
x
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Numeric column index offset is fragile

Shifting numeric j by +1L assumes poly_ID is always the first column. If column order changes (e.g., after setcolorder) this will select the wrong data. Safer: look up columns by name once and compute the offset dynamically.

-if (is.numeric(j)) {
-    j <- j + 1L
+if (is.numeric(j)) {
+    all_cols <- colnames(x@data)
+    j <- j[ j > 0 & j <= length(all_cols) ]
+    j <- unique(c(match("poly_ID", all_cols), j))
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
.subset_overlap_intensity_dt_j <- function(x, j) {
# convert j to char col reference
if (is.numeric(j)) {
j <- j + 1L
} else if (is.logical(j)) {
j <- c(TRUE, j)
}
if (!is.character(j)) {
j <- colnames(x@data)[j]
}
j <- unique(c("poly_ID", j))
x@data <- x@data[, .SD, .SDcols = j]
x
}
.subset_overlap_intensity_dt_j <- function(x, j) {
# convert j to char col reference
if (is.numeric(j)) {
all_cols <- colnames(x@data)
j <- j[j > 0 & j <= length(all_cols)]
j <- unique(c(match("poly_ID", all_cols), j))
} else if (is.logical(j)) {
j <- c(TRUE, j)
}
if (!is.character(j)) {
j <- colnames(x@data)[j]
}
j <- unique(c("poly_ID", j))
x@data <- x@data[, .SD, .SDcols = j]
x
}
🤖 Prompt for AI Agents
In R/methods-extract.R around lines 1193 to 1206, the code shifts numeric column
index j by +1L assuming "poly_ID" is always the first column, which is fragile
if column order changes. To fix this, dynamically find the position of "poly_ID"
in colnames(x@data) and adjust j accordingly instead of hardcoding +1L. This
ensures correct column selection regardless of column order changes.

Comment on lines +794 to +804
.calculate_overlap_vector <- function(spatvec, pointvec, keep = NULL) {
checkmate::assert_character(keep, null.ok = TRUE)
res <- terra::extract(spatvec, pointvec)
cn <- colnames(res)
if (all(c("id.y", "poly_ID") %in% cn)) {
res_keep <- c("id.y", "poly_ID")
} else {
res_keep <- cn[c(1L, 2L)]
}
res <- res[!is.na(res[[2]]), res_keep] # drop NAs (sparsify) + col select
if (!is.null(keep)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fragile column selection in .calculate_overlap_vector

The code picks res_keep <- c("id.y","poly_ID") by position and falls back to the first two columns if names do not match.
terra::extract() does not guarantee column order across versions; a safer pattern is to subset by names and error-out when expected columns are absent, e.g.:

expected <- c("id", "poly_ID")
if (!all(expected %in% names(res)))
    stop(".calculate_overlap_vector: expected columns not found")
res <- res[!is.na(res$poly_ID), expected]

This avoids silent mis-alignment when terra changes.

🤖 Prompt for AI Agents
In R/aggregate.R around lines 794 to 804, the current code selects columns by
position which is fragile because terra::extract() does not guarantee column
order. To fix this, explicitly check for the presence of expected columns by
name (e.g., "id.y" and "poly_ID"), stop with an error if they are missing, and
subset the data frame by these column names instead of by position. This ensures
stable and correct column selection regardless of terra version changes.

Comment on lines +770 to 785
res <- switch(method,
"raster" = .calculate_overlap_raster(
spatvec = x,
pointvec = y,
count_info_column = feat_count_column,
verbose = verbose
),
"vector" = .calculate_overlap_vector(
spatvec = x,
pointvec = y,
keep = feat_count_column
)
)

.create_overlap_point_dt(x, y, res, feat_ids = feat_ids)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

feat_count_column silently dropped

calculateOverlap(SpatVector,SpatVector) adds feat_count_column to calculate_overlap_params, but when control reaches
.create_overlap_point_dt() the argument keep is left NULL, so the count column (and any other extras) is lost.

- .create_overlap_point_dt(x, y, res, feat_ids = feat_ids)
+ .create_overlap_point_dt(
+     x, y,
+     res,
+     keep = feat_count_column,   # preserve requested meta
+     feat_ids = feat_ids
+ )

Without this, later aggregation in overlapToMatrix() cannot use the counts.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
res <- switch(method,
"raster" = .calculate_overlap_raster(
spatvec = x,
pointvec = y,
count_info_column = feat_count_column,
verbose = verbose
),
"vector" = .calculate_overlap_vector(
spatvec = x,
pointvec = y,
keep = feat_count_column
)
)
.create_overlap_point_dt(x, y, res, feat_ids = feat_ids)
}
res <- switch(method,
"raster" = .calculate_overlap_raster(
spatvec = x,
pointvec = y,
count_info_column = feat_count_column,
verbose = verbose
),
"vector" = .calculate_overlap_vector(
spatvec = x,
pointvec = y,
keep = feat_count_column
)
)
.create_overlap_point_dt(
x, y,
res,
keep = feat_count_column, # preserve requested meta
feat_ids = feat_ids
)
}
🤖 Prompt for AI Agents
In R/aggregate.R around lines 770 to 785, the argument feat_count_column is
passed to the raster and vector overlap calculation functions but is not
forwarded to the .create_overlap_point_dt() call, causing the count column to be
lost. To fix this, pass feat_count_column as the keep argument in the
.create_overlap_point_dt() function call so that the count column and any
additional columns are preserved for later aggregation.

Comment on lines +57 to +65
\S4method{overlapToMatrix}{overlapPointDT}(
x,
name = "raw",
sort = TRUE,
feat_count_column = NULL,
output = c("Matrix", "exprObj"),
...
)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

overlapPointDT method: fun parameter absent but documented above

The generic argument list documents fun, yet the overlapPointDT signature omits it.
Either add fun here or note in the docs that aggregation is fixed for this class to avoid user confusion.

-\S4method{overlapToMatrix}{overlapPointDT}(
-  x,
-  name = "raw",
-  sort = TRUE,
-  feat_count_column = NULL,
-  output = c("Matrix", "exprObj"),
-  ...
-)
+\S4method{overlapToMatrix}{overlapPointDT}(
+  x,
+  name = "raw",
+  sort = TRUE,
+  feat_count_column = NULL,
+  fun = "sum",
+  output = c("Matrix", "exprObj"),
+  ...
+)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
\S4method{overlapToMatrix}{overlapPointDT}(
x,
name = "raw",
sort = TRUE,
feat_count_column = NULL,
output = c("Matrix", "exprObj"),
...
)
\S4method{overlapToMatrix}{overlapPointDT}(
x,
name = "raw",
sort = TRUE,
feat_count_column = NULL,
fun = "sum",
output = c("Matrix", "exprObj"),
...
)
🤖 Prompt for AI Agents
In man/overlapToMatrix.Rd around lines 57 to 65, the overlapPointDT method
signature is missing the 'fun' parameter that is documented in the generic
argument list. To fix this, either add the 'fun' parameter to the overlapPointDT
method signature to match the documentation or update the documentation to
clarify that aggregation is fixed for this class and 'fun' is not applicable,
preventing user confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant