Skip to content

Conversation

@anoushkajain
Copy link

Fix #324

@stephprince stephprince added this to the 0.6.4 milestone Jun 27, 2025
Copy link
Contributor

@stephprince stephprince left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @anoushkajain!

I think this looks good but there are a couple more pieces to add before merging.

  1. We may need to expand the check to other array-like dataset objects that could be present in NWB files that haven’t been written to disk yet. I've added a suggestion to the code related to this.
  2. Can you add tests including passing and failing cases with the different types of datasets.
  3. Can you add a new best practice section to the documentation recommending that objects with empty datasets should not be included in the file.

Also in the future, feel free to @ me or request a review whenever you are ready for a PR to be looked at!

Inspector messages for each empty dataset found, or None if no empty datasets are found.
"""
for field_name, field in getattr(nwb_container, "fields", dict()).items():
if not isinstance(field, (h5py.Dataset, zarr.Array)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may also want to check for other empty array-like objects here, including numpy arrays and DataIO objects. Could use something like:

Suggested change
if not isinstance(field, (h5py.Dataset, zarr.Array)):
is_array_data = hasattr(value, "shape") and hasattr(value, "dtype")
if not is_array_data:

continue

# Check if the dataset has zero elements
if field.size == 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if it will be necessary but for the DataIO objects you may need to use get_data_shape from the nwbinspector.utils module.

return None


@register_check(importance=Importance.CRITICAL, neurodata_type=NWBContainer)
Copy link
Contributor

@stephprince stephprince Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure yet whether this should be a critical check or a best practice violation. There was a discussion here about whether empty datasets should be allowed: NeurodataWithoutBorders/pynwb#2065.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Add Check]: Datasets are not empty

2 participants