Skip to content

NickAkhmetov/scFind #3667

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 66 commits into from
May 5, 2025
Merged

NickAkhmetov/scFind #3667

merged 66 commits into from
May 5, 2025

Conversation

NickAkhmetov
Copy link
Collaborator

@NickAkhmetov NickAkhmetov commented Jan 15, 2025

Summary

This PR includes:

  • Updates to include hooks for the latest scFind endpoints
  • Improvements to storybook stories for scFind hooks to decrease repetition
  • Updates to the molecular/cellular queries page to decrease it to two steps
  • Updates to the homepage graph to display donor information
  • Updates to the entities tables to enable adding expandable content
  • Adds tracking where necessary

Remaining TODOs to take care of in follow-up branches:

  • Grouped/Stacked Bar Chart for the dataset overview
  • Gene search

Design Documentation/Original Tickets

https://hms-dbmi.atlassian.net/browse/CAT-915 and its subtasks

Testing

  • Unit tests for URL param construction
  • Storybook page with interfaces for interacting with hooks/scFind API
  • Unit tests for

Screenshots/Video

Expand me for screenshots

image

image

image

image

Checklist

  • Code follows the project's coding standards
    • Lint checks pass locally
    • New CHANGELOG-your-feature-name-here.md is present in the root directory, describing the change(s) in full sentences.
  • Unit tests covering the new feature have been added
  • All existing tests pass
  • Any relevant documentation in JIRA/Confluence has been updated to reflect the new feature
  • Any new functionalities have appropriate analytics functionalities added

Additional Notes

The js/api directory is a new addition working towards CAT-919, which involves reorganizing data fetching hooks into a consistent format.

The approach I've taken should minimize coupling between data fetching hooks and should encourage us to use them to compose other hooks, rather than integrating data fetching and processing into a single hook. In the long term, if we are able to switch the project from Webpack/CJS to Vite/ESM, this approach will enable more fine-grained code splitting.

@NickAkhmetov NickAkhmetov force-pushed the nickakhmetov/scfind-preliminary branch from de588cb to 0943a71 Compare February 12, 2025 16:36
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the scFind integration by adding new data fetching hooks and revising Storybook stories to use a unified StoryControlTemplate, while also introducing a new endpoint on the backend for aggregated datasets.

  • Updated Storybook stories to use the StoryControlTemplate
  • Added several new hooks for fetching scFind data (e.g., dataset genes, cell types, CLID-to-label)
  • Improved the backend routes for dataset counting

Reviewed Changes

Copilot reviewed 153 out of 153 changed files in this pull request and generated no comments.

Show a summary per file
File Description
context/app/static/js/api/scfind/useFindGeneSignatures.stories.tsx Replaced custom layout with StoryControlTemplate for consistency
context/app/static/js/api/scfind/useFindDatasetForGenes.ts Introduced new hook; note potential naming mismatch between file name and exported function
context/app/static/js/api/scfind/useFindDatasetForGenes.stories.tsx Updated story using new hook and StoryControlTemplate
context/app/static/js/api/scfind/useFindDatasetForCellTypes.ts New hook for cell types using SWR for parallel fetching
context/app/static/js/api/scfind/useFindCellTypeSpecificities.stories.tsx Updated argTypes for consistency with story controls
context/app/static/js/api/scfind/useEvaluateMarkers.stories.tsx Migrated story layout to use StoryControlTemplate
context/app/static/js/api/scfind/useCellTypeNames.ts Added memoized hooks for mapping cell types and organs with color support
context/app/static/js/api/scfind/useCellTypeNames.stories.tsx Adjusted story to use a streamlined template
context/app/static/js/api/scfind/useCellTypeMarkers.stories.tsx Revised story layout with StoryControlTemplate
context/app/static/js/api/scfind/useCellTypeCountForTissue.ts & stories New hook and story for tissue dataset counts
context/app/static/js/api/scfind/useCellTypeCountForDataset.ts & stories New hook for dataset counts; potential naming issue observed
context/app/static/js/api/scfind/useCLIDToLabel.ts & stories New hook for mapping CLID to label; potential naming issue observed
context/app/routes_cells.py Added backend route for total datasets count
CHANGELOG-scfind.md Updated changelog to reflect layout and functionality changes
Comments suppressed due to low confidence (3)

context/app/static/js/api/scfind/useFindDatasetForGenes.ts:23

  • The exported function name 'useCellTypeMarkers' does not match the file name 'useFindDatasetForGenes.ts'. Consider renaming the function to 'useFindDatasetForGenes' for clarity.
export default function useCellTypeMarkers(props: DatasetsForGenesParams) {

context/app/static/js/api/scfind/useCellTypeCountForDataset.ts:30

  • The exported function name 'useCellTypeMarkers' does not match the expected naming for dataset count functionality. Rename it to 'useCellTypeCountForDataset' to align with the file purpose.
export default function useCellTypeMarkers(props: CellTypeCountForDatasetParams) {

context/app/static/js/api/scfind/useCLIDToLabel.ts:25

  • The exported function 'useCellTypeMarkers' is inconsistent with the file name 'useCLIDToLabel.ts' and its intended functionality. Rename the function to 'useCLIDtoLabel' for better clarity and consistency.
export default function useCellTypeMarkers(props: CellTypeLabelsForCLIDParams) {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of code related to the datasets overview chart that is currently on hold here - e.g. the nested queries and so on. I figured it'd be easier to leave that intact.

@NickAkhmetov NickAkhmetov requested a review from Copilot May 1, 2025 20:23
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the scFind code and storybook stories to integrate new endpoints and improve UI components. Key changes include:

  • Introducing new hooks and story templates for gene signatures, dataset, cell type queries, and labels.
  • Updating endpoints and key generation functions to support scFind’s evolving API.
  • Minor backend additions (e.g., a new route for total datasets) and updates to the CHANGELOG.

Reviewed Changes

Copilot reviewed 157 out of 157 changed files in this pull request and generated no comments.

Show a summary per file
File Description
context/app/static/js/api/scfind/useFindGeneSignatures.stories.tsx Converted traditional Stack/Typographic layout to use StoryControlTemplate.
context/app/static/js/api/scfind/useFindDatasetForGenes.ts Introduced a new hook for finding datasets for genes with key generation.
context/app/static/js/api/scfind/useFindDatasetForGenes.stories.tsx Added corresponding story using StoryControlTemplate.
context/app/static/js/api/scfind/useFindDatasetForCellTypes.ts New hook for retrieving datasets for cell types using SWR multi-fetch.
context/app/static/js/api/scfind/useFindDatasetForCellTypes.stories.tsx Added a story to test the hook with controls.
context/app/static/js/api/scfind/useFindCellTypeSpecificities.stories.tsx Updated story argTypes and converted layout to StoryControlTemplate.
context/app/static/js/api/scfind/useEvaluateMarkers.stories.tsx Converted layout to StoryControlTemplate.
context/app/static/js/api/scfind/useCellTypeNames.ts Updated to use swr/immutable and added helper hooks using useMemo and scaleOrdinal.
context/app/static/js/api/scfind/useCellTypeNames.stories.tsx Updated to use StoryControlTemplate for a cleaner display.
context/app/static/js/api/scfind/useCellTypeMarkers.stories.tsx Switched to StoryControlTemplate for presenting markers.
context/app/static/js/api/scfind/useCellTypeCountForTissue.ts & stories Introduced a hook and story to fetch cell type counts per tissue.
context/app/static/js/api/scfind/useCellTypeCountForDataset.ts & stories Added a hook and story for counting cell types in a dataset.
context/app/static/js/api/scfind/useCLIDToLabel.ts & stories Added a hook and story for translating CLIDs to cell type labels.
context/app/static/js/api/scfind/StoryTemplate.tsx New re-usable component to standardize story display.
context/app/routes_cells.py New endpoint to return total dataset count.
CHANGELOG-scfind.md Documented major UI and API updates for scFind.
Comments suppressed due to low confidence (3)

context/app/static/js/api/scfind/useFindDatasetForGenes.ts:14

  • The function name 'createCellTypeMarkersKey' is used for constructing keys in the context of finding datasets for genes, which may be misleading. Consider renaming it to 'createFindDatasetForGenesKey' for clearer intent.
export function createCellTypeMarkersKey(

context/app/static/js/api/scfind/useCellTypeCountForDataset.ts:21

  • The function name 'createCellTypeMarkersKey' doesn't clearly reflect its purpose of generating a key for cell type counts in a dataset. Renaming it to 'createCellTypeCountForDatasetKey' would improve clarity.
export function createCellTypeMarkersKey(

context/app/static/js/api/scfind/useCLIDToLabel.ts:16

  • The use of the name 'createCellTypeMarkersKey' is confusing in the context of converting a CLID to a label. Consider renaming it to something like 'createCLIDToLabelKey' to better indicate its purpose.
export function createCellTypeMarkersKey(

@NickAkhmetov NickAkhmetov marked this pull request as ready for review May 2, 2025 18:11
john-conroy
john-conroy previously approved these changes May 5, 2025
Copy link
Collaborator

@john-conroy john-conroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a quick pass given the time constraint, I think we could benefit from some tests and breaking down some of the longer components with lots of logic in the jsx, but nothing blocking.

Comment on lines +62 to +74
const ORGAN_COLORS = [
'#F0F3EB', // success-90 in figma
'#FBEBF3', // primary-90 in figma
'#EAF0F8', // info-90 in figma
'#FBEEEB', // warning-90 in figma
];
export function useCellTypeOrgansColorMap() {
const organs = useCellTypeOrgans();
return scaleOrdinal({
domain: organs,
range: ORGAN_COLORS,
});
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use the theme hook to avoid hard coding the hex values?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely should as a best practice, I'll need to take a pass through the homepage hero as well to make that change. I'll make a quick ticket to keep this in mind.

Comment on lines +23 to +39
const unselectedCellColors = [
grey[300],
grey[400],
grey[500],
grey[600],
grey[700],
blueGrey[300],
blueGrey[400],
blueGrey[500],
blueGrey[600],
blueGrey[700],
teal[300],
teal[400],
teal[500],
teal[600],
teal[700],
];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably fine to hard code as it is?

Comment on lines 110 to 113
{
currentOffset: 0,
offsets: [] as number[],
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Comment on lines +116 to +122
aggs: {
donor_uuids: {
terms: {
field: 'donor.uuid.keyword',
size: 10000,
},
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify the intent here?

@NickAkhmetov NickAkhmetov merged commit 0064414 into main May 5, 2025
8 checks passed
@NickAkhmetov NickAkhmetov deleted the nickakhmetov/scfind-preliminary branch May 5, 2025 20:38
Copy link
Collaborator

@austenem austenem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also took a quick pass - this has been merged already, but wanted to leave these notes here for future reference. Great work!

)}
</>
);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be split into sub-components to improve readability somewhat

);
const total = data.cellTypeCounts.reduce((acc, result) => acc + result.count, 0);
return [counts, total] as const;
}, [data]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some shared logic between this useMemo and the one in the CrossModalityCellTypesChart - might be enough to extract that into a shared helper?

@@ -58,7 +56,7 @@ function CellsCharts({ uuid }: DatasetCellsChartsProps) {
<Skeleton />
)}
</StyledTypography>
</PaddedDiv>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the only place this style was being used

</TableRow>
</TableBody>
</Table>
</>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some mapping could be used here to avoid duplication - maybe on an object like:

const rows = [
  {
    label: 'Datasets',
    key: 'totalDatasets',
    showPercentage: true,
  },
  {
    label: 'Unique Donors',
    key: 'totalDonors',
    showPercentage: true,
  }, 
  ...

);

field.onChange(formattedValue);
}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably be moved into a useCallback rather than being defined inline

}

// scfind does not provide a matching query, so we can do this locally
// TODO: move this logic up to the flask server and cache it there? (routes_cells)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still a TODO?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's optional, I'm still debating whether it's necessary - everything else is already being fetched directly from the SCFind server, so I'm thinking we can nix that todo.

@NickAkhmetov NickAkhmetov mentioned this pull request May 8, 2025
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants