Skip to content

Use rdkit for SSSR and RCs (bug fix + Python upgrade) #2796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

JacksonBurns
Copy link
Contributor

Currently we use RingDecomposerLib for finding the Smallest Set of Smallest Rings and getting the Relevant Cycles. This package does not support Python 3.10+ and is thus blocking further upgrades to RMG.

@KnathanM in particular is looking to get RMG to Python 3.11 so as to add support for ChemProp v2.

I believe we can just use RDKit to do these operations instead. The original paper mentions that the functionality was being moved upstream to RDKit. With the help of AI I've taken just a first pass at reimplementing, with the special note that:

This PR will be a draft for now, as it is predicated on Python 3.9 already being available (which it nearly is in #2741)

Motivation or Problem

A clear and concise description of what what you're trying to fix or improve. Please reference any issues that this addresses.

Description of Changes

A clear and concise description of what what you've changed or added.

Testing

A clear and concise description of testing that you've done or plan to do.

Reviewer Tips

Suggestions for verifying that this PR works or other notes for the reviewer.

@JacksonBurns JacksonBurns marked this pull request as draft May 25, 2025 21:43
Currently we use `RingDecomposerLib` for finding the Smallest Set of Smallest Rings and getting the Relevant Cycles. This package does not support Python 3.10+ and is thus blocking further upgrades to RMG.

@KnathanM in particular is looking to get RMG to Python 3.11 so as to add support for ChemProp v2.

I believe we can just use RDKit to do these operations instead. The original paper mentions that the functionality was being moved upstream to RDKit. With the help of AI I've taken just a first pass at reimplementing, with the special note that:
 - I opted to use the Symmetric SSSR in place of the 'true' SSSR. This is because the latter is non-unique (see [RDKit's "The SSSR Problem"](https://www.rdkit.org/docs/GettingStartedInPython.html#the-sssr-problem)). This should actually resolve  #2562
 - I need to read more about the "Relevant Cycles"

This PR will be a draft for now, as it is predicated on Python 3.9 already being available (which it nearly is in #2741)
@JacksonBurns
Copy link
Contributor Author

Cantera 2.6 isn't available for Python 3.12, so this PR will also need to upgrade the Cantera version to 3 as mostly completed in #2751

@JacksonBurns
Copy link
Contributor Author

A note for the path forward on this PR - the get_relevant_cycles and get_smallest_set_of_smallest_rings functions will need to be moved to Molecule. Currently they are in Graph, and Graph has no way to send itself to RDKit in order to be subject to RDKit GetSymmSSSR but Molecule can be converted to RDKit via converted.to_rdkit_mol (also exposed as Molecule.to_rdkit_mol, and then RDKit can be used. This may have implications for inheritance elsewhere in the codebase, but should be OK.

Base automatically changed from feat/py39_rebase to main May 29, 2025 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant