Skip to content

Can't construct surface rmgpy.molecule.Molecule() #2603

@sevyharris

Description

@sevyharris

Bug Description

I'm trying to create a rmgpy.molecule.Molecule() using smiles

rmgpy.molecule.Molecule(smiles='O=C=[X]')

But it fails from an RDKit error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_18284/1886713532.py in <module>
----> 1 rmgpy.molecule.Molecule(smiles='O=C=[X]')
~/rmg/RMG-Py/rmgpy/molecule/molecule.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.molecule.Molecule.__init__()
~/rmg/RMG-Py/rmgpy/molecule/molecule.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.molecule.Molecule.from_smiles()
~/rmg/RMG-Py/rmgpy/molecule/translator.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.translator.from_smiles()
~/rmg/RMG-Py/rmgpy/molecule/translator.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.translator.from_smiles()
~/rmg/RMG-Py/rmgpy/molecule/translator.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.translator._read()
~/rmg/RMG-Py/rmgpy/molecule/translator.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.translator._rdkit_translator()
ValueError: Could not interpret the identifier 'O=C=[X]'

Possible Solution

I'm wondering if we can get around RDKit's limitations with the following procedure:

  • replace the surface sites X with a nonreactive species like Ar
  • generate that molecule's adjacency list
  • replace Ar with X and delete extra pairs of electrons on the X

It would look something like adding this to the _rdkit_translator() function in RMG-Py/rmgpy/molecule/translator.py:

from rdkit import Chem
import rmgpy.molecule

COX_smiles = 'O=C=[X]'

rdkitmol = Chem.MolFromSmiles(COX_smiles.replace('X', 'Ar'))
mol = rmgpy.molecule.molecule.Molecule()
output = rmgpy.molecule.converter.from_rdkit_mol(mol, rdkitmol)

lines = output.to_adjacency_list().split('\n')

for i, line in enumerate(lines):
    if 'Ar' in line:
        lines[i] = lines[i].replace('Ar', 'X')
        # remove any extra electron pairs...
        lines[i] = lines[i].replace('p3', 'p0')
        lines[i] = lines[i].replace('p2', 'p0')
        lines[i] = lines[i].replace('p1', 'p0')
adj_list = '\n'.join(lines)
COX = rmgpy.molecule.molecule.Molecule().from_adjacency_list(adj_list)

print(COX.smiles)
print(COX.to_adjacency_list())

Other Possible Solution

One workaround is to just instantiate the molecule using the adjacency list. But sometimes that's annoying- especially since we can use smiles for gas phase molecules without any issue. At the very least, a more descriptive error message is probably required, telling the user that you have to use the adjacency list.

Metadata

Metadata

Assignees

Labels

Topic: Catalysisstalestale issue/PR as determined by actions bot

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions