Skip to content

Commit 1157374

Browse files
authored
Merge pull request #187 from OpenFreeEnergy/charge_tutorial
CLI charging update
2 parents cf41975 + 7f7e045 commit 1157374

File tree

3 files changed

+173
-7
lines changed

3 files changed

+173
-7
lines changed

cli_tutorials/cli_charge_molecules.md

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
# Generating Partial Charges with the OpenFE CLI
2+
3+
It is recommended to use a single set of charges for each ligand to ensure reproducibility between repeats or consistent
4+
charges between different legs of a calculation involving the same ligand, like a relative binding affinity calculation for example (see [Osato et al.](https://chemrxiv.org/engage/chemrxiv/article-details/67579833085116a133e39d86)).
5+
As such both the `plan-rbfe-network` and `plan-rhfe-network` commands will calculate partial charges for ligands making it expensive
6+
to run multiple network mappings while finding the optimal one for the resources available.
7+
8+
Here we present a CLI tool to do this ahead of time, reducing overheads and further improving reproducibility.
9+
This tutorial will show you how to use the OpenFE CLI command `charge-molecules` to generate and store partial charges for a series of ligands
10+
into an SDF file which can be used with OpenFE protocols.
11+
12+
## Charging Molecules
13+
14+
The `charge-molecules` command allows you to generate partial charges a series of small molecules saved in SDF or MOL2
15+
format using the `am1bcc` method calculated using `ambertools`:
16+
17+
```bash
18+
openfe charge-molecules -M tyk2_ligands.sdf -o charged_tyk2_ligands.sdf
19+
```
20+
21+
This will result in a new SDF file `charged_tyk2_ligands.sdf` which contains the same ligands and their partial charges
22+
stored in a new SD tag like so:
23+
24+
```text
25+
lig_ejm_42
26+
RDKit 3D
27+
28+
35 36 0 0 0 0 0 0 0 0999 V2000
29+
-4.7651 -2.8327 -16.5085 H 0 0 0 0 0 0 0 0 0 0 0 0
30+
-5.3566 -3.6931 -16.2274 C 0 0 0 0 0 0 0 0 0 0 0 0
31+
-4.7703 -4.9699 -16.2000 C 0 0 0 0 0 0 0 0 0 0 0 0
32+
[continues]
33+
28 31 1 0
34+
M END
35+
36+
> <ofe-name>
37+
lig_ejm_42
38+
39+
> <atom.dprop.PartialCharge>
40+
0.14794282857142857 -0.096057171428571425 -0.12905717142857143 ...
41+
[continues]
42+
```
43+
44+
Generating partial charges with the `am1bcc` method can be slow as they require a semi-empirical quantum chemical calculation,
45+
we can however take advantage of multiprocessing to calculate the charges in parallel for each ligand which offers a
46+
significant speed-up. The number of processors available for the workflow can be specified using the `-n` flag. For
47+
example to spread out the calculation over 4 cores:
48+
49+
```bash
50+
openfe charge-molecules -M tyk2_ligands.sdf -o charged_tyk2_ligands.sdf -n 4
51+
```
52+
53+
## Customizing the Charge Method
54+
55+
There are a wide range of partial charge generation methods available with `am1bcc` based schemes being most commonly
56+
used with OpenFF force fields. The choice of charge scheme can be easily customised by providing a settings file in `.yaml` format.
57+
For example to recreate the current default settings in the workflow you would do the following:
58+
59+
1. Provide a file like `settings.yaml` with the desired settings:
60+
61+
```yaml
62+
partial_charge:
63+
method: am1bcc
64+
settings:
65+
off_toolkit_backend: ambertools
66+
```
67+
68+
2. Charge the ligands with an additional `-s` flag for passing the settings:
69+
70+
```bash
71+
openfe charge-molecules -M tyk2_ligands.sdf -o charged_tyk2_ligands.sdf -n 4 -s settings.yaml
72+
```
73+
74+
3. The output of the CLI program will now reflect the changes made:
75+
76+
```text
77+
SMALL MOLECULE PARTIAL CHARGE GENERATOR
78+
_________________________________________
79+
80+
Parsing in Files:
81+
Got input:
82+
Small Molecules: SmallMoleculeComponent(name=lig_ejm_31) SmallMoleculeComponent(name=lig_ejm_42) SmallMoleculeComponent(name=lig_ejm_43) SmallMoleculeComponent(name=lig_ejm_46) SmallMoleculeComponent(name=lig_ejm_47) SmallMoleculeComponent(name=lig_ejm_48) SmallMoleculeComponent(name=lig_ejm_50) SmallMoleculeComponent(name=lig_jmc_23) SmallMoleculeComponent(name=lig_jmc_27) SmallMoleculeComponent(name=lig_jmc_28)
83+
Using Options:
84+
Partial Charge Generation: am1bcc
85+
```
86+
87+
The full range of partial charge settings can be found in the snippet bellow, note that some may require installing extra packages.
88+
89+
```yaml
90+
partial_charge:
91+
method: am1bcc
92+
# method: am1bccelf10
93+
# method: espaloma
94+
# method: nagl
95+
settings:
96+
off_toolkit_backend: ambertools
97+
# off_toolkit_backend: openeye # required for the am1bccelf10 method
98+
number_of_conformers: null # null specifies the use of the input conformer, a value requests that a new conformer be generated
99+
# nagl_model: null # null specifies the use of the latest nagl model
100+
```
101+
102+
## Overwriting Charges
103+
104+
By default, the `charge-molecules` command will only assign partial charges to ligands which **do not** already have charges,
105+
this behaviour can be changed via the `--overwrite-charges` flag which will assign new charges using the specified settings.

rbfe_tutorial/cli_tutorial.md

Lines changed: 31 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -50,9 +50,21 @@ we do the following:
5050
- Instruct `openfe` to output files into a directory called `network_setup`
5151
with the `-o network_setup` option.
5252

53-
Planning the campaign may take some time, as it tries to find the best
54-
network from all possible transformations. This will create a directory called
55-
`network_setup/`, which is structured like this:
53+
Planning the campaign may take some time due to the complex series of tasks involved:
54+
55+
- partial charges are generated for each of the ligands to ensure reproducibility, by default this requires a semi-empirical quantum
56+
chemical calculation to calculate `am1bcc` charges
57+
- atom mappings are created and scored based on the perceived difficulty for all possible ligand pairs
58+
- an optimal network is extracted from all possible pairwise transformations which balances edge redundancy and the total difficulty score of the network
59+
60+
The partial charge generation can take advantage of multiprocessing which offers a significant speed-up, you can specify
61+
the number of processors available using the `-n` flag:
62+
63+
```bash
64+
openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup -n 4
65+
```
66+
67+
This will result in a directory called `network_setup/`, which is structured like this:
5668

5769
<!-- top lines from `tree network_setup` -->
5870

@@ -87,7 +99,7 @@ The files that describe each individual simulation we will run are located withi
8799
leg to run and contains all the necessary information to run that leg.
88100
Filenames indicate ligand names as taken from the SDF; for example, the file
89101
`easy_rbfe_lig_ejm_31_complex_lig_ejm_42_complex.json` is the leg
90-
associated with the tranformation of the ligand `lig_ejm_31` into `lig_ejm_42`
102+
associated with the transformation of the ligand `lig_ejm_31` into `lig_ejm_42`
91103
while in complex with the protein.
92104

93105
A single RBFE between a pair of ligands requires running two legs of an alchemical cycle (JSON files):
@@ -112,8 +124,9 @@ OpenFE contains many different options and methods for setting up a simulation c
112124
The options can be easily accessed and modified by providing a settings
113125
file in the `.yaml` format.
114126
Let's assume you want to exchange the LOMAP atom mapper with the Kartograf
115-
atom mapper and the Minimal Spanning Tree
116-
Network Planner with the Maximal Network Planner, then you could do the following:
127+
atom mapper, the Minimal Spanning Tree
128+
Network Planner with the Maximal Network Planner and the am1bcc charge method with the am1bccelf10 version from openeye,
129+
then you could do the following:
117130

118131
1. provide a file like `settings.yaml` with the desired changes:
119132

@@ -123,6 +136,11 @@ mapper:
123136

124137
network:
125138
method: generate_maximal_network
139+
140+
partial_charge:
141+
method: am1bccelf10
142+
settings:
143+
off_toolkit_backend: openeye
126144
```
127145
128146
2. Plan your rbfe network with an additional `-s` flag for passing the settings:
@@ -148,6 +166,7 @@ Using Options:
148166
Mapper: <kartograf.atom_mapper.KartografAtomMapper object at 0x7fea079de790>
149167
Mapping Scorer: <function default_lomap_score at 0x7fea1b423d80>
150168
Networker: functools.partial(<function generate_maximal_network at 0x7fea18371260>)
169+
Partial Charge Generation: am1bccelf10
151170
```
152171

153172
That concludes the straightforward process of tailoring your OpenFE setup to your specifications.
@@ -166,6 +185,12 @@ network:
166185
# method: generate_radial_network
167186
# method: generate_maximal_network
168187
# method: generate_minimal_redundant_network
188+
189+
partial_charge:
190+
method: am1bcc
191+
# method: am1bccelf10
192+
# settings:
193+
# off_toolkit_backend: openeye # required for the am1bccelf10 method
169194
```
170195

171196
**Customize away!**

rbfe_tutorial/python_tutorial.ipynb

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,42 @@
4747
"ligands = [openfe.SmallMoleculeComponent.from_rdkit(mol) for mol in supp]"
4848
]
4949
},
50+
{
51+
"cell_type": "markdown",
52+
"id": "8e5de19a",
53+
"metadata": {},
54+
"source": [
55+
"## Charging the ligands\n",
56+
"\n",
57+
"It is recommended to use a single set of charges for each ligand to ensure reproducibility between repeats or consistent charges between different legs of a calculation involving the same ligand, like a relative binding affinity calculation for example. \n",
58+
"\n",
59+
"Here we will use some utility functions from OpenFE which can assign partial charges to a series of molecules with a variety of methods which can be configured via the `OpenFFPartialChargeSettings` class. In this example \n",
60+
"we will charge the ligands using the `am1bcc` method from `ambertools` which is the default charge scheme used by OpenFE."
61+
]
62+
},
63+
{
64+
"cell_type": "code",
65+
"execution_count": null,
66+
"id": "5219106c",
67+
"metadata": {},
68+
"outputs": [],
69+
"source": [
70+
"from openfe.protocols.openmm_utils.omm_settings import OpenFFPartialChargeSettings\n",
71+
"from openfe.protocols.openmm_utils.charge_generation import bulk_assign_partial_charges\n",
72+
"\n",
73+
"charge_settings = OpenFFPartialChargeSettings(partial_charge_method=\"am1bcc\", off_toolkit_backend=\"ambertools\")\n",
74+
"\n",
75+
"charged_ligands = bulk_assign_partial_charges(\n",
76+
" molecules=ligands,\n",
77+
" overwrite=False, \n",
78+
" method=charge_settings.partial_charge_method,\n",
79+
" toolkit_backend=charge_settings.off_toolkit_backend,\n",
80+
" generate_n_conformers=charge_settings.number_of_conformers,\n",
81+
" nagl_model=charge_settings.nagl_model,\n",
82+
" processors=1\n",
83+
")"
84+
]
85+
},
5086
{
5187
"cell_type": "markdown",
5288
"id": "6963be83",
@@ -93,7 +129,7 @@
93129
"outputs": [],
94130
"source": [
95131
"ligand_network = network_planner(\n",
96-
" ligands=ligands,\n",
132+
" ligands=charged_ligands,\n",
97133
" mappers=[mapper],\n",
98134
" scorer=scorer\n",
99135
")"

0 commit comments

Comments
 (0)