by Sriniketh Vangaru, Debswapna Bhattacharya
This repository contains the code and an overview of the results from our large-scale benchmarking of the predictive performance of various protein side-chain packing (PSCP) methods.
We generated predicted protein structures using AlphaFold2 and AlphaFold3, ran the PSCP methods on the experimental backbone conformations as inputs as well as the AlphaFold-generated backbone coordinates as inputs, and evaluated their performance on a variety of metrics (described below). We further explored the effectiveness of leveraging the self-assessment confidence scores from AlphaFold by implementing a confidence-aware integrative approach that performs a weighted combination of several PSCP methods. The relevant scripts created for these studies are in the scripts folder.
We evaluated the PSCP methods on protein targets from the 14th and 15th rounds of the Critical Assessment of Structure Prediction (CASP) experiments. Specifically, we used 66 targets from CASP14 and 71 targets from CASP15.
The side-chains generated by the following methods were assessed, including multiple variations for some:
- AlphaFold2 and AlphaFold3. The side-chains were simply taken with no changes from the full protein structures these tools outputted from their sequence-based prediction pipelines.
- FlowPacker
- PIPPack
- DiffPack
- AttnPacker
- DLPacker
- FASPR
- PyRosetta Packer. Here, we used the Packer tool from the PyRosetta Python library, very closely following the linked script created by PIPPack's developers.
- SCWRL4
We used a script dependent upon AttnPacker's side-chain assessment library to obtain several standard metrics when assessing the side-chains predicted by the PSCP methods:
- Root Mean Square Deviation (RMSD): Describes the average discrepancy of corresponding atoms between the predicted and native structures in 3-dimensional Euclidean space.
- Dihedral Angle Mean Absolute Error (χ-MAE): The average angular error for each of the first four side-chain dihedral angles across all the residues in a given target.
- Recovery Rate (RR) for Rotamers: The percent of amino acid residues in a target for which all four side-chain dihedral angles, χ1 through χ4, are within 20° of the correct (native) value.
- Clash score: The number of pairs of atoms whose distance is less than a specific threshold. We used three thresholds, each equal to a proportion of the sum of the two atoms' van der Waals (vdW) radii, which represents the minimum possible distance that an atom can have with another non-bonded atom. This helps in describing how biophysically realistic a predicted structure is.
Each metric was calculated per target and then averaged across targets.
Note: To see per-target results, simply click on the corresponding tool in the leftmost column.
RMSD (Å) ↓ | χ-MAE (°) ↓ | RR (%) ↑ | Steric Clashes (#) ↓ | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Input Backbone | All | Core | Surface | χ1 | χ2 | χ3 | χ4 | χ1-4 | 100% | 90% | 80% |
Native | |||||||||||
FlowPacker | 0.80 | 0.40 | 1.01 | 23.02 | 25.82 | 46.09 | 52.80 | 57.1 | 102.0 | 21.7 | 6.4 |
PIPPack | 0.79 | 0.43 | 0.99 | 21.57 | 25.25 | 41.93 | 51.27 | 58.1 | 131.2 | 36.2 | 14.4 |
DiffPack | 0.79 | 0.41 | 0.98 | 22.92 | 25.23 | 46.97 | 55.33 | 57.6 | 104.2 | 26.8 | 9.8 |
AttnPacker | 0.79 | 0.44 | 0.98 | 24.19 | 28.79 | 48.34 | 50.37 | 51.3 | 84.6 | 22.8 | 8.1 |
DLPacker | 0.90 | 0.50 | 1.11 | 27.45 | 30.03 | 52.82 | 70.34 | 50.6 | 83.2 | 16.8 | 5.1 |
FASPR | 1.03 | 0.62 | 1.24 | 31.97 | 31.27 | 49.43 | 55.74 | 47.8 | 152.9 | 41.8 | 13.0 |
PyRosetta | 1.00 | 0.55 | 1.23 | 30.98 | 31.29 | 49.31 | 55.58 | 48.9 | 104.3 | 22.1 | 8.4 |
SCWRL4 | 1.04 | 0.61 | 1.26 | 32.22 | 31.65 | 50.21 | 55.10 | 47.5 | 158.3 | 40.2 | 11.8 |
AlphaFold2-Generated | |||||||||||
AlphaFold2 | 1.07 | 0.66 | 1.26 | 34.59 | 31.51 | 50.81 | 51.42 | 46.0 | 41.4 | 1.9 | 0.0 |
FlowPacker | 1.09 | 0.67 | 1.30 | 35.62 | 33.04 | 51.12 | 55.85 | 46.1 | 86.1 | 13.1 | 2.7 |
PIPPack | 1.10 | 0.68 | 1.30 | 35.71 | 33.06 | 51.07 | 54.55 | 45.1 | 102.3 | 20.6 | 6.5 |
DiffPack | 1.12 | 0.70 | 1.34 | 36.69 | 32.99 | 53.41 | 56.13 | 44.9 | 57.2 | 11.7 | 3.7 |
AttnPacker | 1.06 | 0.68 | 1.25 | 36.08 | 34.92 | 52.85 | 51.78 | 43.3 | 68.5 | 15.3 | 4.5 |
DLPacker | 1.11 | 0.70 | 1.33 | 36.28 | 35.19 | 57.87 | 72.25 | 42.9 | 67.7 | 11.0 | 2.1 |
FASPR | 1.18 | 0.75 | 1.38 | 38.94 | 34.70 | 53.43 | 55.79 | 41.9 | 121.1 | 27.0 | 5.7 |
PyRosetta | 1.16 | 0.72 | 1.38 | 38.16 | 35.19 | 52.84 | 55.31 | 42.7 | 73.9 | 7.7 | 1.2 |
SCWRL4 | 1.20 | 0.77 | 1.40 | 39.01 | 35.09 | 52.93 | 56.20 | 41.6 | 132.8 | 29.0 | 5.7 |
AlphaFold3-Generated | |||||||||||
AlphaFold3 | 1.04 | 0.64 | 1.25 | 34.14 | 30.35 | 49.42 | 50.31 | 47.4 | 45.8 | 5.2 | 0.7 |
FlowPacker | 1.07 | 0.66 | 1.29 | 34.85 | 31.91 | 51.16 | 54.77 | 47.3 | 79.0 | 11.6 | 2.5 |
PIPPack | 1.08 | 0.67 | 1.29 | 35.18 | 32.38 | 50.02 | 52.42 | 46.6 | 95.1 | 19.5 | 6.3 |
DiffPack | 1.11 | 0.69 | 1.33 | 36.43 | 32.66 | 51.65 | 56.08 | 45.8 | 56.0 | 12.1 | 4.0 |
AttnPacker | 1.04 | 0.66 | 1.24 | 35.26 | 34.11 | 51.57 | 51.90 | 43.9 | 62.4 | 14.3 | 4.3 |
DLPacker | 1.10 | 0.69 | 1.31 | 36.28 | 34.13 | 56.91 | 72.26 | 43.3 | 64.5 | 10.3 | 2.2 |
FASPR | 1.16 | 0.76 | 1.36 | 38.06 | 33.48 | 52.29 | 53.81 | 43.2 | 115.3 | 25.6 | 5.7 |
PyRosetta | 1.15 | 0.71 | 1.37 | 37.66 | 34.51 | 52.23 | 54.16 | 43.4 | 69.3 | 7.1 | 1.4 |
SCWRL4 | 1.17 | 0.77 | 1.37 | 37.69 | 34.12 | 51.83 | 57.18 | 43.1 | 127.7 | 28.1 | 6.4 |
RMSD (Å) ↓ | χ-MAE (°) ↓ | RR (%) ↑ | Steric Clashes (#) ↓ | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Input Backbone | All | Core | Surface | χ1 | χ2 | χ3 | χ4 | χ1-4 | 100% | 90% | 80% |
Native | |||||||||||
FlowPacker | 0.69 | 0.33 | 0.90 | 18.99 | 22.04 | 40.93 | 52.62 | 66.4 | 100.8 | 14.6 | 3.3 |
PIPPack | 0.70 | 0.34 | 0.91 | 18.27 | 22.16 | 40.21 | 53.36 | 66.1 | 129.0 | 30.5 | 10.9 |
DiffPack | 0.68 | 0.34 | 0.87 | 18.29 | 22.47 | 42.91 | 56.88 | 65.7 | 95.3 | 20.3 | 7.2 |
AttnPacker | 0.71 | 0.37 | 0.90 | 20.29 | 26.10 | 47.09 | 54.68 | 59.2 | 96.4 | 25.5 | 9.5 |
DLPacker | 0.76 | 0.38 | 0.97 | 21.88 | 26.29 | 50.86 | 67.53 | 59.5 | 89.4 | 14.0 | 3.2 |
FASPR | 0.92 | 0.52 | 1.14 | 27.12 | 29.07 | 50.39 | 59.05 | 55.8 | 160.5 | 37.4 | 9.7 |
PyRosetta | 0.87 | 0.43 | 1.12 | 25.84 | 27.57 | 47.95 | 55.32 | 58.0 | 98.5 | 13.5 | 3.1 |
SCWRL4 | 0.94 | 0.50 | 1.17 | 27.89 | 29.12 | 49.81 | 57.25 | 55.5 | 168.3 | 36.3 | 7.7 |
AlphaFold2-Generated | |||||||||||
AlphaFold2 | 0.90 | 0.58 | 1.11 | 28.05 | 27.90 | 48.04 | 55.00 | 53.9 | 48.2 | 2.0 | 0.0 |
FlowPacker | 0.94 | 0.59 | 1.15 | 29.38 | 29.18 | 50.18 | 57.00 | 55.1 | 100.4 | 13.9 | 2.3 |
PIPPack | 0.96 | 0.61 | 1.19 | 30.08 | 29.89 | 50.22 | 56.32 | 53.5 | 124.5 | 27.3 | 9.4 |
DiffPack | 0.99 | 0.63 | 1.21 | 31.07 | 30.10 | 51.90 | 56.39 | 52.8 | 69.7 | 13.7 | 4.5 |
AttnPacker | 0.92 | 0.61 | 1.12 | 30.34 | 31.48 | 52.10 | 55.52 | 50.4 | 87.9 | 23.6 | 7.7 |
DLPacker | 0.97 | 0.62 | 1.18 | 30.94 | 31.05 | 55.90 | 69.34 | 51.3 | 84.2 | 12.9 | 2.7 |
FASPR | 1.06 | 0.71 | 1.27 | 33.67 | 32.53 | 53.56 | 59.75 | 50.0 | 147.0 | 32.5 | 7.9 |
PyRosetta | 1.04 | 0.67 | 1.26 | 32.78 | 32.67 | 51.42 | 58.65 | 51.0 | 91.6 | 10.7 | 2.4 |
SCWRL4 | 1.07 | 0.71 | 1.29 | 33.95 | 32.73 | 53.35 | 61.55 | 50.0 | 160.7 | 35.0 | 7.5 |
AlphaFold3-Generated | |||||||||||
AlphaFold3 | 0.95 | 0.60 | 1.16 | 30.18 | 28.90 | 48.92 | 53.94 | 53.8 | 58.4 | 8.1 | 1.0 |
FlowPacker | 0.98 | 0.62 | 1.19 | 31.12 | 30.44 | 50.88 | 57.02 | 53.7 | 92.5 | 13.3 | 2.5 |
PIPPack | 1.00 | 0.64 | 1.22 | 31.57 | 30.65 | 49.81 | 57.75 | 52.7 | 115.7 | 26.5 | 9.0 |
DiffPack | 1.02 | 0.66 | 1.23 | 32.44 | 31.03 | 52.27 | 61.20 | 51.6 | 70.8 | 14.5 | 4.6 |
AttnPacker | 0.96 | 0.64 | 1.15 | 31.52 | 32.98 | 53.16 | 55.48 | 50.1 | 83.4 | 22.5 | 7.2 |
DLPacker | 1.01 | 0.65 | 1.22 | 32.59 | 32.39 | 55.95 | 72.09 | 50.1 | 77.3 | 12.9 | 3.0 |
FASPR | 1.08 | 0.72 | 1.29 | 34.11 | 32.03 | 53.99 | 60.17 | 49.7 | 133.2 | 28.6 | 6.6 |
PyRosetta | 1.06 | 0.70 | 1.28 | 33.66 | 32.78 | 52.45 | 58.19 | 50.5 | 81.7 | 8.3 | 1.7 |
SCWRL4 | 1.10 | 0.73 | 1.31 | 34.70 | 32.80 | 52.67 | 60.59 | 49.4 | 147.9 | 31.7 | 7.3 |