RDKit
Open-source cheminformatics and machine learning.
RDKit::DGeomHelpers Namespace Reference

Classes

struct  EmbedParameters
 Parameter object for controlling embedding. More...
 

Functions

RDKIT_DISTGEOMHELPERS_EXPORT void initBoundsMat (DistGeom::BoundsMatrix *mmat, double defaultMin=0.0, double defaultMax=1000.0)
 Set default upper and lower distance bounds in a distance matrix. More...
 
RDKIT_DISTGEOMHELPERS_EXPORT void initBoundsMat (DistGeom::BoundsMatPtr mmat, double defaultMin=0.0, double defaultMax=1000.0)
 
RDKIT_DISTGEOMHELPERS_EXPORT void setTopolBounds (const ROMol &mol, DistGeom::BoundsMatPtr mmat, bool set15bounds=true, bool scaleVDW=false)
 Set upper and lower distance bounds between atoms in a molecule based on. More...
 
RDKIT_DISTGEOMHELPERS_EXPORT void setTopolBounds (const ROMol &mol, DistGeom::BoundsMatPtr mmat, std::vector< std::pair< int, int >> &bonds, std::vector< std::vector< int >> &angles, bool set15bounds=true, bool scaleVDW=false)
 
RDKIT_DISTGEOMHELPERS_EXPORT void EmbedMultipleConfs (ROMol &mol, INT_VECT &res, unsigned int numConfs, const EmbedParameters &params)
 
INT_VECT EmbedMultipleConfs (ROMol &mol, unsigned int numConfs, const EmbedParameters &params)
 
int EmbedMolecule (ROMol &mol, const EmbedParameters &params)
 Compute an embedding (in 3D) for the specified molecule using Distance. More...
 
int EmbedMolecule (ROMol &mol, unsigned int maxIterations=0, int seed=-1, bool clearConfs=true, bool useRandomCoords=false, double boxSizeMult=2.0, bool randNegEig=true, unsigned int numZeroFail=1, const std::map< int, RDGeom::Point3D > *coordMap=0, double optimizerForceTol=1e-3, bool ignoreSmoothingFailures=false, bool enforceChirality=true, bool useExpTorsionAnglePrefs=false, bool useBasicKnowledge=false, bool verbose=false, double basinThresh=5.0, bool onlyHeavyAtomsForRMS=false)
 Compute an embedding (in 3D) for the specified molecule using Distance. More...
 
void EmbedMultipleConfs (ROMol &mol, INT_VECT &res, unsigned int numConfs=10, int numThreads=1, unsigned int maxIterations=30, int seed=-1, bool clearConfs=true, bool useRandomCoords=false, double boxSizeMult=2.0, bool randNegEig=true, unsigned int numZeroFail=1, double pruneRmsThresh=-1.0, const std::map< int, RDGeom::Point3D > *coordMap=0, double optimizerForceTol=1e-3, bool ignoreSmoothingFailures=false, bool enforceChirality=true, bool useExpTorsionAnglePrefs=false, bool useBasicKnowledge=false, bool verbose=false, double basinThresh=5.0, bool onlyHeavyAtomsForRMS=false)
 
INT_VECT EmbedMultipleConfs (ROMol &mol, unsigned int numConfs=10, unsigned int maxIterations=30, int seed=-1, bool clearConfs=true, bool useRandomCoords=false, double boxSizeMult=2.0, bool randNegEig=true, unsigned int numZeroFail=1, double pruneRmsThresh=-1.0, const std::map< int, RDGeom::Point3D > *coordMap=0, double optimizerForceTol=1e-3, bool ignoreSmoothingFailures=false, bool enforceChirality=true, bool useExpTorsionAnglePrefs=false, bool useBasicKnowledge=false, bool verbose=false, double basinThresh=5.0, bool onlyHeavyAtomsForRMS=false)
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. More...
 

Variables

const RDKIT_DISTGEOMHELPERS_EXPORT EmbedParameters KDG
 Parameters corresponding to Sereina Riniker's KDG approach. More...
 
const RDKIT_DISTGEOMHELPERS_EXPORT EmbedParameters ETDG
 Parameters corresponding to Sereina Riniker's ETDG approach. More...
 
const RDKIT_DISTGEOMHELPERS_EXPORT EmbedParameters ETKDG
 Parameters corresponding to Sereina Riniker's ETKDG approach. More...
 
const RDKIT_DISTGEOMHELPERS_EXPORT EmbedParameters ETKDGv2
 Parameters corresponding to Sereina Riniker's ETKDG approach - version 2. More...
 

Function Documentation

◆ EmbedMolecule() [1/2]

int RDKit::DGeomHelpers::EmbedMolecule ( ROMol mol,
const EmbedParameters params 
)
inline

Compute an embedding (in 3D) for the specified molecule using Distance.

Definition at line 192 of file Embedder.h.

References EmbedMultipleConfs().

Referenced by EmbedMolecule().

◆ EmbedMolecule() [2/2]

int RDKit::DGeomHelpers::EmbedMolecule ( ROMol mol,
unsigned int  maxIterations = 0,
int  seed = -1,
bool  clearConfs = true,
bool  useRandomCoords = false,
double  boxSizeMult = 2.0,
bool  randNegEig = true,
unsigned int  numZeroFail = 1,
const std::map< int, RDGeom::Point3D > *  coordMap = 0,
double  optimizerForceTol = 1e-3,
bool  ignoreSmoothingFailures = false,
bool  enforceChirality = true,
bool  useExpTorsionAnglePrefs = false,
bool  useBasicKnowledge = false,
bool  verbose = false,
double  basinThresh = 5.0,
bool  onlyHeavyAtomsForRMS = false 
)
inline

Compute an embedding (in 3D) for the specified molecule using Distance.

The following operations are performed (in order) here:

  1. Build a distance bounds matrix based on the topology, including 1-5 distances but not VDW scaling
  2. Triangle smooth this bounds matrix
  3. If step 2 fails - repeat step 1, this time without 1-5 bounds and with vdW scaling, and repeat step 2
  4. Pick a distance matrix at random using the bounds matrix
  5. Compute initial coordinates from the distance matrix
  6. Repeat steps 3 and 4 until maxIterations is reached or embedding is successful
  7. Adjust initial coordinates by minimizing a Distance Violation error function

    NOTE**: if the molecule has multiple fragments, they will be embedded separately, this means that they will likely occupy the same region of space.

Parameters
molMolecule of interest
maxIterationsMax. number of times the embedding will be tried if coordinates are not obtained successfully. The default value is 10x the number of atoms.
seedprovides a seed for the random number generator (so that the same coordinates can be obtained for a molecule on multiple runs). If negative, the RNG will not be seeded.
clearConfsClear all existing conformations on the molecule
useRandomCoordsStart the embedding from random coordinates instead of using eigenvalues of the distance matrix.
boxSizeMultDetermines the size of the box that is used for random coordinates. If this is a positive number, the side length will equal the largest element of the distance matrix times boxSizeMult. If this is a negative number, the side length will equal -boxSizeMult (i.e. independent of the elements of the distance matrix).
randNegEigPicks coordinates at random when a embedding process produces negative eigenvalues
numZeroFailFail embedding if we find this many or more zero eigenvalues (within a tolerance)
coordMapa map of int to Point3D, between atom IDs and their locations their locations. If this container is provided, the coordinates are used to set distance constraints on the embedding. The resulting conformer(s) should have distances between the specified atoms that reproduce those between the points in coordMap. Because the embedding produces a molecule in an arbitrary reference frame, an alignment step is required to actually reproduce the provided coordinates.
optimizerForceTolset the tolerance on forces in the distgeom optimizer (this shouldn't normally be altered in client code).
ignoreSmoothingFailurestry to embed the molecule even if triangle bounds smoothing fails
enforceChiralityenforce the correct chirality if chiral centers are present
useExpTorsionAnglePrefsimpose experimental torsion-angle preferences
useBasicKnowledgeimpose "basic knowledge" terms such as flat aromatic rings, ketones, etc.
verboseprint output of experimental torsion-angle preferences
basinThreshset the basin threshold for the DGeom force field, (this shouldn't normally be altered in client code).
onlyHeavyAtomsForRMSonly use the heavy atoms when doing RMS filtering
Returns
ID of the conformations added to the molecule, -1 if the emdedding failed

Definition at line 273 of file Embedder.h.

References EmbedMolecule().

◆ EmbedMultipleConfs() [1/4]

RDKIT_DISTGEOMHELPERS_EXPORT void RDKit::DGeomHelpers::EmbedMultipleConfs ( ROMol mol,
INT_VECT res,
unsigned int  numConfs,
const EmbedParameters params 
)

◆ EmbedMultipleConfs() [2/4]

void RDKit::DGeomHelpers::EmbedMultipleConfs ( ROMol mol,
INT_VECT res,
unsigned int  numConfs = 10,
int  numThreads = 1,
unsigned int  maxIterations = 30,
int  seed = -1,
bool  clearConfs = true,
bool  useRandomCoords = false,
double  boxSizeMult = 2.0,
bool  randNegEig = true,
unsigned int  numZeroFail = 1,
double  pruneRmsThresh = -1.0,
const std::map< int, RDGeom::Point3D > *  coordMap = 0,
double  optimizerForceTol = 1e-3,
bool  ignoreSmoothingFailures = false,
bool  enforceChirality = true,
bool  useExpTorsionAnglePrefs = false,
bool  useBasicKnowledge = false,
bool  verbose = false,
double  basinThresh = 5.0,
bool  onlyHeavyAtomsForRMS = false 
)
inline

This is kind of equivalent to calling EmbedMolecule multiple times - just that the bounds matrix is computed only once from the topology

NOTE**: if the molecule has multiple fragments, they will be embedded separately, this means that they will likely occupy the same region of space.

Parameters
molMolecule of interest
resUsed to return the resulting conformer ids
numConfsNumber of conformations to be generated
numThreadsSets the number of threads to use (more than one thread will only be used if the RDKit was build with multithread support). If set to zero, the max supported by the system will be used.
maxIterationsMax. number of times the embedding will be tried if coordinates are not obtained successfully. The default value is 10x the number of atoms.
seedprovides a seed for the random number generator (so that the same coordinates can be obtained for a molecule on multiple runs). If negative, the RNG will not be seeded.
clearConfsClear all existing conformations on the molecule
useRandomCoordsStart the embedding from random coordinates instead of using eigenvalues of the distance matrix.
boxSizeMultDetermines the size of the box that is used for random coordinates. If this is a positive number, the side length will equal the largest element of the distance matrix times boxSizeMult. If this is a negative number, the side length will equal -boxSizeMult (i.e. independent of the elements of the distance matrix).
randNegEigPicks coordinates at random when a embedding process produces negative eigenvalues
numZeroFailFail embedding if we find this many or more zero eigenvalues (within a tolerance)
pruneRmsThreshRetain only the conformations out of 'numConfs' after embedding that are at least this far apart from each other. RMSD is computed on the heavy atoms. Pruning is greedy; i.e. the first embedded conformation is retained and from then on only those that are at least pruneRmsThresh away from already retained conformations are kept. The pruning is done after embedding and bounds violation minimization. No pruning by default.
coordMapa map of int to Point3D, between atom IDs and their locations their locations. If this container is provided, the coordinates are used to set distance constraints on the embedding. The resulting conformer(s) should have distances between the specified atoms that reproduce those between the points in coordMap. Because the embedding produces a molecule in an arbitrary reference frame, an alignment step is required to actually reproduce the provided coordinates.
optimizerForceTolset the tolerance on forces in the DGeom optimizer (this shouldn't normally be altered in client code).
ignoreSmoothingFailurestry to embed the molecule even if triangle bounds smoothing fails
enforceChiralityenforce the correct chirality if chiral centers are present
useExpTorsionAnglePrefsimpose experimental torsion-angle preferences
useBasicKnowledgeimpose "basic knowledge" terms such as flat aromatic rings, ketones, etc.
verboseprint output of experimental torsion-angle preferences
basinThreshset the basin threshold for the DGeom force field, (this shouldn't normally be altered in client code).
onlyHeavyAtomsForRMSonly use the heavy atoms when doing RMS filtering

Definition at line 365 of file Embedder.h.

References EmbedMultipleConfs().

◆ EmbedMultipleConfs() [3/4]

INT_VECT RDKit::DGeomHelpers::EmbedMultipleConfs ( ROMol mol,
unsigned int  numConfs,
const EmbedParameters params 
)
inline

Definition at line 183 of file Embedder.h.

References EmbedMultipleConfs().

◆ EmbedMultipleConfs() [4/4]

INT_VECT RDKit::DGeomHelpers::EmbedMultipleConfs ( ROMol mol,
unsigned int  numConfs = 10,
unsigned int  maxIterations = 30,
int  seed = -1,
bool  clearConfs = true,
bool  useRandomCoords = false,
double  boxSizeMult = 2.0,
bool  randNegEig = true,
unsigned int  numZeroFail = 1,
double  pruneRmsThresh = -1.0,
const std::map< int, RDGeom::Point3D > *  coordMap = 0,
double  optimizerForceTol = 1e-3,
bool  ignoreSmoothingFailures = false,
bool  enforceChirality = true,
bool  useExpTorsionAnglePrefs = false,
bool  useBasicKnowledge = false,
bool  verbose = false,
double  basinThresh = 5.0,
bool  onlyHeavyAtomsForRMS = false 
)
inline

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Definition at line 385 of file Embedder.h.

References EmbedMultipleConfs().

◆ initBoundsMat() [1/2]

RDKIT_DISTGEOMHELPERS_EXPORT void RDKit::DGeomHelpers::initBoundsMat ( DistGeom::BoundsMatPtr  mmat,
double  defaultMin = 0.0,
double  defaultMax = 1000.0 
)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

◆ initBoundsMat() [2/2]

RDKIT_DISTGEOMHELPERS_EXPORT void RDKit::DGeomHelpers::initBoundsMat ( DistGeom::BoundsMatrix mmat,
double  defaultMin = 0.0,
double  defaultMax = 1000.0 
)

Set default upper and lower distance bounds in a distance matrix.

Parameters
mmatpointer to the bounds matrix to be altered
defaultMindefault value for the lower distance bounds
defaultMaxdefault value for the upper distance bounds

◆ setTopolBounds() [1/2]

RDKIT_DISTGEOMHELPERS_EXPORT void RDKit::DGeomHelpers::setTopolBounds ( const ROMol mol,
DistGeom::BoundsMatPtr  mmat,
bool  set15bounds = true,
bool  scaleVDW = false 
)

Set upper and lower distance bounds between atoms in a molecule based on.

This consists of setting 1-2, 1-3 and 1-4 distance based on bond lengths, bond angles and torsion angle ranges. Optionally 1-5 bounds can also be set, in particular, for path that contain rigid 1-4 paths.

The final step involves setting lower bound to the sum of the vdW radii for the remaining atom pairs.

Parameters
molThe molecule of interest
mmatBounds matrix to the bounds are written
set15boundsIf true try to set 1-5 bounds also based on topology
scaleVDWIf true scale the sum of the vdW radii while setting lower bounds so that a smaller value (0.7*(vdw1 + vdw2) ) is used for paths that are less five bonds apart.

Note For some strained systems the bounds matrix resulting from setting 1-5 bounds may fail triangle smoothing. In these cases it is recommended to back out and recompute the bounds matrix with no 1-5 bounds and with vdW scaling.

◆ setTopolBounds() [2/2]

RDKIT_DISTGEOMHELPERS_EXPORT void RDKit::DGeomHelpers::setTopolBounds ( const ROMol mol,
DistGeom::BoundsMatPtr  mmat,
std::vector< std::pair< int, int >> &  bonds,
std::vector< std::vector< int >> &  angles,
bool  set15bounds = true,
bool  scaleVDW = false 
)

Variable Documentation

◆ ETDG

const RDKIT_DISTGEOMHELPERS_EXPORT EmbedParameters RDKit::DGeomHelpers::ETDG

Parameters corresponding to Sereina Riniker's ETDG approach.

◆ ETKDG

const RDKIT_DISTGEOMHELPERS_EXPORT EmbedParameters RDKit::DGeomHelpers::ETKDG

Parameters corresponding to Sereina Riniker's ETKDG approach.

◆ ETKDGv2

const RDKIT_DISTGEOMHELPERS_EXPORT EmbedParameters RDKit::DGeomHelpers::ETKDGv2

Parameters corresponding to Sereina Riniker's ETKDG approach - version 2.

◆ KDG

const RDKIT_DISTGEOMHELPERS_EXPORT EmbedParameters RDKit::DGeomHelpers::KDG

Parameters corresponding to Sereina Riniker's KDG approach.