AryanRajSaxena's picture
Upload folder using huggingface_hub
56793c5 verified
Metadata-Version: 2.1
Name: CDK_pywrapper
Version: 0.1.0
Summary: Python wrapper for CDK molecular descriptors and fingerprints
Home-page: https://github.com/OlivierBeq/CDK_pywrapper
Author: Olivier J. M. Béquignon
Author-email: "olivier.bequignon.maintainer@gmail.com"
Maintainer: Olivier J. M. Béquignon
Maintainer-email: "olivier.bequignon.maintainer@gmail.com"
Keywords: Chemistry Development Kit,molecular descriptors,molecular fingerprints,cheminformatics,QSAR
Classifier: Development Status :: 5 - Production/Stable
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.6
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: more-itertools
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: rdkit
Requires-Dist: install-jdk ==0.3.0
Requires-Dist: bounded-pool-executor ==0.0.3
Provides-Extra: docs
Requires-Dist: sphinx ; extra == 'docs'
Requires-Dist: sphinx-rtd-theme ; extra == 'docs'
Requires-Dist: sphinx-autodoc-typehints ; extra == 'docs'
Provides-Extra: testing
Requires-Dist: pytest ; extra == 'testing'
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
# CDK Python wrapper
Python wrapper to ease the calculation of [CDK](https://cdk.github.io/) molecular descriptors and fingerprints.
## Installation
From source:
git clone https://github.com/OlivierBeq/CDK_pywrapper.git
pip install ./CDK_pywrapper
with pip:
```bash
pip install CDK-pywrapper
```
### Get started
```python
from CDK_pywrapper import CDK
from rdkit import Chem
smiles_list = [
# erlotinib
"n1cnc(c2cc(c(cc12)OCCOC)OCCOC)Nc1cc(ccc1)C#C",
# midecamycin
"CCC(=O)O[C@@H]1CC(=O)O[C@@H](C/C=C/C=C/[C@@H]([C@@H](C[C@@H]([C@@H]([C@H]1OC)O[C@H]2[C@@H]([C@H]([C@@H]([C@H](O2)C)O[C@H]3C[C@@]([C@H]([C@@H](O3)C)OC(=O)CC)(C)O)N(C)C)O)CC=O)C)O)C",
# selenofolate
"C1=CC(=CC=C1C(=O)NC(CCC(=O)OCC[Se]C#N)C(=O)O)NCC2=CN=C3C(=N2)C(=O)NC(=N3)N",
# cisplatin
"N.N.Cl[Pt]Cl"
]
mols = [Chem.AddHs(Chem.MolFromSmiles(smiles)) for smiles in smiles_list]
cdk = CDK()
print(cdk.calculate(mols))
```
The above calculates 222 molecular descriptors (23 1D and 200 2D).<br/>
The additional 65 three-dimensional (3D) descriptors may be obtained with the following:
:warning: Molecules are required to have conformers for 3D descriptors to be calculated.<br/>
```python
from rdkit.Chem import AllChem
for mol in mols:
_ = AllChem.EmbedMolecule(mol)
cdk = CDK(ignore_3D=False)
print(cdk.calculate(mols))
```
To obtain molecular fingerprint, one can used the following:
```python
from CDK_pywrapper import CDK, FPType
cdk = CDK(fingerprint=.PubchemFP)
print(cdk.calculate(mols))
```
The following fingerprints can be calculated:
| FPType | Fingerprint name |
|-----------|------------------------------------------------------------------------------------|
| FP | CDK fingerprint |
| ExtFP | Extended CDK fingerprint (includes 25 bits for ring features and isotopic masses) |
| EStateFP | Electrotopological state fingerprint (79 bits) |
| GraphFP | CDK fingerprinter ignoring bond orders |
| MACCSFP | Public MACCS fingerprint |
| PubchemFP | PubChem substructure fingerprint |
| SubFP | Fingerprint describing 307 substructures |
| KRFP | Klekota-Roth fingerprint |
| AP2DFP | Atom pair 2D fingerprint as implemented in PaDEL |
| HybridFP | CDK fingerprint ignoring aromaticity |
| LingoFP | LINGO fingerprint |
| SPFP | Fingerprint based on the shortest paths between two atoms |
| SigFP | Signature fingerprint |
| CircFP | Circular fingerprint |
## Documentation
```python
class CDK(ignore_3D=True, fingerprint=None, nbits=1024, depth=6):
```
Constructor of a CDK calculator for molecular descriptors or fingerprints
Parameters:
- ***ignore_3D : bool***
Should 3D molecular descriptors be calculated (default: False). Ignored if a fingerprint is set.
- ***fingerprint : FPType***
Type of fingerprint to calculate (default: None). If None, calculate descriptors.
- ***nbits : int***
Number of bits in the fingerprint.
- ***depth : int***
Depth of the fingerprint.
<br/>
<br/>
```python
def calculate(mols, show_banner=True, njobs=1, chunksize=1000):
```
Default method to calculate CDK molecular descriptors and fingerprints.
Parameters:
- ***mols : Iterable[Chem.Mol]***
RDKit molecule objects for which to obtain CDK descriptors.
- ***show_banner : bool***
Displays default notice about CDK.
- ***njobs : int***
Maximum number of simultaneous processes.
- ***chunksize : int***
Maximum number of molecules each process is charged of.