How to obtain SMILES of ligands using PDB ligand IDs?

Fetching SMILE strings for a given number of SDF files of chemical compounds is not such a trivial task. We can quickly obtain them using RDKit or OpenBabel. But what if you don’t have SDF files of ligands in the first place? All you have is Ligand IDs from PDB. If they are a few then you can think of downloading SDF files manually but still, it seems time-consuming, especially when you have multiple compounds to work with. Therefore, we provide a Python script that will read all Ligand IDs and fetch their SDF files, and will finally convert them into SMILE strings.

Contents

Requirements Usage Availability References

pdb_ligand_id-to-smi.ipynb is a Python notebook that will fetch SMILES for each ligand ID using RDKit [1] provided in a CSV file.

Requirements

This script requires Python3 and uses RDKit along with some additional packages. Install them using the following commands.

$ conda create -c conda-forge -n my-rdkit-env rdkit
$ conda activate my-rdkit-env
$ conda install pandas

Usage

Provide all Ligand IDs in the ‘lig-ids.csv‘ file and save it. Run the Jupyter notebook to get the results. The script will read ligand IDs, then download their respective SDF files, followed by combining them into a single SDF file. Finally, it will fetch SMILES from RDKit and write results in the ‘smiles.txt‘ file.

Availability

The script is available on GitHub in the ‘cheminformatics‘ repository.

References

Landrum, G. (2013). Rdkit documentation. Release, 1 (1-79), 4.

Requirements

Usage

Availability

References

Leave a Reply Cancel reply

You Might Also Like

Installing Pycharm on Ubuntu (Linux)

sminalog_analysis.py – A new Python script to fetch top binding affinities from SMINA log file

How to make swarm boxplot?

vs_analysis_compound.py: Python script to search for binding affinities based on compound names.