Bioinformatics ReviewBioinformatics Review
Notification Show More
Font ResizerAa
  •  Home
  • Docking
  • MD Simulation
  • Tools
  • More Topics
    • Softwares
    • Sequence Analysis
    • Algorithms
    • Bioinformatics Programming
    • Bioinformatics Research Updates
    • Drug Discovery
    • Phylogenetics
    • Structural Bioinformatics
    • Editorials
    • Tips & Tricks
    • Bioinformatics News
    • Featured
    • Genomics
    • Bioinformatics Infographics
  • Community
    • BiR-Research Group
    • Community Q&A
    • Ask a question
    • Join Telegram Channel
    • Join Facebook Group
    • Join Reddit Group
    • Subscription Options
    • Become a Patron
    • Write for us
  • About Us
    • About BiR
    • BiR Scope
    • The Team
    • Guidelines for Research Collaboration
    • Feedback
    • Contact Us
    • Recent @ BiR
  • Subscription
  • Account
    • Visit Dashboard
    • Login
Font ResizerAa
Bioinformatics ReviewBioinformatics Review
Search
Have an existing account? Sign In
Follow US
Bioinformatics ProgrammingCheminformaticsPython

tanimoto_similarities.py: A Python script to calculate Tanimoto similarities of multiple compounds using RDKit.

Dr. Muniba Faiza
Last updated: July 2, 2022 11:31 pm
Dr. Muniba Faiza
Share
2 Min Read
tanimoto_similarities.py: A Python script to calculate Tanimoto similarities of multiple compounds using RDKit.
SHARE

RDKit [1] is a very nice cheminformatics software. It allows us to perform a wide range of operations on chemical compounds/ ligands. We have provided a Python script to perform fingerprinting using Tanimoto similarity on multiple compounds using RDKit.

Contents
IntroductionAvailabilityRequirementsUsageReferences

Introduction

tanimoto_similarities.py script calculates Tanimoto similarities of given molecules in the form of smiles.

Let’s say we have a list of smiles of 15 molecules in a CSV file named ‘smiles.csv’. This file may also consist of other information such as ligand name, serial number, and so on. In that case, you can extract the smiles column from the CSV file. The smiles are available under the column named “SMILES” (or edit the column name in the script as per your file).

This script will calculate similarities and save them in the form of text files and heatmaps. Generated heatmaps will help you visualize the matrix. Sample smiles are provided in the ‘smiles.csv’ file.

Availability

The script is available on GitHub under the package ‘tanimoto_similarities‘.

Requirements

This script requires Python3 and uses RDKit along with some additional packages. Install them using the following commands.

$ conda create -c conda-forge -n my-rdkit-env rdkit
$ conda activate my-rdkit-env
$ pip3 install seaborn
$ sudo apt-get install python3-matplotlib
$ conda install pandas
$ pip3 install numpy

Usage

This script consists of two functions. One function calculates the similarity matrix and shows the usual heatmap and saves the output file as ‘similarities.txt‘. The other function calculates the similarity matrix as a lower triangular matrix and saves the output file as ‘similarities_lower_tri.txt‘.
Run the script as:

$ python3 tanimoto_similarities.py

Note: If you still get an error stating “rdkit not found”, then perhaps you have not activated the rdkit environment. Run the conda activate my-rdkit-env command again and then run the script.


References

  1. Landrum, G. (2013). Rdkit documentation. Release, 1 (1-79), 4.
TAGGED:compoundsligandspython scriptrdkittanimototanimoto similarities
Share This Article
Facebook Copy Link Print
ByDr. Muniba Faiza
Follow:
Dr. Muniba is a Bioinformatician based in New Delhi, India. She has completed her PhD in Bioinformatics from South China University of Technology, Guangzhou, China. She has cutting edge knowledge of bioinformatics tools, algorithms, and drug designing. When she is not reading she is found enjoying with the family. Know more about Muniba
Leave a Comment

Leave a Reply Cancel reply

You must be logged in to post a comment.

Starting in Bioinformatics? Do This First!
Starting in Bioinformatics? Do This First!
Tips & Tricks
[Editorial] Is it ethical to change the order of authors’ names in a manuscript?
Editorial Opinion
Installing bbtools on Ubuntu
[Tutorial] Installing BBTools on Ubuntu (Linux).
Sequence Analysis Software Tools
wes_data_analysis Whole Exome Sequencing (WES) Data visualization Toolkit
wes_data_analysis: Whole Exome Sequencing (WES) Data visualization Toolkit
Bioinformatics Programming GitHub Python

You Might Also Like

Bioinformatics ProgrammingGitHubMD SimulationPythonSoftwareTools

Free_Energy_Landscape-MD: Python package to create Free Energy Landscape using PCA from GROMACS.

March 13, 2025
AlgorithmsBioinformatics ProgrammingFasta

How to read fasta sequences from a file using PHP?

October 23, 2024
AlgorithmsBioinformatics ProgrammingSoftware

BETSY: A new backward-chaining expert system for automated development of pipelines in Bioinformatics

May 20, 2020
AlgorithmsBioinformatics ProgrammingPerl

How to search motif pattern in FASTA sequences using Perl hash?

May 20, 2020
Copyright 2024 IQL Technologies
  • Journal
  • Customer Support
  • Contact Us
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Cookie Policy
  • Sitemap
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?

Not a member? Sign Up