Connect with us

Database

How to download small molecules from ZINC database for virtual screening?

Dr. Muniba Faiza

Published

on

Download ZINC database

It is difficult to manage thousands of compounds altogether while performing virtual high-throughput screening. Compounds databases allow to download of molecules in different formats such as the ZINC database [1] allows downloading a batch file that is processed afterward. In this article, we will download small molecules from the ZINC database [1] that can be used in virtual screening.

Downloading batch file

  1. Go to the download page of the ZINC database.
  2. There you can see multiple options to download such as drug-like, lead-like, clean, and so on.
  3. Select an appropriate category according to your work and download it by clicking on it. A new page will be displayed showing downloading options for Linux and Windows.
  4. Download in MOL2 or SDF format.
  5. If you are downloading for Linux, a csh file will be downloaded whereas, for Windows, a batch file will be downloaded.

Downloading structures

On Ubuntu

  1. Open a terminal (Ctrl+Alt+T).
  2. Change to the directory where you have downloaded the batch file:$ cd Downloads/
  3. $ chmod +x usual.sdf.csh
  4. $ csh usual.sdf.csh
  5. It will download all structures in sdf.gz format.
  6. $ gunzip -v *.sdf.gz. It will provide thousands of structures.
  7. If you want to combine these files into one to ease virtual high-throughput screening, then type the following command: $ cat *.sdf > all_clean.sdf

On Windows

You will have to install wget on Windows as shown below:

  1. Download an executable of wget from here.
  2. Now open command prompt (cmd) and type >path
  3. Copy the executable to C:\WINDOWS\System32.

After that, restart the command prompt and extract all molecules as shown below:

  1. Go to the folder where you have downloaded the batch file.

2. Right-click on file –> Run as administrator. A command prompt will appear and will download all files. This step will take a lot of time.

NOTE: It might be possible that an error will be displayed on the command prompt indicating “404 NOT FOUND”. For that, open this file in an editor and edit the URL given in the file.

Look at the second line, it would be set base=http://zinc.docking.org/db/bysubset/16.

Change it to set base=http://zinc12.docking.org/db/bysubset/16.

3. Extract these files: >unzip *.sdf.gz

4. Combine these files as: >cat *.sdf > all_clean.sdf

Now, you can use these structures for virtual screening.


References

  1. Irwin, J. J., & Shoichet, B. K. (2005). ZINC− a free database of commercially available compounds for virtual screening. Journal of chemical information and modeling45(1), 177-182.

Dr. Muniba is a Bioinformatician based in New Delhi, India. She has completed her PhD in Bioinformatics from South China University of Technology, Guangzhou, China. She has cutting edge knowledge of bioinformatics tools, algorithms, and drug designing. When she is not reading she is found enjoying with the family. Know more about Muniba

Cheminformatics

cheML.io: ML-generated molecules database

Dr. Muniba Faiza

Published

on

cheML.io: ML-generated database of molecules

Due to the advancement of machine learning (ML) methods, we can find increasing applications of them in the field of bioinformatics as well. ML is being utilized in making personalized medicines, similarity searches in DNA and protein sequences, phylogenetics by mapping selected species on phylogenetic trees, gene and protein function annotation, generating chemical compounds, and so on. In this article, we will discuss an online database of ML-generated molecules known as cheML.io [1].

(more…)

Continue Reading

Database

MitoTox- A new mitochondrial toxicity database

Published

on

MitoTox- A new mitochondrial toxicity database

Mitochondrial-toxicity-related molecules lead to damaging effects on mitochondria and cause severe side effects. To keep track of such molecules a new database is created. (more…)

Continue Reading

Database

TANTIGEN 2.0- A Database of Tumor T-cell Antigens & Epitopes

Published

on

TANTIGEN 2.0- A Database of Tumor T-cell Antigens & Epitopes

TANTIGEN is an online database of T-cell epitopes and HLA ligands [1]. A new version of TANTIGEN is introduced this month, known as TANTIGEN 2.0. In this article, we give a brief introduction to this new version of the database. (more…)

Continue Reading

Database

H2V- A Database of Human Responsive Genes & Proteins for SARS & MERS

Dr. Muniba Faiza

Published

on

h2v

A new database of response genes and proteins in humans for SARS and MERS is created, namely, H2V [1]. (more…)

Continue Reading

LATEST ISSUE

ADVERT