Connect with us

Cheminformatics

Converting file formats using Openbabel.

Published

on

Converting file formats using Openbabel.

Openbabel [1] offers a wide range of operations. One of which is file format conversion which is most widely used. In this article, we will describe commands that convert file formats.

Assuming, you have already installed Openbabel on your system, you should be able to run it as obabel/babel in the terminal. Also, you can use the GUI of Openbabel that you will have to compile during its installation.

So, here is the command syntax for file conversion:

$ obabel -i<input_format> <input_filename> -o<output_format> -O <output_filename> -other_arguments

It is optional to provide input and output formats but it is always better to do so.

Let’s convert opensmiles SMILES to canonical SMILES.

$ obabel -ismi input.smi -osmi -O output.smi -ocan

Similarly, you can convert SMILES to InChi as shown below:

$ obabel -ismi input.smi -oinchi -O output.inchi

If you want, you can give some additional arguments such as ignoring isomers, removing duplicates, etc. Sometimes, the openbabel stops processing once it finds any invalid compound/molecule, so for that, you can -e to the command.

$ obabel -ismi input.smi -oinchi -O output.inchi -e --unique

Here, –unique will not convert the duplicate molecules.

If you want to ignore the stereochemistry of the molecules then you can use the following arguments:

$ obabel -ismi input.smi -oinchi -O output.inchi -e /nostereo /nochg /noiso /noEZ /sp3

You can also keep unique molecules based on the arguments given in the above command by writing –unique before the arguments as shown below.

$ obabel -ismi input.smi -oinchi -O output.inchi -e --unique /nostereo /nochg /noiso /noEZ /sp3

For further information on supported file formats, read here.


References

  1. O’Boyle, NM, Banck, M., James, CA, Morley, C., Vandermeersch, T., & Hutchison, GR (2011). Open Babel: An open chemical toolbox. Journal of cheminformatics3 (1), 1-14.

Dr. Muniba is a Bioinformatician based in New Delhi, India. She has completed her PhD in Bioinformatics from South China University of Technology, Guangzhou, China. She has cutting edge knowledge of bioinformatics tools, algorithms, and drug designing. When she is not reading she is found enjoying with the family. Know more about Muniba

Bioinformatics Programming

How to obtain ligand structures in PDB format from PDB ligand IDs?

Published

on

How to obtain ligand structures in PDB format from PDB ligand IDs?

Previously, we provided a similar script to download ligand SMILES from PDB ligand IDs. In this article, we are downloading PDB ligand structures from their corresponding IDs. (more…)

Continue Reading

Cheminformatics

cheML.io: ML-generated molecules database

Published

on

cheML.io: ML-generated database of molecules

Due to the advancement of machine learning (ML) methods, we can find increasing applications of them in the field of bioinformatics as well. ML is being utilized in making personalized medicines, similarity searches in DNA and protein sequences, phylogenetics by mapping selected species on phylogenetic trees, gene and protein function annotation, generating chemical compounds, and so on. In this article, we will discuss an online database of ML-generated molecules known as cheML.io [1].

(more…)

Continue Reading

Bioinformatics Programming

How to obtain SMILES of ligands using PDB ligand IDs?

Published

on

How to obtain SMILES of ligands using PDB ligand IDs?

Fetching SMILE strings for a given number of SDF files of chemical compounds is not such a trivial task. We can quickly obtain them using RDKit or OpenBabel. But what if you don’t have SDF files of ligands in the first place? All you have is Ligand IDs from PDB. If they are a few then you can think of downloading SDF files manually but still, it seems time-consuming, especially when you have multiple compounds to work with. Therefore, we provide a Python script that will read all Ligand IDs and fetch their SDF files, and will finally convert them into SMILE strings. (more…)

Continue Reading

LATEST ISSUE

ADVERT