In this article, we summarize the latest interesting findings made this month in bioinformatics.
1. Mechanistic understanding of the cryptic transmission of SARS-CoV-2 in Europe & USA.
A new study has been published stating the cryptic transmission of SARS-CoV-2 during the first wave in Europe and USA [1]. The authors have implemented a global metapopulation epidemic model to provide the mechanistic understanding behind the spread of SARS-CoV-2 in the two countries. The study has revealed that the SARS-CoV-2 was transmitted in Europe and USA in January 2020 and by early March only 1-3 in 100 infections were detected. This approach complements phylogenetic analyses and other methods to provide insights into the spread of the infection [1].
For more information, read here.
2. New method to predict variant pathogenicity using deep generative models of evolutionary data.
A new method has been developed based on deep generative models of evolutionary data to predict the variant pathogenicity [2]. This method does not rely on labels that are based on machine learning models to avoid biases and insufficient reliability. It performs better than the other state-of-the-art methods and also provides better predictions from high-throughput experiments. The model is freely accessible and available on GitHub.
For more information, read here.
3. New open-source toolkit for genomic data visualization.
A new toolkit called CoolBox is developed for the visual analysis of genomic data [3]. It makes it easy to visualize patterns in a large-scale genomic dataset. CoolBox is an open-source toolkit that is highly compatible with the Python ecosystem. Its plotting system is based on the matplotlib package. CoolBox is implemented in Python and is available for Linux, Windows, and macOS operating systems. CoolBox can be downloaded from GitHub.
For more information, read here.
4. A new software for the discovery of circular RNAs from paired-end RNA sequencing data.
New software is developed for the discovery of circular RNAs from paired-end RNA sequencing data. This tool is called Circall [4]. Circall is highly efficient as it uses a quasi-mapping algorithm for fast and accurate RNA read alignment. It controls false positives by applying a robust multidimensional local false discovery rate method. Circall is written in C++ and R. It is freely available to use at https://www.meb.ki.se/sites/biostatwiki/circall and can be easily downloaded from GitHub.
For more information, read here.
5. New method for compound activity prediction based on binding pocket information.
A new end-to-end learning method is proposed for compound activity based on binding pocket information [5]. This method uses structure information of a binding pocket present in a target protein. It implements end-to-end learning by using a graph neural network to learn the important features for both the compound and the binding pocket structure. The new method showed equivalent accuracy to docking simulation using AutoDock Vina with a much shorter computing time.
For more information, read here.
6. New tool for progressive MSA with Poison indel process.
A new tool called ProPIP [6] is developed that uses the Poison indel process (PIP) to describe insertions and deletions. The tool is developing in a frequentist framework. It is available for Linux, macOS, and Microsoft Windows platforms. The source code is freely available on GitHub.
For more information, read here.
References
- Davis, J.T., Chinazzi, M., Perra, N. et al. (2021). Cryptic transmission of SARS-CoV-2 and the first COVID-19 wave. Nature.
- Frazer, J., Notin, P., Dias, M. et al. (2021). Disease variant prediction with deep generative models of evolutionary data. Nature.
- Xu, W., Zhong, Q., Lin, D. et al. (2021). CoolBox: a flexible toolkit for visual analysis of genomics data. BMC Bioinformatics 22, 489.
- Nguyen, D.T., Trac, Q.T., Nguyen, TH. et al. (2021). Circall: fast and accurate methodology for discovery of circular RNAs from paired-end RNA-sequencing data. BMC Bioinformatics 22, 495.
- Tanebe, T., Ishida, T. (2021). End-to-end learning for compound activity prediction based on binding pocket information. BMC Bioinformatics 22, 529.
- Maiolo, M., Gatti, L., Frei, D. et al. (2021). ProPIP: a tool for progressive multiple sequence alignment with Poisson Indel Process. BMC Bioinformatics 22, 518.