We summarize the latest research published this month in the field of bioinformatics.
1. A map of human cell structure fusing protein images and interactions.
A multi-scale map of human cell structure is developed by integrating immunofluorescence images in the Human Protein Atlas with affinity purification in Bioplex [1]. It creates a unified hierarchical map of human cell structure. This is a machine learning based method known as the multi-scale integrated cell (MuSIC 1.0) [1]. It resolves 69 subcellular systems. The MuSIC 1.0 code is freely available on GitHub.
For more information, read here.
2. New machine learning based tool for assembly quality evaluation of mammalian genome.
A new tool called EvalDNA is developed for the evaluation of mammalian genome assembly quality evaluation [2]. It stands for Evaluation of De Novo Assemblies. EvalDNA provides a quality score to genome assemblies based on supervised machine learning without requiring a reference genome for accuracy assessment. It allows direct comparison amongst assemblies from different species. EvalDNA is freely available on GitHub.
For more information, read here.
3. New method for drug-target interaction prediction based on neural networks.
A new method called NeuRank is developed to predict drug-target interactions based on neural networks [3]. The authors have treated drug-target interaction as a ranking problem. They have addressed the drug and target similarities to avoid the similar drug compounds problem that may interact with similar proteins. NeuRank showed better performance in predicting drug-target interactions than that of the other available methods.
For more information, read here.
4. New tool for multi-species splice site prediction based on convolutional neural networks.
A new tool called Spliceator is developed to predict multi-species splice sites in model and non-model organisms [4]. It is based on convolutional neural networks. Spliceator is trained on validated data from over 100 organisms. It shows 89% to 92% high accuracy as compared to the existing methods. Spliceator is implemented in Python is freely available to download at https://git.unistra.fr/nscalzitti/spliceator. The webserver is available at http://www.lbgi.fr/spliceator/.
For more information, read here.
5. New method for white blood cells counting in bone marrow images based on deep learning.
A new method is introduced for counting white blood cells in colored bone marrow images [5]. It is based on deep learning and uses FasterCNN and a feature pyramid network dealing with various illumination levels and color components’ stability. 609 white blood cells images of 2560 × 1920 resolution were used to test the method and resulted in 98.8% accuracy [5].
For more information, read here.
6. New toolkit for file system virtualization of compressed FASTA files.
A new Linux-based toolkit is developed for file system virtualization of randomly accessed compressed FASTA files [6]. This is known as FASTAFS. It uses the filesystem in userspace for virtualization of FASTA archives into the file system. FASTAFS also provides fast decompression by using bit encodings plus Zstandard (zstd). It is a single executable consisting of various subcommands. The executable can be easily downloaded from GitHub.
For more information, read here.
References
- Qin, Y., Huttlin, E.L., Winsnes, C.F. et al. (2021). A multi-scale map of cell structure fusing protein images and interactions. Nature.
- MacDonald, M.L., Lee, K.H. (2021). EvalDNA: a machine learning-based tool for the comprehensive evaluation of mammalian genome assembly quality. BMC Bioinformatics 22, 570.
- Wu, X., Zeng, W., Lin, F. et al. (2021). NeuRank: learning to rank with neural networks for drug–target interaction prediction. BMC Bioinformatics 22, 567.
- Scalzitti, N., Kress, A., Orhand, R. et al. (2021). Spliceator: multi-species splice site prediction using convolutional neural networks. BMC Bioinformatics 22, 561.
- Wang, D., Hwang, M., Jiang, WC. et al. (2021). A deep learning method for counting white blood cells in bone marrow images. BMC Bioinformatics 22, 94.
- Hoogstrate, Y., Jenster, G.W. & Werken, H.J.G.v.d. (2021). FASTAFS: file system virtualisation of random access compressed FASTA files. BMC Bioinformatics 22, 535.