Connect with us

Sequence Analysis

FEGS- A New Feature Extraction Model for Protein Sequence Analysis

Tariq Abdullah

Published

on

FEGS- A New Feature Extraction Model for Protein Sequence Analysis

Protein sequence analyses include protein similarity, Protein function prediction, protein interactions, and so on. A new feature extraction model is developed for easy analysis of protein sequences.

This extraction model is known as FEGS (Feature Extraction based on Graphical and Statistical features) [1]. It represents protein sequences graphically based on their physicochemical properties and statistical features. By using these two properties/features, FEGS transforms a protein sequence into a 578-dimensional numerical vector.

How does FEGS work?

After taking protein sequences as input, FEGS starts building 158 space curves for the next protein sequence. After that, it builds L/L matrices and calculates normalized maximum eigenvalues. In the third step, it calculates the frequency of 20 amino acids and 400 dipeptides present in the sequence. It ultimately provides a frequency vector of that protein sequence. In the fourth step, it develops a feature vector of the sequence. It can later be subjected to phylogenetic analysis.

FEGS is a user-friendly software and freely downloadable on Sourceforge at https://sourceforge.net/projects/transcriptomeassembly/files/Feature%20Extraction/. FEGS’s performance has been tested on five different protein sequence datasets and it has shown the best performance amongst the other existing methods.

For more information, read here.


References

  1. Mu, Z., Yu, T., Liu, X. et al. (2021). FEGS: a novel feature extraction model for protein sequences and its applications. BMC Bioinformatics 22, 297.

Tariq is founder of Bioinformatics Review and CEO at IQL Technologies. His areas of expertise include algorithm design, phylogenetics, MicroArray, Plant Systematics, and genome data analysis. If you have questions, reach out to him via his homepage.

Sequence Analysis

Installing RDPTools on Ubuntu (Linux)

Dr. Muniba Faiza

Published

on

Installing RDPTools on Ubuntu

RDP provides analysis tools called RDPTools. These tools are used to high-throughput sequencing data including single-strand, and paired-end reads [1]. In this article, we are going to install RDPTools on Ubuntu (Linux). (more…)

Continue Reading

Sequence Analysis

NGlyAlign- A New Tool to Align Highly Variable Regions in HIV Sequences

Tariq Abdullah

Published

on

NGlyAlign: A tool to align Highly Variable Regions in HIV envelope

It is necessary to detect highly variable regions in envelopes of viruses as it allows the establishment of the viruses in the human body. A new tool is developed to build and align the highly variable regions in HIV sequences. (more…)

Continue Reading

Sequence Analysis

How to install ClustalW2 on Ubuntu?

Tariq Abdullah

Published

on

Installing clustalw2 command-line tool on Ubuntu

Clustal packages [1,2] are quite useful in multiple sequence alignments. Especially, when you need specific outputs from the command-line. In this article, we will install CustalW2 command-line tool on Ubuntu. (more…)

Continue Reading

LATEST ISSUE

ADVERT