MOCCA- A New Suite to Model cis- regulatory Elements for Motif Occurrence Combinatorics

Tariq Abdullah
1 Min Read

cis-regulatory elements are DNA sequence segments that regulate gene expression. cis-regulatory elements consist of some regions such as promoters, enhancers, and so on. These regions consist of specific sequence motifs.

A newly expanded suite is developed to model the motif occurrence combinatorics in DNA sequences [1]. This suite is known as MOCCA (Motif Occurrence Combinatorics Algorithms). It is based on support vector machines (SVM) and random forest (RF) models, SVM-MOCCA and RF-MOCCA respectively.

Users enter motifs, training sequences, and model specifications. The sequences are provided in FATSA format or can be generated by either an i.i.d. model or an N-th order Markov chain. Users can enter IUPAC nucleotide code motifs and Position Weight Matrix (PWM) motifs.

MOCCA implements three types of models:

  • CPREdictor
  • Dummy PREdictor
  • SVM-MOCCA
  • including a new hierarchical model RF-MOCCA.

MOCCA is implemented in C++ with a minimal number of dependencies to ease the process of installation. It can be installed on Unix-based systems.

MOCCA is freely available to download from Github.

For more details, read here.


References

  1. Bredesen, B.A., Rehmsmeier, M. (2021). MOCCA: a flexible suite for modelling DNA sequence motif occurrence combinatorics. BMC Bioinformatics 22, 234.
Share This Article
Tariq is founder of Bioinformatics Review and CEO at IQL Technologies. His areas of expertise include algorithm design, phylogenetics, MicroArray, Plant Systematics, and genome data analysis. If you have questions, reach out to him via his homepage.
Leave a Comment

Leave a Reply