Bioinformatics ReviewBioinformatics Review
Notification Show More
Font ResizerAa
  •  Home
  • Docking
  • MD Simulation
  • Tools
  • More Topics
    • Softwares
    • Sequence Analysis
    • Algorithms
    • Bioinformatics Programming
    • Bioinformatics Research Updates
    • Drug Discovery
    • Phylogenetics
    • Structural Bioinformatics
    • Editorials
    • Tips & Tricks
    • Bioinformatics News
    • Featured
    • Genomics
    • Bioinformatics Infographics
  • Community
    • BiR-Research Group
    • Community Q&A
    • Ask a question
    • Join Telegram Channel
    • Join Facebook Group
    • Join Reddit Group
    • Subscription Options
    • Become a Patron
    • Write for us
  • About Us
    • About BiR
    • BiR Scope
    • The Team
    • Guidelines for Research Collaboration
    • Feedback
    • Contact Us
    • Recent @ BiR
  • Subscription
  • Account
    • Visit Dashboard
    • Login
Font ResizerAa
Bioinformatics ReviewBioinformatics Review
Search
Have an existing account? Sign In
Follow US
Data Mining

Bioinformatics data mining: an introduction

Dr. Muniba Faiza
Last updated: May 20, 2020 5:48 pm
Dr. Muniba Faiza
Share
3 Min Read
SHARE

Bioinformaticians handle a large amount of data: in TBs if not in gigs thus it becomes important not only to store such massive data but also making sense out of them. In this article, I will talk about what is data mining and how bioinformaticians can benefit from it.

What is data mining?

Data Mining is the process of discovering a new data/pattern/information/understandable models from ha uge amount of data that already exists. It is sometimes also referred to as “Knowledge Discovery in Databases” (KDD). It has been successfully applied in bioinformatics which is data-rich and requires essential findings such as gene expression, protein modeling, drug discovery and so on. Development of novel data mining methods provides a useful way to understand the rapidly expanding biological data. Now let’s discuss basic concepts of data mining and then we will move to its application in bioinformatics. I will also discuss some data mining tools in upcoming articles.

As defined earlier, data mining is a process of automatic generation of information from existing data. The major goals of data mining are “prediction” & “description”. The main tasks which can be performed with it are as follows:

  • Classification: Classification is the learning of a function that maps / reads (classifies) the input data item into one of several predefined classes (i.e., existing data).
  • Estimation: It shows a value for the data input.
  • Prediction: Involves both classification and estimation, but the data is classified on the basis of the some future behavior or estimated future value.
  • Association rules: It is also known as dependency modeling, where it determines the data associated with each other and what may be the outcomes.
  • Clustering: Separating the population into subgroups or clusters.
  • Description & Visualization: Representing the data with the help of visualization techniques / tools.

Data learning is composed of two main categories:

Directed (Supervised) learning and Indirected (Unsupervised) learning.

Classification, Estimation and Prediction falls under the category of Supervised learning and the rest three tasks- Association rules, Clustering and Description & Visualization comes under the Unsupervised learning. In the former category, some relationships are established among all the variables and the patterns are identified in the later category.

Data Mining has been proved to be very effective and useful in bioinformatics, such as, microarray analysis, gene finding, domain identification, protein function prediction, disease identification, drug discovery and so on.

For follow up, please write to muniba@bioinformaticsreview.com.

 

References:

K Raza. APPLICATION OF DATA MINING IN BIOINFORMATICS, Indian Journal of Computer Science and Engineering, Vol 1 No 2, 114-118

Mohammed J Zaki, Data Mining in Bioinformatics (BIOKDD), Algorithms for Molecular Biology2007 2:4, DOI: 10.1186/1748-7188-2-4

Prof. Xiaohua (Tony) Hu, Editor, International Journal of Data Mining and Bioinformatics

Share This Article
Facebook Copy Link Print
ByDr. Muniba Faiza
Follow:
Dr. Muniba is a Bioinformatician based in New Delhi, India. She has completed her PhD in Bioinformatics from South China University of Technology, Guangzhou, China. She has cutting edge knowledge of bioinformatics tools, algorithms, and drug designing. When she is not reading she is found enjoying with the family. Know more about Muniba
Leave a Comment

Leave a Reply Cancel reply

You must be logged in to post a comment.

Starting in Bioinformatics? Do This First!
Starting in Bioinformatics? Do This First!
Tips & Tricks
[Editorial] Is it ethical to change the order of authors’ names in a manuscript?
Editorial Opinion
Installing bbtools on Ubuntu
[Tutorial] Installing BBTools on Ubuntu (Linux).
Sequence Analysis Software Tools
wes_data_analysis Whole Exome Sequencing (WES) Data visualization Toolkit
wes_data_analysis: Whole Exome Sequencing (WES) Data visualization Toolkit
Bioinformatics Programming GitHub Python

You Might Also Like

Data MiningSoftwareTools

PcircRNA_finder: Tool to predict circular RNA in plants

May 20, 2020
Copyright 2024 IQL Technologies
  • Journal
  • Customer Support
  • Contact Us
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Cookie Policy
  • Sitemap
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?

Not a member? Sign Up