Connect with us


How to define features/predictors of user dataset for SOM analysis in MATLAB?



How to define features/predictors of user dataset for SOM analysis in MATLAB?

Most of the examples that demonstrate the SOM analysis use the default Iris dataset. It is also mentioned in the SOM example of MATLAB that for the user dataset, the predictors need to be defined [1]. But it is not quite clear how to define the features or predictors of our dataset. In this article, we will learn how to define predictors for user datasets.

In order to define a problem, you specify predictors like this:
predictors = [7 0 6 2 6 5 6 1 0 1; 6 2 5 0 7 5 5 1 2 2]

Here, there are 10 digits or values that represent 10 columns and a row. A semicolon separates the rows.

So, for example, if you have a dataset in the form of an excel sheet where the features are in columns, then you paste the values in a row here and separate them by a semicolon. But remember the number of columns or the values must be the same for each row. For instance, if there are 10 columns in the first row, the following rows must have 10 values.

Let’s take another example if we have a dataset of 3 x 5, then the predictors will look like this:

predictors = [3 0 6 0 1; 5 5 1 2 2; 4 8 3 1 9]

As you see there are 5 values in each row.



Dr. Muniba is a Bioinformatician based in New Delhi, India. She has completed her PhD in Bioinformatics from South China University of Technology, Guangzhou, China. She has cutting edge knowledge of bioinformatics tools, algorithms, and drug designing. When she is not reading she is found enjoying with the family. Know more about Muniba