Trimmomatic is a read trimming tool for Illumina NGS data [1]. It is a flexible tool providing several functions to be operated on reads. These functions include trailing, leading, and several other quality control operations. In this article, we are going to perform trailing on NGS paired-end reads data using the GALAXY platform [2].
Trailing means cutting off the reads from the 3′-end (i.e., from right to left). We have an interleaved file of paired-end reads, let’s call it ‘input.fastq’.
Deinterlacing the input file
- Since this file is interleaved, we will have to deinterlace it. Click on the ‘FASTQ de-interlacer’ in the GALAXY tools.
- Select fastq dataset, i.e., your interleaved file containing paired-end reads.
- Click ‘Execute’.
This will create four different files: one file for forward reads, one for reverse reads, and reads without a pair are placed into the other two files.
Running Trimmomatic
After deinterlacing our file, we will perform trailing on our two first output files.
- Select ‘Trimmomatic’.
- Select ‘Paired-end (two separate input files)’ from the dropdown.
- Now, select for the R1 input file, select output file from the last step, namely ‘FASTQ de-interlaced left mates from data2‘ fand then for the R2 input file, select another output file, namely, ‘FASTQ de-interlaced right mates from data2‘ from the next dropdown menu.
- Now, select the trimming operation. We need to cut off at the end of the reads, therefore, we will select ‘Cut bases off the ens of a read, if below a threshold quality (TRAILING)’.
- The generally considered minimum quality to keep a base is 20. So, enter 20 in the minimum quality text box.
- Click ‘Execute‘. If the de-interlacer has finished the previous job, then trailing will start right away, otherwise, you will have to wait for the de-interlacer job to finish first.
References
- Bolger, A. M., Lohse, M., & Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, 30(15), 2114-2120.
- https://usegalaxy.org/