A new assembler called SAUTE is developed for sequence assembly that is based on target enrichment [1].
SAUTE stands for sequence assembly for target enrichment. It efficiently assembles repeat regions and reports multiple well-supported variants. Users can provide target sequences for assembly assistance. The authors have developed two assemblers namely, SAUTE and SAUTE_PROT.
How SAUTE works?
At first, SAUTE reads input files in fast or fastq or SRA format and trim reads. After that, it builds two De Bruijn Graphs followed by finding and assembling subgraphs. These subgraphs are further filtered using the alignment of reads and read pairs to the assembled graph. Further, it assembles additional sequences beyond the ends if a user opts to do so. Finally, it reports output in the format of graphical fragment assembly along with two fasta files.
As compared to TRINITY, RNASPADES, SPALIGNER, and SPADES in the case of RNA-seq, SAUTE_PROT outputs more coding sequences translating to benchmark proteins.
SAUTE is currently available for Linux only. It is available to download on GitHub.
For more information, read here.
References
- Souvorov, A., Agarwala, R. (2021). SAUTE: sequence assembly using target enrichment. BMC Bioinformatics 22, 375.