Title : DeBaser: An online tool for fast RNA-Seq data assembly and polymorphism discovery
Abstract:
The advent of Next Generation Sequencing (NGS) represented a dynamic leap in the capacity to study the genomic basis of variation within and between species. Knowledge of this variation is not only important in understanding possible causes of phenotypic diversity but is also crucial for accurate design of many molecular research tools such as RNAi constructs. However, full scale de novo assembly of NGS generated transcriptomes can take many months and raw data sets are being accumulated faster than they can be processed into usable assemblies. In order to expedite sequence information from NGS raw data we have designed ‘DeBaser’. This is a web based bioinformatic pipeline that aligns NGS raw data to a reference CDS set to produce a full transcriptome in just 48 -72 hours. Users can utilise DeBaser by uploading data and, if required, a reference genome. After assembly, the transcriptome is then stored permanently on the website and can be retrieved in full or the user can specify individual transcripts by entering gene identifiers. Polymorphisms between assembled transcriptomes can also be determined by selecting multiple varieties in the web interface. Users retrieve sequence information for each by entering selected gene identifiers or FASTA files. Multi-sequence alignment files showing polymorphisms between varieties are generated via MultiAlin or Muscle. The designers have also provided pre-assembled plant transcriptomes which can be utilised along with the user provided data. DeBaser is in the final stages of development and will be released in the second half of 2017.
Keywords: Next Generation Sequencing, polymorphism, multiple sequence alignment