Repository logo
 
No Thumbnail Available
Publication

NGS4Cloud: Cloud-based NGS Data Processing

Use this identifier to reference this record.
Name:Description:Size:Format: 
INForum_2016_43.pdf128.93 KBAdobe PDF Download

Advisor(s)

Abstract(s)

Motivation and challenges: Next-Generation Sequencing (NGS) technologies are greatly increasing the amount of genomic computer data, revolutionizing the biosciences field and leading to the development of more complex NGS Data Analysis techniques [2]. These techniques, known as pipelines or workflows, consist of running and refining a series of intertwined computational analysis and visualization tasks on large amounts of data. These pipelines involve the use of multiple software tools and data resources in a staged fashion, with the output of one tool being passed as input to the next one. To simplify the design and execution of biomedical workflows by end users, especially those that use multiple software tools and data resources, a number of scientific workflow systems have been developed over the past decade. Examples include Galaxy [1] and Swift [3]. However, most of these scientific workflow systems cannot be easily deployed and most of the times are only available to users with access to specialized IT support. There are two main issues to address in the design of an execution environment to these pipelines. First, due to the complexity of configuring and parametrizing pipelines, the use of NGS Data Analysis techniques is not an easy task for a user without IT knowledge. Second, knowing input data can be as much as terabytes and petabytes, pipelines execution require, in general, a great amount of computational resources.

Description

Keywords

NGS Data Analysis techniques Pipelines

Citation

FORJA, João; [et al] – NGS4Cloud : Cloud-based NGS Data Processing. In Iforum 2016. Lisboa, Portugal: Inforum, 2016. Pp. 1-2.

Research Projects

Research ProjectShow more

Organizational Units

Journal Issue

Publisher

Inforum 2016

CC License