Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/694
Title: DNA Short Read Alignment on Parallel Platforms
Authors: Maryam Abdulrahman Al-Jame 
Supervisor: Prof. Imtiaz Ahmad
Keywords: DNA : Parallel
Issue Date: 2018
Publisher:  Kuwait university - college of graduate studies
Abstract: The evolution of technologies has unleashed a wealth of challenges by generating massive amount of data. Recently, biological data has increased exponentially, which has introduced several computational challenges. DNA short reads alignment is an important problem in bioinformatics. The exponential growth in the number of short reads has increased the need for an ideal platform to accelerate the alignment process. The aim of this thesis is to take advantage of two parallel platforms by exploring their capabilities to accelerate DNA short reads alignment. The two platforms are the Micron Automata Processor (AP), and the Apache Spark. Micron Automata Processor (AP) is a new DRAM technology that implements a numerous set of Non-deterministic Finite Automata (NFAs) in hardware natively. The AP exploits the level of parallelism by executing NFAs concurrently over a single input dataflow. On the other hand, Apache Spark is a cluster-computing framework that involves data parallelism and fault tolerance. In this thesis, we propose two algorithms to accelerate DNA short reads alignment problem. The first algorithm uses AP and the second algorithm called Spark-DNAligning is based on Apache Spark. Spark-DNAligning exploits Apache Spark’s performance optimizations such as join after partitioning, caching, and in-memory computations. Spark-DNAligning is evaluated in term of performance by comparing it with a MapReduce based algorithm called CloudBurst. All the experiments are conducted on Amazon Web Services (AWS). Results demonstrate that Spark-DNAligning outperforms CloudBurst by providing a speedup in the range of 160 to 863 in aligning gigabytes of short reads to the human genome. Empirical evaluation reveals that parallel platforms are promising solutions to DNA short reads alignment problem.
URI: http://hdl.handle.net/123456789/694
Appears in Programs:0612 Computer Engineering

Files in This Item:
File Description SizeFormat 
Thesis.pdf4,44 MBAdobe PDFView/Open    Request a copy
Show full item record

Page view(s)

5
Last Week
0
Last month
checked on Nov 21, 2019

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.