Cover Image for System.Linq.Enumerable+EnumerablePartition`1[System.Char]

Hybrid Approach for Enhancing Performance of Genomic Data for Stream Matching

OAI: oai:igi-global.com:274063 DOI: 10.4018/IJCINI.20211001.oa38
Published by: IGI Global

Abstract

In gene expression analysis, the expression levels of thousands of genes are analyzed, such as separate stages of treatments or diseases. Identifying particular gene sequence pattern is a challenging task with respect to performance issues. The proposed solution addresses the performance issues in genomic stream matching by involving assembly and sequencing. Counting the k-mer based on k-input value and while performing DNA sequencing tasks, the researches need to concentrate on sequence matching. The proposed solution addresses performance issue metrics such as processing time for k-mer counting, number of operations for matching similarity, memory utilization while performing similarity search, and processing time for stream matching. By suggesting an improved algorithm, Revised Rabin Karp(RRK) for basic operation and also to achieve more efficiency, the proposed solution suggests a novel framework based on Hadoop MapReduce blended with Pig & Apache Tez. The measure of memory utilization and processing time proposed model proves its efficiency when compared to existing approaches.