Article Abstract
International Journal of Advance Research in Multidisciplinary, 2023;1(1):738-743
Routine examination of Mapreduce-based Apriori algorithm on Hadoop cluster: Performance analysis and optimization
Author : Shweta Mittal and Dr. Prerna Sidana
Abstract
This paper presents an in-depth examination of the MapReduce-based Apriori algorithm implemented on a Hadoop cluster, focusing on its performance and potential optimization strategies. The Apriori algorithm is a foundational tool in frequent itemset mining, commonly used for market basket analysis and association rule learning. When applied to large datasets, its computational complexity becomes a critical issue, which has driven the adoption of distributed frameworks like Hadoop and MapReduce. This study evaluates the performance of the Apriori algorithm on a small-sized Hadoop cluster, identifies bottlenecks, and proposes optimization strategies. Through a series of experiments, we analyze the algorithm’s execution time, resource utilization, and scalability. The findings indicate that while MapReduce enhances the algorithm’s capability to handle large datasets, significant improvements can be achieved by optimizing the data partitioning, load balancing, and resource allocation. The paper concludes with recommendations for further research and potential improvements in implementing the Apriori algorithm in distributed computing environments.
Keywords
Routine, Mapreduce-based, Apriori algorithm, Hadoop