CSpace  > 大数据挖掘及应用中心
HScheduler: an optimal approach to minimize the makespan of multiple MapReduce jobs
Tian, Wenhong1,3,4; Li, Guozhong1; Yang, Wutong1; Buyya, Rajkumar2
2016-06-01
摘要Large-scale MapReduce clusters that routinely process big data bring challenges to the cloud computing. One of the key challenges is to reduce the response time of these MapReduce clusters by minimizing their makespans. It is observed that the order in which these jobs are executed can have a significant impact on their overall makespans and resource utilization. In this work, we consider a scheduling model for multiple MapReduce jobs. The goal is to design a job scheduler that minimizes the makespan of such a set of MapReduce jobs. We exploit classical Johnson model and propose a novel framework HScheduler, which combines features of both classical Johnson's algorithm and MapReduce to minimize the makespan for both offline and online jobs. Our Offline HScheduler reaches the theoretical lower bound (optimum) and Online HScheduler is 2-competitive which is the best-known constant ratio for minimizing the makespan. Through extensive real data tests, we find that HScheduler has better performance than the best-known approach by 10.6-11.7 % on average for offline scheduling and 8-10 % on average for online scheduling. The HScheduler can be applied to improve responsive time, throughput and energy efficiency in cloud computing.
关键词Hadoop MapReduce Batch workloads Optimized schedule Minimized makespan
DOI10.1007/s11227-016-1737-4
发表期刊JOURNAL OF SUPERCOMPUTING
ISSN0920-8542
卷号72期号:6页码:2376-2393
通讯作者Tian, WH (reprint author), Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 610054, Peoples R China. ; Tian, WH (reprint author), Univ Elect Sci & Technol China, Big Data Res Ctr, Chengdu 610054, Peoples R China. ; Tian, WH (reprint author), Chinese Acad Sci, Chongqing Inst Green & Intelligent Technol, Chongqing 400714, Peoples R China.
收录类别SCI
WOS记录号WOS:000376650000015
语种英语