­
­
­
­

Inventi Impact - Cloud Computing

Articles

  • Inventi:ecc/77/14
    AN IMPROVED TASK ASSIGNMENT SCHEME FOR HADOOP RUNNING IN THE CLOUDS
    Wei Dai, Mostafa Bassiouni

    Nowadays, data-intensive problems are so prevalent that numerous organizations in various industries have to face them in their business operation. It is often crucial for enterprises to have the capability of analyzing large volumes of data in an effective and timely manner. MapReduce and its open-source implementation Hadoop dramatically simplified the development of parallel data-intensive computing applications for ordinary users, and the combination of Hadoop and cloud computing made large-scale parallel data-intensive computing much more accessible to all potential users than ever before. Although Hadoop has become the most popular data management framework for parallel data-intensive computing in the clouds, the Hadoop scheduler is not a perfect match for the cloud environments. In this paper, we discuss the issues with the Hadoop task assignment scheme, and present an improved scheme for heterogeneous computing environments, such as the public clouds. The proposed scheme is based on an optimal minimum makespan algorithm. It projects and compares the completion times of all task slots’ next data block, and explicitly strives to shorten the completion time of the map phase of MapReduce jobs. We conducted extensive simulation to evaluate the performance of the proposed scheme compared with the Hadoop scheme in two types of heterogeneous computing environments that are typical on the public cloud platforms. The simulation results showed that the proposed scheme could remarkably reduce the map phase completion time, and it could reduce the amount of remote processing employed to a more significant extent which makes the data processing less vulnerable to both network congestion and disk contention.

    How to Cite this Article
    CC Compliant Citation: Dai and Bassiouni: An improved task assignment scheme for Hadoop running in the clouds. Journal of Cloud Computing: Advances, Systems and Applications 2013 2:23, doi:10.1186/2192-113X-2-23. © Dai and Bassiouni; licensee Springer. 2013 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://?creativecommons.?org/?licenses/?by/?2.?0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
    Download Full Text