JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2015, Vol. 50 ›› Issue (07): 66-70.doi: 10.6040/j.issn.1671-9352.0.2014.461

Previous Articles     Next Articles

Optimization on MapReduce algorithm based on Hash table

LI Rui-xia, LIU Ren-jin, ZHOU Xian-cun   

  1. School of Information Engineering, West Anhui University, Lu'an 237012, Anhui, China
  • Received:2014-10-20 Online:2015-07-20 Published:2015-07-31

Abstract: Distributed parallel computing is commonly used to improve computer performance. But according to different demands, there is not a uniform way to design and implement parallel program. Parallel programming depends on the experience of developer. MapReduce, a distributed parallel programming model, put forward by Google, can perform special parallel program development and operation. MapReduce was optimized by using Hash table, which would decrease fragment of Map function, skip other redundancy function such as Combiner function, reduce transmission load and improve computing efficiency. Meanwhile, the attributes of Map function and Reduce function were kept to make MapReduce maintaining parallel.

Key words: distributed, MapReduce, Map function, Hash table, parallel, Hadoop

CLC Number: 

  • TP311
[1] DEAN J, GHEMAWAT S. MapReduce: simplified data processing on large clusters[J]. Communications of the ACM, 2008, 51(1):107-113.
[2] YANG H, DASDAN A, HSIAO R L, et al. Map-reduce-merge: simplified relational data processing on large clusters[C]//Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. New York: ACM, 2007:1029-1040.
[3] APACHE. Welcome to Apache Hadoop[EB/OL].[2014-09-09]. http://hadoop.apache.org/. 2014
[4] 孙牧. 云端的小飞象——Hadoop[J]. 程序员, 2008(10):100-102. SUN Mu. The flying elephant on cloud—Haoop[J]. Journal of Programmer, 2008(10):100-102.
[5] 李成华, 张新访, 金海, 等. MapReduce: 新型的分布式并行计算编程模型[J]. 计算机工程与科学, 2011, 33(3):129-135. LI Chenghua, ZHANG Xinfang, JIN Hai, et al. MapReduce: a new distributed parallel computing programming model[J]. Journal of Computer Engineering and Science, 2011, 33(3):129-135.
[6] 李建江, 崔健, 王聃, 等. MapReduce并行编程模型研究综述[J]. 电子学报, 2011, 39(11):2635-2642. LI Jianjiang, CUI Jian, WANG Dan, et al. Review of MapReduce parallel computing programming model[J]. Chinese Journal of Electronics, 2011, 39(11):2635-2642.
[7] LEISERSON C E, RIVEST R L, STEIN C. Introduction to algorithms[M]. The MIT Press, 2001.
[8] 陈全,邓倩妮.云计算及其关键技术[J].计算机应用,2009,29(9):2562-2567. CHEN Quan, DENG Qianni.Cloud computing and its key techniques[J]. Journal of Computer Applications, 2009, 29(9):2562-2567.
[9] 陆秋,程小辉. 基于MapReduce 的决策树算法并行化[J].计算机应用, 2012, 32(9):2463-2465. LU Qiu, CHENG Xiaohui. Parallelization of decision tree algorithm based on MapReduce[J].Journal of Computer Applications, 2012, 32(9):2463-2465.
[10] 谢桂兰, 罗省贤. 基于 Hadoop MapReduce 模型的应用研究[J]. 微型机与应用, 2010, 25(3):4-7. XIE Guilan, LUO Shengxian. Based on Hadoop MapReduce model apply and research[J]. Journal of Microcomputer and Applications, 2010, 25(3):4-7.
[11] 张雪萍,龚康莉,赵广才. 基于MapReduce的K-Medoids 并行算法[J]. 计算机应用, 2013, 33(4):1023-1025. ZHANG Xueping, GONG Kangli, ZHAO Guangcai. Parallel K-Medoids algorithm based on MapReduce[J].Journal of Computer Applications, 2013, 33(4):1023-1025.
[12] 郭进伟,皮建勇. 基于MapReduce的SON算法实现[J]. 计算机应用,2014, 34(S1):100-102. GUO Jinwei, PI Jianyong. Realization SON algorithm based on MapReduce[J]. Journal of Computer Applications, 2014, 34(S1):100-102.
[13] BONDHUGULA U.Automatic distributed-memory parallelization and code generation using the polyhedral framework, IISc-CSA-TR-2011-3[R].Bangalore: Indian Institute of Science, 2011.
[1] LI Li, GUAN Tao, LIN He. The hybrid parallel rough set model based on pansystems operators [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(7): 22-29.
[2] JI Xin-rong, HOU Cui-qin, HOU Yi-bin, ZHAO Bin. A distributed training method for L1 regularized kernel machines based on filtering mechanism [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(9): 137-144.
[3] LIU Xin, XU Qiu-liang, ZHANG Bo. Cooperative group signature scheme with controllable linkability [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(9): 18-35.
[4] WANG Chang-hong, WANG Lin-shan. Mean square exponential stability of memristor-based stochastic neural networks with S-type distributed delays [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(5): 130-135.
[5] TANG Liang, LI Qian, XU Hong-bo, YI Mian-zhu. Chinese-Japanese multi-word phrase extraction and alignment based on multi-strategy filtering [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2015, 50(09): 21-28.
[6] ZHEN Yan, WANG Lin-shan. Mean square exponential stability analysis of stochastic generalized cellular neural networks with S-type distributed delays [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(12): 60-65.
[7] YANG Yang, LIU Long-fei, WEI Xian-hui, LIN Hong-fei. New methods for extracting emotional words based on distributed representations of words [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(11): 51-58.
[8] LU Qi-bei1,2, GUO Fei-peng3. Distributed associative classification algorithm based on improved FP-tree [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(1): 71-75.
[9] MA Kui-sen, WANG Lin-shan*. Exponential synchronization of stochastic BAM neural networks with#br# S-type distributed delays [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(03): 73-78.
[10] GAO Zheng-hui, LUO Li-ping. Philos-type oscillation criteria for third-order nonlinear functional differential equations with distributed delays and damped terms [J]. J4, 2013, 48(4): 85-90.
[11] GUO Xiao-dong1, DU Peng1, ZHANG Xue-fen2. A energy-efficient distributed detection and power allocation algorithm in wireless sensor networks [J]. J4, 2012, 47(9): 60-64.
[12] ZHANG Wei-wei1, WANG Lin-shan2*. Global exponential robust stability of stochastic interval cellularneural networks with S-type distributed delays [J]. J4, 2012, 47(3): 87-92.
[13] WANG Kan1, WU Lei2,3, HAO Rong4. A resilient and distributed scheme of data security [J]. J4, 2011, 46(9): 39-42.
[14] CHEN Pei-Jian1, YANG Yue-Xiang2, TANG Chuan2. Honesty-rate measuring based distributed intrusion detection system [J]. J4, 2011, 46(9): 77-80.
[15] ZENG Jian-ping, WU Cheng-rong, GONG Ling-hui. Algorithm of dynamic maintaince of index library for a distributed search engine [J]. J4, 2011, 46(5): 24-27.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!