您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

《山东大学学报(理学版)》 ›› 2019, Vol. 54 ›› Issue (1): 103-115.doi: 10.6040/j.issn.1671-9352.3.2018.143

• • 上一篇    

基于失效规律感知的可靠动态级云资源调度算法

齐平1,2,王福成2,王必晴1,2,梁昌勇1   

  1. 1.合肥工业大学管理学院, 安徽 合肥 230039;2.铜陵学院数学与计算机科学系, 安徽 铜陵 244000
  • 发布日期:2019-01-23
  • 作者简介:齐平(1981— ), 男, 博士, 副教授,研究方向为可信计算、粒计算. E-mail:qiping929@gmail.com
  • 基金资助:
    国家自然科学基金重点资助项目(71331002);安徽省高校优秀青年骨干人才国内外访学研修项目(gxfx2017113);铜陵学院人才科研启动基金项目(2015tlxyrc08)

Dynamic level scheduling algorithm for cloud computing based on failure regularity-aware

QI Ping1,2, WANG Fu-cheng2, WANG Bi-qing1,2, LIANG Chang-yong1   

  1. 1. School of management, Hefei University of Technology, Hefei 230039, Anhui, China;
    2. Department of Mathematics and Computer Science, Tongling University, Tongling 244000, Anhui, China
  • Published:2019-01-23

摘要: 由于云系统资源的动态性、异构性和广域性等特征,云计算环境下运行的并行任务易受资源节点失效和通信链路故障的影响而无法完成。针对动态提供云资源可靠性较低,且在失效恢复机制作用下资源失效规律参数动态变化的问题,首先使用两参数威尔布(Weibull)分布对不同时段资源节点和通信链路失效规律的局部特征进行描述,再根据并行任务之间存在的各类交互关系分析,提出了一种基于变参数失效规则的资源可靠性评估模型。最后将该模型并入DLS算法得到可靠动态级调度算法CFR-DLS,从而在计算调度级别时充分考虑备选资源的可靠程度。仿真实验结果表明,当选择合适的失效恢复参数,提出的CFR-DLS算法能够在大幅提高云服务可靠性的同时,只增加少量的额外失效恢复开销。

关键词: 云计算, 失效规律, 威尔布分布, 失效恢复机制, 可靠性评估

Abstract: Due to the characteristics of dynamic, heterogeneity and distributed, parallel tasks in cloud computing environment cannot be accomplished because they are vulnerable to resource node failure and communication link failure. Aiming at the problem that the reliability of dynamically providing cloud resources is low and the parameters of resource failure law are dynamically changing under the failure recovery mechanism. Firstly, the local characteristics of failure nodes and communication links in different periods are described by using Weibull distribution. Then, according to the analysis of various kinds of interaction between parallel tasks, a resource reliability evaluation model based on variable parameter failure rules is proposed. Finally, the model is incorporated into the DLS algorithm to obtain the reliable dynamic level scheduling algorithm CFR-DLS, so that the reliability of alternative resources is fully considered when calculating the scheduling level. Simulation results show that the proposed CFR-DLS algorithm can greatly improve the reliability of cloud services while only increasing a small amount of additional failure recovery overhead.

Key words: cloud computing, failure regularity, Weibull distribution, failure recovery mechanism, reliability estimation

中图分类号: 

  • TP301.6
[1] DANIEL N, RICH W, CHRIS G, et al. The eucalyptus open-source cloud-computing system[C] // Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid. New York: IEEE, 2009: 124-131.
[2] BUYYA R, YEO C, VENUGOPAL S, et al. Cloud computing and emerging IT platforms: vision, hype and reality for delivering computing as the 5th utility[J]. Future Generation Computer Systems, 2009, 25(6):599-616.
[3] 许力,曾智斌,姚川.云计算环境中虚拟资源分配优化策略研究[J].通信学报, 2012,33(1):9-16. XU Li, ZENG Zhibin, YAO Chuan. Study on virtual resource allocation optimization in cloud computing environment[J]. Journal on Communications, 2012, 33(1):9-16.
[4] DARBHA S, AGRAWAL D. Optimal scheduling algorithm for distributed memory machines[J]. IEEE Transactions on Parallel and Distributed Systems, 2012, 9(1):87-95.
[5] LI S, ZHANG ZM, THOMAS R. Energy-aware scheduling of embarrassingly parallel jobs and resource allocation in Cloud[J]. IEEE Transactions on Parallel and Distributed Systems, 2017, 28(6):1607-1620.
[6] 李青,李勇,涂碧波,等.QoS保证的数据中心动态资源供应方法[J].计算机学报,2014, 37(12):2395-2406. LI Qing, LI Yong, TU Bibo, et al. QoS-guaranteed dynamic resource provision in internet data centers[J]. Chinese Journal of Computers, 2014, 37(12):2395-2406.
[7] 曹洁,曾国荪,匡桂娟,等.支持随机服务请求的云虚拟机按需物理资源分配方法[J].软件学报,2017, 28(2):457-472. CAO Jie, ZENG Guosun, KUANG Guijuan, et al. An on-demand physical resource allocation method for cloud virtual machine to support random service requests[J]. Journal of Software, 2017, 28(2):457-472.
[8] 师雪霖,徐恪.云虚拟机资源分配的效用最大化模型[J].计算机学报, 2013,36(2):252-262. SHI Xuelin, XU Ke. Utility maximization model of virtual machine scheduling in cloud environment[J]. Chinese Journal of Computers, 2013, 36(2):252-262.
[9] SONG Y, SUN Y Z, SHI W S. A two-tiered on-demand resource allocation mechanism for VM-based data centers[J]. IEEE Trans on Services Computing, 2013, 6(1):116-129.
[10] 丁滟,王怀民,史佩昌,等.可信云服务[J].计算机学报,2015,38(1):133-149. DING Yan, WANG Huaimin, SHI Peichang, et al. Trusted cloud service[J]. Chinese Journal of Computers, 2015, 38(1):133-149.
[11] 顾军,罗军舟,曹玖新,等.考虑失效恢复的组合服务性能建模与分析[J].软件学报, 2013,24(4):696-714. GU Jun, LUO Junzhou, CAO Jiuxin, et al. Performance modeling and analysis for composite service considering failure recovery[J]. Journal of Software, 2013, 24(4):696-714.
[12] QI P, LI L S. A fault recovery-based scheduling algorithm for cloud service reliability[J]. Security and Communication Networks, 2015, 8:703-714.
[13] ZHOU A, WANG S, HSU C, et al. Task rescheduling optimization to minimize network resource consumption[J]. Multimedia Tools and Applications, 2015,(76):1-15.
[14] 张建华,张文博,徐继伟,等.一种基于隐马尔可夫模型的虚拟机失效恢复方法[J].软件学报,2014,25(11): 2702-2714. ZHANG Jianhua, ZHANG Wenbo, XU Jiwei, et al. Approach of virtual machine failure recovery based on hidden Markov model[J]. Journal of Software, 2014, 25(11): 2702-2714.
[15] 唐文,陈钟.基于模糊集合理论的主观信任管理模型[J].软件学报,2003,14(8):1401-1408. TANG Wen, CHEN Zhong. Research of subjective trust management model based on the fuzzy set theory[J]. Journal of Software, 2003, 14(8):1401-1408.
[16] WANG W, ZENG Gu, TANG D, et al. Cloud-DLS: dynamic trusted scheduling for cloud computing[J]. Expert Systems with Applications, 2012, 39(2012): 2321-2329.
[17] 曹洁,曾国荪,姜火文,等.云环境下服务信任感知的可信动态级调度算法[J].通信学报,2014,35(11):39-49. CAO Jie, ZENG Guosun, JIANG Huowen, et al. Trust-aware dynamic level scheduling algorithm in cloud environment[J]. Journal on Communications, 2014, 35(11):39-49.
[18] 曹洁,曾国荪,钮俊,等.云环境下可用性感知的并行任务调度算法[J].计算机研究与发展,2013,50(7):1563-1572. CAO Jie, ZENG Guosun, NIU Jun, et al. Availability-aware scheduling method for parallel task in cloud environment[J]. Journal of Computer Research and Development, 2013, 50(7):1563-1572.
[19] 田冠华,孟丹,詹剑锋.云计算环境下基于失效规则的资源动态提供策略[J].计算机学报,2010,33(10):1859-1872. TIAN Guanhua, MENG Dan, ZHAN Jianfeng, et al. Reliable resource provision policy for cloud computing[J]. Chinese Journal of Computer, 2010, 33(10):1859-1872.
[20] TANG X Y, LI K L, QIU M K, et al. A hierarchical reliability-driven scheduling algorithm in grid systems[J]. Journal of Parallel and Distributed Computing, 2012, 72(4):525-535.
[21] SCHROEDER B, GIBSON G A. A large-scale study of failures in high-performance computing systems[C] // International Conference on Dependable Systems and Networks. New York: IEEE, 2006: 249-258.
[1] 王小艳,陈兴蜀,王毅桐,葛龙. 基于OpenStack的云计算网络性能测量与分析[J]. 山东大学学报(理学版), 2018, 53(1): 30-37.
[2] 韩盼盼,秦静. 云计算中可验证的外包数据库加密搜索方案[J]. 山东大学学报(理学版), 2017, 52(9): 41-53.
[3] 黄宇晴,赵波,肖钰,陶威. 一种基于KVM的vTPM虚拟机动态迁移方案[J]. 山东大学学报(理学版), 2017, 52(6): 69-75.
[4] 陈广瑞,陈兴蜀,王毅桐,葛龙. 一种IaaS多租户环境下虚拟机软件更新服务机制[J]. 山东大学学报(理学版), 2017, 52(3): 60-67.
[5] 姚克,朱斌瑞,秦静. 基于生物信息的可验证公钥可搜索加密协议[J]. 山东大学学报(理学版), 2017, 52(11): 11-22.
[6] 岳猛,吴志军,姜军. 云计算中基于可用带宽欧氏距离的LDoS攻击检测方法[J]. 山东大学学报(理学版), 2016, 51(9): 92-100.
[7] 蔡红云, 田俊峰. 云计算中的数据隐私保护研究[J]. 山东大学学报(理学版), 2014, 49(09): 83-89.
[8] 罗海燕, 吕萍, 刘林忠, 杨洵. 云环境下基于模糊粗糙AHP的企业信任综合评估[J]. 山东大学学报(理学版), 2014, 49(08): 111-117.
[9] 刘洋,秦丰林,葛连升. 云计算测量研究综述[J]. J4, 2013, 48(11): 27-35.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!