Hybrid adaptive checkpointing for virtual machine fault toleranceShow others and affiliations
2018 (English)In: Proceedings - 2018 IEEE International Conference on Cloud Engineering, IC2E 2018, Institute of Electrical and Electronics Engineers Inc. , 2018, p. 12-22Conference paper, Published paper (Refereed)
Abstract [en]
Active Virtual Machine (VM) replication is an application independent and cost-efficient mechanism for high availability and fault tolerance, with several recently proposed implementations based on checkpointing. However, these methods may suffer from large impacts on application latency, excessive resource usage overheads, and/or unpredictable behavior for varying workloads. To address these problems, we propose a hybrid approach through a Proportional-Integral (PI) controller to dynamically switch between periodic and on-demand check-pointing. Our mechanism automatically selects the method that minimizes application downtime by adapting itself to changes in workload characteristics. The implementation is based on modifications to QEMU, LibVirt, and OpenStack, to seamlessly provide fault tolerant VM provisioning and to enable the controller to dynamically select the best checkpointing mode. Our evaluation is based on experiments with a video streaming application, an e-commerce benchmark, and a software development tool. The experiments demonstrate that our adaptive hybrid approach improves both application availability and resource usage compared to static selection of a checkpointing method, with application performance gains and neglectable overheads.
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc. , 2018. p. 12-22
Keywords [en]
Checkpoint, COLO, Control theory, Fault tolerance, Resource management, Application programs, Benchmarking, Network security, Software design, Two term control systems, Application performance, Proportional integral controllers, Software development tools, Video Streaming Applications, Workload characteristics, Virtual machine
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:mdh:diva-39981DOI: 10.1109/IC2E.2018.00023ISI: 000759774400002Scopus ID: 2-s2.0-85048315473ISBN: 9781538650080 (print)OAI: oai:DiVA.org:mdh-39981DiVA, id: diva2:1222402
Conference
2018 IEEE International Conference on Cloud Engineering, IC2E 2018, 17 April 2018 through 20 April 2018
2018-06-212018-06-212022-06-07Bibliographically approved