mdh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Hybrid adaptive checkpointing for virtual machine fault tolerance
Department of Computing Science, Umea University, Sweden.
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.ORCID iD: 0000-0002-1364-8127
Department of Computing Science, Umea University, Sweden.
Red Hat Inc., United States.
Show others and affiliations
2018 (English)In: Proceedings - 2018 IEEE International Conference on Cloud Engineering, IC2E 2018, Institute of Electrical and Electronics Engineers Inc. , 2018, p. 12-22Conference paper, Published paper (Refereed)
Abstract [en]

Active Virtual Machine (VM) replication is an application independent and cost-efficient mechanism for high availability and fault tolerance, with several recently proposed implementations based on checkpointing. However, these methods may suffer from large impacts on application latency, excessive resource usage overheads, and/or unpredictable behavior for varying workloads. To address these problems, we propose a hybrid approach through a Proportional-Integral (PI) controller to dynamically switch between periodic and on-demand check-pointing. Our mechanism automatically selects the method that minimizes application downtime by adapting itself to changes in workload characteristics. The implementation is based on modifications to QEMU, LibVirt, and OpenStack, to seamlessly provide fault tolerant VM provisioning and to enable the controller to dynamically select the best checkpointing mode. Our evaluation is based on experiments with a video streaming application, an e-commerce benchmark, and a software development tool. The experiments demonstrate that our adaptive hybrid approach improves both application availability and resource usage compared to static selection of a checkpointing method, with application performance gains and neglectable overheads.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc. , 2018. p. 12-22
Keywords [en]
Checkpoint, COLO, Control theory, Fault tolerance, Resource management, Application programs, Benchmarking, Network security, Software design, Two term control systems, Application performance, Proportional integral controllers, Software development tools, Video Streaming Applications, Workload characteristics, Virtual machine
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:mdh:diva-39981DOI: 10.1109/IC2E.2018.00023Scopus ID: 2-s2.0-85048315473ISBN: 9781538650080 OAI: oai:DiVA.org:mdh-39981DiVA, id: diva2:1222402
Conference
2018 IEEE International Conference on Cloud Engineering, IC2E 2018, 17 April 2018 through 20 April 2018
Available from: 2018-06-21 Created: 2018-06-21 Last updated: 2018-06-21Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records BETA

Papadopoulos, Alessandro

Search in DiVA

By author/editor
Papadopoulos, Alessandro
By organisation
Embedded Systems
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 4 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf