mdh.sePublications
Change search
Refine search result
1 - 16 of 16
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Danielsson, Jakob
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Marcus, Jägemar
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Ericsson AB, Stockholm, Sweden.
    Behnam, Moris
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Sjödin, Mikael
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Seceleanu, Tiberiu
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Measurement-based evaluation of data-parallelism for OpenCV feature-detection algorithms2018In: Staying Smarter in a Smartening World COMPSAC'18, 2018, p. 701-710Conference paper (Refereed)
    Abstract [en]

    We investigate the effects on the execution time, shared cache usage and speed-up gains when using data-partitioned parallelism for the feature detection algorithms available in the OpenCV library. We use a data set of three different images which are scaled to six different sizes to exercise the different cache memories of our test architectures. Our measurements reveal that the algorithms using the default settings of OpenCV behave very differently when using data-partitioned parallelism. Our investigation shows that the executions of the algorithms SURF, Dense and MSER correlate to L3-cache usage and they are therefore not suitable for data-partitioned parallelism on multi-core CPUs. Other algorithms: BRISK, FAST, ORB, HARRIS, GFTT, SimpleBlob and SIFT, do not correlate to L3-cache in the same extent, and they are therefore more suitable for data-partitioned parallelism. Furthermore, the SIFT algorithm provides the most stable speed-up, resulting in an execution between 3 and 3.5 times faster than the original execution time for all image sizes. We also have evaluated the hardware resource usage by measuring the algorithm execution time simultaneously with the L3-cache usage. We have used our measurements to conclude which algorithms are suitable for parallelization on hardware with shared resources.

  • 2.
    Danielsson, Jakob
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Marcus, Jägemar
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Seceleanu, Tiberiu
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Behnam, Moris
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Sjödin, Mikael
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Run-time Cache-Partition Controller for Multi-core Systems2019Conference paper (Refereed)
  • 3.
    Danielsson, Jakob
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Seceleanu, Tiberiu
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. ABB AB, Västerås, Sweden.
    Marcus, Jägemar
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Ericsson AB, Stockholm, Sweden.
    Behnam, Moris
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Sjödin, Mikael
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Testing Performance-Isolation in Multi-Core Systems2019Conference paper (Refereed)
    Abstract [en]

    In this paper we present a methodology to be used for quantifying the level of performance isolation for a multi-core system. We have devised a test that can be applied to breaches of isolation in different computing resources that may be shared between different cores. We use this test to determine the level of isolation gained by using the Jailhouse hypervisor compared to a regular Linux system in terms of CPU isolation, cache isolation and memory bus isolation. Our measurements show that the Jailhouse hypervisor provides performance isolation of local computing resources such as CPU. We have also evaluated if any isolation could be gained for shared computing resources such as the system wide cache and the memory bus controller. Our tests show no measurable difference in partitioning between a regular Linux system and a Jailhouse partitioned system for shared resources. Using the Jailhouse hypervisor provides only a small noticeable overhead when executing multiple shared-resource intensive tasks on multiple cores, which implies that running Jailhouse in a memory saturated system will not be harmful. However, contention still exist in the memory bus and in the system-wide cache.

  • 4.
    Hallmans, Daniel
    et al.
    ABB, Sweden.
    Jägemar, Marcus
    Ericsson, Sweden.
    Larsson, Stig
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Nolte, Thomas
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Identifying Evolution Problems for Large Long Term Industrial Evolution Systems2014In: 38TH ANNUAL IEEE INTERNATIONAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE WORKSHOPS (COMPSACW 2014), 2014, no 6th, p. 384-389Conference paper (Refereed)
    Abstract [en]

    Large infrastructure systems with a life time of more than 30 years, such as telecommunication or power transmission systems, are difficult to maintain since they suffer from the end-of-life plague of software, hardware and knowledge. Large companies have traditionally tackled this problem successfully, but maybe not with complete efficiency in all cases. We find system evolution to be an increasingly interesting problem as infrastructure becomes more complicated. Our increasingly complex and advanced society demands more of the infrastructure making system evolution an interesting alternative to system replacement. From the point of view of the ISO/IEC 15288 development process we have identified life cycle issues connected to long life time scenarios and the different life cycle stages. In this paper we contribute with a modification of the utilisation and support stage in ISO/IEC 15288 into an evolution stage where a system is not only retired and replaced but rather evolved into the next generation. Using this approach changes the view of system development for this specific type of systems towards a way of incremental development, where new functions can be added at the same time as old legacy parts are replaced with functionally equivalent modules based on new hardware. We have based our solution on the experience from investigations of life cycle issues for two large infrastructure systems.

  • 5.
    Inam, Rafia
    et al.
    Mälardalen University, School of Innovation, Design and Engineering.
    Sjödin, Mikael
    Mälardalen University, School of Innovation, Design and Engineering.
    Marcus, Jägemar
    Mälardalen University, School of Innovation, Design and Engineering. Ericsson AB.
    Bandwidth Measurement using Performance Counters for Predictable Multicore Software2012In: IEEE Symposium on Emerging Technologies and Factory Automation, ETFA 2012, 2012, , p. 4p. Article number: 6489714-Conference paper (Other (popular science, discussion, etc.))
    Abstract [en]

    Memory contention is one of the largest sources of inter-core interference in statically partitioned multicore systems, and the contention reduces the overall performance of applications and causes unpredictable execution-times. A first step in achieving predictable execution is to accurately measure the amount of consumed memory bandwidth for each application. Such measurements can be used to track down bottlenecks, provide better partitioning among cores, and ultimately be used to arbitrate and police access to the memory bus. We propose to use hardware performance counters to continuously track the memory-bandwidth consumed by different applications executing in parallel. In this paper we describe ongoing efforts exploring suitable performance counters on core-level and on system-on-chip level for the 8-core Freescale P4080 processor. The aim is to accurately and efficiently track consumed memory bandwidth per application; with the final goal to use these measurements to improve predictability of multicore real-time software.

  • 6.
    Jagemar, Marcus
    et al.
    Ericsson AB, Sweden .
    Eldh, S.
    Ericsson AB, Sweden .
    Ermedahl, A.
    Ericsson AB, Sweden .
    Lisper, Björn
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Adaptive online feedback controlled message compression2014In: Proceedings - International Computer Software and Applications Conference, 2014, p. 558-567Conference paper (Refereed)
    Abstract [en]

    Communication is a vital part of computer systems today. One current problem is that computational capacity is growing faster than the bandwidth of interconnected computers. Maximising performance is a key objective for industries, both on new and existing software systems, which further extends the need for more powerful systems at the cost of additional communication. Our contribution is to let the system selectively choose the best compression algorithm from a set of available algorithms if it provides a better overall system performance. The online selection mechanism can adapt to a changing environment such as temporary network congestion or a change of message content while still selecting the optimal algorithm. Additionally, is autonomous and does not require any human intervention making it suitable for large-scale systems. We have implemented and evaluated this autonomous selection and compression mechanism in an initial trial situation as a proof of concept. The message round trip time were decreased by 7.1%, while still providing ample computational resources for other co-existing services.

  • 7.
    Jägemar, Marcus
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Ericsson, Sweden.
    Utilizing Hardware Monitoring to Improve the Performance of Industrial Systems2016Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    The drastically increasing use of Information and Communications Technology has resulted in a growing demand for network capacity. In this Licentiate thesis, we show how to monitor, model and finally improve network performance for large industrial systems. We also show how to use modeling techniques to move performance testing to an earlier design phase, with the aim to reduce the total development time of large systems. Our first contribution is a low-intrusive method for long-term hardware characteristic measurements of production nodes located at customer sites. Our second contribution is a technique to mimic the hardware usage of a production environment by creating a characteristics model. The cloned environment makes function test suites more realistic. The goal when creating the model is to reduce the system development time by moving late-stage performance testing to early design phases thereby improving the quality of the test environment. The third and final contribution is a network performance improvement where we dynamically trade computational capacity for a message round-trip time reduction when there are CPU cycles to spare. We have implemented an automatic feedback controlled mechanism for transparent message compression resulting in improved messaging performance between interconnected network nodes. Our mechanism continuously evaluates eleven compression algorithms on message stream content and network congestion level. The message subsystem will use the compression algorithm that provides the lowest messaging time. If the message content or network load change, a new evaluation is performed. We have conducted several case studies in an industrial environment and verified all contributions on a large telecommunication system manufactured by Ericsson. System engineers frequently use the monitoring and modeling functionality for debugging purposes in production environments. We have deployed all techniques in a complicated industrial legacy system with minimal impact. We show that we can provide not only a solution but a cost-effective solution, which is an important requirement for industrial systems.

  • 8.
    Jägemar, Marcus
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Utilizing Hardware Monitoring to Improve the Quality of Service and Performance of Industrial Systems2018Doctoral thesis, monograph (Other academic)
    Abstract [en]

    The drastically increased use of information and communications technology has resulted in a growing demand for telecommunication network capacity. The demand for radically increased network capacity coincides with industrial cost-reductions due to an increasingly competitive telecommunication market. We have addressed the capacity and cost-reduction problems in three ways.

    Our first contribution is a method to support shorter development cycles for new functionality and more powerful hardware. We reduce the development time by replicating the hardware usage of production systems in our test environment. Having a realistic test environment allows us to run performance tests at early design phases and therefore reducing the overall system development time.

    Our second contribution is a method to improve the communication performance through selective and automatic message compression. The message compression functionality monitors transmissions continuously and selects the most efficient compression algorithm. The message compression functionality evaluates several parameters such as network congestion level, CPU usage, and message content. Our implementation extends the communication capacity of a legacy communication API running on Linux where it emulates a legacy real-time operating system.

    In our third an final contribution, we implement a process allocation and scheduling framework to allow higher system performance and quality of service. The framework continuously monitors selected processes and correlate their performance to hardware usage such as caches, floating point unit and similar. The framework uses the performance-hardware correlation to minimize shared hardware resource congestion by efficiently allocate processes on multi-core CPUs. We have also designed a shared hardware resource aware process scheduler that makes it possible for multiple processes to co-exist on a CPU without affecting the performance of each other through hardware resource congestions. The allocation and scheduling techniques can be used to consolidate several functions on shared hardware thus reducing the system cost. We have implemented and evaluated our process scheduler as a new scheduling class in Linux.

    We have conducted several case studies in an industrial environment and verified all contributions in the scope of a large telecommunication system manufactured by Ericsson.%We have deployed all techniques in a complicated industrial legacy system with minimal impact. We show that we can provide a cost-effective solution, which is an essential requirement for industrial systems.

  • 9.
    Jägemar, Marcus
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Dodig-Crnkovic, Gordana
    Chalmers University of Technology, Gothenburg, Sweden.
    Cognitively Sustainable ICT with Ubiquitous Mobile Services - Challenges and Opportunities2015In: The 37th International Conference on Software Engineering ICSE, 2015, no 37, p. 531-540Conference paper (Refereed)
    Abstract [en]

    Information and Communication Technology (ICT) has led to an unprecedented development in almost all areas of human life. It forms the basis for what is called “the cognitive revolution” – a fundamental change in the way we communicate, feel, think and learn based on an extension of individual information processing capacities by communication with other people through technology. This so-called “extended cognition” shapes human relations in a radically new way. It is accompanied by a decrease of shared attention and affective presence within closely related groups. This weakens the deepest and most important bonds, that used to shape human identity. Sustainability, both environmental and social (economic, technological, political and cultural) is one of the most important issues of our time. In connection with “extended cognition” we have identified a new, basic type of social sustainability that everyone takes for granted, and which we claim is in danger due to our changed ways of communication. We base our conclusion on a detailed analysis of the current state of the practice and observed trends. The contribution of our article consists of identifying cognitive sustainability and explaining its central role for all other aspects of sustainability, showing how it relates to the cognitive revolution, its opportunities and challenges. Complex social structures with different degrees of proximity have always functioned as mechanisms behind belongingness and identity. To create a long-term cognitive sustainability, we need to rethink and design new communication technologies that support differentiated and complex social relationships.

  • 10.
    Jägemar, Marcus
    et al.
    Ericsson, Stockholm, Sweden.
    Eldh, Sigrid
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Ermedahl, Andreas
    Ericsson, Stockholm, Sweden.
    Lisper, Björn
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Automatic Message Compression with Overload Protection2016In: Journal of Systems and Software, ISSN 0164-1212, E-ISSN 1873-1228, Vol. 121, no 1 nov, p. 209-222Article in journal (Refereed)
    Abstract [en]

    In this paper, we show that it is possible to increase the message throughput of a large-scale industrial system by selectively compress messages. The demand for new high-performance message processing systems conflicts with the cost effectiveness of legacy systems. The result is often a mixed environment with several concurrent system generations. Such a mixed environment does not allow a complete replacement of the communication backbone to provide the increased messaging performance. Thus, performance-enhancing software solutions are highly attractive. Our contribution is 1) an online compression mechanism that automatically selects the most appropriate compression algorithm to minimize the message round trip time; 2) a compression overload mechanism that ensures ample resources for other processes sharing the same CPU. We have integrated 11 well-known compression algorithms/configurations and tested them with production node traffic. In our target system, automatic message compression results is a 9.6% reduction of message round trip time. The selection procedure is fully automatic and does not require any manual intervention. The automatic behavior makes it particularly suitable for large systems where it is difficult to predict future system behavior.

  • 11.
    Jägemar, Marcus
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Eldh, Sigrid
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Ermedahl, Andreas
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Lisper, Björn
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Automatic Multi-Core Cache Characteristics Modelling2013Conference paper (Refereed)
    Abstract [en]

    When updating low-level software for large computer systems it is di cult to verify whether performance requirements are met or not. Common practice is to measure the performance only when the new software is fully developed and has reached system veri cation. Since this gives long lead-times it becomes costly to remedy performance problems. Our contribution is that we have deployed a new method to synthesise production workload. We have, using this method, created a multi-core cache characteristics model. We have validated our method by deploying it in a production system as a case study. The result shows that the method is su ciently accurate to detect changes and mimic cache characteristics and performance, and thus giving early characteristics feedback to engineers.We have also applied the model to a real software update detecting changes in performance characteristics similar to the real system.

  • 12.
    Jägemar, Marcus
    et al.
    Mälardalen University, School of Innovation, Design and Engineering.
    Eldh, Sigrid
    Mälardalen University, School of Innovation, Design and Engineering.
    Ermedahl, Andreas
    Mälardalen University, School of Innovation, Design and Engineering.
    Lisper, Björn
    Mälardalen University, School of Innovation, Design and Engineering.
    Feedback-Based Generation of Hardware Characteristics2012Report (Other academic)
    Abstract [en]

    In large complex server-like computer systems it is difficult to characterise hardware usage in early stages of system development. Many times the applications running on the platform are not ready at the time of platform deployment leading to postponed metrics measurement. In our study we seek answers to the questions: (1) Can we use a feedback-based control system to create a characteristics model of a real production system? (2) Can such a model be sufficiently accurate to detect characteristics changes instead of executing the production application? The model we have created runs a signalling application, similar to the production application, together with a PID-regulator generating L1 and L2 cache misses to the same extent as the production system. Our measurements indicate that we have managed to mimic a similar environment regarding cache characteristics. Additionally we have applied the model on a software update for a production system and detected characteristics changes using the model. This has later been verified on the complete production system, which in this study is a large scale telecommunication system with a substantial market share.

  • 13.
    Jägemar, Marcus
    et al.
    Ericsson AB.
    Eldh, Sigrid
    Ericsson AB.
    Ermedahl, Andreas
    Ericsson AB.
    Lisper, Björn
    Mälardalen University, School of Innovation, Design and Engineering.
    Towards Feedback-Based Generation of Hardware Characteristics2012In: 7th International Workshop on Feedback Computing, 2012Conference paper (Refereed)
    Abstract [en]

    In large complex server-like computer systems it is difficult to characterise hardware usage in early stages of system development. Many times the applications running on the platform are not ready at the time of platform deployment leading to postponed metrics measurement. In our study we seek answers to the questions: (1) Can we use a feedback-based control system to create a characteristics model of a real production system? (2) Can such a model be sufficiently accurate to detect characteristics changes instead of executing the production application? The model we have created runs a signalling application, similar to the production application, together with a PID-regulator generating L1 and L2 cache misses to the same extent as the production system. Our measurements indicate that we have managed to mimic a similar environment regarding cache characteristics. Additionally we have applied the model on a software update for a production system and detected characteristics changes using the model. This has later been verified on the complete production system, which in this study is a large scale telecommunication system with a substantial market share.

  • 14.
    Jägemar, Marcus
    et al.
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Ericsson, Stockholm, Sweden.
    Lisper, Björn
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Eldh, Sigrid
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Ermedahl, Andreas
    Ericsson, Stockholm, Sweden.
    Andai, Gabor
    Automatic Benchmarking for Early-Stage Performance Verification of Industrial Systems2016Manuscript (preprint) (Other academic)
  • 15.
    Marcus, Jägemar
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    Mallocpool: Improving Memory Performance Through Contiguously TLB Mapped Memory2018In: International Conference on Emerging Technologies and Factory Automation ETFA'18, 2018Conference paper (Refereed)
    Abstract [en]

    Many computer systems allocate and free many memory chunks over the application lifespan. One problem with allocating many chunks is that they may not be contiguously allocated causing a massive strain on caches, translation lookaside buffers (TLB), and the memory subsystem. We have devised a method that preallocates a large memory fragment, mapping it with a variable size TLB, and then allocate subsequently requested chunks from that fragment. Our method has two advantages. The first is that all chunks allocated by malloc() are allocated contiguously, thus allowing a better cache-locality. The second advantage is that we can map the whole memory region with one variable size TLB reducing much of the 4kB TLB strain. These two advantages drastically improve the memory access performance. We have implemented our method in a Linux library which we can either dynamically or statically link to an existing application. The library is API-compatible to the GlibC library and can act as a drop-in replacement removing any need for legacy application changes.

  • 16.
    Marcus, Jägemar
    et al.
    Ericsson AB, Stockholm, Sweden.
    Ermedahl, Andreas
    Ericsson AB, Stockholm, Sweden.
    Eldh, Sigrid
    Ericsson AB, Stockholm, Sweden.
    Behnam, Moris
    Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
    A Scheduling Architecture for Enforcing Quality of Service in Multi-Process Systems2017In: International Conference on Emerging Technologies And Factory Automation ETFA'17, 2017, p. 1-8Conference paper (Refereed)
    Abstract [en]

    There is a massive deployment of multi-core CPUs. It requires a significant drive to consolidate multiple services while still achieving high performance on these off-the-shelf CPUs. Each function had earlier an own execution environment, which guaranteed a certain Quality of Service (QoS). Consolidating multiple services can give rise to shared resource congestions, resulting in lower and non-deterministic QoS. We describe a method to increase the overall system performance by assisting the operating system process scheduler to utilize shared resources more efficiently. Our method utilizes hardware- and system-level performance counters to profile the shared resource usage of each process. We also use a big-data approach to analyzing statistics from many nodes. The outcome of the analysis is a decision support model that is utilized by the process scheduler when allocating and scheduling process. Our scheduler can efficiently distribute processes compared to traditional CPU-load based process schedulers by considering the hardware capacity and previous scheduling- and allocation decisions.

1 - 16 of 16
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf