https://www.mdu.se/

mdu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Measurement-based evaluation of data-parallelism for OpenCV feature-detection algorithms
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.ORCID iD: 0000-0002-3755-562X
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems. Ericsson AB, Stockholm, Sweden.ORCID iD: 0000-0003-2612-4135
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.ORCID iD: 0000-0002-1687-930X
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.ORCID iD: 0000-0001-7586-0409
Show others and affiliations
2018 (English)In: Staying Smarter in a Smartening World COMPSAC'18, 2018, p. 701-710Conference paper, Published paper (Refereed)
Abstract [en]

We investigate the effects on the execution time, shared cache usage and speed-up gains when using data-partitioned parallelism for the feature detection algorithms available in the OpenCV library. We use a data set of three different images which are scaled to six different sizes to exercise the different cache memories of our test architectures. Our measurements reveal that the algorithms using the default settings of OpenCV behave very differently when using data-partitioned parallelism. Our investigation shows that the executions of the algorithms SURF, Dense and MSER correlate to L3-cache usage and they are therefore not suitable for data-partitioned parallelism on multi-core CPUs. Other algorithms: BRISK, FAST, ORB, HARRIS, GFTT, SimpleBlob and SIFT, do not correlate to L3-cache in the same extent, and they are therefore more suitable for data-partitioned parallelism. Furthermore, the SIFT algorithm provides the most stable speed-up, resulting in an execution between 3 and 3.5 times faster than the original execution time for all image sizes. We also have evaluated the hardware resource usage by measuring the algorithm execution time simultaneously with the L3-cache usage. We have used our measurements to conclude which algorithms are suitable for parallelization on hardware with shared resources.

Place, publisher, year, edition, pages
2018. p. 701-710
Keywords [en]
Multi-core, OpenCV, Cache
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:mdh:diva-40855DOI: 10.1109/COMPSAC.2018.00105ISI: 000904976500090Scopus ID: 2-s2.0-85055434865ISBN: 9781538626665 (print)OAI: oai:DiVA.org:mdh-40855DiVA, id: diva2:1249860
Conference
42nd IEEE Computer Software and Applications Conference, COMPSAC 2018; Tokyo; Japan; 23 July 2018 through 27 July 2018
Projects
DPAC - Dependable Platforms for Autonomous systems and ControlAvailable from: 2018-09-20 Created: 2018-09-20 Last updated: 2023-04-12Bibliographically approved
In thesis
1. Characterization of Shared Resource Contention in Multi-core Systems
Open this publication in new window or tab >>Characterization of Shared Resource Contention in Multi-core Systems
2019 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Multi-core computers are infamous for being hard to use in time-critical systems due to execution-time variations as an effect of shared resource contention. In this thesis we study the problem of shared resource contention which occurs when multiple applications executing on different cores do not have exclusive ownership of a shared resource. We investigate performance variations of parallel tasks in multi-core systems and present a method to pinpoint the source of the resource contention using existing hardware performance counters. Furthermore, we investigate methods to mitigate performance variations using resource isolation techniques. We present a methodology for verifying isolation and tested the achieved isolation using the Jailhouse hypervisor. We further investigate shared cache memory isolation techniques using a page coloring tool called PALLOC. Page-coloring is used for partitioning the cache, assigning specific cache lines to specific processes. Page coloring can however cause system performance degradation since it decreases the total amount of cache memory available for each process. Finally, we propose a dynamic partitioning assignment policy which assigns cache partitions to a process according to an adaptive model based on the process performance. The general conclusion from our investigations is that a large body of applications can suffer from shared resource contention and that techniques for mitigating resource contention are in dire need. Our methods measure and characterise applications, identifies resource contention and finally study isolation techniques.  

Place, publisher, year, edition, pages
Västerås: Mälardalen University, 2019. p. 160
Series
Mälardalen University Press Licentiate Theses, ISSN 1651-9256 ; 287
National Category
Computer Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:mdh:diva-45932 (URN)978-91-7485-449-7 (ISBN)
Presentation
2019-12-17, Paros, Mälardalens högskola, Västerås, 13:15 (English)
Opponent
Supervisors
Available from: 2019-11-11 Created: 2019-11-11 Last updated: 2022-11-08Bibliographically approved
2. Automatic Characterization and Mitigation of Shared-resource Contention in Multi-core Systems
Open this publication in new window or tab >>Automatic Characterization and Mitigation of Shared-resource Contention in Multi-core Systems
2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Multi-core computers are infamous for being hard to use in time-critical systems due to execution-time variations as an effect of shared resource contention. In this thesis, we study the problem of shared resource contention, which occurs when multiple applications executing on different cores do not have exclusive access to of a shared hardware resource. We investigate performance variations of parallel tasks in multi-core systems and present a method to pinpoint the source of the resource contention using hardware performance counters. We investigate mitigation methods for performance variations due to resource contention, including the Jailhouse hypervisor and the cache-partitioning tool PALLOC. We propose a benchmark strategy that quantifies the isolation gained from a specific isolation technique and exemplify this strategy using the Jailhouse hypervisor. We furthermore present and implement solutions for cache-partition allocation during application runtime. Our implementation aims to avoid over-provisioning of cache through pre-runtime estimations of an application's dependency towards the cache and continuous re-partitioning of the cache memory during application runtime.

The primary goal of this thesis is to contribute to a process that automates some of the tedious manual testing needed to detect resource contention bottlenecks. The methods we present in this provide a holistic solution for automatic mitigating resource-contention in a multi-core system. First, we evaluate the risk for shared resource contention when several applications execute simultaneously. We then allocate partitions to mitigate resource contention for applications that risk severe performance degradations. We finally present methods that dynamically re-allocate partition space to meet the performance requirements of the running applications. 

Abstract [sv]

Flerkärniga datorer är ökända för att vara svåra att använda i tidskritiska system på grund av prestandavariationer som sker på grund av samtidigt delande av hårdvaruresurser. I denna avhandling studerar vi problemet med delade resurser som uppstår när flera applikationer som körs på olika kärnor inte har exklusivt ägande av en delad resurs. Vi undersöker prestationsvariationer för parallella uppgifter i flerkärniga system och presenterar en metod för att identifiera källan till resurskonflikten med hjälp av befintliga hårdvaruprestationsräknare. Vi undersöker begränsningsmetoder för prestationsvariationer på grund av resurstvister, inklusive Jailhouse-hypervisor och cachepartitionsverktyget PALLOC. Vi föreslår en riktmärkesstrategi som kvantifierar isoleringen från en specifik isoleringsteknik och exemplifierar denna strategi med hjälp av Jailhouse -hypervisor. Vi presenterar och implementerar dessutom lösningar för tilldelningskontroll för cachepartitioner under applikationstiden. Vår implementering syftar till att undvika onödiga cacheallokeringar genom att uppskattninga programmets beroende av cacheminnet och kontinuerlig omallokering av cacheminnet medans applikationen kör.

Huvudmålet med denna avhandling är att underlätta den manuella testningen av resurskonflits-flaskhalsar och istället föreslå en automatiska metoder. De metoder vi presenterar ger en helhetslösning för automatisk lindring av resurskonflikter i ett flerkärnigt system. Först utvärderar vi risken för negativ påverkan genom delade resurser när flera applikationer körs samtidigt. Vi tilldelar sedan partitioner för att mildra resurskonflikter för applikationer som riskerar allvarliga prestandaförsämringar. Vi presenterar slutligen metoder som dynamiskt omallokerar cacheminne för att uppfylla prestandakraven för de applikationer som körs.

Place, publisher, year, edition, pages
Västerås: Mälardalen University, 2021. p. 254
Series
Mälardalen University Press Dissertations, ISSN 1651-4238 ; 348
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:mdh:diva-56075 (URN)978-91-7485-527-2 (ISBN)
Public defence
2021-11-19, Paros, Mälardalens högskola, Västerås, 13:00 (English)
Opponent
Supervisors
Available from: 2021-10-07 Created: 2021-10-01 Last updated: 2022-11-08Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Danielsson, JakobMarcus, JägemarBehnam, MorisSeceleanu, Tiberiu

Search in DiVA

By author/editor
Danielsson, JakobMarcus, JägemarBehnam, MorisSjödin, MikaelSeceleanu, Tiberiu
By organisation
Embedded Systems
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 158 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf