https://www.mdu.se/

mdu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
BELIEF: A distance-based redundancy-proof feature selection method for Big Data
Department of Computer Science and Artificial Intelligence, CITIC-UGR, University of Granada, Granada, Spain.
Department of Computer Science and Artificial Intelligence, CITIC-UGR, University of Granada, Granada, Spain.
Department of Computer Science and Artificial Intelligence, CITIC-UGR, University of Granada, Granada, Spain.
Mälardalens högskola, Akademin för innovation, design och teknik, Inbyggda system.ORCID-id: 0000-0001-9857-4317
Visa övriga samt affilieringar
2021 (Engelska)Ingår i: Information Sciences, ISSN 0020-0255, E-ISSN 1872-6291, Vol. 558, s. 124-139Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

With the advent of Big Data era, data reduction methods are in highly demand given their ability to simplify huge data, and ease complex learning processes. Concretely, algorithms able to select relevant dimensions from a set of millions are of huge importance. Although effective, these techniques also suffer from the “scalability” curse when they are brought into tackle large-scale problems. In this paper, we propose a distributed feature weighting algorithm which precisely estimates feature importance in large datasets using the well-know algorithm RELIEF in small problems. Our solution, called BELIEF, incorporates a novel redundancy elimination measure that generates similar schemes to those based on entropy, but at a much lower time cost. Furthermore, BELIEF provides a smooth scale-up when more instances are required to increase precision in estimations. Empirical tests performed on our method illustrate the estimation ability of BELIEF in manifold huge sets – both in number of features and instances, as well as its reduced runtime cost as compared to other state-of-the-art methods. 

Ort, förlag, år, upplaga, sidor
Elsevier Inc. , 2021. Vol. 558, s. 124-139
Nyckelord [en]
Apache spark, Big Data, Feature selection (FS), High-dimensional, Redundancy elimination
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:mdh:diva-53524DOI: 10.1016/j.ins.2020.12.082ISI: 000634824100008Scopus ID: 2-s2.0-85100519874OAI: oai:DiVA.org:mdh-53524DiVA, id: diva2:1541622
Tillgänglig från: 2021-04-01 Skapad: 2021-04-01 Senast uppdaterad: 2022-08-29Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Person

Xiong, Ning

Sök vidare i DiVA

Av författaren/redaktören
Xiong, Ning
Av organisationen
Inbyggda system
I samma tidskrift
Information Sciences
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 88 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf