https://www.mdu.se/

mdh.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
BELIEF: A distance-based redundancy-proof feature selection method for Big Data
Department of Computer Science and Artificial Intelligence, CITIC-UGR, University of Granada, Granada, Spain.
Department of Computer Science and Artificial Intelligence, CITIC-UGR, University of Granada, Granada, Spain.
Department of Computer Science and Artificial Intelligence, CITIC-UGR, University of Granada, Granada, Spain.
Mälardalens högskola, Akademin för innovation, design och teknik, Inbyggda system.ORCID-id: 0000-0001-9857-4317
Vise andre og tillknytning
2021 (engelsk)Inngår i: Information Sciences, ISSN 0020-0255, E-ISSN 1872-6291, Vol. 558, s. 124-139Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

With the advent of Big Data era, data reduction methods are in highly demand given their ability to simplify huge data, and ease complex learning processes. Concretely, algorithms able to select relevant dimensions from a set of millions are of huge importance. Although effective, these techniques also suffer from the “scalability” curse when they are brought into tackle large-scale problems. In this paper, we propose a distributed feature weighting algorithm which precisely estimates feature importance in large datasets using the well-know algorithm RELIEF in small problems. Our solution, called BELIEF, incorporates a novel redundancy elimination measure that generates similar schemes to those based on entropy, but at a much lower time cost. Furthermore, BELIEF provides a smooth scale-up when more instances are required to increase precision in estimations. Empirical tests performed on our method illustrate the estimation ability of BELIEF in manifold huge sets – both in number of features and instances, as well as its reduced runtime cost as compared to other state-of-the-art methods. 

sted, utgiver, år, opplag, sider
Elsevier Inc. , 2021. Vol. 558, s. 124-139
Emneord [en]
Apache spark, Big Data, Feature selection (FS), High-dimensional, Redundancy elimination
HSV kategori
Identifikatorer
URN: urn:nbn:se:mdh:diva-53524DOI: 10.1016/j.ins.2020.12.082ISI: 000634824100008Scopus ID: 2-s2.0-85100519874OAI: oai:DiVA.org:mdh-53524DiVA, id: diva2:1541622
Tilgjengelig fra: 2021-04-01 Laget: 2021-04-01 Sist oppdatert: 2022-08-29bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekstScopus

Person

Xiong, Ning

Søk i DiVA

Av forfatter/redaktør
Xiong, Ning
Av organisasjonen
I samme tidsskrift
Information Sciences

Søk utenfor DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 87 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf