https://www.mdu.se/

mdu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
TVFace: towards large-scale unsupervised face recognition in video streams
Natl Univ Sci & Technol NUST, Sch Elect Engn & Comp Sci, Islamabad, Pakistan..
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
Univ Reading, Dept Comp Sci, Whiteknights House, Reading RG6 6UR, England..
Natl Univ Sci & Technol NUST, Sch Elect Engn & Comp Sci, Islamabad, Pakistan..
2025 (English)In: Pattern Analysis and Applications, ISSN 1433-7541, E-ISSN 1433-755X, Vol. 28, no 2, article id 88Article in journal (Refereed) Published
Abstract [en]

Recent advances in deep learning have led to significant improvements in face recognition systems, but face clustering, particularly in video streams, remains a challenging problem. Current video face clustering approaches are primarily tailored for short-form content, such as movies and television shows, that features a limited number of face images and individuals. The few existing large-scale face datasets are derived from web images and do not effectively capture the complexities of the video domain. In view of these limitations, we present TVFace, the first large-scale dataset of face images extracted from long-form video content. TVFace has been sourced from public live streams of international news channels and contains a total of 2.6 million face images of 33 thousand individuals. To address the challenge of identity annotation in unstructured video streams, we design a semi-automatic annotation framework that combines unsupervised face clustering with human validation, ensuring scalable and high-quality labeling. TVFace is well suited to evaluate and advance face representation and identity classification components of face recognition systems across both image and video domains. We also demonstrate the effectiveness of TVFace in evaluating real-time person retrieval systems using a novel tree-search-based Hierarchical Retrieval Index tailored for online face clustering. In conclusion, our work centers around the preparation of TVFace, a dataset poised to reshape the landscape of face recognition in the video domain, making it a crucial resource for the research community. The dataset and code are available at https://github.com/Vision-At-SEECS/streamface.

Place, publisher, year, edition, pages
SPRINGER , 2025. Vol. 28, no 2, article id 88
Keywords [en]
Face dataset, Video face clustering, Visual analysis, Live television, Hierarchical retrieval index
National Category
Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:mdh:diva-71289DOI: 10.1007/s10044-025-01464-3ISI: 001467616300003OAI: oai:DiVA.org:mdh-71289DiVA, id: diva2:1955494
Available from: 2025-04-30 Created: 2025-04-30 Last updated: 2025-04-30Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Khan, Bostan

Search in DiVA

By author/editor
Khan, Bostan
By organisation
Embedded Systems
In the same journal
Pattern Analysis and Applications
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 9 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf