This is the official repository for the publication "HQSwap: A Challenging High-Resolution Face-Swap Dataset for Improved Manipulation Detection" created by da/sec at HDA
The generalization capabilities to cross and unknown scenarios are one of the main issues to solve in deepfake detection research. This work provides a new face-swap dataset generated by six algorithms, which can be used to enhance current classifiers. Our work focuses on tackling open challenges in current datasets, such as a small number of bona fide images, a limited number of subjects, and unevenly split datasets. Our new dataset consists of around 70k high-resolution 1024x1024 subject-disjoint face images, not lossy compressed, and face-swapped. This is essential to accurately classify current classifiers and improve the accuracy of future classifiers, as resolution of face-swaps will likely increase in the future.
To download the dataset, request access from:** Available soon.**
In order to generate this new proposed dataset, we identify six different face-swap tools as follows:
- Blendswap: Published by Shiohara et al., "Blendswap", is a SoTA novel identity encoder for face-swapping. Their method improves the disentanglement of identity and attributes for face-swapping, focusing only on the inner face. Link
- Ghost: The Generative High-fidelity One Shot Transfer (Ghost), is a one-shot face-swapping pipeline developed by Groshev et al. for image-to-image and image-to-video transfer. According to the authors: "several major architecture improvements which include a new eye-based loss function, face mask smooth algorithm, a new face-swap pipeline for image-to-video face transfer, a new stabilization technique to decrease face jittering on adjacent frames and a super-resolution stage". Link
- Simswap: Chen et al. published the "Simswap" method, a high-fidelity identity-specific face-swap method. Their proposed approach is defined by a weak feature comparison loss that helps the framework to maintain attribute preservation ability and the results showed that they were able to generate visually appealing results. Link
- Uniface: Xu et al. published "Uniface", proposing a end-to-end unified framework that combines face-swapping and face-reenactment. They make use of feature disentanglement, then apply attribute transfer and identity transfer to adaptively fuse the identities. This approach exploits the "underlying similarity of attribute and identity transfer" to achieve "better identity consistency in reenactment and better attribute preservation in swapping". Link
- Hyperswap: The hyper-accurate face-swapping for everyone "Hyperswap" is an open-source commercial face-swapping model by Facefusion that produces highly-realistic face-swap results. The model was trained on the VGGFace2 dataset. Link
- Inswapper: This model provides a "One-click face-swapper and restoration powered by insightface". Information about training data or model structure is not given. The Inswapper algorithm used is the Inswapper128 model. Link
@inproceedings{srock2026hqswap,
title = {HQSwap: A Challenging High-Resolution Face-Swap Dataset for Improved Manipulation Detection},
author = {Srock, Philipp and Tapia, Juan E. and Busch, Christoph},
booktitle = {IEEE International Workshop on Biometrics (IWBF2026), Côte d’Azur, EURECOM, April 23–24, 2026},
year = {2026},
pages = {1--10},
publisher = {IEEE},
doi = {soon}}
The HQSwap dataset is provided for academic and research purposes only. It is intended to support research in facial image manipulation detection, deepfake analysis, and related fields. Any use of this dataset must comply with ethical guidelines and the terms of use set forth by the authors. Commercial use, redistribution, or any application outside of legitimate research must be coordinated with the authors. The dataset contains synthetic or manipulated facial images generated via face-swapping techniques. These images are not representative of real individuals and should not be used to infer identity, consent, or personal attributes. For questions, collaboration, or clarification regarding the dataset, please contact: Philipp Srock (philipp.srock@h-da.de) and Juan E. Tapia (juan.tapia-farias@h-da.de) at the Department of Computer Science, Hochschule Darmstadt, University of Applied Sciences, Germany.
