MaskedFace-Net is a dataset of human faces with a correctly or incorrectly worn mask (133,783 images) based on the dataset Flickr-Faces-HQ (FFHQ). The wearing of the face masks appears as a solution for limiting the spread of COVID-19. In this context, efficient recognition systems are expected for checking that people faces are masked in regulated areas. To perform this task, a large dataset of masked faces is necessary for training deep learning models towards detecting people wearing masks and those not wearing masks. Some large datasets of masked faces are available in the literature. However, at the moment, there are no available large dataset of masked face images that permits to check if detected masked faces are correctly worn or not. Indeed, many people are not correctly wearing their masks due to bad practices, bad behaviors or vulnerability of individuals (e.g., children, old people). For these reasons, several mask wearing campaigns intend to sensitize people about this problem and good practices. In this sense, this work proposes three types of masked face detection dataset; namely, the Correctly Masked Face Dataset (CMFD), the Incorrectly Masked Face Dataset (IMFD) and their combination for the global masked face detection (MaskedFace-Net). Realistic masked face datasets are proposed with a twofold objective: i) to detect people having their faces masked or not masked, ii) to detect faces having their masks correctly worn or incorrectly worn (e.g.; at airport portals or in crowds). To the best of our knowledge, no large dataset of masked faces provides such a granularity of classification towards permitting mask wearing analysis. Moreover, this work globally presents the applied mask-to-face deformable model for permitting the generation of other masked face images, notably with specific masks.
For more details about this work:
Update November 29, 2020:
New referencing following the online appearing on "Smart Health", Elsevier
Adnane Cabani, Karim Hammoudi, Halim Benhabiles, and Mahmoud Melkemi, "MaskedFace-Net - A dataset of correctly/incorrectly masked face images in the context of COVID-19", Smart Health, ISSN 2352-6483, Elsevier, 2020. https://doi.org/10.1016/j.smhl.2020.100144 [Preprint version available at arXiv:2008.08016]
Update March 11, 2022:
Our article mentioned above has received the Best Paper Awards 2021 of the journal Smart Health, Elsevier.
Project leaders:
- Adnane Cabani, ESIGELEC/IRSEEM, [email protected]
- Karim Hammoudi, Université de Haute-Alsace, IRIMAS, [email protected]
Note: project leaders equally contributed to this work.
Contributors:
- Halim Benhabiles, Yncrea Hauts-de-France, IEMN Lille, [email protected]
- Mahmoud Melkemi, Université de Haute-Alsace, IRIMAS, [email protected]
- Junhao Cao, Internship student at ESIGELEC
MaskedFace-Net is available below:
Update January 28, 2020:
Refined selection of the incorrectly masked face images.
67,049 images with Correctly Masked Face Dataset (CMFD) at 1024×1024: Go to OneDrive (19 GB)
66,734 images with Incorrectly Masked Face Dataset (IMFD) at 1024×1024: Go to OneDrive (19 GB)
To facilitate the downloading of this dataset, we suggest you to use a download manager tool or OneDrive sync client.
Update January 15, 2021:
Add mirror links on Google Drive.
Zip file CMFD: Part 1 - Part 2
Zip file IMFD: Part 1 - Part 2
Metadata: Each image of the MaskedFace-Net dataset has its corresponding file naming in the FFHQ dataset. Hence, the metadata file “ffhq-dataset-v2.json” (see the FFHQ webpage) can be exploited for your processing of MaskedFace-Net.
In the following the licenses of the original FFHQ-dataset: The individual images were published in Flickr by their respective authors under either Creative Commons BY 2.0, Creative Commons BY-NC 2.0, Public Domain Mark 1.0, Public Domain CC0 1.0, or U.S. Government Works license. All of these licenses allow free use, redistribution, and adaptation for non-commercial purposes. However, some of them require giving appropriate credit to the original author, as well as indicating any changes that were made to the images. The license and original author of each image are indicated in the metadata.
- https://creativecommons.org/licenses/by/2.0/
- https://creativecommons.org/licenses/by-nc/2.0/
- https://creativecommons.org/publicdomain/mark/1.0/
- https://creativecommons.org/publicdomain/zero/1.0/
- http://www.usa.gov/copyright.shtml
The dataset itself (including JSON metadata, download script, and documentation) is made available under Creative Commons BY-NC-SA 4.0 license by NVIDIA Corporation. You can use, redistribute, and adapt it for non-commercial purposes, as long as you (a) give appropriate credit by citing our paper, (b) indicate any changes that you've made, and (c) distribute any derivative works under the same license. https://creativecommons.org/licenses/by-nc-sa/4.0/
The licenses of MaskedFace-Net dataset: The dataset is made available under Creative Commons BY-NC-SA 4.0 license by NVIDIA Corporation. You can use, redistribute, and adapt it for non-commercial purposes, as long as you
- give appropriate credit by citing our papers:
- Adnane Cabani, Karim Hammoudi, Halim Benhabiles, and Mahmoud Melkemi, "MaskedFace-Net - A dataset of correctly/incorrectly masked face images in the context of COVID-19", Smart Health, ISSN 2352-6483, Elsevier, 2020, DOI:10.1016/j.smhl.2020.100144
- Karim Hammoudi, Adnane Cabani, Halim Benhabiles, and Mahmoud Melkemi,"Validating the correct wearing of protection mask by taking a selfie: design of a mobile application "CheckYourMask" to limit the spread of COVID-19", CMES-Computer Modeling in Engineering & Sciences, Vol.124, No.3, pp. 1049-1059, 2020, DOI:10.32604/cmes.2020.011663
- indicate any changes that you've made,
- and distribute any derivative works under the same license. https://creativecommons.org/licenses/by-nc-sa/4.0/
Adnane Cabani, Karim Hammoudi, Halim Benhabiles, and Mahmoud Melkemi, "MaskedFace-Net - A dataset of correctly/incorrectly masked face images in the context of COVID-19", Smart Health, ISSN 2352-6483, Elsevier, 2020, DOI:10.1016/j.smhl.2020.100144
@Article{cabani.hammoudi.2020.maskedfacenet,
title={MaskedFace-Net -- A Dataset of Correctly/Incorrectly Masked Face Images in the Context of COVID-19},
author={Adnane Cabani and Karim Hammoudi and Halim Benhabiles and Mahmoud Melkemi},
journal={Smart Health},
year={2020},
url ={http://www.sciencedirect.com/science/article/pii/S2352648320300362},
issn={2352-6483},
doi ={https://doi.org/10.1016/j.smhl.2020.100144}
}
Karim Hammoudi, Adnane Cabani, Halim Benhabiles, and Mahmoud Melkemi,"Validating the correct wearing of protection mask by taking a selfie: design of a mobile application "CheckYourMask" to limit the spread of COVID-19", CMES-Computer Modeling in Engineering & Sciences, Vol.124, No.3, pp. 1049-1059, 2020, DOI:10.32604/cmes.2020.011663
@Article{cmes.2020.011663,
title={Validating the Correct Wearing of Protection Mask by Taking a Selfie: Design of a Mobile Application “CheckYourMask” to Limit the Spread of COVID-19},
author={Karim Hammoudi, Adnane Cabani, Halim Benhabiles, Mahmoud Melkemi},
journal={Computer Modeling in Engineering \& Sciences},
volume={124},
year={2020},
number={3},
pages={1049--1059},
url={http://www.techscience.com/CMES/v124n3/39927},
issn={1526-1506},
doi={10.32604/cmes.2020.011663}
}
The authors would like to thank:
- Mr. Jie Feng, a Columbia University PhD graduate and Creator of VisualData, for referencing our dataset on his website https://www.visualdata.io/discovery
- Dr. Amin Sarafraz for referencing our dataset on his website https://www.aitribune.com/dataset/2021021108
Jolibrain, AI software and services (March 26, 2021), “Removing face masks with JoliGAN”, https://www.deepdetect.com/blog/15-face-masks-gan/
Arthur Fortes (February 4, 2021), “Building a Personalized Face Mask Detection Using OpenCV and Deep Learning)”, https://fortes-arthur.medium.com/building-a-personalized-face-mask-detection-using-opencv-and-deep-learning-4aae008c95a0
OrangeTree Global, Business Intelligence and Business Analytics Training Organization (January 9, 2021), “Top Free COVID dataset for Analytics professionals”, https://orangetreeglobal.com/top-free-covid-dataset-for-analytics-professionals/
Stacy Stanford, Roberto Iriondo, Pratik Shukla (January 5, 2021), "Best Public Datasets for Machine Learning and Data Science", https://medium.com/towards-artificial-intelligence/best-datasets-for-machine-learning-data-science-computer-vision-nlp-ai-c9541058cf4f
Purva Huilgol (December 15, 2020), "Top 15 Open-Source Datasets of 2020 that every Data Scientist Should add to their Portfolio!", https://www.analyticsvidhya.com/blog/2020/12/top-15-datasets-of-2020-that-every-data-scientist-should-add-to-their-portfolio/