a list of datasets dedicated to the Face Recognition & Detection , OCR , Objection Detection, Gan , SLAM, Motion Track & Pose Estimation , ReID, etc. Any suggestions and pull request are welcome.
- [
Kaggle
] the datasets are used for kaggle competition https://www.kaggle.com/datasets - [
google
] the datasets search engine https://toolbox.google.com/datasetsearch - [
AWS
] the datasets is diversity,contains transportation,satellite picture, with description and tutor materialhttps://registry.opendata.aws/ - [
UCI
] The UCI Machine Learning Repository is a collection of databases, domain theories, and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms http://archive.ics.uci.edu/ml/about.html - [
AwesomeData
] The vision of the AwesomeData community is contributing a pure list of high quality datasets for open communities such as academia, research, education etc https://github.com/awesomedata/awesome-public-datasets - [
LSP
] http://sam.johnson.io/research/lsp.html - [
FLIC
] https://bensapp.github.io/flic-dataset.html - [
MPII
] https://bensapp.github.io/flic-dataset.html
- [
MSCOCO
] http://cocodataset.org/#download - [
AI Challenge
] https://challenger.ai/competition/keypoint/subject - [
Visual Tracker Benchmark
] http://cvlab.hanyang.ac.kr/tracker_benchmark/datasets.html - [
visualdata
] the datasets is used for classification, objcetion detection and semantic, automatic,OCR,etchttps://www.visualdata.io/ - [
VOC-360
] VOC-360 is the first dataset for object detection, segmentation, and classification in fisheye images, which contains 39,575 fisheye imageshttps://researchdata.sfu.ca/islandora/object/sfu%3A2724
- [
tusimple
] Lane Detection applicationLANE DETECTION
[lesions
] a large collection of multi-source dermatoscopic images of pigmented lesions classification
- [
DeepFashion2
]: DeepFashion2 is a comprehensive fashion dataset. It contains 491K diverse images of 13 popular clothing categories from both commercial shopping stores and consumers. It totally has 801K clothing clothing items, where each item in an image is labeled with scale, occlusion, zoom-in, viewpoint, category, style, bounding box, dense landmarks and per-pixel mask.There are also 873K Commercial-Consumer clothes pairscloth classification & detection
- [
CrowdPose
] CrowdPose: Efficient Crowded Scenes Pose Estimation and A New BenchmarkPose Estimation
- [
face
] https://github.com/becauseofAI/HelloFace - [
CASIA-SURF
] A Dataset and Benchmark for Large-scale Multi-modal Face Anti-spoofingAnti-spoofing
- [
VggFace
] VGGFace2: A dataset for recognising faces across pose and ageFace-Recognition
- [
FEI
] - [
LFW
]Face Detection
- [
WFLW
] It contains 10000 faces (7500 for training and 2500 for testing) with 98 fully manual annotated landmarksFace Alignment
- DiF: Diversity in Faces [project] [blog]
- FRVT: Face Recognition Vendor Test [project] [leaderboard]
- IMDb-Face: The Devil of Face Recognition is in the Noise(59k people in 1.7M images) [paper] [dataset]
- Trillion Pairs: Challenge 3: Face Feature Test/Trillion Pairs(MS-Celeb-1M-v1c with 86,876 ids/3,923,399 aligned images + Asian-Celeb 93,979 ids/2,830,146 aligned images) [benckmark] [dataset] [result]
- MF2: Level Playing Field for Million Scale Face Recognition(672K people in 4.7M images) [paper] [dataset] [result] [benckmark]
- MegaFace: The MegaFace Benchmark: 1 Million Faces for Recognition at Scale(690k people in 1M images) [paper] [dataset] [result] [benckmark]
- UMDFaces: An Annotated Face Dataset for Training Deep Networks(8k people in 367k images with pose, 21 key-points and gender) [paper] [dataset]
- MS-Celeb-1M: A Dataset and Benchmark for Large Scale Face Recognition(100K people in 10M images) [paper] [dataset] [result] [benchmark] [project]
- VGGFace2: A dataset for recognising faces across pose and age(9k people in 3.3M images) [paper] [dataset]
- VGGFace: Deep Face Recognition(2.6k people in 2.6M images) [paper] [dataset]
- CASIA-WebFace: Learning Face Representation from Scratch(10k people in 500k images) [paper] [dataset]
- LFW: Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments(5.7k people in 13k images) [report] [dataset] [result] [benchmark]
- WiderFace: WIDER FACE: A Face Detection Benchmark(400k people in 32k images with a high degree of variability in scale, pose and occlusion) [paper] [dataset] [result] [benchmark]
- FDDB: A Benchmark for Face Detection in Unconstrained Settings(5k faces in 2.8k images) [report] [dataset] [result] [benchmark]
- LS3D-W: A large-scale 3D face alignment dataset constructed by annotating the images from AFLW, 300VW, 300W and FDDB in a consistent manner with 68 points using the automatic method [paper] [dataset]
- AFLW: Annotated Facial Landmarks in the Wild: A Large-scale, Real-world Database for Facial Landmark Localization(25k faces with 21 landmarks) [paper] [benchmark]
- CelebA: Deep Learning Face Attributes in the Wild(10k people in 202k images with 5 landmarks and 40 binary attributes per image) [paper] [dataset]
ICDAR 2015
1000 training images and 500 testing imagesICDAR 2017
Competition on Multi-lingual scene text detection and script identificationMLT 2017
7200 training, 1800 validation imagesCOCO-Text (Computer Vision Group, Cornell)
63,686 images, 173,589 text instances, 3 fine-grained text attributes.Synthetic Word Dataset (Oxford, VGG)
9 million images covering 90k English wordsIIIT
5000 images from Scene Texts and born-digital (2k training and 3k testing images) Each image is a cropped word image of scene text with case-insensitive labelsStanfordSynth
Small single-character images of 62 characters (0-9, a-z, A-Z)(MSRA-TD500)
Street View Text (SVT)
100 images for training and 250 images for testingKAIST Scene_Text
3000 images of indoor and outdoor scenes containing textChars74k
Small single-character images of 62 characters (0-9, a-z, A-Z) Over 74K images from natural images, as well as a set of synthetically generated characters
- [
LaSOT
] A High-quality Benchmark for Large-scale Single Object TrackingObject Tracking
- [
Moments in Time
] Moments in Time: one million videos for event understanding)videos understanding
- [
UCF101
] action recognition data set of realistic action videos, collected from YouTube, having 101 action categories. This data set is an extension of UCF50 data set which has 50 action categoriesaction recognition
- [
DAVIS
] DAVIS Challenge on Video Object SegmentationVideo Object Segmentation
- [
Sports1M
] contains 1,133,158 video URLs which have been annotated automatically with 487 Sports labels using the YouTube Topics APIvideo classification
- [
Kinetics
] Kinetics consists of approximately 650,000 video clips, and covers 700 human action classes with at least 600 video clips for each action class. Each clip lasts around 10 seconds and is labeled with a single class.video understanding
- [
CityFlow
] A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-IdentificationVehicle ReId
- [
Argoverse
] 3D Tracking and Forecasting With Rich MapsObject Tracking
- [
CrowdPose
] CrowdPose: Efficient Crowded Scenes Pose Estimation and A New BenchmarkPose Estimation
- [
JHMDB
] J-HMDB is, however, more than a dataset of human actions; it could also serve as a benchmark for pose estimation and human detection[motion understand
](http://jhmdb.is.tue.mpg.de/dataset - [
Kinetics
] Kinetics consists of approximately 650,000 video clips, and covers 700 human action classes with at least 600 video clips for each action class. Each clip lasts around 10 seconds and is labeled with a single class.video understanding