Researchers from Indiana University and Stony Brook University have collected and published a large, diverse dataset of videos for action recognition. The dataset, called AviD contains videos from many different countries, which makes it stand out from existing publicly available video datasets.
The dataset was collected from videos already available on the Internet, from sites such as Flickr, Instagram, Youtube, etc. Researchers defined a large number of action classes or more specifically 887 classes which define different actions. These classes were organized in an action class hierarchy in order to provide more information about the relationships between individual classes. The final datasets consist of more than 800K videos and it is completely anonymized (all faces appearing in the video were blurred, for preserving privacy).
The idea behind the collection of this dataset was to show that existing video databases are biased towards specific countries and settings and consequently models trained on those datasets fail to produce quality results on transfer learning and testing with data from a different distribution.
Researchers conducted experiments that empirically show that AViD overcomes this issue of bias towards specific countries and therefore allows for training more robust models.
The dataset is published under the Creative Commons license and is available here. Details about the data collection and the dataset statistics can be found in the paper.