• Know Your Data: Google’s Dataset Analysis Tool

    Know Your Data is a Google tool for studying and analyzing datasets. The application is aimed at finding incorrect annotations and unbalanced data classes.

    Know Your Data is an application with a graphical interface that allows you to visually study and check the dataset. At the moment, the tool only works with images. With the help of Know Your Data, you can get answers to the following questions:

    • Is the data corrupted? (Both images and annotations)
    • Is the data sensitive? (For example, are there specific people in the images, prohibited content, etc.)
    • Are there any gaps in the data?
    • Is the data set balanced across different attributes?

    One of the key features of Know Your Data is that it allows users to study a dataset based on information that was originally missing from it. KYD annotates existing data with additional information. For this purpose, machine learning models are used, such as Cloud Vision labels, Cloud Vision face detection and general image quality indicators (for example, sharpness and brightness). In addition, Know Your Data has a set of functions for creating interactive dashboards with statistics and filters in real time.

    At the moment, Know Your Data is in beta testing. To get access, you need to sign up for a waiting list.

    Notify of
    Inline Feedbacks
    View all comments