FAIR has opened access to Dynabench , a free platform for benchmarking models, to everyone. The functionality of the service allows you to evaluate the bias, accuracy and resource intensity of models.
Dynatask is a new Dynabench function for working with natural language processing models. It is distinguished by its flexibility in configuration. In particular, one task can have one or more owners who can manage its settings. Users can also choose which datasets and metrics to use for evaluating models.
Working with Dynatask:
- Step 1: Log in to your Dynabench account and fill out the “Request new task ” form on the profile page.
- Step 2: After approval, a page with a dashboard will be allocated for the task, allowing you to configure the task.
- Step 3: In the dashboard, select the datasets and metrics with which you want to evaluate the model.
- Step 4: Upload the model.
- Step 5: If the training allowed you to improve the model, you can upload a new version of the model for a new iteration of improvement.
The goal of Dynatask is to create more holistic systems for evaluating artificial intelligence models that go beyond evaluating only their accuracy.
The service is available here.