This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Documentation

Welcome to the documentation of Computer Vision Annotation Tool.

CVAT is a free, online, interactive video and image annotation tool for computer vision. It is being developed and used by Intel to annotate millions of objects with different properties. Many UI and UX decisions are based on feedbacks from professional data annotation team. Try it online app.cvat.ai.

Our documentation provides information for AI researchers, system administrators, developers, simple and advanced users. The documentation is divided into three sections, and each section is divided into subsections basic and advanced.

Getting started

Basic information and sections needed for a quick start.

FAQ

Answers to frequently asked questions.

GitHub Repository

Computer Vision Annotation Tool GitHub repository.

Manual

This section contains documents for CVAT simple and advanced users.

Administration

This section contains documents for system administrators.

Contributing

This section contains documents for developers.

1 - Getting started

This section contains basic information and links to sections necessary for a quick start.

Installation

First step is to install CVAT on your system:

To learn how to create a superuser and log in to CVAT, go to the authorization section.

Getting started in CVAT

To create a task, go to Tasks section. Click Create new task to go to the task creation page.

Set the name of the future task.

Set the label using the constructor: first click Add label, then enter the name of the label and choose the color.

You need to upload images or videos for your future annotation. To do so, simply drag and drop the files.

To learn more, go to creating an annotation task

Annotation

Basic

When the task is created, you will see a corresponding message in the top right corner. Click the Open task button to go to the task page.

Once on the task page, open a link to the job in the jobs list.

Choose a correct section for your type of the task and start annotation.

Shape Annotation Interpolation
Rectangle Shape mode (basics) Track mode (basics)
Polygon Annotation with polygons Track mode with polygons
Polyline Annotation with polylines
Points Points in shape mode Liner interpolation with one point
Cuboids Annotation with cuboids Editing the cuboid
Tag Annotation with tags

Advanced

In CVAT there is the possibility of using automatic and semi-automatic annotation what gives you the opportunity to speed up the execution of the annotation:

Export dataset

  1. To download the annotations, first you have to save all changes. Click the Save button or press Ctrl+Sto save annotations quickly.

  2. After you saved the changes, click the Menu button.

  3. Then click the Export dataset button.

  4. Lastly choose a format of the dataset. Exporting is available in formats from the list of supported formats.

To learn more, go to export/import datasets section.

1.1 - Product tour

1.1.1 - CVAT intro

In this video, we show how to use CVAT - how to sign up, upload your data, annotate it, and download it. This video is the first one in a series that’ll guide you on our product and its many features.

1.1.2 - Introduction to CVAT and Datumaro

We are excited to introduce the first video in our course series designed to help you annotate data faster and better using CVAT. In this introductory 4 minute video, we walk through:

  1. what problems CVAT and Datumaro solve,
  2. how they can speed up your model training process, and
  3. some resources you can use to learn more about how to use them.

2 - Integrations

This section contains information about the tools that are integrated with CVAT.

2.1 - FiftyOne

FiftyOne is an open-source tool for building high-quality datasets and computer vision models. FiftyOne supercharges your machine learning workflows by enabling you to visualize datasets and interpret models faster and more effectively.

FiftyOne provides an API to create tasks and jobs, upload data, define label schemas, and download annotations using CVAT, all programmatically in Python. All of the following label types are supported, for both image and video datasets:

  • Classifications
  • Detections
  • Instance segmentations
  • Polygons and polylines
  • Keypoints
  • Scalar fields
  • Semantic segmentation

2.2 - Toloka

To have your dataset annotated through Toloka, simply establish a project in CVAT set the pricing, and let Toloka annotators take care of the annotations for you.

See:

Glossary

This page contains several terms used to describe interactions between systems and actors. Refer to the table below for clarity on how we define and use them.

Term Explanation
Toloka Toloka is a crowdsourcing platform that allows users to assign tasks to a broad group of participants, often termed “crowd workers”. In the context of this article, when we mention Toloka, we are specifically referring to one of its UI interfaces.
CVAT CVAT is a tool designed for annotating video and image data for computer vision tasks. In this article’s context, when we reference CVAT, we mean one of its UI interfaces.
Requester An individual who establishes an annotation project within CVAT determines the price, creates tasks and jobs within the project, and then releases it on Toloka.
Toloker A person who annotates the Requester’s dataset.

Preconditions

Actor: Requester.

Requester must have a CVAT account and Toloka Requester account.

To get access to the feature in CVAT, send request to CVAT Support

Creating Toloka project

The requester can set up a project within CVAT and subsequently connect it to Toloka, making it accessible for annotations by Tolokers.

To initiate your Toloka project, proceed with the following steps:

  1. Log in to CVAT and initiate a new Project.
    Here you can can setup user guide which will be shown on Toloka platform, see Adding instructions for annotators to the Toloka project.

  2. Navigate to the project page and select Actions > Setup crowdsourcing project.

    Setting Up Toloka Project

  3. Fill in the following fields in the Setup crowdsourcing project form:

    Toloka Project Configuration

    • Provider: Choose Toloka as your provider.
    • Environment: Select either Sandbox for a testing environment or Production for a live environment.
    • API key: Enter the Requester API key.
    • Project description (Optional): Provide a brief description of your project.
  4. Click Submit. A pop-up indicating Toloka Project Created will appear, along with the Update project form.

    Project Update Form

    • Open Project in Toloka will take you to the published project in Toloka.
    • Open Project in CVAT will take you back to the project in CVAT.

In CVAT, all projects related to Toloka will be labeled as Toloka.

Toloka Label on Project

The status indicator changes based on the state of the project:

  • Green - active
  • Dark grey - archived

Adding tasks and jobs to the Toloka project

To add tasks to the Toloka project, see Create annotation task.

On step 2 of the Create annotation task procedure, from the drop-down list, select your Toloka project.

Toloka Add Task

Adding instructions for annotators to the Toloka project

To add instructions for annotators to the Toloka project, see Adding specification to Project documentation or Adding instructions video tutorial.

Toloka pool setup

After creating a task and its associated jobs in CVAT, you’ll need to configure a Toloka pool, specifying all task requirements and setting the price for task completion.

To set up the Toloka pool, do the following:

  1. Open Toloka task, go to Actions > Setup Toloka pool.

    Set up Toloka pool

  2. In the Create crowdsourcing task form, fill in the following fields:

    Set up Toloka pool

    • Price per job: Specify the payment amount for completing one job within the project.
    • Use project description: Switch this toggle if you want to use the overarching project description for individual tasks.
    • Description: Provide details about the pool. This field is visible only when the Use project description toggle is off.
    • Sensible content: Switch this toggle if your dataset contains images intended for an adult audience.
    • Accept solutions automatically: Enable this if you wish for completed jobs to be automatically accepted.
    • Close pool after completion, sec: The interval during which the pool will remain open from the moment all tasks are completed. Minimum — 0, maximum — 259200 seconds (three days).
    • Time per task suite, sec: Enter the time limit, in seconds, within which each job must be completed. The Toloker will see the deadline in the task information on the main Toloka page and also in CVAT interface. Uncompleted tasks are redistributed to other Tolokers.
    • Days for manual review by requester: Specify the Review period in days — the number of days for the review (from 1 to 21 days from the task completion date). The Toloker will see the deadline in the task information on the main Toloka page.
    • Audience: Add rules to make jobs available only to Tolokers who meet certain criteria. For example, you might require Tolokers to be proficient in English and have higher education. These rules operate based on filter principles. For more information, see Toloks Filters documentation, CVAT Filters documentation or CVAT Filters video tutorial.
    • Toloka Rules
  3. Click Submit. You will see the Toloka task was created pop-up and the Update pool form.

    Update pool

    • Open pool in Toloka opens pool in Toloka.
    • Open task in CVAT opens task in CVAT.
  4. Open the CVAT task that was published to Toloka, go to Actions > Start Toloka pool.
    Project, that you created will now be visible to Tolokers.

    Toloka Project

Pools status indicator has the following states:

  • Green - open for annotating
  • Light gray - closed
  • Dark grey - archived

Changing Toloka pool

To change started Toloka pool, you need to stop it first.

  1. Open Toloka task, Actions > Stop Toloka pool.
  2. Implement changes.
  3. Open Toloka task, go to Actions > Start Toloka pool.

Reviewing annotated jobs

In case the pool you’ve created are not in the Accept solutions automatically mode, you will need to manually review and accept them within time limits that were defined in the Toloka pool settings.

To approve or reject the job, use the Accept and Reject buttons.

Toloka Project

Accepting job

To accept the annotated job, do the following:

  1. Go to the Toloka task and open the job.

  2. Review the result of annotation and in case all is fine, on the top menu, click Accept.

  3. Optionally, you may add comment.

    Toloka Project

  4. Click OK.

Rejecting job

Note, that Toloker can open dispute and appeal the rejected job on the Toloka platform.

To reject the annotated job, do the following:

  1. Go to the Toloka task and open the job.
    On the top menu, you will see Accept and Reject buttons. Toloka Project

  2. Review the result of the annotation and in case something is wrong, on the top menu, click Reject.

  3. Add comment why this work was rejected

    Toloka Project

  4. Click OK.

After you reject the job, the menu bar will change and only the Accept button will be active.

Rejected job can be accepted later.

Moving Toloka pool to archive

After annotation is complete, you can move the Toloka pools to archive without archiving the whole Project.

Note, that to archive pool, all jobs within task must be in the Complete state.

Note, that pool must accepted and without active assignments on the Toloka side.

Keep in mind, that if you Rejected the job, it will not become unassigned immediately, to give Toloker time to open a dispute.

To archive complete jobs, do the following:

  1. Open Toloka task, and go to Actions.
  2. (Optional) If the task is ongoing, select Stop Toloka pool.
  3. Select Archive Toloka pool
  4. In the pop-up click OK.

Moving Toloka project to archive

After annotation is completed, you can move the Toloka project to the archive.

Note that all jobs must be complete. Tasks must not have active assignments or assignments that are being disputed. All project pools must be closed/archived.

  1. Open Toloka project, go to Actions > Archive Toloka project.

    Toloka Project

  2. In the pop-up, click Yes.

Resource sync between CVAT and Toloka

There are two types of synchronization between CVAT and Toloka:

  • Explicit synchronization: Triggered manually by the requester by clicking the Sync Toloka project/Sync Toloka pool button within the CVAT interface.
  • Implicit Synchronization: Occurs automatically at predetermined intervals. Resources that have been requested by users will be synchronized without any manual intervention.

Acceptance/Rejection synchronization

In addition to project and pool synchronization, it is essential to synchronize the status of assignments. If a requester accepts or rejects an assignment through Toloka’s client interface, this action automatically synchronizes with CVAT to ensure that the data remains current and consistent across both platforms.

3 - Frequently asked questions

Answers to frequently asked questions

How to migrate data from CVAT.org to CVAT.ai

Please follow the export tasks and projects guide to download an archive with data which corresponds to your task or project. The backup for a project will have all tasks which are inside the project. Thus you don’t need to export them separately.

Please follow the import tasks and projects guide to upload your backup with a task or project to a CVAT instance.

See a quick demo below. It is really a simple process. If your data is huge, it may take some time. Please be patient.

Export and import backup demo

How to upgrade CVAT

Before upgrading, please follow the backup guide and backup all CVAT volumes.

Follow the upgrade guide.

How to change default CVAT hostname or port

To change the hostname, simply set the CVAT_HOST environment variable

export CVAT_HOST=<YOUR_HOSTNAME_OR_IP>

NOTE, if you’re using docker compose with sudo to run CVAT, then please add the -E (or --preserve-env) flag to preserve the user environment variable which set above to take effect in your docker containers:

sudo -E docker compose up -d

If you want to change the default web application port, change the ports part of traefik service configuration in docker-compose.yml

services:
  traefik:
    ...
    ...
    ports:
      - <YOUR_WEB_PORTAL_PORT>:8080
      - 8090:8090

Note that changing the port does not make sense if you are using HTTPS - port 443 is conventionally used for HTTPS connections, and is needed for Let’s Encrypt TLS challenge.

How to configure connected share folder on Windows

Follow the Docker manual and configure the directory that you want to use as a shared directory:

After that, it should be possible to use this directory as a CVAT share:

services:
  cvat_server:
    volumes:
      - cvat_share:/home/django/share:ro
  cvat_worker_import:
    volumes:
      - cvat_share:/home/django/share:ro
  cvat_worker_export:
    volumes:
      - cvat_share:/home/django/share:ro
  cvat_worker_annotation:
    volumes:
      - cvat_share:/home/django/share:ro

volumes:
  cvat_share:
    driver_opts:
      type: none
      device: /d/my_cvat_share
      o: bind

How to make unassigned tasks not visible to all users

Set reduce_task_visibility variable to True.

Where are uploaded images/videos stored

The uploaded data is stored in the cvat_data docker volume:

volumes:
  - cvat_data:/home/django/data

Where are annotations stored

Annotations are stored in the PostgreSQL database. The database files are stored in the cvat_db docker volume:

volumes:
  - cvat_db:/var/lib/postgresql/data

How to mark job/task as completed

The status is set by the user in the Info window of the job annotation view. There are three types of status: annotation, validation or completed. The status of the job changes the progress bar of the task.

How to install CVAT on Windows 10 Home

Follow this guide.

I do not have the Analytics tab on the header section. How can I add analytics

You should build CVAT images with ‘Analytics’ component.

How to upload annotations to an entire task from UI when there are multiple jobs in the task

You can upload annotation for a multi-job task from the Dasboard view or the Task view. Uploading of annotation from the Annotation view only affects the current job.

How to specify multiple hostnames

To do this, you will need to edit traefik.http.<router>.cvat.rule docker label for both the cvat and cvat_ui services, like so (see the documentation on Traefik rules for more details):

  cvat:
    labels:
      - traefik.http.routers.cvat.rule=(Host(`example1.com`) || Host(`example2.com`)) &&
          PathPrefix(`/api/`, `/git/`, `/analytics/`, `/static/`, `/admin`, `/documentation/`, `/django-rq`)

  cvat_ui:
    labels:
      - traefik.http.routers.cvat-ui.rule=Host(`example1.com`) || Host(`example2.com`)

How to create a task with multiple jobs

Set the segment size when you create a new task, this option is available in the Advanced configuration section.

How to transfer CVAT to another machine

Follow the backup/restore guide.

How to load your own DL model into CVAT

See the information here in the Serverless tutorial.

My server uses a custom SSL certificate and I don’t want to check it.

You can call control SSL certificate check with the --insecure CLI argument. For SDK, you can specify ssl_verify = True/False in the cvat_sdk.core.client.Config object.

4 - Paid features

Setting up paid features in CVAT.

We provide a variety of premium features exclusively for our paying customers.

For further details, please visit:

4.1 - Subscription management

How to manage your subscription

This article provides tips on how to effectively manage your CVAT subscriptions, including tracking expenses and canceling unnecessary subscriptions, to optimize your finances and save time.

Whether you’re a business owner or an individual, you’ll learn how to take control of your subscriptions and manage them.

See:

Billing

This section describes the billing model and gives short a description of limitations for each plan.

For more information, see: Pricing Plans

Pro plan

Account/Month: The Pro plan has a fixed price and is designed for personal use only. It doesn’t allow collaboration with team members, but removes all the other limits of the Free plan.

Note: Although it allows the creation of an organization and access for up to 3 members – it is for trial purposes only, organization and members will have all the limitations of the Free plan.

Team plan

Member/ month: The Team plan allows you to create an organization and add team members who can collaborate on projects. The monthly payment for the plan depends on the number of team members you’ve added. All limits of the Free plan will be removed.

Note: The organization owner is also part of the team. So, if you have three annotators working, you’ll need to pay for 4 seats (3 annotators + 1 organization owner).

Payment methods

This section describes how to change or add payment methods.

Paying with bank transfer

Note at the moment this method of payment work only with US banks.

To pay with bank transfer:

  1. Go to the Upgrade to Pro/Team plan> Get started.
  2. Click US Bank Transfer.
  3. Upon successful completion of the payment, the you will receive a receipt via email.

Note that the completion of the payment process may take up to three banking days.

Bank Transfer Payment

Change payment method on Pro plan

Access Manage Pro Plan > Manage and click +Add Payment Method

Payment pro

Change payment method on Team plan

Access Manage Team Plan > Manage and click +Add Payment Method.

Payment team

Adding and removing team members

This section describes how to add team members to collaborate within one team.

Pro plan

Not available.

Team plan

Go to the Manage Team plan > Manage > Update quantity.

Add members

If you’ve added a user before the current billing period ends, the payment will be prorated for the remaining time until the next billing cycle begins. From the following month onward, the full payment will be charged.

In case you removed the user before the current billing period ends, funds will not be returned to your account, but next month you will pay less by the amount of unused funds.

Change plan

The procedure is the same for both Pro and Team plans.

If for some reason you want to change your plan, you need to:

  1. Unsubscribe from the previous plan.
  2. If you need a refund, contact us at accounting@cvat.ai.
  3. Subscribe to a new plan.

Can I subscribe to several plans?

Paid plans are not mutually exclusive. You can have several active subscriptions, for example, the Pro plan and several Team plans for different organizations.

Cancel plan

This section describes how to cancel your CVAT subscription and what will happen to your data.

What will happen to my data?

Once you have terminated your subscription, your data will remain accessible within the system for a month. During this period, you will be unable to add new tasks and free plan limits will be applied.

In case you possess a substantial amount of data, it will be switched to read-only mode. It means you will not be able to save annotations, add any resources, and so on.

Following the one month, you will receive a notification requesting you to either remove the excess data or it will be deleted automatically.

Pro plan

Access Manage Pro Plan > Manage > Cancel plan

Please, fill out the feedback form, to help us improve our platform.

Cancel pro

Team plan

Access Manage Team plan > Manage -> Cancel plan

Please, fill out the feedback form, to help us improve our platform.

Cancel team

Plan renewal

This section describes how to renew your CVAT subscription

Pro plan

Access Manage Pro Plan > Manage > Renew plan

Team plan

Access Manage Team Plan > Manage > Renew plan

4.2 - Social auth configuration

Social accounts authentication for Self-Hosted solution

Note: This is a paid feature available for Enterprise clients.

You can now easily set up authentication with popular social services, which opens doors to such benefits as:

  • Convenience: you can use the existing social service credentials to sign in to CVAT.
  • Time-saving: with just two clicks, you can sign in without the hassle of typing in сredentials, saving time and effort.
  • Security: social auth service providers have high-level security measures in place to protect your accounts.

Currently, we offer three options:

  • Authentication with Github.
  • Authentication with Google.
  • Authentication with Amazon Cognito.

With more to come soon. Stay tuned!

See:

Enable authentication with a Google account

To enable authentication, do the following:

  1. Log in to the Google Cloud console

  2. Create a project, and go to APIs & Services

  3. On the left menu, select OAuth consent, then select User type (Internal or External), and click Create.

  4. On the OAuth consent screen fill all required fields, and click Save and Continue.

  5. On the Scopes screen, click Add or remove scopes and select auth/userinfo.email, auth/userinfo.profile, and openid. Click Update, and Save and Continue.
    For more information, see Configure Auth Consent.

  6. On the left menu, click Credentials, on the top menu click + Create credentials, and select OAuth client ID.

  7. From the Application Type select Web application and configure: Application name, Authorized JavaScript origins, Authorized redirect URIs.
    For example, if you plan to deploy CVAT instance on https://localhost:8080, add https://localhost:8080 to authorized JS origins and https://localhost:8080/api/auth/social/goolge/login/callback/ to redirect URIs.

  8. Create conпiguration file in CVAT:

    1. Create the auth_config.yml file with the following content:
    ---
    social_account:
      enabled: true
      google:
        client_id: <some_client_id>
        client_secret: <some_client_secret>
    
    1. Set AUTH_CONFIG_PATH="<path_to_auth_config> environment variable.
  9. In a terminal, run the following command:

    docker compose -f docker-compose.yml -f docker-compose.dev.yml -f docker-compose.override.yml up -d --build
    

Enable authentication with a GitHub account

There are 2 basic steps to enable GitHub account authentication.

  1. Open the GitHub settings page.

  2. On the left menu, click <> Developer settings > OAuth Apps > Register new application.
    For more information, see Creating an OAuth App

  3. Fill in the name field, set the homepage URL (for example: https://localhost:8080), and authorization callback URL (for example: https://localhost:8080/api/auth/social/github/login/callback/).

  4. Create conпiguration file in CVAT:

    1. Create the auth_config.yml file with the following content:
    ---
    social_account:
      enabled: true
      github:
        client_id: <some_client_id>
        client_secret: <some_client_secret>
    
    1. Set AUTH_CONFIG_PATH="<path_to_auth_config> environment variable.
  5. In a terminal, run the following command:

    docker compose -f docker-compose.yml -f docker-compose.dev.yml -f docker-compose.override.yml up -d --build
    

Note: You can also configure GitHub App, but don’t forget to add required permissions.
In the Permission > Account permissions > Email addresses must be set to read-only.

Enable authentication with an Amazon Cognito

To enable authentication, do the following:

  1. Create a user pool. For more information, see Amazon Cognito user pools

  2. Fill in the name field, set the homepage URL (for example: https://localhost:8080), and authorization callback URL (for example: https://localhost:8080/api/auth/social/amazon-cognito/login/callback/).

  3. Create conпiguration file in CVAT:

    1. Create the auth_config.yml file with the following content:
    ---
    social_account:
      enabled: true
      amazon_cognito:
        client_id: <some_client_id>
        client_secret: <some_client_secret>
        domain: https://<domain-prefix>.auth.us-east-1.amazoncognito.com
    
    1. Set AUTH_CONFIG_PATH="<path_to_auth_config> environment variable.
  4. In a terminal, run the following command:

    docker compose -f docker-compose.yml -f docker-compose.dev.yml -f docker-compose.override.yml up -d --build
    

5 - Manual

This section contains documents for CVAT simple and advanced users

5.1 - Basics

This section contains basic documents for CVAT users

5.1.1 - Registration

App CVAT user registration and account access.

To start to annotate in CVAT, you need to create an account or log in to the existing account.

This section describes App CVAT, that is suitable for small personal projects, that do not require user management. It is also ok to use if you just want to try what is CVAT.

While it is easy to use, it has some limitations. For example, in App CVAT you cannot create a superuser (admin account) or administer user roles. All these features are available for Admin user in local version of CVAT.

See:

To creata account or log in, go to the App CVAT login page:

Note: By default authentication and registration with Google and GitHub work only for App CVAT.
If you want to use Google and GitHub authentication on a local installation, see Social auth configuration.

User registration

To register as a non-admin user, do the following:

  1. Click Create an account.

    Create account

  2. Fill in all blank fields, accept terms of use, and click the Create an account button.

Account form


A username generates from the email automatically. You can edit it if needed.

Usernname generation

To register with Google or GitHub, click the button with the name of the service, and follow instructions on the screen.

Account access

To access your account, do the following:

  1. Go to the login page.
  2. Enter username or email. The password field will appear.
  3. Enter the password and click Next.

To log in with Google or GitHub, click the button with the name of the service.

5.1.2 - Create annotation task

How to create and configure an annotation task.

To start annotating in CVAT, you need to create an annotation task and specify its parameters.

To create a task, on the Tasks page click + and select Create new task.

Create new task

See:

Create a task

To create a new task, open task configurator:

Basic configurator

And specify the following parameters:

  1. In the Name field, enter the name of the new task.

    Name of task

  2. (Optional) From the Projects drop-down, select a project for the new task.
    Leave this field empty if you do not want to assign the task to any project.

    Select project

    Note: Following steps are valid if the task does not belong to a project.
    If the task has been assigned to a project, the project’s labels will be applied to the task.

  3. On the Constructor tab, click Add label.
    The label constructor menu will open:

    Label constructor

  4. In the Label name field, enter the name of the label.

  5. (Optional) To limit the use of the label to a certain shape tool, from the Label shape drop-down select the shape.

  6. (Optional) Select the color for the label.

    label shape and color

  7. (Optional) Click Add an attribute and set up its properties.

  8. Click Select files to upload files for annotation.

  9. Click Continue to submit the label and start adding a new one
    or Cancel to terminate the current label and return you to the labels list.

  10. Click Submit and open to submit the configuration and open the created task,
    or Submit and continue, to submit the configuration and start a new task.

Label shape

Labels (or classes) are categories of objects that you can annotate.

Label shape limits the use of the label to certain shape tool.

Any is the default setting that does not limit the use of the label to any particular shape tool.

For example, you added:

  • Label sun with the Label shape type ellipse
  • Label car with the Label shape type any

As a result:

  • The sun label will be available only for ellipse shape.

  • The car label will be available for all shapes.

    Label shape

The tools on the Controls sidebar will be limited to the selected types of shapes.

For example, if you select Any, all tools will be available, but if you select Rectangle for all labels, only the Rectangle tool will be visible on the sidebar.

Note: You cannot apply the Label shape to the AI and OpenCV tools, these tools will always be available.

Type control sidebar

You can change the shape of the label as needed. This change will not affect the existing annotation.

For example, if you created objects using polygons and then changed the label shape to polylines, all previously created objects will remain polygons. However, you will not be able to add new polygon objects with the same label.

Note: You cannot change the shape of the skeleton label.
The Label shape field for the skeleton label is disabled.

Add an attribute

Attribute is a property of an annotated object, such as color, model, or other quality.

For example, you have a label for face and want to specify the type of face. Instead of creating additional labels for male and female, you can use attributes to add this information.

There are two types of attributes:

  • Immutable attributes are unique and do not change from frame to frame. For example, age, gender, and color.
  • Mutable attributes are temporary and can change from frame to frame. For example, pose, quality, and truncated.

Added attributes will be available from the Objects menu:

Attributes

To add an attribute, do the following:

  1. Go to the Constructor tab and click Add attribute.

    Attributes

  2. In the Name field enter the name of the attribute.

  3. From the drop-down, select way to display the attribute in the Objects menu:

    • Select enables a drop-down list, from which you can select an attribute.
      If in the Attribute value field you add __undefined__, the drop-down list will have a blank value.
      This is useful for cases where the attribute of the object cannot be clarified:

    • Undefined value

    • Radio enables the selection of one option from several options.

    • Checkbox enables the selection of multiple options.

    • Text sets the attribute to a text field.

    • Number sets the attribute to numerical field in the following format: min;max;step.

  4. In the Attribute values field, add attribute values.
    To separate values use Enter.
    To delete value, use Backspace or click x next to the value name.

  5. (Optional) For mutable attributes, select Mutable.

  6. (Optional) To set the default attribute, hover over it with mouse cursor and click on it. The default attribute will change color to blue.

Default attribute

To delete an attribute, click Delete attribute.

Select files

There are several ways to upload files:

Data source Description
My computer Use this option to select files from your laptop or PC.
To select file:
1. Click on the Select files field:
Select files.
2. Select files to upload.
Connected file share Advanced option.
Upload files from a local or cloud shared folder.
Note, that you need to mount a fileshare first.
For more information, see Share path
Remote source Enter a list of URLs (one per line) in the field.
Cloud Storage Advanced option.
To upload files from cloud storage, type the cloud storage name, choose the manifest file, and select the required files.
For more information, see Attach cloud storage

Editing labels in RAW format

The Raw is a way of working with labels for an advanced user.

It is useful when you need to copy labels from one independent task to another.

Note: Be careful with changing the raw specification of an existing task/project. Removing any “id” properties will lead to losing existing annotations. This property will be removed automatically from any text you insert to this field.

Raw presents label data in .json format with an option of editing and copying labels as text. The Done button applies the changes and the Reset button cancels the changes.

Data formats for a 3D task

To create a 3D task, you must prepare an archive with one of the following directory structures.

Note: You can’t mix 2D and 3D data in the same task.

  VELODYNE FORMAT
    Structure:
      velodyne_points/
        data/
          image_01.bin
          IMAGE_00 # unknown dirname,
                   # generally image_01.png can be under IMAGE_00, IMAGE_01, IMAGE_02, IMAGE_03, etc
      data/
        image_01.png
   3D POINTCLOUD DATA FORMAT
    Structure:
      pointcloud/
        00001.pcd
      related_images/
        00001_pcd/
          image_01.png # or any other image
    3D, DEFAULT DATAFORMAT Option 1
    Structure:
      data/
        image.pcd
        image.png
    3D, DEFAULT DATAFORMAT Option 2
    Structure:
      data/
        image_1/
            image_1.pcd
            context_1.png # or any other name
            context_2.jpg

Advanced configuration

Use advanced configuration to set additional parameters for the task and customize it to meet specific needs or requirements.

The following parameters are available:

Element Description
Sorting method Note: Does not work for the video data.

Several methods to sort the data.
For example, the sequence 2.jpeg, 10.jpeg, 1.jpeg after sorting will be:

  • Lexicographica: 1.jpeg, 10.jpeg, 2.jpeg
  • Natural: 1.jpeg, 2.jpeg, 10.jpeg
  • Predefined: 2.jpeg, 10.jpeg, 1.jpeg
  • Random uploads data in random order.
  • Use zip/video chunks Use this parameter to divide your video or image dataset for annotation into short video clips a zip file of frames.
    Zip files are larger but do not require decoding on the client side, and video clips are smaller but require decoding.
    It is recommended to turn off this parameter for video tasks to reduce traffic between the client side and the server.
    Use cache Select checkbox, to enable on-the-fly data processing to reduce task creation time and store data in a cache with a policy of
    evicting less popular items.

    For more information, see Data preparation on the fly.
    Image Quality CVAT has two types of data: original quality and compressed. Original quality images are used for dataset export
    and automatic annotation. Compressed images are used only for annotations to reduce traffic between the server
    and client side.
    It is recommended to adjust the compression level only if the images contain small objects that are not
    visible in the original quality.
    Values range from 5 (highly compressed images) to 100 (not compressed
    Overlap Size Use this parameter to create overlapped segments, making tracking continuous from one segment to another.

    Note that this functionality only works for bounding boxes.

    This parameter has the following options:

    Interpolation task (video sequence). If you annotate with a bounding box on two adjacent segments, they will be
    merged into a single bounding box. In case the overlap is zero or the bounding box is inaccurate (not enclosing the object
    properly, misaligned or distorted) on the adjacent segments, it may be difficult to accurately interpole the object’s
    movement between the segments. As a result, multiple tracks will be created for the same object.

    Annotation task (independent images). If an object exists on overlapped segments with overlap greater than zero,
    and the annotation of these segments is done properly, then the segments will be automatically merged into a single
    object. If the overlap is zero or the annotation is inaccurate (not enclosing the object properly, misaligned, distorted) on the
    adjacent segments, it may be difficult to accurately track the object. As a result, multiple bounding boxes will be
    created for the same object.

    If the annotations on different segments (on overlapped frames) are very different, you will have two shapes
    for the same object.

    To avoid this, accurately annotate the object on the first segment and the same object on the second segment to create a track
    between two annotations.
    Segment size Use this parameter to divide a dataset into smaller parts. For example, if you want to share a dataset among multiple
    annotators, you can split it into smaller sections and assign each section to a separate job.
    This allows annotators to work on the same dataset concurrently.
    Start frame Defines the first frame of the video.
    Stop frame Defines the last frame of the video.
    Frame Step Use this parameter to filter video frames or images in a dataset. Specify frame step value to include only
    certain frames or images in the dataset.
    For example, if the frame step value is 25, the dataset will include every 25th frame or image. If a video
    has 100 frames, setting the frame step to 25 will include only frames 1, 26, 51, 76, and 100 in the dataset.
    This can be useful for reducing the size of the dataset, or for focusing on specific frames or images that are
    of particular interest.
    Chunk size Defines amount of frames to be packed in a chunk when send from client to server.
    The server defines automatically if the chunk is empty.
    Recommended values:
  • 1080p or less: 36
  • 2k or less: 8
  • 16 - 4k or less: 4
  • 8 - More: 1 - 4
  • Dataset repository Advanced option.
    URL link of the repository that specifies the path to the repository for storage (default: annotation / <dump_file_name> .zip).
    Supports .zip and .xml formats.

    Field format: URL [PATH] example: https://github.com/project/repos.git [1/2/3/4/annotation.xml]

    Supported URL formats:
  • https://github.com/project/repos[.git]
  • github.com/project/repos[.git]
  • git@github.com:project/repos[.git]

    After the task is created, the synchronization status will show up on the task page.
    If you specify a dataset repository, when you create a task, you will see a message about the need to grant access with
    the ssh key.
    This is the key you need to add to your github account.
    For other git systems, you can learn about adding an ssh key in their documentation.
  • Use LFS Advanced option.
    Use this parameter for big annotation files, to create a repository with LFS support.
    Issue tracker Use this parameter to specify the issue tracker URL.
    Source storage Specify the source storage for importing resources like annotations and backups.
    If the task was assigned to the project, use the Use project source storage toggle to determine whether to
    use project values or specify new ones.
    Target storage Specify the target storage (local or cloud) for exporting resources like annotations and backups.
    If the task is created in the project, use the Use project target storage toggle to determine whether to
    use project values or specify new ones.

    To save and open the task, click Submit & Open .

    To create several tasks in sequence, click Submit & Continue.

    Created tasks will be displayed on the tasks page.

    5.1.3 - Create multi tasks

    Step-by-step guide on how to create and set up multiple tasks

    Use Create multi tasks to create multiple video annotation tasks with the same configuration.

    The Сreate multi tasks feature is available for videos only.

    To create the multi tasks, on the Tasks page click + and select Create multi tasks.

    See:

    Create multi tasks

    To add several tasks in one go, open the task configurator:

    Multitack configurator

    And specify the following parameters:

    1. In the Name field, enter the name of the new task:

      • Enter the name of the task. If the name includes more than one word, use the underscore: Word1 word2 word3
      • (Optional) {{index}} adds an index to the file in the set (starting from 0).
      • (Optional) {{file_name}} adds the file’s name to the task’s name.

        Note: use hyphen between three parameters: Word1 word2 word3 {{index}} {{file_name}}

    2. (Optional) From the Projects drop-down, select a project for the tasks.
      Leave this field empty if you do not want to assign tasks to any project.

      Select project

      Note: Following steps are valid if the tasks do not belong to a project.
      If the tasks have been assigned to a project, the project’s labels will be applied to the tasks.

    3. On the Constructor tab, click Add label.

    4. In the Label name field, enter the name of the label.

    5. (Optional) Select the color for the label.

    6. (Optional) Click Add an attribute and set up its properties.

    7. Click Select files to upload files for annotation.

      Note: You cannot upload multiple tasks from the cloud storage.

    8. Click Submit N tasks

    Example

    A step-by-step example for creating the multiple tasks:

    1. In the Name field, enter the Create_multitask-{{index}}-{{file_name}}.

    2. Add labels.

    3. Select files.
      In case there are more than four files, only the total number of selected files will be displayed:

    4. Click Submit N tasks

    5. You will see a progress bar that shows the progress of the tasks being created:

    6. Click Ok.

    The result will look like the following:

    Errors

    During the process of adding multiple tasks, the following errors may occur:

    Error Description
    Wrong file format. You can add only video files.
    In the process of creating a task, CVAT was not able to process the video file.
    The name of the failed file will be displayed on the progress bar.

    To fix this issue:
  • If you want to try again, click Retry failed tasks.
  • If you want to skip the file, click OK.
  • Advanced configuration

    Use advanced configuration to set additional parameters for the task and customize it to meet specific needs or requirements.

    For more information, see Advanced configuration

    5.1.4 - Jobs page

    On the jobs page, users (for example, with the worker role) can see the jobs that are assigned to them without having access to the task page, as well as track progress, sort and apply filters to the job list.

    On the job page there is a list of jobs presented in the form of tiles, where each tile is one job. Each element contains:

    • job ID
    • dimension 2D or 3D
    • preview
    • stage and state
    • when hovering over an element, you can see:
      • size
      • assignee
    • menu to navigate to a task, project, or bug tracker.

    To open the job in a new tab, click on the job by holding Ctrl.

    In the upper left corner there is a search bar, using which you can find the job by assignee, stage, state, etc. In the upper right corner there are sorting, quick filters and filter.

    Filter

    Applying filter disables the quick filter.

    The filter works similarly to the filters for annotation, you can create rules from properties, operators and values and group rules into groups. For more details, see the filter section. Learn more about date and time selection.

    For clear all filters press Clear filters.

    Supported properties for jobs list

    Properties Supported values Description
    State all the state names The state of the job
    (can be changed in the menu inside the job)
    Stage all the stage names The stage of the job
    (is specified by a drop-down list on the task page)
    Dimension 2D or 3D Depends on the data format
    (read more in creating an annotation task)
    Assignee username Assignee is the user who is working on the job.
    (is specified on task page)
    Last updated last modified date and time (or value range) The date can be entered in the dd.MM.yyyy HH:mm format
    or by selecting the date in the window that appears
    when you click on the input field
    ID number or range of job ID
    Task ID number or range of task ID
    Project ID number or range of project ID
    Task name task name Set when creating a task,
    can be changed on the (task page)
    Project name project name Specified when creating a project,
    can be changed on the (project section)

    5.1.5 - Tasks page

    Overview of the Tasks page.

    The tasks page contains elements and each of them relates to a separate task. They are sorted in creation order. Each element contains: task name, preview, progress bar, button Open, and menu Actions. Each button is responsible for a in menu Actions specific function:

    • Export task dataset — download annotations or annotations and images in a specific format. More information is available in the export/import datasets section.
    • Upload annotation upload annotations in a specific format. More information is available in the export/import datasets section.
    • Automatic Annotation — automatic annotation with OpenVINO toolkit. Presence depends on how you build the CVAT instance.
    • Backup task — make a backup of this task into a zip archive. Read more in the backup section.
    • Move to project — Moving a task to a project (you can move only a task which does not belong to any project). In case of label mismatch, you can create or delete necessary labels in the project/task. Some task labels can be matched with the target project labels.
    • Delete — delete task.

    In the upper left corner there is a search bar, using which you can find the task by assignee, task name etc. In the upper right corner there are sorting, quick filters and filter.

    Filter

    Applying filter disables the quick filter.

    The filter works similarly to the filters for annotation, you can create rules from properties, operators and values and group rules into groups. For more details, see the filter section. Learn more about date and time selection.

    For clear all filters press Clear filters.

    Supported properties for tasks list

    Properties Supported values Description
    Dimension 2D or 3D Depends on the data format
    (read more in creating an annotation task)
    Status annotation, validation or completed
    Data video, images Depends on the data format
    (read more in creating an annotation task)
    Subset test, train, validation or custom subset [read more] [subset]
    Assignee username Assignee is the user who is working on the project, task or job.
    (is specified on task page)
    Owner username The user who owns the project, task, or job
    Last updated last modified date and time (or value range) The date can be entered in the dd.MM.yyyy HH:mm format
    or by selecting the date in the window that appears
    when you click on the input field
    ID number or range of job ID
    Project ID number or range of project ID
    Name name On the tasks page - name of the task,
    on the project page - name of the project
    Project name project name Specified when creating a project,
    can be changed on the (project section)

    Push Open button to go to task details.

    5.1.6 - Task details

    Overview of the Task details page.

    Task details is a task page which contains a preview, a progress bar and the details of the task (specified when the task was created) and the jobs section.

    • The next actions are available on this page:

      1. Change the task’s title.

      2. Open Actions menu.

      3. Change issue tracker or open issue tracker if it is specified.

      4. Change labels (available only if the task is not related to the project). You can add new labels or add attributes for the existing labels in the Raw mode or the Constructor mode. By clicking Copy you will copy the labels to the clipboard.

      5. Assigned to — is used to assign a task to a person. Start typing an assignee’s name and/or choose the right person out of the dropdown list. In the list of users, you will only see the users of the organization where the task is created.

      6. Dataset Repository

        • Repository link
        • Synchronization status with dataset repository. When you click on the status, the current annotation will be sent. It has several states:
          • Synchronized - task synchronized, that is, created a pull of requisites with an actual annotation file.
          • Merged - merged pull request with up-to-date annotation file.
          • Synchronize - highlighted in red, annotations are not synced.
        • Use a format drop-down list of formats in which the annotation can be synchronized.
        • Support for large file enabling the use of LFS.
    • Jobs — is a list of all jobs for a particular task. Here you can find the next data:

      • Jobs name with a hyperlink to it.
      • Frames — the frame interval.
      • A stage of the job. The stage is specified by a drop-down list. There are three stages: annotation, validation or acceptance. This value affects the task progress bar.
      • A state of the job. The state can be changed by an assigned user in the menu inside the job. There are several possible states: new, in progress, rejected, completed.
      • Started on — start date of this job.
      • Duration — is the amount of time the job is being worked.
      • Assignee is the user who is working on the job. You can start typing an assignee’s name and/or choose the right person out of the dropdown list.
      • Reviewer – a user assigned to carry out the review, read more in the review section.
      • Copy. By clicking Copy you will copy the job list to the clipboard. The job list contains direct links to jobs.

      You can filter or sort jobs by status, as well as by assigner or reviewer.

    Follow a link inside Jobs section to start annotation process. In some cases, you can have several links. It depends on size of your task and Overlap Size and Segment Size parameters. To improve UX, only the first chunk of several frames will be loaded and you will be able to annotate first images. Other frames will be loaded in background.

    5.1.7 - Interface of the annotation tool

    Main user interface

    The tool consists of:

    • Header - pinned header used to navigate CVAT sections and account settings;

    • Top panel — contains navigation buttons, main functions and menu access;

    • Workspace — space where images are shown;

    • Controls sidebar — contains tools for navigating the image, zoom, creating shapes and editing tracks (merge, split, group);

    • Objects sidebar — contains label filter, two lists: objects (on the frame) and labels (of objects on the frame) and appearance settings.

    Pop-up messages

    Pop-up message

    In CVAT, you’ll receive pop-up messages in the upper-right corner, on any page. Pop-up messages can contain useful information, links, or error messages.

    Information message

    Informational messages inform about the end of the auto-annotation process. Learn more about auto-annotation.

    Jump Suggestion Messages

    Open a task

    After creating a task, you can immediately open it by clicking Open task. Learn more about creating a task.

    Continue to the frame on which the work on the job is finished

    When you open a job that you previously worked on, you will receive pop-up messages with a proposal to go to the frame that was visited before closing the tab.

    Error Messages

    If you perform impossible actions, you may receive an error message. The message may contain information about the error or a prompt to open the browser console (shortcut F12) for information. If you encounter a bug that you can’t solve yourself, you can create an issue on GitHub.

    5.1.8 - Basic navigation

    Overview of basic controls.
    1. Use arrows below to move to the next/previous frame. Use the scroll bar slider to scroll through frames. Almost every button has a shortcut. To get a hint about a shortcut, just move your mouse pointer over an UI element.

    2. To navigate the image, use the button on the controls sidebar. Another way an image can be moved/shifted is by holding the left mouse button inside an area without annotated objects. If the Mouse Wheel is pressed, then all annotated objects are ignored. Otherwise the a highlighted bounding box will be moved instead of the image itself.

    3. You can use the button on the sidebar controls to zoom on a region of interest. Use the button Fit the image to fit the image in the workspace. You can also use the mouse wheel to scale the image (the image will be zoomed relatively to your current cursor position).

    5.1.9 - Top Panel

    Overview of controls available on the top panel of the annotation tool.


    It is the main menu of the annotation tool. It can be used to download, upload and remove annotations.

    Button assignment:

    • Upload Annotations — uploads annotations into a task.

    • Export as a dataset — download a data set from a task in one of the supported formats. You can also enter a Custom name and enable the Save images checkbox if you want the dataset to contain images.

    • Remove Annotations — calls the confirmation window if you click Delete, the annotation of the current job will be removed, if you click Select range you can remove annotation on range frames, if you activate checkbox Delete only keyframe for tracks then only keyframes will be deleted from the tracks, on the selected range.

    • Open the task — opens a page with details about the task.

    • Change job state - changes the state of the job (new, in progress, rejected, completed).

    • Finish the job/Renew the job - changes the job stage and state to acceptance and completed / annotation and new correspondingly.

    Save Work

    Saves annotations for the current job. The button has an indication of the saving process.

    Undo-redo buttons

    Use buttons to undo actions or redo them.


    Done

    Used to complete the creation of the object. This button appears only when the object is being created.


    Block

    Used to pause automatic line creation when drawing a polygon with OpenCV Intelligent scissors. Also used to postpone server requests when creating an object using AI Tools. When blocking is activated, the button turns blue.


    Player

    Go to the first /the latest frames.

    Go to the next/previous frame with a predefined step. Shortcuts: V — step backward, C — step forward. By default the step is 10 frames (change at Account Menu —> Settings —> Player Step).

    The button to go to the next / previous frame has the customization possibility. To customize, right-click on the button and select one of three options:

    1. The default option - go to the next / previous frame (the step is 1 frame).
    2. Go to the next / previous frame that has any objects (in particular filtered). Read the filter section to know the details how to use it.
    3. Go to the next / previous frame without annotation at all. Use this option in cases when you need to find missed frames quickly.

    Shortcuts: D - previous, F - next.

    Play the sequence of frames or the set of images. Shortcut: Space (change at Account Menu —> Settings —> Player Speed).

    Go to a specific frame. Press ~ to focus on the element.

    To delete frame.

    Shortcut: Alt+Del


    Fullscreen Player

    The fullscreen player mode. The keyboard shortcut is F11.

    Info

    Open the job info.

    Overview:

    • Assignee - the one to whom the job is assigned.
    • Reviewer – a user assigned to carry out the review, read more in the review section.
    • Start Frame - the number of the first frame in this job.
    • End Frame - the number of the last frame in this job.
    • Frames - the total number of all frames in the job.

    Annotations statistics:

    This is a table number of created shapes, sorted by labels (e.g. vehicle, person) and type of annotation (shape, track). As well as the number of manual and interpolated frames.

    UI switcher

    Switching between user interface modes.

    5.1.10 - Controls sidebar

    Overview of available functions on the controls sidebar of the annotation tool.

    Navigation block - contains tools for moving and rotating images.

    Icon Description
    Cursor (Esc)- a basic annotation pedacting tool.
    Move the image- a tool for moving around the image without
    the possibility of editing.
    Rotate- two buttons to rotate the current frame
    a clockwise (Ctrl+R) and anticlockwise (Ctrl+Shift+R).
    You can enable Rotate all images in the settings to rotate all the images in the job

    Zoom

    Zoom block - contains tools for image zoom.

    Icon Description
    Fit image- fits image into the workspace size.
    Shortcut - double click on an image
    Select a region of interest- zooms in on a selected region.
    You can use this tool to quickly zoom in on a specific part of the frame.

    Shapes

    Shapes block - contains all the tools for creating shapes.

    Icon Description Links to section
    AI Tools AI Tools
    OpenCV OpenCV
    Rectangle Shape mode; Track mode;
    Drawing by 4 points
    Polygon Annotation with polygons; Track mode with polygons
    Polyline Annotation with polylines
    Points Annotation with points
    Ellipses Annotation with ellipses
    Cuboid Annotation with cuboids
    Brushing tools Annotation with brushing
    Tag Annotation with tags
    Open an issue Review (available only in review mode)

    Edit

    Edit block - contains tools for editing tracks and shapes.

    Icon Description Links to section
    Merge Shapes(M) - starts/stops the merging shapes mode. Track mode (basics)
    Group Shapes (G) - starts/stops the grouping shapes mode. Shape grouping
    Split - splits a track. Track mode (advanced)

    5.1.11 - Objects sidebar

    Overview of available functions on the objects sidebar of the annotation tool.

    In the objects sidebar, you can see the list of available objects on the current frame. The following figure is an example of how the list might look like:

    Shape mode Track mode

    Objects properties

    Filter input box

    The way how to use filters is described in the advanced guide here.

    List of objects

    • Switch lock property for all - switches lock property of all objects in the frame.
    • Switch hidden property for all - switches hide the property of all objects in the frame.
    • Expand/collapse all - collapses/expands the details field of all objects in the frame.
    • Sorting - sort the list of objects: updated time, ID - accent, ID - descent

    Objects on the sidebar

    The type of shape can be changed by selecting the Label property. For instance, it can look like shown in the figure below:

    Object action menu

    The action menu calls up the button:

    The action menu contains:

    • Create object URL - puts a link to an object on the clipboard. After you open the link, this object will be filtered.

    • Make a copy - copies an object. The keyboard shortcut is Ctrl + C > Ctrl + V.

    • Propagate function copies the form to multiple frames and displays a dialog box where you can specify the number of copies or the frame to which you want to copy the object. The keyboard shortcut is Ctrl + B.
      There are two options available:

      • Propagate forward (Fw propagate) creates a copy of the object on N subsequent frames at the same position.
      • Propagate backward (Back propagate) creates a copy of the object on N previous frames at the same position.

    • To background - moves the object to the background. The keyboard shortcut - or _

    • To foreground - moves the object to the foreground. The keyboard shortcut + or =

    • Change instance color- choosing a color using the color picker (available only in instance mode).

    • Remove - removes the object. The keyboard shortcut Del, Shift+Del.

    A shape can be locked to prevent its modification or moving by an accident. Shortcut to lock an object: L.

    A shape can be Occluded. Shortcut: Q. Such shapes have dashed boundaries.

    You can change the way an object is displayed on a frame (show or hide).

    Switch pinned property - when enabled, a shape cannot be moved by dragging or dropping.

    **Tracker switcher **- enable/disable tracking for the object.

    By clicking on the Details button you can collapse or expand the field with all the attributes of the object.


    Labels

    In this tab, you can lock or hide objects of a certain label. To change the color for a specific label, you need to go to the task page and select the color by clicking the edit button, this way you will change the label color for all jobs in the task.

    Fast label change

    You can change the label of an object using hotkeys. In order to do it, you need to assign a number (from 0 to 9) to labels. By default numbers 1,2…0 are assigned to the first ten labels. To assign a number, click on the button placed at the right of a label name on the sidebar.

    After that, you will be able to assign a corresponding label to an object by hovering your mouse cursor over it and pressing Ctrl + Num(0..9).

    In case you do not point the cursor to the object, pressing Ctrl + Num(0..9) will set a chosen label as default, so that the next object you create (use the N key) will automatically have this label assigned.


    Appearance

    Color By options

    Change the color scheme of the annotation:

    • Instance — every shape has a random color

    • Group — every group of shape has its own random color, ungrouped shapes are white

    • Label — every label (e.g. car, person) has its own random color

      You can change any random color pointing to a needed box on a frame or on an object sidebar.

    Fill Opacity slider

    Change the opacity of every shape in the annotation.

    Selected Fill Opacity slider

    Change the opacity of the selected object’s fill. It is possible to change the opacity while drawing an object in the case of rectangles, polygons, and cuboids.

    Outlines borders checkbox

    You can change a special shape border color by clicking on the Eyedropper icon.

    Show bitmap checkbox

    If enabled all shapes are displayed in white and the background is black.

    Show projections checkbox

    Enables/disables the display of auxiliary perspective lines. Only relevant for cuboids

    Hide objects sidebar

    Hide - the button hides the object’s sidebar.

    5.1.12 - Workspace

    Overview of available functions on the workspace of the annotation tool.

    This is the main field in which drawing and editing objects takes place. In addition the workspace also has the following functions:

    • Right-clicking on an object calls up the Object card - this is an element containing the necessary controls for changing the label and attributes of the object, as well as the action menu.

    • Right-clicking a point deletes it.

    • Z-axis slider - Allows you to switch annotation layers hiding the upper layers (slider is enabled if several z layers are on a frame). This element has a button for adding a new layer. When pressed, a new layer is added and switched to it. You can move objects in layers using the + and - keys.

    • Image settings panel -  used to set up the grid and set up image brightness contrast saturation.

      • Show Grid, change grid size, choose color and transparency:

      • Adjust Brightness/Contrast/Saturation of too exposed or too dark images using color settings (it affects only how a user sees the image, not the image itself).

      • Reset color settings to default values.

    5.1.13 - 3D task workspace

    If the related_images folder contains any images, a context image will be available in the perspective window. The contextual image could be compared to 3D data and would help to identify the labels of marked objects.

    Perspective – a main window for work with objects in a 3D task.

    Projections - projections are tied to an object so that a cuboid is in the center and looks like a rectangle. Projections show only the selected object.

    • Top – a projection of the view from above.
    • Side – a projection of the left side of the object.
    • Front - a frontal projection of the object.

    5.1.14 - Standard 3D mode (basics)

    Standard 3d mode - Designed to work with 3D data. The mode is automatically available if you add PCD or Kitty BIN format data when you create a task. read more

    You can adjust the size of the projections, to do so, simply drag the boundary between the projections.

    5.1.15 - Settings

    To open the settings open the user menu in the header and select the settings item or press F2.

    Settings have two tabs:

    In tab Player you can:

    • Control step of C and V shortcuts.
    • Control speed of Space/Play button.
    • Select canvas background color. You can choose a background color or enter manually (in RGB or HEX format).
    • Reset zoom Show every image in full size or zoomed out like previous (it is enabled by default for interpolation mode and disabled for annotation mode).
    • Rotate all images checkbox — switch the rotation of all frames or an individual frame.
    • Smooth image checkbox — smooth image when zoom-in it.
      smoothed pixelized

    In tab Workspace you can:

    • Enable auto save checkbox — turned off by default.

    • Auto save interval (min) input box — 15 minutes by default.

    • Show all interpolation tracks checkbox — shows hidden objects on the side panel for every interpolated object (turned off by default).

    • Always show object details - show text for an object on the canvas not only when the object is activated:

    • Content of a text - setup of the composition of the object details:

      • ID - object identifier.
      • Attributes - attributes of the object.
      • Label - object label.
      • Source- source of creating of objects MANUAL, AUTO or SEMI-AUTO.
      • Descriptions - description of attributes.
    • Position of a text - text positioning mode selection:

      • Auto - the object details will be automatically placed where free space is.
      • Center - the object details will be embedded to a corresponding object if possible.
    • Font size of a text - specifies the text size of the object details.

    • Automatic bordering - enable automatic bordering for polygons and polylines during drawing/editing. For more information To find out more, go to the section annotation with polygons.

    • Intelligent polygon cropping - activates intelligent cropping when editing the polygon (read more in the section edit polygon

    • Show tags on frame - shows/hides frame tags on current frame

    • Attribute annotation mode (AAM) zoom margin input box — defines margins (in px) for shape in the attribute annotation mode.

    • Control points size — defines a size of any interactable points in the tool (polygon’s vertices, rectangle dragging points, etc.)

    • Default number of points in polygon approximation With this setting, you can choose the default number of points in polygon. Works for serverless interactors and OpenCV scissors.

    • Click Save to save settings (settings will be saved on the server and will not change after the page is refreshed). Click Cancel or press F2 to return to the annotation.

    5.1.16 - Types of shapes

    List of shapes available for annotation.

    There are several shapes with which you can annotate your images:

    • Rectangle or Bounding box
    • Polygon
    • Polyline
    • Points
    • Ellipse
    • Cuboid
    • Cuboid in 3d task
    • Skeleton
    • Tag

    And there is how they all look like:

    Tag - has no shape in the workspace, but is displayed in objects sidebar.

    5.1.17 - Shape mode (basics)

    Usage examples and basic operations available during annotation in shape mode.

    Usage examples:

    • Create new annotations for a set of images.
    • Add/modify/delete objects for existing annotations.
    1. You need to select Rectangle on the controls sidebar:

      Before you start, select the correct Label (should be specified by you when creating the task) and Drawing Method (by 2 points or by 4 points):

    2. Creating a new annotation in Shape mode:

      • Create a separate Rectangle by clicking on Shape.

      • Choose the opposite points. Your first rectangle is ready!

      • To learn more about creating a rectangle read here.

      • It is possible to adjust boundaries and location of the rectangle using a mouse. Rectangle’s size is shown in the top right corner , you can check it by clicking on any point of the shape. You can also undo your actions using Ctrl+Z and redo them with Shift+Ctrl+Z or Ctrl+Y.

    3. You can see the Object card in the objects sidebar or open it by right-clicking on the object. You can change the attributes in the details section. You can perform basic operations or delete an object by clicking on the action menu button.

    4. The following figure is an example of a fully annotated frame with separate shapes.

      Read more in the section shape mode (advanced).

    5.1.18 - Track mode (basics)

    Usage examples and basic operations available during annotation in track mode.

    Usage examples:

    • Create new annotations for a sequence of frames.
    • Add/modify/delete objects for existing annotations.
    • Edit tracks, merge several rectangles into one track.
    1. Like in the Shape mode, you need to select a Rectangle on the sidebar, in the appearing form, select the desired Label and the Drawing method.

    2. Creating a track for an object (look at the selected car as an example):

      • Create a Rectangle in Track mode by clicking on Track.

      • In Track mode the rectangle will be automatically interpolated on the next frames.

      • The cyclist starts moving on frame #2270. Let’s mark the frame as a key frame. You can press K for that or click the star button (see the screenshot below).

      • If the object starts to change its position, you need to modify the rectangle where it happens. It isn’t necessary to change the rectangle on each frame, simply update several keyframes and the frames between them will be interpolated automatically.

      • Let’s jump 30 frames forward and adjust the boundaries of the object. See an example below:

      • After that the rectangle of the object will be changed automatically on frames 2270 to 2300:

    3. When the annotated object disappears or becomes too small, you need to finish the track. You have to choose Outside Property, shortcut O.

    4. If the object isn’t visible on a couple of frames and then appears again, you can use the Merge feature to merge several individual tracks into one.

      • Create tracks for moments when the cyclist is visible:

      • Click Merge button or press key M and click on any rectangle of the first track and on any rectangle of the second track and so on:

      • Click Merge button or press M to apply changes.

      • The final annotated sequence of frames in Interpolation mode can look like the clip below:

        Read more in the section track mode (advanced).

    5.1.19 - 3D Object annotation

    Overview of basic operations available when annotating 3D objects.

    Use the 3D Annotation tool for labeling 3D objects and scenes, such as vehicles, buildings, landscapes, and others.

    See:

    The 3D annotation canvas looks like the following

    3D canvas

    Note: if you added contextual images to the dataset, the canvas will include them. For more information, see Contextual images

    For information on the available tools, see Controls sidebar.

    You can navigate, using the mouse, or navigation keys:

    You can also use keyboard shortcuts to navigate:

    Action Keys
    Camera rotation Shift + Arrow (Up, Down, Left, Right)
    Left/Right Alt+J/ Alt+L
    Up/down Alt+U/ Alt+O
    Zoom in/ou Alt+K/ Alt+I

    Annotation with cuboids

    There are two options available for 3D annotation:

    • Shape: for tasks like object detection.
    • Track: uses interpolation to predict the position of objects in subsequent frames. A unique ID will be assigned to each object and maintained throughout the sequence of images.

    Annotation with shapes

    To add a 3D shape, do the following:

    1. On the objects pane, select Draw new cuboid > select the label from the drop-down list > Shape.

    1. The cursor will be followed by a cuboid. Place the cuboid on the 3D scene.

    1. Use projections to adjust the cuboid. Click and hold the left mouse button to edit the label shape on the projection.

    1. (Optional) Move one of the four points to change the size of the cuboid.

    1. (Optional) To rotate the cuboid, click on the middle point and then drag the cuboid up/down or to left/right.

    Tracking with cuboids

    To track with cuboids, do the following:

    1. On the objects pane, select Draw new cuboid > select the label from the drop-down list > Track.

    2. The cursor will be followed by a cuboid. Place the cuboid on the 3D scene.

    3. Use projections to adjust the cuboid. Click and hold the left mouse button to edit the label shape on the projection.

    1. (Optional) Move one of the four points to change the size of the cuboid.

    1. (Optional) To rotate the cuboid, click on the middle point and then drag the cuboid up/down or to left/right.

    1. Move several frames forward. You will see the cuboid you’ve added in frame 1. Adjust it, if needed.

    2. Repeat to the last frame with the presence of the object you are tracking.

    For more information about tracking, see Track mode

    5.1.20 - Attribute annotation mode (basics)

    Usage examples and basic operations available in attribute annotation mode.
    • In this mode you can edit attributes with fast navigation between objects and frames using a keyboard. Open the drop-down list in the top panel and select Attribute annotation Mode.

    • In this mode objects panel change to a special panel :

    • The active attribute will be red. In this case it is gender . Look at the bottom side panel to see all possible shortcuts for changing the attribute. Press key 2 on your keyboard to assign a value (female) for the attribute or select from the drop-down list.

    • Press Up Arrow/Down Arrow on your keyboard or click the buttons in the UI to go to the next/previous attribute. In this case, after pressing Down Arrow you will be able to edit the Age attribute.

    • Use Right Arrow/Left Arrow keys to move to the previous/next image with annotation.

    To see all the hot keys available in the attribute annotation mode, press F2. Read more in the section attribute annotation mode (advanced).

    5.1.21 - Vocabulary

    List of terms pertaining to annotation in CVAT.

    Label

    Label is a type of an annotated object (e.g. person, car, vehicle, etc.)


    Attribute

    Attribute is a property of an annotated object (e.g. color, model, quality, etc.). There are two types of attributes:

    Unique

    Unique immutable and can’t be changed from frame to frame (e.g. age, gender, color, etc.)

    Temporary

    Temporary mutable and can be changed on any frame (e.g. quality, pose, truncated, etc.)


    Track

    Track is a set of shapes on different frames which corresponds to one object. Tracks are created in Track mode


    Annotation

    Annotation is a set of shapes and tracks. There are several types of annotations:

    • Manual which is created by a person
    • Semi-automatic which is created mainly automatically, but the user provides some data (e.g. interpolation)
    • Automatic which is created automatically without a person in the loop

    Approximation

    Approximation allows you to reduce the number of points in the polygon. Can be used to reduce the annotation file and to facilitate editing polygons.


    Trackable

    Trackable object will be tracked automatically if the previous frame was a latest keyframe for the object. More details in the section trackers.


    Mode

    Interpolation

    Mode for video annotation, which uses track objects. Only objects on keyframes are manually annotation, and intermediate frames are linearly interpolated.

    Related sections:

    Annotation

    Mode for images annotation, which uses shape objects.

    Related sections:


    Dimension

    Depends on the task data type that is defined when the task is created.

    2D

    The data format of 2d tasks are images and videos. Related sections:

    3D

    The data format of 3d tasks is a cloud of points. Data formats for a 3D task

    Related sections:


    State

    State of the job. The state can be changed by an assigned user in the menu inside the job. There are several possible states: new, in progress, rejected, completed.


    Stage

    Stage of the job. The stage is specified with the drop-down list on the task page. There are three stages: annotation, validation or acceptance. This value affects the task progress bar.


    Subset

    A project can have subsets. Subsets are groups for tasks that make it easier to work with the dataset. It could be test, train, validation or custom subset.


    Credentials

    Under credentials is understood Key & secret key, Account name and token, Anonymous access, Key file. Used to attach cloud storage.


    Resource

    Under resource is understood bucket name or container name. Used to attach cloud storage.

    5.1.22 - Cloud storages page

    Overview of the cloud storages page.

    The cloud storages page contains elements, each of them relating to a separate cloud storage.  Each element contains: preview, cloud storage name, provider, creation and update info, status, ? button for displaying the description and the actions menu.

    Each button in the action menu is responsible for a specific function:

    • Update — update this cloud storage
    • Delete — delete cloud storage.

    This preview will appear when it is impossible to get a real preview (e.g. storage is empty or invalid credentials were used).

    In the upper left corner there is a search bar, using which you can find the cloud storage by display name, provider, etc. In the upper right corner there are sorting, quick filters and filter.

    Filter

    Applying filter disables the quick filter.

    The filter works similarly to the filters for annotation, you can create rules from properties, operators and values and group rules into groups. For more details, see the filter section. Learn more about date and time selection.

    For clear all filters press Clear filters.

    Supported properties for cloud storages list

    Properties Supported values Description
    ID number or range of task ID
    Provider type AWS S3, Azure, Google cloud
    Credentials type Key & secret key, Account name and token,
    Anonymous access, Key file
    Resource name Bucket name or container name
    Display name Set when creating cloud storage
    Description Description of the cloud storage
    Owner username The user who owns the project, task, or job
    Last updated last modified date and time (or value range) The date can be entered in the dd.MM.yyyy HH:mm format
    or by selecting the date in the window that appears
    when you click on the input field

    Click the + button to attach a new cloud storage.

    5.1.23 - Attach cloud storage

    Instructions on how to attach cloud storage using UI

    In CVAT you can use AWS-S3, Azure Blob Container and Google cloud storages to store image datasets for your tasks.

    See:

    AWS S3

    Create a bucket

    To create bucket, do the following:

    1. Create an AWS account.

    2. Go to console AWS-S3, and click Create bucket.

    3. Specify the name and region of the bucket. You can also copy the settings of another bucket by clicking on the Choose bucket button.

    4. Enable Block all public access. For access, you will use access key ID and secret access key.

    5. Click Create bucket.

    A new bucket will appear on the list of buckets.

    Upload data

    You need to upload data for annotation and the manifest.jsonl file.

    1. Prepare data. For more information, see prepare the dataset.

    2. Open the bucket and click Upload.

    3. Drag the manifest file and image folder on the page and click Upload:

    Access permissions

    Authorized access

    To add access permissions, do the following:

    1. Go to IAM and click Add users.

    2. Set User name and enable Access key - programmatic access.

    3. Click Next: Permissions.

    4. Click Create group, enter the group name.

    5. Use search to find and select:

      • For read-only access: AmazonS3ReadOnlyAccess.
      • For full access: AmazonS3FullAccess.

    6. (Optional) Add tags for the user and go to the next page.

    7. Save Access key ID and Secret access key.

    For more information, see Creating an IAM user in your AWS account

    Anonymous access

    On how to grant public access to the bucket, see Configuring block public access settings for your S3 buckets

    Attach AWS S3 storage

    To attach storage, do the following:

    1. Log into CVAT and in the separate tab open your bucket page.
    2. In the CVAT, on the top menu select Cloud storages > on the opened page click +.

    Fill in the following fields:

    CVAT AWS S3
    Display name Preferred display name for your storage.
    Description (Optional) Add description of storage.
    Provider From drop-down list select AWS S3.
    Bucket name Name of the Bucket.
    Authorization type Depends on the bucket setup:
  • Key id and secret access key pair: available on IAM.
  • Anonymous access: for anonymous access. Public access to the bucket must be enabled.
  • Region (Optional) Choose a region from the list or add a new one. For more information, see Available locations.
    Manifests Click + Add manifest and enter the name of the manifest file with an extension. For example: manifest.jsonl.

    After filling in all the fields, click Submit.

    AWS manifest file

    To prepare the manifest file, do the following:

    1. Go to AWS cli and run script for prepare manifest file.
    2. Perform the installation, following the aws-shell manual,
      You can configure credentials by running aws configure.
      You will need to enter Access Key ID and Secret Access Key as well as the region.
    aws configure
    Access Key ID: <your Access Key ID>
    Secret Access Key: <your Secret Access Key>
    
    1. Copy the content of the bucket to a folder on your computer:
    aws s3 cp <s3://bucket-name> <yourfolder> --recursive
    
    1. After copying the files, you can create a manifest file as described in preapair manifest file section:
    python <cvat repository>/utils/dataset_manifest/create.py --output-dir <yourfolder> <yourfolder>
    
    1. When the manifest file is ready, upload it to aws s3 bucket:
    • For read and write permissions when you created the user, run:
    aws s3 cp <yourfolder>/manifest.jsonl <s3://bucket-name>
    
    • For read-only permissions, use the download through the browser, click upload, drag the manifest file to the page and click upload.

    Google Cloud

    Create a bucket

    To create bucket, do the following:

    1. Create Google account and log into it.
    2. On the Google Cloud page, click Start Free, then enter the required data and accept the terms of service.

      Note: Google requires to add payment, you will need a bank card to accomplish step 2.

    3. Create a Bucket with the following parameters:
      • Name your bucket: Unique name.
      • Choose where to store your data: Set up a location nearest to you.
      • Choose a storage class for your data: Set a default class > Standart.
      • Choose how to control access to objects: Enforce public access prevention on this bucket > Uniform (default).
      • How to protect data: None

    GB

    You will be forwarded to the bucket.

    Upload data

    You need to upload data for annotation and the manifest.jsonl file.

    1. Prepare data. For more information, see prepare the dataset.
    2. Open the bucket and from the top menu select Upload files or Upload folder (depends on how your files are organized).

    Access permissions

    To access Google Cloud Storage get a Project ID from cloud resource manager page

    And follow instructions below based on the preferable type of access.

    Authorized access

    For authorized access you need to create a service account and key file.

    To create a service account:

    1. In Google Cloud platform, go to IAM & Admin > Service Accounts and click +Create Service Account.
    2. Enter your account name and click Create And Continue.
    3. Select a role, for example Basic > Viewer, and click Continue.
    4. (Optional) Give access rights to the service account.
    5. Click Done.

    To create a key:

    1. Go to IAM & Admin > Service Accounts > click on account name > Keys.
    2. Click Add key and select Create new key > JSON
    3. Click Create. The key file will be downloaded automatically.

    For more information about keys, see Learn more about creating keys.

    Anonymous access

    To configure anonymous access:

    1. Open the bucket and go to the Permissions tab.
    2. Сlick + Grant access to add new principals.
    3. In the New principals field specify allUsers, select roles: Cloud Storage Legacy > Storage Legacy Bucket Reader.
    4. Click Save.

    Now you can attach new Azure Blob container into CVAT.

    Attach Google Cloud storage

    To attach storage, do the following:

    1. Log into CVAT and in the separate tab open your bucket page.
    2. In the CVAT, on the top menu select Cloud storages > on the opened page click +.

    Fill in the following fields:

    CVAT Google Cloud
    Display name Preferred display name for your storage.
    Description (Optional) Add description of storage.
    Provider From drop-down list select Google Cloud Storage.
    Bucket name Name of the bucket. You can find it on the storage browser page.
    Authorization type Depends on the bucket setup:
  • Authorized access: Click on the Key file field and upload key file from computer.
    Advanced: For self-hosted solution, if the key file was not attached, then environment variable GOOGLE_APPLICATION_CREDENTIALS that was specified for an environment will be used. For more information, see Authenticate to Cloud services using client libraries.
  • Anonymous access: for anonymous access. Public access to the bucket must be enabled.
  • Prefix (Optional) Used to filter data from the bucket.
    Project ID Project ID.
    For more information, see projects page and cloud resource manager page.
    Note: Project name does not match the project ID.
    Location (Optional) Choose a region from the list or add a new one. For more information, see Available locations.
    Manifests Click + Add manifest and enter the name of the manifest file with an extension. For example: manifest.jsonl.

    After filling in all the fields, click Submit.

    Microsoft Azure

    Create a bucket

    To create bucket, do the following:

    1. Create an Microsoft Azure account and log into it.

    2. Go to Azure portal, hover over the resource , and in the pop-up window click Create.

    3. Enter a name for the group and click Review + create, check the entered data and click Create.

    4. Go to the resource groups page, navigate to the group that you created and click Create resources.

    5. On the marketplace page, use search to find Storage account.

    6. Click on Storage account and on the next page click Create.

    7. On the Basics tab, fill in the following fields:

      • Storage account name: to access container from CVAT.
      • Select a region closest to you.
      • Select Performance > Standart.
      • Select Local-redundancy storage (LRS).
      • Click next: Advanced>.

    8. On the Advanced page, fill in the following fields:

      • (Optional) Disable Allow enabling public access on containers to prohibit anonymous access to the container.
      • Click Next > Networking.

    1. On the Networking tab, fill in the following fields:

      • If you want to change public access, enable Public access from all networks.

      • Click Next>Data protection.

        You do not need to change anything in other tabs until you need some specific setup.

    2. Click Review and wait for the data to load.

    3. Click Create. Deployment will start.

    4. After deployment is over, click Go to resource.

    Create a container

    To create container, do the following:

    1. Go to the containers section and on the top menu click +Container

    1. Enter the name of the container.
    2. (Optional) In the Public access level drop-down, select type of the access.
      Note: this field will inactive if you disabled Allow enabling public access on containers.
    3. Click Create.

    Upload data

    You need to upload data for annotation and the manifest.jsonl file.

    1. Prepare data. For more information, see prepare the dataset.
    2. Go to container and click Upload.
    3. Click Browse for files and select images.

      Note: If images are in folder, specify folder in the Advanced settings > Upload to folder.

    4. Click Upload.

    SAS token and connection string

    Use the SAS token or connection string to grant secure access to the container.

    To configure the credentials:

    1. Go to Home > Resourse groups > You resource name > Your storage account.
    2. On the left menu, click Shared access signature.
    3. Change the following fields:
      • Allowed services: Enable Blob . Disable all other fields.
      • Allowed resource types: Enable Container and Object. Disable all other fields.
      • Allowed permissions: Enable Read, Write, and List. Disable all other fields.
      • Start and expiry date: Set up start and expiry dates.
      • Allowed protocols: Select HTTPS and HTTP
      • Leave all other fields with default parameters.
    4. Click Generate SAS and connection string and copy SAS token or Connection string.

    Personal use

    For personal use, you can use the Access Key from your storage account in the CVAT SAS Token field.

    To get the Access Key:

    1. In the Azure Portal, go to the Security + networking > Access Keys
    2. Click Show and copy the key.

    Attach Azure Blob Container

    To attach storage, do the following:

    1. Log into CVAT and in the separate tab open your bucket page.
    2. In the CVAT, on the top menu select Cloud storages > on the opened page click +.

    Fill in the following fields:

    CVAT Azure
    Display name Preferred display name for your storage.
    Description (Optional) Add description of storage.
    Provider From drop-down list select Azure Blob Container.
    Container name` Name of the cloud storage container.
    Authorization type Depends on the container setup.
    Account name and SAS token:
    • Account name enter storage account name.
    • SAS token is located in the Shared access signature section of your Storage account.
    . Anonymous access: for anonymous access Allow enabling public access on containers must be enabled.
    Manifests Click + Add manifest and enter the name of the manifest file with an extention. For example: manifest.jsonl.

    After filling in all the fields, click Submit.

    Prepare the dataset

    For example, the dataset is The Oxford-IIIT Pet Dataset:

    1. Download the archive with images.
    2. Unpack the archive into the prepared folder.
    3. Create a manifest. For more information, see Dataset manifest:
    python <cvat repository>/utils/dataset_manifest/create.py --output-dir <your_folder> <your_folder>
    

    5.2 - Advanced

    This section contains advanced documents for CVAT users

    5.2.1 - Projects page

    Creating and exporting projects in CVAT.

    Projects page

    On this page you can create a new project, create a project from a backup, and also see the created projects.

    In the upper left corner there is a search bar, using which you can find the project by project name, assignee etc. In the upper right corner there are sorting, quick filters and filter.

    Filter

    Applying filter disables the quick filter.

    The filter works similarly to the filters for annotation, you can create rules from properties, operators and values and group rules into groups. For more details, see the filter section. Learn more about date and time selection.

    For clear all filters press Clear filters.

    Supported properties for projects list

    Properties Supported values Description
    Assignee username Assignee is the user who is working on the project, task or job.
    (is specified on task page)
    Owner username The user who owns the project, task, or job
    Last updated last modified date and time (or value range) The date can be entered in the dd.MM.yyyy HH:mm format
    or by selecting the date in the window that appears
    when you click on the input field
    ID number or range of job ID
    Name name On the tasks page - name of the task,
    on the project page - name of the project

    Create a project

    At CVAT, you can create a project containing tasks of the same type. All tasks related to the project will inherit a list of labels.

    To create a project, go to the projects section by clicking on the Projects item in the top menu. On the projects page, you can see a list of projects, use a search, or create a new project by clicking on the + button and select Create New Project.

    Note that the project will be created in the organization that you selected at the time of creation. Read more about organizations.

    You can change: the name of the project, the list of labels (which will be used for tasks created as parts of this project) and a skeleton if it’s necessary. In advanced configuration also you can specify: a link to the issue, source and target storages. Learn more about creating a label list, creating the skeleton and attach cloud storage.

    To save and open project click on Submit & Open button. Also you can click on Submit & Continue button for creating several projects in sequence

    Once created, the project will appear on the projects page. To open a project, just click on it.

    Here you can do the following:

    1. Change the project’s title.

    2. Open the Actions menu. Each button is responsible for a specific function in the Actions menu:

      • Export dataset/Import dataset - download/upload annotations or annotations and images in a specific format. More information is available in the export/import datasets section.
      • Backup project - make a backup of the project read more in the backup section.
      • Delete - remove the project and all related tasks.
    3. Change issue tracker or open issue tracker if it is specified.

    4. Change labels and skeleton. You can add new labels or add attributes for the existing labels in the Raw mode or the Constructor mode. You can also change the color for different labels. By clicking Setup skeleton you can create a skeleton for this project.

    5. Assigned to — is used to assign a project to a person. Start typing an assignee’s name and/or choose the right person out of the dropdown list.

    6. Tasks — is a list of all tasks for a particular project, with the ability to search, sort and filter for tasks in the project. Read more about search. Read more about sorting and filter It is possible to choose a subset for tasks in the project. You can use the available options (Train, Test, Validation) or set your own.

    5.2.2 - Organization

    Using organization in CVAT.

    Organization is a feature for teams of several users who work together on projects and share tasks.

    Create an Organization, invite your team members, and assign roles to make the team work better on shared tasks.

    See:

    Personal workspace

    The account’s default state is activated when no Organization is selected.

    If you do not select an Organization, the system links all new resources directly to your personal account, that inhibits resource sharing with others.

    When Personal workspace is selected, it will be marked with a tick in the menu.

    Create new organization

    To create an organization, do the following:

    1. Log in to the CVAT.

    2. On the top menu, click your Username > Organization > + Create.

    3. Fill in the following fields and click Submit.

    Field Description
    Short name A name of the organization that will be displayed in the CVAT menu.
    Full Name Optional. Full name of the organization.
    Description Optional. Description of organization.
    Email Optional. Your email.
    Phone number Optional. Your phone number.
    Location Optional. Organization address.

    Upon creation, the organization page will open automatically.

    For future access to your organization, navigate to Username > Organization

    Note, that if you’ve created more than 10 organizations, a Switch organization line will appear in the drop-down menu.

    Switching between organizations

    If you have more than one Organization, it is possible to switch between these Organizations at any given time.

    Follow these steps:

    1. In the top menu, select your Username > Organization.
    2. From the drop-down menu, under the Personal space section, choose the desired Organization.

    Note, that if you’ve created more than 10 organizations, a Switch organization line will appear in the drop-down menu.

    Click on it to see the Select organization dialog, and select organization from drop-down list.

    Organization page

    Organization page is a place, where you can edit the Organization information and manage Organization members.

    Note that in order to access the organization page, you must first activate the organization (see Switching between organizations). Without activation, the organization page will remain inaccessible.
    An organization is considered activated when it’s ticked in the drop-down menu and its name is visible in the top-right corner under the username.

    To go to the Organization page, do the following:

    1. On the top menu, click your Username > Organization.
    2. In the drop-down menu, select Organization.
    3. In the drop-down menu, click Settings.

    Invite members into organization

    To add members to Organization do the following:

    1. Go to the Organization page, and click Invite members.

    2. Fill in the form (see below).

    3. Click Ok.

    The Invite Members form has the following fields:

    Field Description
    Email Specifies the email address of the user who is being added to the Organization.

    Note, that the user you’re inviting must already have a CVAT account (on the same instance) registered to the email address you’re sending the invitation to.
    Role drop-down list Defines the role of the user which sets the level of access within the Organization:
  • Worker: Has access only to the tasks, projects, and jobs assigned to them.
  • Supervisor: Can create and assign jobs, tasks, and projects to the Organization members.
  • Maintainer: Has the same capabilities as the Supervisor, but with additional visibility over all tasks and projects created by other members, complete access to Cloud Storages, and the ability to modify members and their roles.
  • Owner: role assigned to the creator of the organization by default. Has maximum capabilities and cannot be changed or assigned to the other user.
  • Invite more Button to add another user to the Organization.

    Members of Organization will appear on the Organization page.

    The member of the organization can leave the organization by going to Organization page > Leave organization.

    The organization owner can remove members, by clicking on the Bin icon.

    Delete organization

    You can remove an organization that you created.

    Note: Removing an organization will delete all related resources (annotations, jobs, tasks, projects, cloud storage, and so on).

    To remove an organization, do the following:

    1. Go to the Organization page.
    2. In the top-right corner click Actions > Remove organization.
    3. Enter the short name of the organization in the dialog field.
    4. Click Remove.

    5.2.3 - Search

    Overview of available search options.

    There are several options how to use the search.

    • Search within all fields (owner, assignee, task name, task status, task mode). To execute enter a search string in search field.
    • Search for specific fields. How to perform:
      • owner: admin - all tasks created by the user who has the substring “admin” in his name
      • assignee: employee - all tasks which are assigned to a user who has the substring “employee” in his name
      • name: training - all tasks with the substring “training” in their names
      • mode: annotation or mode: interpolation - all tasks with images or videos.
      • status: annotation or status: validation or status: completed - search by status
      • id: 5 - task with id = 5.
    • Multiple filters. Filters can be combined (except for the identifier) ​​using the keyword AND:
      • mode: interpolation AND owner: admin
      • mode: annotation and status: annotation

    The search is case insensitive.

    5.2.4 - Shape mode (advanced)

    Advanced operations available during annotation in shape mode.

    Basic operations in the mode were described in section shape mode (basics).

    Occluded Occlusion is an attribute used if an object is occluded by another object or isn’t fully visible on the frame. Use Q shortcut to set the property quickly.

    Example: the three cars on the figure below should be labeled as occluded.

    If a frame contains too many objects and it is difficult to annotate them due to many shapes placed mostly in the same place, it makes sense to lock them. Shapes for locked objects are transparent, and it is easy to annotate new objects. Besides, you can’t change previously annotated objects by accident. Shortcut: L.

    5.2.5 - Track mode (advanced)

    Advanced operations available during annotation in track mode.

    Basic operations in the mode were described in section track mode (basics).

    Shapes that were created in the track mode, have extra navigation buttons.

    • These buttons help to jump to the previous/next keyframe.

    • The button helps to jump to the initial frame and to the last keyframe.

    You can use the Split function to split one track into two tracks:

    5.2.6 - 3D Object annotation (advanced)

    Overview of advanced operations available when annotating 3D objects.

    As well as 2D-task objects, 3D-task objects support the ability to change appearance, attributes, properties and have an action menu. Read more in objects sidebar section.

    Moving an object

    If you hover the cursor over a cuboid and press Shift+N, the cuboid will be cut, so you can paste it in other place (double-click to paste the cuboid).

    Copying

    As well as in 2D task you can copy and paste objects by Ctrl+C and Ctrl+V, but unlike 2D tasks you have to place a copied object in a 3D space (double click to paste).

    Image of the projection window

    You can copy or save the projection-window image by left-clicking on it and selecting a “save image as” or “copy image”.

    5.2.7 - Attribute annotation mode (advanced)

    Advanced operations available in attribute annotation mode.

    Basic operations in the mode were described in section attribute annotation mode (basics).

    It is possible to handle lots of objects on the same frame in the mode.

    It is more convenient to annotate objects of the same type. In this case you can apply the appropriate filter. For example, the following filter will hide all objects except person: label=="Person".

    To navigate between objects (person in this case), use the following buttons switch between objects in the frame on the special panel:

    or shortcuts:

    • Tab — go to the next object
    • Shift+Tab — go to the previous object.

    In order to change the zoom level, go to settings (press F3) in the workspace tab and set the value Attribute annotation mode (AAM) zoom margin in px.

    5.2.8 - Annotation with rectangles

    To learn more about annotation using a rectangle, see the sections:

    Rotation rectangle

    To rotate the rectangle, pull on the rotation point. Rotation is done around the center of the rectangle. To rotate at a fixed angle (multiple of 15 degrees), hold shift. In the process of rotation, you can see the angle of rotation.

    Annotation with rectangle by 4 points

    It is an efficient method of bounding box annotation, proposed here. Before starting, you need to make sure that the drawing method by 4 points is selected.

    Press Shape or Track for entering drawing mode. Click on four extreme points: the top, bottom, left- and right-most physical points on the object. Drawing will be automatically completed right after clicking the fourth point. Press Esc to cancel editing.

    5.2.9 - Annotation with polygons

    Guide to creating and editing polygons.

    5.2.9.1 - Manual drawing

    It is used for semantic / instance segmentation.

    Before starting, you need to select Polygon on the controls sidebar and choose the correct Label.

    • Click Shape to enter drawing mode. There are two ways to draw a polygon: either create points by clicking or by dragging the mouse on the screen while holding Shift.
    Clicking points Holding Shift+Dragging
    • When Shift isn’t pressed, you can zoom in/out (when scrolling the mouse wheel) and move (when clicking the mouse wheel and moving the mouse), you can also delete the previous point by right-clicking on it.
    • You can use the Selected opacity slider in the Objects sidebar to change the opacity of the polygon. You can read more in the Objects sidebar section.
    • Press N again or click the Done button on the top panel for completing the shape.
    • After creating the polygon, you can move the points or delete them by right-clicking and selecting Delete point or clicking with pressed Alt key in the context menu.

    5.2.9.2 - Drawing using automatic borders

    You can use auto borders when drawing a polygon. Using automatic borders allows you to automatically trace the outline of polygons existing in the annotation.

    • To do this, go to settings -> workspace tab and enable Automatic Bordering or press Ctrl while drawing a polygon.

    • Start drawing / editing a polygon.

    • Points of other shapes will be highlighted, which means that the polygon can be attached to them.

    • Define the part of the polygon path that you want to repeat.

    • Click on the first point of the contour part.

    • Then click on any point located on part of the path. The selected point will be highlighted in purple.

    • Click on the last point and the outline to this point will be built automatically.

    Besides, you can set a fixed number of points in the Number of points field, then drawing will be stopped automatically. To enable dragging you should right-click inside the polygon and choose Switch pinned property.

    Below you can see results with opacity and black stroke:

    If you need to annotate small objects, increase Image Quality to 95 in Create task dialog for your convenience.

    5.2.9.3 - Edit polygon

    To edit a polygon you have to click on it while holding Shift, it will open the polygon editor.

    • In the editor you can create new points or delete part of a polygon by closing the line on another point.

    • When Intelligent polygon cropping option is activated in the settings, CVAT considers two criteria to decide which part of a polygon should be cut off during automatic editing.

      • The first criteria is a number of cut points.
      • The second criteria is a length of a cut curve.

      If both criteria recommend to cut the same part, algorithm works automatically, and if not, a user has to make the decision. If you want to choose manually which part of a polygon should be cut off, disable Intelligent polygon cropping in the settings. In this case after closing the polygon, you can select the part of the polygon you want to leave.

    • You can press Esc to cancel editing.

    5.2.9.4 - Track mode with polygons

    Polygons in the track mode allow you to mark moving objects more accurately other than using a rectangle (Tracking mode (basic); Tracking mode (advanced)).

    1. To create a polygon in the track mode, click the Track button.

    2. Create a polygon the same way as in the case of Annotation with polygons. Press N or click the Done button on the top panel to complete the polygon.

    3. Pay attention to the fact that the created polygon has a starting point and a direction, these elements are important for annotation of the following frames.

    4. After going a few frames forward press Shift+N, the old polygon will disappear and you can create a new polygon. The new starting point should match the starting point of the previously created polygon (in this example, the top of the left mirror). The direction must also match (in this example, clockwise). After creating the polygon, press N and the intermediate frames will be interpolated automatically.

    5. If you need to change the starting point, right-click on the desired point and select Set starting point. To change the direction, right-click on the desired point and select switch orientation.

    There is no need to redraw the polygon every time using Shift+N, instead you can simply move the points or edit a part of the polygon by pressing Shift+Click.

    5.2.9.5 - Creating masks

    Cutting holes in polygons

    Currently, CVAT does not support cutting transparent holes in polygons. However, it is poissble to generate holes in exported instance and class masks. To do this, one needs to define a background class in the task and draw holes with it as additional shapes above the shapes needed to have holes:

    The editor window:

    The editor

    Remember to use z-axis ordering for shapes by [-] and [+, =] keys.

    Exported masks:

    A class mask An instance mask

    Notice that it is currently impossible to have a single instance number for internal shapes (they will be merged into the largest one and then covered by “holes”).

    Creating masks

    There are several formats in CVAT that can be used to export masks:

    • Segmentation Mask (PASCAL VOC masks)
    • CamVid
    • MOTS
    • ICDAR
    • COCO (RLE-encoded instance masks, guide)
    • TFRecord (over Datumaro, guide):
    • Datumaro

    An example of exported masks (in the Segmentation Mask format):

    A class mask An instance mask

    Important notices:

    • Both boxes and polygons are converted into masks
    • Grouped objects are considered as a single instance and exported as a single mask (label and attributes are taken from the largest object in the group)

    Class colors

    All the labels have associated colors, which are used in the generated masks. These colors can be changed in the task label properties:

    Label colors are also displayed in the annotation window on the right panel, where you can show or hide specific labels (only the presented labels are displayed):

    A background class can be:

    • A default class, which is implicitly-added, of black color (RGB 0, 0, 0)
    • background class with any color (has a priority, name is case-insensitive)
    • Any class of black color (RGB 0, 0, 0)

    To change background color in generated masks (default is black), change background class color to the desired one.

    5.2.10 - Annotation with polylines

    Guide to annotating tasks using polylines.

    It is used for road markup annotation etc.

    Before starting, you need to select the Polyline. You can set a fixed number of points in the Number of points field, then drawing will be stopped automatically.

    Click Shape to enter drawing mode. There are two ways to draw a polyline — you either create points by clicking or by dragging a mouse on the screen while holding Shift. When Shift isn’t pressed, you can zoom in/out (when scrolling the mouse wheel) and move (when clicking the mouse wheel and moving the mouse), you can delete previous points by right-clicking on it. Press N again or click the Done button on the top panel to complete the shape. You can delete a point by clicking on it with pressed Ctrl or right-clicking on a point and selecting Delete point. Click with pressed Shift will open a polyline editor. There you can create new points(by clicking or dragging) or delete part of a polygon closing the red line on another point. Press Esc to cancel editing.

    5.2.11 - Annotation with points

    Guide to annotating tasks using single points or shapes containing multiple points.

    5.2.11.1 - Points in shape mode

    It is used for face, landmarks annotation etc.

    Before you start you need to select the Points. If necessary you can set a fixed number of points in the Number of points field, then drawing will be stopped automatically.

    Click Shape to entering the drawing mode. Now you can start annotation of the necessary area. Points are automatically grouped — all points will be considered linked between each start and finish. Press N again or click the Done button on the top panel to finish marking the area. You can delete a point by clicking with pressed Ctrl or right-clicking on a point and selecting Delete point. Clicking with pressed Shift will open the points shape editor. There you can add new points into an existing shape. You can zoom in/out (when scrolling the mouse wheel) and move (when clicking the mouse wheel and moving the mouse) while drawing. You can drag an object after it has been drawn and change the position of individual points after finishing an object.

    5.2.11.2 - Linear interpolation with one point

    You can use linear interpolation for points to annotate a moving object:

    1. Before you start, select the Points.

    2. Linear interpolation works only with one point, so you need to set Number of points to 1.

    3. After that select the Track.

    4. Click Track to enter the drawing mode left-click to create a point and after that shape will be automatically completed.

    5. Move forward a few frames and move the point to the desired position, this way you will create a keyframe and intermediate frames will be drawn automatically. You can work with this object as with an interpolated track: you can hide it using the Outside, move around keyframes, etc.

    6. This way you’ll get linear interpolation using the Points.

    5.2.12 - Annotation with ellipses

    Guide to annotating tasks using ellipses.

    It is used for road sign annotation etc.

    First of all you need to select the ellipse on the controls sidebar.

    Choose a Label and click Shape or Track to start drawing. An ellipse can be created the same way as a rectangle, you need to specify two opposite points, and the ellipse will be inscribed in an imaginary rectangle. Press N or click the Done button on the top panel to complete the shape.

    You can rotate ellipses using a rotation point in the same way as rectangles.

    5.2.13 - Annotation with cuboids

    Guide to creating and editing cuboids.

    It is used to annotate 3 dimensional objects such as cars, boxes, etc… Currently the feature supports one point perspective and has the constraint where the vertical edges are exactly parallel to the sides.

    5.2.13.1 - Creating the cuboid

    Before you start, you have to make sure that Cuboid is selected and choose a drawing method ”from rectangle” or “by 4 points”.

    Drawing cuboid by 4 points

    Choose a drawing method “by 4 points” and click Shape to enter the drawing mode. There are many ways to draw a cuboid. You can draw the cuboid by placing 4 points, after that the drawing will be completed automatically. The first 3 points determine the plane of the cuboid while the last point determines the depth of that plane. For the first 3 points, it is recommended to only draw the 2 closest side faces, as well as the top and bottom face.

    A few examples:

    Drawing cuboid from rectangle

    Choose a drawing method “from rectangle” and click Shape to enter the drawing mode. When you draw using the rectangle method, you must select the frontal plane of the object using the bounding box. The depth and perspective of the resulting cuboid can be edited.

    Example:

    5.2.13.2 - Editing the cuboid

    The cuboid can be edited in multiple ways: by dragging points, by dragging certain faces or by dragging planes. First notice that there is a face that is painted with gray lines only, let us call it the front face.

    You can move the cuboid by simply dragging the shape behind the front face. The cuboid can be extended by dragging on the point in the middle of the edges. The cuboid can also be extended up and down by dragging the point at the vertices.

    To draw with perspective effects it should be assumed that the front face is the closest to the camera. To begin simply drag the points on the vertices that are not on the gray/front face while holding Shift. The cuboid can then be edited as usual.

    If you wish to reset perspective effects, you may right click on the cuboid, and select Reset perspective to return to a regular cuboid.

    The location of the gray face can be swapped with the adjacent visible side face. You can do it by right clicking on the cuboid and selecting Switch perspective orientation. Note that this will also reset the perspective effects.

    Certain faces of the cuboid can also be edited, these faces are: the left, right and dorsal faces, relative to the gray face. Simply drag the faces to move them independently from the rest of the cuboid.

    You can also use cuboids in track mode, similar to rectangles in track mode (basics and advanced) or Track mode with polygons

    5.2.14 - Annotation with skeletons

    Guide to creating and editing skeletons.

    Skeletons should be used as annotations templates when you need to annotate complex objects sharing the same structure (e.g. human pose estimation, facial landmarks, etc.). A skeleton consist of any number of points (also called as elements), joined or not joined by edges. Any point itself is considered like an individual object with its own attributes and properties (like color, occluded, outside, etc). At the same time a skeleton point can exist only within the parent skeleton.

    Any skeleton elements can be hidden (by marking them outside) if necessary (for example if a part is out of a frame). Currently there are two formats which support exporting skeletons: CVAT & COCO.

    5.2.14.1 - Creating the skeleton

    Initial skeleton setup

    Unlike other CVAT objects, to start annotating using skeletons, first of all you need to setup a skeleton. You can do that in the label configurator during creating a task/project, or later in created instances.

    So, start by clicking Setup skeleton option:

    Below the regular label form where you need to add a name, and setup attributes if necessary, you will see a drawing area with some buttons aside:

    • PUT AN IMAGE AS A BACKGROUND - is a helpful feature you can use to draw a skeleton template easier, seeing an example - object you need to annotate in the future.
    • PUT NEW SKELETON POINTS - is activated by default. It is a mode where you can add new skeleton points clicking the drawing area.
    • DRAW AN EDGE BETWEEN TWO POINTS - in this mode you can add an edge, clicking any two points, which are not joined yet.
    • REMOVE A DRAWN SKELETON POINTS - in this mode clicking a point will remove the point and all attached edges. You can also remove an edge only, it will be highlighted as red on hover.
    • DOWNLOAD DRAWN TEMPLATE AS AN .SVG - you can download setup configuration to use it in future
    • UPLOAD A TEMPLATE FROM AN .SVG FILE - you can upload previously downloaded configuration

    Let’s draw an exampe skeleton - star. After the skeleton is drawn, you can setup each its point. Just hover the point, do right mouse click and click Configure:

    Here you can setup a point name, its color and attributes if necessary like for a regular CVAT label:

    Press Done button to finish editing the point. Press Continue button to save the skeleton. Continue creating a task/project in a regular way.

    For an existing task/project you are not allowed to change a skeleton configuration for now. You can copy/insert skeletons configuration using Raw tab of the label configurator.

    Drawing a skeleton from rectangle

    In opened job go to left sidebar and find Draw new skeleton control, hover it:

    If the control is absent, be sure you have setup at least one skeleton in the corresponding task/project. In a pop-up dropdown you can select between a skeleton Shape and a skeleton Track, depends on your task. Draw a skeleton as a regular bounding box, clicking two points on a canvas:

    Well done, you’ve just created the first skeleton.

    5.2.14.2 - Editing the skeleton

    Editing skeletons on the canvas

    A drawn skeleton is wrapped by a bounding box for a user convenience. Using this wrapper the user can edit the skeleton as a regular bounding box, by dragging, resizing, or rotating:

    Moreover, each the skeleton point can be dragged itself. After dragging, the wrapping bounding box is adjusted automatically, other points are not affected:

    You can use Shortcuts on both a skeleton itself and its elements.

    • Hover the mouse cursor over the bounding box to apply a shortcut on the whole skeleton (like lock, occluded, pinned, keyframe and outside for skeleton tracks)
    • Hover the mouse cursor over one of skeleton points to apply a shortcut to this point (the same shortcuts list, but outside is available also for a skeleton shape elements)

    Editing skeletons on the sidebar

    Using the sidebar is another way to setup skeleton properties, and attributes. It works a similar way, like for other kinds of objects supported by CVAT, but with some changes:

    • A user is not allowed to switch a skeleton label
    • Outside property is always available for skeleton elements (it does not matter if they are tracks or not)
    • Additional collapse is available for a user, to see a list of skeleton parts

    5.2.15 - Annotation with brush tool

    Guide to annotating tasks using brush tools.

    With a brush tool, you can create masks for disjoint objects, that have multiple parts, such as a house hiding behind trees, a car behind a pedestrian, or a pillar behind a traffic sign. The brush tool has several modes, for example: erase pixels, change brush shapes, and polygon-to-mask mode.

    Use brush tool for Semantic (Panoptic) and Instance Image Segmentation tasks.
    For more information about segmentation masks in CVAT, see Creating masks.

    See:

    Brush tool menu

    The brush tool menu appears on the top of the screen after you click Shape:

    BT Menu

    It has the following elements:

    Element Description
    Tick icon Save mask saves the created mask. The saved mask will appear on the object sidebar
    Save mask and continue Save mask and continue adds a new mask to the object sidebar and allows you to draw a new one immediately.
    Brush Brush adds new mask/ new regions to the previously added mask).
    Eraser Eraser removes part of the mask.
    Add poly Polygon selection tool. Selection will become a mask.
    Remove poly Remove polygon selection subtracts part of the polygon selection.
    Brush size Brush size in pixels.
    Note: Visible only when Brush or Eraser are selected.
    Brush shape Brush shape with two options: circle and square.
    Note: Visible only when Brush or Eraser are selected.
    Pixel remove Remove underlying pixels. When you are drawing or editing a mask with this tool,
    pixels on other masks that are located at the same positions as the pixels of the
    current mask are deleted.
    Label Label that will be assigned to the newly created mask
    Move Move. Click and hold to move the menu bar to the other place on the screen

    Annotation with brush

    To annotate with brush, do the following:

    1. From the controls sidebar, select Brush Brush icon.

    2. In the Draw new mask menu, select label for your mask, and click Shape.
      The BrushBrush tool will be selected by default.

      BT context menu

    3. With the brush, draw a mask on the object you want to label.
      To erase selection, use Eraser Eraser

      Brushing

    4. After you applied the mask, on the top menu bar click Save mask Tick icon
      to finish the process (or N on the keyboard).

    5. Added object will appear on the objects sidebar.

    To add the next object, repeat steps 1 to 5. All added objects will be visible on the image and the objects sidebar.

    To save the job with all added objects, on the top menu click Save Save.

    Annotation with polygon-to-mask

    To annotat with polygon-to-mask, do the following:

    1. From the controls sidebar, select Brush Brush icon.

    2. In the Draw new mask menu, select label for your mask, and click Shape.

      BT context menu

    3. In the brush tool menu, select Polygon Add poly.

    4. With the PolygonAdd poly tool, draw a mask for the object you want to label.
      To correct selection, use Remove polygon selection Remove poly.

    5. Use Save mask Tick icon (or N on the keyboard)
      to switch between add/remove polygon tools:

      Brushing

    6. After you added the polygon selection, on the top menu bar click Save mask Tick icon
      to finish the process (or N on the keyboard).

    7. Click Save mask Tick icon again (or N on the keyboard).
      The added object will appear on the objects sidebar.

    To add the next object, repeat steps 1 to 5.

    All added objects will be visible on the image and the objects sidebar.

    To save the job with all added objects, on the top menu click Save Save.

    Remove underlying pixels

    Use Remove underlying pixels tool when you want to add a mask and simultaneously delete the pixels of
    other masks that are located at the same positions. It is a highly useful feature to avoid meticulous drawing edges twice between two different objects.

    Remove pixel

    AI Tools

    You can convert AI tool masks to polygons. To do this, use the following AI tool menu:

    Save

    1. Go to the Detectors tab.
    2. Switch toggle Masks to polygons to the right.
    3. Add source and destination labels from the drop-down lists.
    4. Click Annotate.

    Import and export

    For export, see Export dataset

    Import follows the general import dataset procedure, with the additional option of converting masks to polygons.

    Note: This option is available for formats that work with masks only.

    To use it, when uploading the dataset, switch the Convert masks to polygon toggle to the right:

    Remove pixel

    5.2.16 - Annotation with tags

    It is used to annotate frames, tags are not displayed in the workspace. Before you start, open the drop-down list in the top panel and select Tag annotation.

    The objects sidebar will be replaced with a special panel for working with tags. Here you can select a label for a tag and add it by clicking on the Plus button. You can also customize hotkeys for each label.

    If you need to use only one label for one frame, then enable the Automatically go to the next frame checkbox, then after you add the tag the frame will automatically switch to the next.

    Tags will be shown in the top left corner of the canvas. You can show/hide them in the settings.

    5.2.17 - Models

    To deploy the models, you will need to install the necessary components using Semi-automatic and Automatic Annotation guide. To learn how to deploy the model, read Serverless tutorial.

    The Models page contains a list of deep learning (DL) models deployed for semi-automatic and automatic annotation. To open the Models page, click the Models button on the navigation bar. The list of models is presented in the form of a table. The parameters indicated for each model are the following:

    • Framework the model is based on
    • model Name
    • model Type:
    • Description - brief description of the model
    • Labels - list of the supported labels (only for the models of the detectors type)

    5.2.18 - CVAT Analytics and quality assessment in Cloud

    Analytics and quality assessment in CVAT Cloud

    5.2.18.1 - Annotation quality & Honeypot

    How to check the quality of annotation in CVAT

    In CVAT, it’s possible to evaluate the quality of annotation through the creation of a Ground truth job, referred to as a Honeypot. To estimate the task quality, CVAT compares all other jobs in the task against the established Ground truth job, and calculates annotation quality based on this comparison.

    Note that quality estimation only supports 2d tasks. It supports all the annotation types except 2d cuboids.

    Note that tracks are considered separate shapes and compared on a per-frame basis with other tracks and shapes.

    See:

    Ground truth job

    A Ground truth job is a way to tell CVAT where to store and get the “correct” annotations for task quality estimation.

    To estimate task quality, you need to create a Ground truth job in the task, and annotate it. You don’t need to annotate the whole dataset twice, the annotation quality of a small part of the data shows the quality of annotation for the whole dataset.

    For the quality assurance to function correctly, the Ground truth job must have a small portion of the task frames and the frames must be chosen randomly. Depending on the dataset size and task complexity, 5-15% of the data is typically good enough for quality estimation, while keeping extra annotation overhead acceptable.

    For example, in a typical task with 2000 frames, selecting just 5%, which is 100 extra frames to annotate, is enough to estimate the annotation quality. If the task contains only 30 frames, it’s advisable to select 8-10 frames, which is about 30%.

    It is more than 15% but in the case of smaller datasets, we need more samples to estimate quality reliably.

    To create a Ground truth job, do the following:

    1. Create a task, and open the task page.

    2. Click +.

      Create job

    3. In the Add new job window, fill in the following fields:

      Add new job

      • Job type: Use the default parameter Ground truth.
      • Frame selection method: Use the default parameter Random.
      • Quantity %: Set the desired percentage of frames for the Ground truth job.
        Note that when you use Quantity %, the Frames field will be autofilled.
      • Frame count: Set the desired number of frames for the “ground truth” job.
        Note that when you use Frames, the Quantity % field will be will be autofilled.
      • Seed: (Optional) If you need to make the random selection reproducible, specify this number. It can be any integer number, the same value will yield the same random selection (given that the frame number is unchanged).
        Note that if you want to use a custom frame sequence, you can do this using the server API instead, see Jobs API #create.
    4. Click Submit.

    5. Annotate frames, save your work.

    6. Change the status of the job to Completed.

    7. Change Stage to Accepted.

    The Ground truth job will appear in the jobs list.

    Add new job

    Managing Ground Truth jobs: Import, Export, and Deletion

    Annotations from Ground truth jobs are not included in the dataset export, they also cannot be imported during task annotations import or with automatic annotation for the task.

    Import, export, and delete options are available from the job’s menu.

    Add new job

    Import

    If you want to import annotations into the Ground truth job, do the following.

    1. Open the task, and find the Ground truth job in the jobs list.
    2. Click on three dots to open the menu.
    3. From the menu, select Import annotations.
    4. Select import format, and select file.
    5. Click OK.

    Note that if there are imported annotations for the frames that exist in the task, but are not included in the Ground truth job, they will be ignored. This way, you don’t need to worry about “cleaning up” your Ground truth annotations for the whole dataset before importing them. Importing annotations for the frames that are not known in the task still raises errors.

    Export

    To export annotations from the Ground truth job, do the following.

    1. Open the task, and find a job in the jobs list.
    2. Click on three dots to open the menu.
    3. From the menu, select Export annotations.

    Delete

    To delete the Ground truth job, do the following.

    1. Open the task, and find the Ground truth job in the jobs list.
    2. Click on three dots to open the menu.
    3. From the menu, select Delete.

    Assessing data quality with Ground truth jobs

    Once you’ve established the Ground truth job, proceed to annotate the dataset.

    CVAT will begin the quality comparison between the annotated task and the Ground truth job in this task once it is finished (on the acceptance stage and in the completed state).

    Note that the process of quality calculation may take up to several hours, depending on the amount of data and labeled objects, and is not updated immediately after task updates.

    To view results go to the Task > Actions > View analytics> Performance tab.

    Add new job

    Quality data

    The Analytics page has the following fields:

    Field Description
    Mean annotation quality Displays the average quality of annotations, which includes: the count of accurate annotations, total task annotations, ground truth annotations, accuracy rate, precision rate, and recall rate.
    GT Conflicts Conflicts identified during quality assessment, including extra or missing annotations. Mouse over the ? icon for a detailed conflict report on your dataset.
    Issues Number of opened issues. If no issues were reported, will show 0.
    Quality report Quality report in JSON format.
    Ground truth job data “Information about ground truth job, including date, time, and number of issues.
    List of jobs List of all the jobs in the task

    Annotation quality settings

    If you need to tweak some aspects of comparisons, you can do this from the Annotation Quality Settings menu.

    You can configure what overlap should be considered low or how annotations must be compared.

    The updated settings will take effect on the next quality update.

    To open Annotation Quality Settings, find Quality report and on the right side of it, click on three dots.

    The following window will open. Hover over the ? marks to understand what each field represents.

    Add new job

    Annotation quality settings have the following parameters:

    Field Description
    Min overlap threshold Min overlap threshold(IoU) is used for the distinction between matched / unmatched shapes.
    Low overlap threshold Low overlap threshold is used for the distinction between strong/weak (low overlap) matches.
    OKS Sigma IoU threshold for points. The percent of the box area, used as the radius of the circle around the GT point, where the checked point is expected to be.
    Relative thickness (frame side %) Thickness of polylines, relative to the (image area) ^ 0.5. The distance to the boundary around the GT line inside of which the checked line points should be.
    Check orientation Indicates that polylines have direction.
    Min similarity gain (%) The minimal gain in the GT IoU between the given and reversed line directions to consider the line inverted. Only useful with the Check orientation parameter.
    Compare groups Enables or disables annotation group checks.
    Min group match threshold Minimal IoU for groups to be considered matching, used when the Compare groups are enabled.
    Check object visibility Check for partially-covered annotations. Masks and polygons will be compared to each other.
    Min visibility threshold Minimal visible area percent of the spatial annotations (polygons, masks). For reporting covered annotations, useful with the Check object visibility option.
    Match only visible parts Use only the visible part of the masks and polygons in comparisons.

    GT conflicts in the CVAT interface

    To see GT Conflicts in the CVAT interface, go to Review > Issues > Show ground truth annotations and conflicts.

    GT conflict

    The ground truth (GT) annotation is depicted as a dotted-line box with an associated label.

    Upon hovering over an issue on the right-side panel with your mouse, the corresponding GT Annotation gets highlighted.

    Use arrows in the Issue toolbar to move between GT conflicts.

    To create an issue related to the conflict, right-click on the bounding box and from the menu select the type of issue you want to create.

    GT conflict

    Annotation quality & Honeypot video tutorial

    This video demonstrates the process:

    5.2.18.2 - CVAT Performance & Monitoring

    How to monitor team activity and performance in CVAT

    In CVAT Cloud, you can track a variety of metrics reflecting the team’s productivity and the pace of annotation with the Performance feature.

    See:

    Performance dashboard

    To open the Performance dashboard, do the following:

    1. In the top menu click on Projects/ Tasks/ Jobs.
    2. Select an item from the list, and click on three dots (Open menu).
    3. From the menu, select View analytics > Performance tab.

    Open menu

    The following dashboard will open:

    Open menu

    The Performance dashboard has the following elements:

    Element Description
    Analytics for Object/ Task/ Job number.
    Created Time when the dashboard was updated last time.
    Objects Graph, showing the number of annotated, updated, and deleted objects by day.
    Annotation speed (objects per hour) Number of objects annotated per hour.
    Time A drop-down list with various periods for the graph. Currently affects only the histogram data.
    Annotation time (hours) Shows for how long the Project/Task/Job is in In progress state.
    Total objects count Shows the total objects count in the task. Interpolated objects are counted.
    Total annotation speed (objects per hour) Shows the annotation speed in the Project/Task/Job. Interpolated objects are counted.

    You can rearrange elements of the dashboard by dragging and dropping each of them.

    Performance video tutorial

    This video demonstrates the process:

    5.2.19 - OpenCV and AI Tools

    Overview of semi-automatic and automatic annotation tools available in CVAT.

    Label and annotate your data in semi-automatic and automatic mode with the help of AI and OpenCV tools.

    While interpolation is good for annotation of the videos made by the security cameras, AI and OpenCV tools are good for both: videos where the camera is stable and videos, where it moves together with the object, or movements of the object are chaotic.

    See:

    Interactors

    Interactors are a part of AI and OpenCV tools.

    Use interactors to label objects in images by creating a polygon semi-automatically.

    When creating a polygon, you can use positive points or negative points (for some models):

    • Positive points define the area in which the object is located.
    • Negative points define the area in which the object is not located.

    AI tools: annotate with interactors

    To annotate with interactors, do the following:

    1. Click Magic wand Magic wand, and go to the Interactors tab.
    2. From the Label drop-down, select a label for the polygon.
    3. From the Interactor drop-down, select a model (see Interactors models).
      Click the Question mark to see information about each model:
    4. (Optional) If the model returns masks, and you need to convert masks to polygons, use the Convert masks to polygons toggle.
    5. Click Interact.
    6. Use the left click to add positive points and the right click to add negative points.
      Number of points you can add depends on the model.
    7. On the top menu, click Done (or Shift+N, N).

    AI tools: add extra points

    Note: More points improve outline accuracy, but make shape editing harder. Fewer points make shape editing easier, but reduce outline accuracy.

    Each model has a minimum required number of points for annotation. Once the required number of points is reached, the request is automatically sent to the server. The server processes the request and adds a polygon to the frame.

    For a more accurate outline, postpone request to finish adding extra points first:

    1. Hold down the Ctrl key.
      On the top panel, the Block button will turn blue.
    2. Add points to the image.
    3. Release the Ctrl key, when ready.

    In case you used Mask to polygon when the object is finished, you can edit it like a polygon.

    You can change the number of points in the polygon with the slider:

    AI tools: delete points


    To delete a point, do the following:

    1. With the cursor, hover over the point you want to delete.
    2. If the point can be deleted, it will enlarge and the cursor will turn into a cross.
    3. Left-click on the point.

    OpenCV: intelligent scissors

    To use Intelligent scissors, do the following:

    1. On the menu toolbar, click OpenCVOpenCV and wait for the library to load.


    2. Go to the Drawing tab, select the label, and click on the Intelligent scissors button.

    3. Add the first point on the boundary of the allocated object.
      You will see a line repeating the outline of the object.

    4. Add the second point, so that the previous point is within the restrictive threshold.
      After that a line repeating the object boundary will be automatically created between the points.

    5. To finish placing points, on the top menu click Done (or N on the keyboard).

    As a result, a polygon will be created.

    You can change the number of points in the polygon with the slider:

    To increase or lower the action threshold, hold Ctrl and scroll the mouse wheel.

    During the drawing process, you can remove the last point by clicking on it with the left mouse button.

    Settings

    Interactors models

    Model Tool Description Example
    Segment Anything Model (SAM) AI Tools The Segment Anything Model (SAM) produces high
    quality object masks, and it can be used to generate
    masks for all objects in an image. It has been trained
    on a dataset of 11 million images and
    1.1 billion masks, and has strong zero-shot performance on a variety of segmentation tasks.

    For more information, see:
  • GitHub: Segment Anything
  • Site: Segment Anything
  • Paper: Segment Anything
  • Deep extreme
    cut (DEXTR)
    AI Tool This is an optimized version of the original model,
    introduced at the end of 2017. It uses the
    information about extreme points of an object
    to get its mask. The mask is then converted to a polygon.
    For now this is the fastest interactor on the CPU.

    For more information, see:
  • GitHub: DEXTR-PyTorch
  • Site: DEXTR-PyTorch
  • Paper: DEXTR-PyTorch
  • Feature backpropagating
    refinement
    scheme (f-BRS)
    AI Tool The model allows to get a mask for an
    object using positive points (should be
    left-clicked on the foreground),
    and negative points (should be right-clicked
    on the background, if necessary).
    It is recommended to run the model on GPU,
    if possible.

    For more information, see:
  • GitHub: f-BRS
  • Paper: f-BRS
  • High Resolution
    Net (HRNet)
    AI Tool The model allows to get a mask for
    an object using positive points (should
    be left-clicked on the foreground),
    and negative points (should be
    right-clicked on the background,
    if necessary).
    It is recommended to run the model on GPU,
    if possible.

    For more information, see:
  • GitHub: HRNet
  • Paper: HRNet
  • Inside-Outside-Guidance
    (IOG)
    AI Tool The model uses a bounding box and
    inside/outside points to create a mask.
    First of all, you need to create a bounding
    box, wrapping the object.
    Then you need to use positive
    and negative points to say the
    model where is
    a foreground, and where is a background.
    Negative points are optional.

    For more information, see:
  • GitHub: IOG
  • Paper: IOG
  • Intelligent scissors OpenCV Intelligent scissors is a CV method of creating
    a polygon by placing points with the automatic
    drawing of a line between them. The distance
    between the adjacent points is limited by
    the threshold of action, displayed as a
    red square that is tied to the cursor.

    For more information, see:
  • Site: Intelligent Scissors Specification
  • int scissors

    Detectors

    Detectors are a part of AI tools.

    Use detectors to automatically identify and locate objects in images or videos.

    Labels matching

    Each model is trained on a dataset and supports only the dataset’s labels.

    For example:

    • DL model has the label car.
    • Your task (or project) has the label vehicle.

    To annotate, you need to match these two labels to give DL model a hint, that in this case car = vehicle.

    If you have a label that is not on the list of DL labels, you will not be able to match them.

    For this reason, supported DL models are suitable only for certain labels.
    To check the list of labels for each model, see Detectors models.

    Annotate with detectors

    To annotate with detectors, do the following:

    1. Click Magic wand Magic wand, and go to the Detectors tab.

    2. From the Model drop-down, select model (see Detectors models).

    3. From the left drop-down select the DL model label, from the right drop-down select the matching label of your task.

    4. (Optional) If the model returns masks, and you need to convert masks to polygons, use the Convert masks to polygons toggle.

    5. Click Annotate.

    This action will automatically annotate one frame. For automatic annotation of multiple frames, see Automatic annotation.

    Detectors models

    Model Description
    Mask RCNN The model generates polygons for each instance of an object in the image.

    For more information, see:
  • GitHub: Mask RCNN
  • Paper: Mask RCNN
  • Faster RCNN The model generates bounding boxes for each instance of an object in the image.
    In this model, RPN and Fast R-CNN are combined into a single network.

    For more information, see:
  • GitHub: Faster RCNN
  • Paper: Faster RCNN
  • YOLO v3 YOLO v3 is a family of object detection architectures and models pre-trained on the COCO dataset.

    For more information, see:
  • GitHub: YOLO v3
  • Site: YOLO v3
  • Paper: YOLO v3
  • Semantic segmentation for ADAS This is a segmentation network to classify each pixel into 20 classes.

    For more information, see:
  • Site: ADAS
  • Mask RCNN with Tensorflow Mask RCNN version with Tensorflow. The model generates polygons for each instance of an object in the image.

    For more information, see:
  • GitHub: Mask RCNN
  • Paper: Mask RCNN
  • Faster RCNN with Tensorflow Faster RCNN version with Tensorflow. The model generates bounding boxes for each instance of an object in the image.
    In this model, RPN and Fast R-CNN are combined into a single network.

    For more information, see:
  • Site: Faster RCNN with Tensorflow
  • Paper: Faster RCNN
  • RetinaNet Pytorch implementation of RetinaNet object detection.


    For more information, see:
  • Specification: RetinaNet
  • Paper: RetinaNet
  • Documentation: RetinaNet
  • Face Detection Face detector based on MobileNetV2 as a backbone for indoor and outdoor scenes shot by a front-facing camera.


    For more information, see:
  • Site: Face Detection 0205
  • Trackers

    Trackers are part of AI and OpenCV tools.

    Use trackers to identify and label objects in a video or image sequence that are moving or changing over time.

    AI tools: annotate with trackers

    To annotate with trackers, do the following:

    1. Click Magic wand Magic wand, and go to the Trackers tab.


      Start tracking an object

    2. From the Label drop-down, select the label for the object.

    3. From Tracker drop-down, select tracker.

    4. Click Track, and annotate the objects with the bounding box in the first frame.

    5. Go to the top menu and click Next (or the F on the keyboard) to move to the next frame.
      All annotated objects will be automatically tracked.

    OpenCV: annotate with trackers

    To annotate with trackers, do the following:

    1. On the menu toolbar, click OpenCVOpenCV and wait for the library to load.


    2. Go to the Tracker tab, select the label, and click Tracking.


      Start tracking an object

    3. From the Label drop-down, select the label for the object.

    4. From Tracker drop-down, select tracker.

    5. Click Track.

    6. To move to the next frame, on the top menu click the Next button (or F on the keyboard).

    All annotated objects will be automatically tracked when you move to the next frame.

    When tracking

    • To enable/disable tracking, use Tracker switcher on the sidebar.

      Tracker switcher

    • Trackable objects have an indication on canvas with a model name.

      Tracker indication

    • You can follow the tracking by the messages appearing at the top.

      Tracker pop-up window

    Trackers models

    Model Tool Description Example
    TrackerMIL OpenCV TrackerMIL model is not bound to
    labels and can be used for any
    object. It is a fast client-side model
    designed to track simple non-overlapping objects.

    For more information, see:
  • Article: Object Tracking using OpenCV
  • Annotation using a tracker
    SiamMask AI Tools Fast online Object Tracking and Segmentation. The trackable object will
    be tracked automatically if the previous frame
    was the latest keyframe for the object.

    For more information, see:
  • GitHub: SiamMask
  • Paper: SiamMask
  • Annotation using a tracker
    Transformer Tracking (TransT) AI Tools Simple and efficient online tool for object tracking and segmentation.
    If the previous frame was the latest keyframe
    for the object, the trackable object will be tracked automatically.
    This is a modified version of the PyTracking
    Python framework based on Pytorch


    For more information, see:
  • GitHub: TransT
  • Paper: TransT
  • Annotation using a tracker

    OpenCV: histogram equalization

    Histogram equalization improves the contrast by stretching the intensity range.

    It increases the global contrast of images when its usable data is represented by close contrast values.

    It is useful in images with backgrounds and foregrounds that are bright or dark.

    To improve the contrast of the image, do the following:

    1. In the OpenCV menu, go to the Image tab.
    2. Click on Histogram equalization button.

    Histogram equalization will improve contrast on current and following frames.

    Example of the result:

    To disable Histogram equalization, click on the button again.

    5.2.20 - Automatic annotation

    Automatic annotation of tasks

    Automatic annotation in CVAT is a tool that you can use to automatically pre-annotate your data with pre-trained models.

    CVAT can use models from the following sources:

    The following table describes the available options:

    Self-hosted Cloud
    Price Free See Pricing
    Models You have to add models You can use pre-installed models
    Hugging Face & Roboflow
    integration
    Not supported Supported

    See:

    Running Automatic annotation

    To start automatic annotation, do the following:

    1. On the top menu, click Tasks.

    2. Find the task you want to annotate and click Action > Automatic annotation.

    3. In the Automatic annotation dialog, from the drop-down list, select a model.

    4. Match the labels of the model and the task.

    5. (Optional) In case you need the model to return masks as polygons, switch toggle Return masks as polygons.

    6. (Optional) In case you need to remove all previous annotations, switch toggle Clean old annotations.

    7. Click Annotate.

    CVAT will show the progress of annotation on the progress bar.

    Progress bar

    You can stop the automatic annotation at any moment by clicking cancel.

    Labels matching

    Each model is trained on a dataset and supports only the dataset’s labels.

    For example:

    • DL model has the label car.
    • Your task (or project) has the label vehicle.

    To annotate, you need to match these two labels to give CVAT a hint that, in this case, car = vehicle.

    If you have a label that is not on the list of DL labels, you will not be able to match them.

    For this reason, supported DL models are suitable only for certain labels.

    To check the list of labels for each model, see Models papers and official documentation.

    Models

    Automatic annotation uses pre-installed and added models.

    For self-hosted solutions, you need to install Automatic Annotation first and add models.

    List of pre-installed models:

    Model Description
    Attributed face detection Three OpenVINO models work together:

  • Face Detection 0205: face detector based on MobileNetV2 as a backbone with a FCOS head for indoor and outdoor scenes shot by a front-facing camera.
  • Emotions recognition retail 0003: fully convolutional network for recognition of five emotions (‘neutral’, ‘happy’, ‘sad’, ‘surprise’, ‘anger’).
  • Age gender recognition retail 0013: fully convolutional network for simultaneous Age/Gender recognition. The network can recognize the age of people in the [18 - 75] years old range; it is not applicable for children since their faces were not in the training set.
  • RetinaNet R101 RetinaNet is a one-stage object detection model that utilizes a focal loss function to address class imbalance during training. Focal loss applies a modulating term to the cross entropy loss to focus learning on hard negative examples. RetinaNet is a single, unified network composed of a backbone network and two task-specific subnetworks.

    For more information, see:
  • Site: RetinaNET
  • Text detection Text detector based on PixelLink architecture with MobileNetV2, depth_multiplier=1.4 as a backbone for indoor/outdoor scenes.

    For more information, see:
  • Site: OpenVINO Text detection 004
  • YOLO v3 YOLO v3 is a family of object detection architectures and models pre-trained on the COCO dataset.

    For more information, see:
  • Site: YOLO v3
  • YOLO v7 YOLOv7 is an advanced object detection model that outperforms other detectors in terms of both speed and accuracy. It can process frames at a rate ranging from 5 to 160 frames per second (FPS) and achieves the highest accuracy with 56.8% average precision (AP) among real-time object detectors running at 30 FPS or higher on the V100 graphics processing unit (GPU).

    For more information, see:
  • GitHub: YOLO v7
  • Paper: YOLO v7
  • Adding models from Hugging Face and Roboflow

    In case you did not find the model you need, you can add a model of your choice from Hugging Face or Roboflow.

    Note, that you cannot add models from Hugging Face and Roboflow to self-hosted CVAT.

    For more information, see Streamline annotation by integrating Hugging Face and Roboflow models.

    This video demonstrates the process:

    5.2.21 - Specification for annotators

    Learn how to easily create and add specification for annotators using the Guide feature.

    The Guide feature provides a built-in markdown editor that allows you to create specification for annotators.

    Once you create and submit the specification, it will be accessible from the annotation interface (see below).

    You can attach the specification to Projects or to Tasks.

    The attachment procedure is the same for individual users and organizations.

    See:

    Adding specification to Project

    To add specification to the Projects, do the following:

    1. Go to the Projects page and click on the project to which you want to add specification.
    2. Under the Project description, click Edit.

    Project specification

    1. Add instruction to the Markdown editor, and click Submit.

    Editing rights

    • For individual users: only the project owner and the project assignee can edit the specification.
    • For organizations: specification additionally can be edited by the organization owner and maintainer

    Editor rights

    Adding specification to Task

    To add specification to the Task, do the following:

    1. Go to the Tasks page and click on the task to which you want to add specification.

    2. Under the Task description, click Edit.

      Task specification

    3. Add instruction to the Markdown editor, and click Submit.

    Editing rights

    • For individual users: only the task owner and task assignee can edit the specification.
    • For organizations: only the task owner, maintainer, and task assignee can edit the specification.

    Editor rights

    Access to specification for annotators

    To open specification, do the following:

    1. Open the job to see the annotation interface.
    2. In the top right corner, click Guide button(Guide Icon).

    Markdown editor guide

    The markdown editor for Guide has two panes. Add instructions to the left pane, and the editor will immediately show the formatted result on the right.

    Markdown editor

    You can write in raw markdown or use the toolbar on the top of the editor.

    Markdown editor

    Element Description
    1 Text formatting: bold, cursive, and strikethrough.
    2 Insert a horizontal rule (horizontal line).
    3 Add a title, heading, or subheading. It provides a drop-down list to select the title level (from 1 to 6).
    4 Add a link.
    Note: If you left-click on the link, it will open in the same window.
    5 Add a quote.
    6 Add a single line of code.
    7 Add a block of code.
    8 Add a comment. The comment is only visible to Guide editors and remains invisible to annotators.
    9 Add a picture. To use this option, first, upload the picture to an external resource and then add the link in the editor. Alternatively, you can drag and drop a picture into the editor, which will upload it to the CVAT server and add it to the specification.
    10 Add a list: bullet list, numbered list, and checklist.
    11 Hide the editor pane: options to hide the right pane, show both panes or hide the left pane.
    12 Enable full-screen mode.

    Specification for annotators' video tutorial

    Video tutorial on how to use the Guide feature.

    5.2.22 - Backup Task and Project

    Overview

    In CVAT you can backup tasks and projects. This can be used to backup a task or project on your PC or to transfer to another server.

    Create backup

    To backup a task or project, open the action menu and select Backup Task or Backup Project.

    You can backup a project or a task locally on your PC or using an attached cloud storage.

    (Optional) Specify the name in the Custom name text field for backup, otherwise the file of backup name will be given by the mask project_<project_name>_backup_<date>_<time>.zip for the projects and task_<task_name>_backup_<date>_<time>.zip for the tasks.

    If you want to save a backup to a specific attached cloud storage, you should additionally turn off the switch Use default settings, select the Cloud storage value in the Target storage and select this storage in the list of the attached cloud storages.

    Create backup APIs

    • endpoints:
      • /tasks/{id}/backup
      • /projects/{id}/backup
    • method: GET
    • responses: 202, 201 with zip archive payload

    Upload backup APIs

    • endpoints:
      • /api/tasks/backup
      • /api/projects/backup
    • method: POST
    • Content-Type: multipart/form-data
    • responses: 202, 201 with json payload

    Create from backup

    To create a task or project from a backup, go to the tasks or projects page, click the Create from backup button and select the archive you need.

    As a result, you’ll get a task containing data, parameters, and annotations of the previously exported task.

    Backup file structure

    As a result, you’ll get a zip archive containing data, task or project and task specification and annotations with the following structure:

        .
        ├── data
        │   └── {user uploaded data}
        ├── task.json
        └── annotations.json
      
        .
        ├── task_{id}
        │   ├── data
        │   │   └── {user uploaded data}
        │   ├── task.json
        │   └── annotations.json
        └── project.json
      

    5.2.23 - Frame deleting

    This section explains how to delete and restore a frame from a task.

    Delete frame

    You can delete the current frame from a task. This frame will not be presented either in the UI or in the exported annotation. Thus, it is possible to mark corrupted frames that are not subject to annotation.

    1. Go to the Job annotation view and click on the Delete frame button (Alt+Del).

      Note: When you delete with the shortcut, the frame will be deleted immediately without additional confirmation.

    2. After that you will be asked to confirm frame deleting.

      Note: all annotations from that frame will be deleted, unsaved annotations will be saved and the frame will be invisible in the annotation view (Until you make it visible in the settings). If there is some overlap in the task and the deleted frame falls within this interval, then this will cause this frame to become unavailable in another job as well.

    3. When you delete a frame in a job with tracks, you may need to adjust some tracks manually. Common adjustments are:

      • Add keyframes at the edges of the deleted interval for the interpolation to look correct;
      • Move the keyframe start or end keyframe to the correct side of the deleted interval.

    Configurate deleted frames visibility and navigation

    If you need to enable showing the deleted frames, you can do it in the settings.

    1. Go to the settings and chose Player settings.

    2. Click on the Show deleted frames checkbox. And close the settings dialog.

    3. Then you will be able to navigate through deleted frames. But annotation tools will be unavailable. Deleted frames differ in the corresponding overlay.

    4. There are view ways to navigate through deleted frames without enabling this option:

      • Go to the frame via direct navigation methods: navigation slider or frame input field,
      • Go to the frame via the direct link.
    5. Navigation with step will not count deleted frames.

    Restore deleted frame

    You can also restore deleted frames in the task.

    1. Turn on deleted frames visibility, as it was told in the previous part, and go to the deleted frame you want to restore.

    2. Click on the Restore icon. The frame will be restored immediately.

    5.2.24 - Export/import datasets and upload annotation

    This section explains how to download and upload datasets (including annotation, images, and metadata) of projects, tasks, and jobs.

    Export dataset

    You can export a dataset to a project, task or job.

    1. To download the latest annotations, you have to save all changes first. Click the Save button. There is a Ctrl+S shortcut to save annotations quickly.

    2. After that, click the Menu button. Exporting and importing of task and project datasets takes place through the Action menu.

    3. Press the Export task dataset button.

    4. Choose the format for exporting the dataset. Exporting and importing is available in:


    5. To download images with the dataset, enable the Save images option.

    6. (Optional) To name the resulting archive, use the Custom name field.

    7. You can choose a storage for dataset export by selecting a target storage Local or Cloud storage. The default settings are the settings that had been selected when the project was created (for example, if you specified a local storage when you created the project, then by default, you will be prompted to export the dataset to your PC). You can find out the default value by hovering the mouse over the ?. Learn more about attach cloud storage.

    Import dataset

    You can import dataset only to a project. In this case, the data will be split into subsets. To import a dataset, do the following on the Project page:

    • Open the Actions menu.
    • Press the Import dataset button.
    • Select the dataset format (if you did not specify a custom name during export, the format will be in the archive name).
    • Drag the file to the file upload area or click on the upload area to select the file through the explorer.

    • You can also import a dataset from an attached cloud storage. Here you should select the annotation format, then select a cloud storage from the list or use default settings if you have already specified required cloud storage for task or project and specify a zip archive to the text field File name.

    During the import process, you will be able to track the progress of the import.

    Upload annotations

    In the task or job you can upload an annotation. For this select the item Upload annotation in the menu Action of the task or in the job Menu on the Top panel select the format in which you plan to upload the annotation and select the annotation file or archive via explorer.

    Or you can also use the attached cloud storage to upload the annotation file.

    5.2.25 - Formats

    List of annotation formats supported by CVAT.

    CVAT supported the following formats:

    5.2.25.1 -

    CVAT

    This is the native CVAT annotation format. It supports all CVAT annotations features, so it can be used to make data backups.

    • supported annotations CVAT for Images: Rectangles, Polygons, Polylines, Points, Cuboids, Skeletons, Tags, Tracks

    • supported annotations CVAT for Videos: Rectangles, Polygons, Polylines, Points, Cuboids, Skeletons, Tracks

    • attributes are supported

    CVAT for images export

    Downloaded file: a ZIP file of the following structure:

    taskname.zip/
    ├── images/
    |   ├── img1.png
    |   └── img2.jpg
    └── annotations.xml
    
    • tracks are split by frames

    CVAT for videos export

    Downloaded file: a ZIP file of the following structure:

    taskname.zip/
    ├── images/
    |   ├── frame_000000.png
    |   └── frame_000001.png
    └── annotations.xml
    
    • shapes are exported as single-frame tracks

    CVAT loader

    Uploaded file: an XML file or a ZIP file of the structures above

    5.2.25.2 -

    Datumaro format

    Datumaro is a tool, which can help with complex dataset and annotation transformations, format conversions, dataset statistics, merging, custom formats etc. It is used as a provider of dataset support in CVAT, so basically, everything possible in CVAT is possible in Datumaro too, but Datumaro can offer dataset operations.

    • supported annotations: any 2D shapes, labels
    • supported attributes: any

    Import annotations in Datumaro format

    Uploaded file: a zip archive of the following structure:

    <archive_name>.zip/
    └── annotations/
        ├── subset1.json # fully description of classes and all dataset items
        └── subset2.json # fully description of classes and all dataset items
    

    JSON annotations files in the annotations directory should have similar structure:

    {
      "info": {},
      "categories": {
        "label": {
          "labels": [
            {
              "name": "label_0",
              "parent": "",
              "attributes": []
            },
            {
              "name": "label_1",
              "parent": "",
              "attributes": []
            }
          ],
          "attributes": []
        }
      },
      "items": [
        {
          "id": "img1",
          "annotations": [
            {
              "id": 0,
              "type": "polygon",
              "attributes": {},
              "group": 0,
              "label_id": 1,
              "points": [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0],
              "z_order": 0
            },
            {
              "id": 1,
              "type": "bbox",
              "attributes": {},
              "group": 1,
              "label_id": 0,
              "z_order": 0,
              "bbox": [1.0, 2.0, 3.0, 4.0]
            },
            {
              "id": 2,
              "type": "mask",
              "attributes": {},
              "group": 1,
              "label_id": 0,
              "rle": {
                "counts": "d0d0:F\\0",
                "size": [10, 10]
              },
              "z_order": 0
            }
          ]
        }
      ]
    }
    

    Export annotations in Datumaro format

    Downloaded file: a zip archive of the following structure:

    taskname.zip/
    ├── annotations/
    │   └── default.json # fully description of classes and all dataset items
    └── images/ # if the option `save images` was selected
        └── default
            ├── image1.jpg
            ├── image2.jpg
            ├── ...
    

    5.2.25.3 -

    LabelMe

    LabelMe export

    Downloaded file: a zip archive of the following structure:

    taskname.zip/
    ├── img1.jpg
    └── img1.xml
    
    • supported annotations: Rectangles, Polygons (with attributes)

    LabelMe import

    Uploaded file: a zip archive of the following structure:

    taskname.zip/
    ├── Masks/
    |   ├── img1_mask1.png
    |   └── img1_mask2.png
    ├── img1.xml
    ├── img2.xml
    └── img3.xml
    
    • supported annotations: Rectangles, Polygons, Masks (as polygons)

    5.2.25.4 -

    MOT sequence

    MOT export

    Downloaded file: a zip archive of the following structure:

    taskname.zip/
    ├── img1/
    |   ├── image1.jpg
    |   └── image2.jpg
    └── gt/
        ├── labels.txt
        └── gt.txt
    
    # labels.txt
    cat
    dog
    person
    ...
    
    # gt.txt
    # frame_id, track_id, x, y, w, h, "not ignored", class_id, visibility, <skipped>
    1,1,1363,569,103,241,1,1,0.86014
    ...
    
    
    • supported annotations: Rectangle shapes and tracks
    • supported attributes: visibility (number), ignored (checkbox)

    MOT import

    Uploaded file: a zip archive of the structure above or:

    taskname.zip/
    ├── labels.txt # optional, mandatory for non-official labels
    └── gt.txt
    
    • supported annotations: Rectangle tracks

    5.2.25.5 -

    MOTS PNG

    MOTS PNG export

    Downloaded file: a zip archive of the following structure:

    taskname.zip/
    └── <any_subset_name>/
        |   images/
        |   ├── image1.jpg
        |   └── image2.jpg
        └── instances/
            ├── labels.txt
            ├── image1.png
            └── image2.png
    
    # labels.txt
    cat
    dog
    person
    ...
    
    • supported annotations: Rectangle and Polygon tracks

    MOTS PNG import

    Uploaded file: a zip archive of the structure above

    • supported annotations: Polygon tracks

    5.2.25.6 -

    MS COCO Object Detection

    COCO export

    Downloaded file: a zip archive with the structure described here

    archive.zip/
    ├── images/
    │   ├── train/
    │   │   ├── <image_name1.ext>
    │   │   ├── <image_name2.ext>
    │   │   └── ...
    │   └── val/
    │       ├── <image_name1.ext>
    │       ├── <image_name2.ext>
    │       └── ...
    └── annotations/
       ├── <task>_<subset_name>.json
       └── ...
    

    If the dataset is exported from a Project, the subsets are named the same way as they are named in the project. In other cases there will be a single default subset, containing all the data. The <task> part corresponds to one of the COCO tasks: instances, person_keypoints, panoptic, image_info, labels, captions, stuff. There can be several annotation files in the archive.

    • supported annotations: Polygons, Rectangles
    • supported attributes:
      • is_crowd (checkbox or integer with values 0 and 1) - specifies that the instance (an object group) should have an RLE-encoded mask in the segmentation field. All the grouped shapes are merged into a single mask, the largest one defines all the object properties
      • score (number) - the annotation score field
      • arbitrary attributes - will be stored in the attributes annotation section

    Support for COCO tasks via Datumaro is described here For example, support for COCO keypoints over Datumaro:

    1. Install Datumaro pip install datumaro
    2. Export the task in the Datumaro format, unzip
    3. Export the Datumaro project in coco / coco_person_keypoints formats datum export -f coco -p path/to/project [-- --save-images]

    This way, one can export CVAT points as single keypoints or keypoint lists (without the visibility COCO flag).

    COCO import

    Uploaded file: a single unpacked *.json or a zip archive with the structure described above or here (without images).

    • supported annotations: Polygons, Rectangles (if the segmentation field is empty)
    • supported tasks: instances, person_keypoints (only segmentations will be imported), panoptic

    MS COCO Keypoint Detection

    COCO export

    Downloaded file: a zip archive with the structure described here

    • supported annotations: Skeletons
    • supported attributes:
      • is_crowd (checkbox or integer with values 0 and 1) - specifies that the instance (an object group) should have an RLE-encoded mask in the segmentation field. All the grouped shapes are merged into a single mask, the largest one defines all the object properties
      • score (number) - the annotation score field
      • arbitrary attributes - will be stored in the attributes annotation section

    COCO import

    Uploaded file: a single unpacked *.json or a zip archive with the structure described here (without images).

    • supported annotations: Skeletons

    How to create a task from MS COCO dataset

    1. Download the MS COCO dataset.

      For example val images and instances annotations

    2. Create a CVAT task with the following labels:

      person bicycle car motorcycle airplane bus train truck boat "traffic light" "fire hydrant" "stop sign" "parking meter" bench bird cat dog horse sheep cow elephant bear zebra giraffe backpack umbrella handbag tie suitcase frisbee skis snowboard "sports ball" kite "baseball bat" "baseball glove" skateboard surfboard "tennis racket" bottle "wine glass" cup fork knife spoon bowl banana apple sandwich orange broccoli carrot "hot dog" pizza donut cake chair couch "potted plant" bed "dining table" toilet tv laptop mouse remote keyboard "cell phone" microwave oven toaster sink refrigerator book clock vase scissors "teddy bear" "hair drier" toothbrush
      
    3. Select val2017.zip as data (See Creating an annotation task guide for details)

    4. Unpack annotations_trainval2017.zip

    5. click Upload annotation button, choose COCO 1.1 and select instances_val2017.json annotation file. It can take some time.

    5.2.25.7 -

    Pascal VOC

    • Format specification

    • Dataset examples

    • supported annotations:

      • Rectangles (detection and layout tasks)
      • Tags (action- and classification tasks)
      • Polygons (segmentation task)
    • supported attributes:

      • occluded (both UI option and a separate attribute)
      • truncated and difficult (should be defined for labels as checkbox -es)
      • action attributes (import only, should be defined as checkbox -es)
      • arbitrary attributes (in the attributes section of XML files)

    Pascal VOC export

    Downloaded file: a zip archive of the following structure:

    taskname.zip/
    ├── JPEGImages/
    │   ├── <image_name1>.jpg
    │   ├── <image_name2>.jpg
    │   └── <image_nameN>.jpg
    ├── Annotations/
    │   ├── <image_name1>.xml
    │   ├── <image_name2>.xml
    │   └── <image_nameN>.xml
    ├── ImageSets/
    │   └── Main/
    │       └── default.txt
    └── labelmap.txt
    
    # labelmap.txt
    # label : color_rgb : 'body' parts : actions
    background:::
    aeroplane:::
    bicycle:::
    bird:::
    

    Pascal VOC import

    Uploaded file: a zip archive of the structure declared above or the following:

    taskname.zip/
    ├── <image_name1>.xml
    ├── <image_name2>.xml
    └── <image_nameN>.xml
    

    It must be possible for CVAT to match the frame name and file name from annotation .xml file (the filename tag, e. g. <filename>2008_004457.jpg</filename> ).

    There are 2 options:

    1. full match between frame name and file name from annotation .xml (in cases when task was created from images or image archive).

    2. match by frame number. File name should be <number>.jpg or frame_000000.jpg. It should be used when task was created from video.

    Segmentation mask export

    Downloaded file: a zip archive of the following structure:

    taskname.zip/
    ├── labelmap.txt # optional, required for non-VOC labels
    ├── ImageSets/
    │   └── Segmentation/
    │       └── default.txt # list of image names without extension
    ├── SegmentationClass/ # merged class masks
    │   ├── image1.png
    │   └── image2.png
    └── SegmentationObject/ # merged instance masks
        ├── image1.png
        └── image2.png
    
    # labelmap.txt
    # label : color (RGB) : 'body' parts : actions
    background:0,128,0::
    aeroplane:10,10,128::
    bicycle:10,128,0::
    bird:0,108,128::
    boat:108,0,100::
    bottle:18,0,8::
    bus:12,28,0::
    

    Mask is a png image with 1 or 3 channels where each pixel has own color which corresponds to a label. Colors are generated following to Pascal VOC algorithm. (0, 0, 0) is used for background by default.

    • supported shapes: Rectangles, Polygons

    Segmentation mask import

    Uploaded file: a zip archive of the following structure:

      taskname.zip/
      ├── labelmap.txt # optional, required for non-VOC labels
      ├── ImageSets/
      │   └── Segmentation/
      │       └── <any_subset_name>.txt
      ├── SegmentationClass/
      │   ├── image1.png
      │   └── image2.png
      └── SegmentationObject/
          ├── image1.png
          └── image2.png
    

    It is also possible to import grayscale (1-channel) PNG masks. For grayscale masks provide a list of labels with the number of lines equal to the maximum color index on images. The lines must be in the right order so that line index is equal to the color index. Lines can have arbitrary, but different, colors. If there are gaps in the used color indices in the annotations, they must be filled with arbitrary dummy labels. Example:

    q:0,128,0:: # color index 0
    aeroplane:10,10,128:: # color index 1
    _dummy2:2,2,2:: # filler for color index 2
    _dummy3:3,3,3:: # filler for color index 3
    boat:108,0,100:: # color index 3
    ...
    _dummy198:198,198,198:: # filler for color index 198
    _dummy199:199,199,199:: # filler for color index 199
    ...
    the last label:12,28,0:: # color index 200
    
    • supported shapes: Polygons

    How to create a task from Pascal VOC dataset

    1. Download the Pascal Voc dataset (Can be downloaded from the PASCAL VOC website)

    2. Create a CVAT task with the following labels:

      aeroplane bicycle bird boat bottle bus car cat chair cow diningtable
      dog horse motorbike person pottedplant sheep sofa train tvmonitor
      

      You can add ~checkbox=difficult:false ~checkbox=truncated:false attributes for each label if you want to use them.

      Select interesting image files (See Creating an annotation task guide for details)

    3. zip the corresponding annotation files

    4. click Upload annotation button, choose Pascal VOC ZIP 1.1

      and select the zip file with annotations from previous step. It may take some time.

    5.2.25.8 -

    YOLO

    YOLO export

    Downloaded file: a zip archive with following structure:

    archive.zip/
    ├── obj.data
    ├── obj.names
    ├── obj_<subset>_data
    │   ├── image1.txt
    │   └── image2.txt
    └── train.txt # list of subset image paths
    
    # the only valid subsets are: train, valid
    # train.txt and valid.txt:
    obj_<subset>_data/image1.jpg
    obj_<subset>_data/image2.jpg
    
    # obj.data:
    classes = 3 # optional
    names = obj.names
    train = train.txt
    valid = valid.txt # optional
    backup = backup/ # optional
    
    # obj.names:
    cat
    dog
    airplane
    
    # image_name.txt:
    # label_id - id from obj.names
    # cx, cy - relative coordinates of the bbox center
    # rw, rh - relative size of the bbox
    # label_id cx cy rw rh
    1 0.3 0.8 0.1 0.3
    2 0.7 0.2 0.3 0.1
    

    Each annotation *.txt file has a name that corresponds to the name of the image file (e. g. frame_000001.txt is the annotation for the frame_000001.jpg image). The *.txt file structure: each line describes label and bounding box in the following format label_id cx cy w h. obj.names contains the ordered list of label names.

    YOLO import

    Uploaded file: a zip archive of the same structure as above It must be possible to match the CVAT frame (image name) and annotation file name. There are 2 options:

    1. full match between image name and name of annotation *.txt file (in cases when a task was created from images or archive of images).

    2. match by frame number (if CVAT cannot match by name). File name should be in the following format <number>.jpg . It should be used when task was created from a video.

    How to create a task from YOLO formatted dataset (from VOC for example)

    1. Follow the official guide(see Training YOLO on VOC section) and prepare the YOLO formatted annotation files.

    2. Zip train images

    zip images.zip -j -@ < train.txt
    
    1. Create a CVAT task with the following labels:

      aeroplane bicycle bird boat bottle bus car cat chair cow diningtable dog
      horse motorbike person pottedplant sheep sofa train tvmonitor
      

      Select images. zip as data. Most likely you should use share functionality because size of images. zip is more than 500Mb. See Creating an annotation task guide for details.

    2. Create obj.names with the following content:

      aeroplane
      bicycle
      bird
      boat
      bottle
      bus
      car
      cat
      chair
      cow
      diningtable
      dog
      horse
      motorbike
      person
      pottedplant
      sheep
      sofa
      train
      tvmonitor
      
    3. Zip all label files together (we need to add only label files that correspond to the train subset)

      cat train.txt | while read p; do echo ${p%/*/*}/labels/${${p##*/}%%.*}.txt; done | zip labels.zip -j -@ obj.names
      
    4. Click Upload annotation button, choose YOLO 1.1 and select the zip

      file with labels from the previous step.

    5.2.25.9 -

    TFRecord

    TFRecord is a very flexible format, but we try to correspond the format that used in TF object detection with minimal modifications.

    Used feature description:

    image_feature_description = {
        'image/filename': tf.io.FixedLenFeature([], tf.string),
        'image/source_id': tf.io.FixedLenFeature([], tf.string),
        'image/height': tf.io.FixedLenFeature([], tf.int64),
        'image/width': tf.io.FixedLenFeature([], tf.int64),
        # Object boxes and classes.
        'image/object/bbox/xmin': tf.io.VarLenFeature(tf.float32),
        'image/object/bbox/xmax': tf.io.VarLenFeature(tf.float32),
        'image/object/bbox/ymin': tf.io.VarLenFeature(tf.float32),
        'image/object/bbox/ymax': tf.io.VarLenFeature(tf.float32),
        'image/object/class/label': tf.io.VarLenFeature(tf.int64),
        'image/object/class/text': tf.io.VarLenFeature(tf.string),
    }
    

    TFRecord export

    Downloaded file: a zip archive with following structure:

    taskname.zip/
    ├── default.tfrecord
    └── label_map.pbtxt
    
    # label_map.pbtxt
    item {
    	id: 1
    	name: 'label_0'
    }
    item {
    	id: 2
    	name: 'label_1'
    }
    ...
    
    • supported annotations: Rectangles, Polygons (as masks, manually over Datumaro)

    How to export masks:

    1. Export annotations in Datumaro format
    2. Apply polygons_to_masks and boxes_to_masks transforms
    datum transform -t polygons_to_masks -p path/to/proj -o ptm
    datum transform -t boxes_to_masks -p ptm -o btm
    
    1. Export in the TF Detection API format
    datum export -f tf_detection_api -p btm [-- --save-images]
    

    TFRecord import

    Uploaded file: a zip archive of following structure:

    taskname.zip/
    └── <any name>.tfrecord
    
    • supported annotations: Rectangles

    How to create a task from TFRecord dataset (from VOC2007 for example)

    1. Create label_map.pbtxt file with the following content:
    item {
        id: 1
        name: 'aeroplane'
    }
    item {
        id: 2
        name: 'bicycle'
    }
    item {
        id: 3
        name: 'bird'
    }
    item {
        id: 4
        name: 'boat'
    }
    item {
        id: 5
        name: 'bottle'
    }
    item {
        id: 6
        name: 'bus'
    }
    item {
        id: 7
        name: 'car'
    }
    item {
        id: 8
        name: 'cat'
    }
    item {
        id: 9
        name: 'chair'
    }
    item {
        id: 10
        name: 'cow'
    }
    item {
        id: 11
        name: 'diningtable'
    }
    item {
        id: 12
        name: 'dog'
    }
    item {
        id: 13
        name: 'horse'
    }
    item {
        id: 14
        name: 'motorbike'
    }
    item {
        id: 15
        name: 'person'
    }
    item {
        id: 16
        name: 'pottedplant'
    }
    item {
        id: 17
        name: 'sheep'
    }
    item {
        id: 18
        name: 'sofa'
    }
    item {
        id: 19
        name: 'train'
    }
    item {
        id: 20
        name: 'tvmonitor'
    }
    
    1. Use create_pascal_tf_record.py

    to convert VOC2007 dataset to TFRecord format. As example:

    python create_pascal_tf_record.py --data_dir <path to VOCdevkit> --set train --year VOC2007 --output_path pascal.tfrecord --label_map_path label_map.pbtxt
    
    1. Zip train images

      cat <path to VOCdevkit>/VOC2007/ImageSets/Main/train.txt | while read p; do echo <path to VOCdevkit>/VOC2007/JPEGImages/${p}.jpg  ; done | zip images.zip -j -@
      
    2. Create a CVAT task with the following labels:

      aeroplane bicycle bird boat bottle bus car cat chair cow diningtable dog horse motorbike person pottedplant sheep sofa train tvmonitor
      

      Select images. zip as data. See Creating an annotation task guide for details.

    3. Zip pascal.tfrecord and label_map.pbtxt files together

      zip anno.zip -j <path to pascal.tfrecord> <path to label_map.pbtxt>
      
    4. Click Upload annotation button, choose TFRecord 1.0 and select the zip file

      with labels from the previous step. It may take some time.

    5.2.25.10 -

    ImageNet

    ImageNet export

    Downloaded file: a zip archive of the following structure:

    # if we save images:
    taskname.zip/
    ├── label1/
    |   ├── label1_image1.jpg
    |   └── label1_image2.jpg
    └── label2/
        ├── label2_image1.jpg
        ├── label2_image3.jpg
        └── label2_image4.jpg
    
    # if we keep only annotation:
    taskname.zip/
    ├── <any_subset_name>.txt
    └── synsets.txt
    
    
    • supported annotations: Labels

    ImageNet import

    Uploaded file: a zip archive of the structure above

    • supported annotations: Labels

    5.2.25.11 -

    WIDER Face

    WIDER Face export

    Downloaded file: a zip archive of the following structure:

    taskname.zip/
    ├── labels.txt # optional
    ├── wider_face_split/
    │   └── wider_face_<any_subset_name>_bbx_gt.txt
    └── WIDER_<any_subset_name>/
        └── images/
            ├── 0--label0/
            │   └── 0_label0_image1.jpg
            └── 1--label1/
                └── 1_label1_image2.jpg
    
    • supported annotations: Rectangles (with attributes), Labels
    • supported attributes:
      • blur, expression, illumination, pose, invalid
      • occluded (both the annotation property & an attribute)

    WIDER Face import

    Uploaded file: a zip archive of the structure above

    • supported annotations: Rectangles (with attributes), Labels
    • supported attributes:
      • blur, expression, illumination, occluded, pose, invalid

    5.2.25.12 -

    CamVid

    CamVid export

    Downloaded file: a zip archive of the following structure:

    taskname.zip/
    ├── label_colors.txt # optional, required for non-CamVid labels
    ├── <any_subset_name>/
    |   ├── image1.png
    |   └── image2.png
    ├── <any_subset_name>annot/
    |   ├── image1.png
    |   └── image2.png
    └── <any_subset_name>.txt
    
    # label_colors.txt (with color value type)
    # if you want to manually set the color for labels, configure label_colors.txt as follows:
    # color (RGB) label
    0 0 0 Void
    64 128 64 Animal
    192 0 128 Archway
    0 128 192 Bicyclist
    0 128 64 Bridge
    
    # label_colors.txt (without color value type)
    # if you do not manually set the color for labels, it will be set automatically:
    # label
    Void
    Animal
    Archway
    Bicyclist
    Bridge
    

    Mask is a png image with 1 or 3 channels where each pixel has own color which corresponds to a label. (0, 0, 0) is used for background by default.

    • supported annotations: Rectangles, Polygons

    CamVid import

    Uploaded file: a zip archive of the structure above

    • supported annotations: Polygons

    5.2.25.13 -

    VGGFace2

    VGGFace2 export

    Downloaded file: a zip archive of the following structure:

    taskname.zip/
    ├── labels.txt # optional
    ├── <any_subset_name>/
    |   ├── label0/
    |   |   └── image1.jpg
    |   └── label1/
    |       └── image2.jpg
    └── bb_landmark/
        ├── loose_bb_<any_subset_name>.csv
        └── loose_landmark_<any_subset_name>.csv
    # labels.txt
    # n000001 car
    label0 <class0>
    label1 <class1>
    
    • supported annotations: Rectangles, Points (landmarks - groups of 5 points)

    VGGFace2 import

    Uploaded file: a zip archive of the structure above

    • supported annotations: Rectangles, Points (landmarks - groups of 5 points)

    5.2.25.14 -

    Market-1501

    Market-1501 export

    Downloaded file: a zip archive of the following structure:

    taskname.zip/
    ├── bounding_box_<any_subset_name>/
    │   └── image_name_1.jpg
    └── query
        ├── image_name_2.jpg
        └── image_name_3.jpg
    # if we keep only annotation:
    taskname.zip/
    └── images_<any_subset_name>.txt
    # images_<any_subset_name>.txt
    query/image_name_1.jpg
    bounding_box_<any_subset_name>/image_name_2.jpg
    bounding_box_<any_subset_name>/image_name_3.jpg
    # image_name = 0001_c1s1_000015_00.jpg
    0001 - person id
    c1 - camera id (there are totally 6 cameras)
    s1 - sequence
    000015 - frame number in sequence
    00 - means that this bounding box is the first one among the several
    
    • supported annotations: Label market-1501 with attributes (query, person_id, camera_id)

    Market-1501 import

    Uploaded file: a zip archive of the structure above

    • supported annotations: Label market-1501 with attributes (query, person_id, camera_id)

    5.2.25.15 -

    ICDAR13/15

    ICDAR13/15 export

    Downloaded file: a zip archive of the following structure:

    # word recognition task
    taskname.zip/
    └── word_recognition/
        └── <any_subset_name>/
            ├── images
            |   ├── word1.png
            |   └── word2.png
            └── gt.txt
    # text localization task
    taskname.zip/
    └── text_localization/
        └── <any_subset_name>/
            ├── images
            |   ├── img_1.png
            |   └── img_2.png
            ├── gt_img_1.txt
            └── gt_img_1.txt
    #text segmentation task
    taskname.zip/
    └── text_localization/
        └── <any_subset_name>/
            ├── images
            |   ├── 1.png
            |   └── 2.png
            ├── 1_GT.bmp
            ├── 1_GT.txt
            ├── 2_GT.bmp
            └── 2_GT.txt
    

    Word recognition task:

    • supported annotations: Label icdar with attribute caption

    Text localization task:

    • supported annotations: Rectangles and Polygons with label icdar and attribute text

    Text segmentation task:

    • supported annotations: Rectangles and Polygons with label icdar and attributes index, text, color, center

    ICDAR13/15 import

    Uploaded file: a zip archive of the structure above

    Word recognition task:

    • supported annotations: Label icdar with attribute caption

    Text localization task:

    • supported annotations: Rectangles and Polygons with label icdar and attribute text

    Text segmentation task:

    • supported annotations: Rectangles and Polygons with label icdar and attributes index, text, color, center

    5.2.25.16 -

    Open Images

    • Format specification

    • Dataset examples

    • Supported annotations:

      • Rectangles (detection task)
      • Tags (classification task)
      • Polygons (segmentation task)
    • Supported attributes:

      • Labels

        • score (should be defined for labels as text or number). The confidence level from 0 to 1.
      • Bounding boxes

        • score (should be defined for labels as text or number). The confidence level from 0 to 1.
        • occluded (both UI option and a separate attribute). Whether the object is occluded by another object.
        • truncated (should be defined for labels as checkbox -es). Whether the object extends beyond the boundary of the image.
        • is_group_of (should be defined for labels as checkbox -es). Whether the object represents a group of objects of the same class.
        • is_depiction (should be defined for labels as checkbox -es). Whether the object is a depiction (such as a drawing) rather than a real object.
        • is_inside (should be defined for labels as checkbox -es). Whether the object is seen from the inside.
      • Masks

        • box_id (should be defined for labels as text). An identifier for the bounding box associated with the mask.
        • predicted_iou (should be defined for labels as text or number). Predicted IoU value with respect to the ground truth.

    Open Images export

    Downloaded file: a zip archive of the following structure:

    └─ taskname.zip/
        ├── annotations/
        │   ├── bbox_labels_600_hierarchy.json
        │   ├── class-descriptions.csv
        |   ├── images.meta  # additional file with information about image sizes
        │   ├── <subset_name>-image_ids_and_rotation.csv
        │   ├── <subset_name>-annotations-bbox.csv
        │   ├── <subset_name>-annotations-human-imagelabels.csv
        │   └── <subset_name>-annotations-object-segmentation.csv
        ├── images/
        │   ├── subset1/
        │   │   ├── <image_name101.jpg>
        │   │   ├── <image_name102.jpg>
        │   │   └── ...
        │   ├── subset2/
        │   │   ├── <image_name201.jpg>
        │   │   ├── <image_name202.jpg>
        │   │   └── ...
        |   ├── ...
        └── masks/
            ├── subset1/
            │   ├── <mask_name101.png>
            │   ├── <mask_name102.png>
            │   └── ...
            ├── subset2/
            │   ├── <mask_name201.png>
            │   ├── <mask_name202.png>
            │   └── ...
            ├── ...
    

    Open Images import

    Uploaded file: a zip archive of the following structure:

    └─ upload.zip/
        ├── annotations/
        │   ├── bbox_labels_600_hierarchy.json
        │   ├── class-descriptions.csv
        |   ├── images.meta  # optional, file with information about image sizes
        │   ├── <subset_name>-image_ids_and_rotation.csv
        │   ├── <subset_name>-annotations-bbox.csv
        │   ├── <subset_name>-annotations-human-imagelabels.csv
        │   └── <subset_name>-annotations-object-segmentation.csv
        └── masks/
            ├── subset1/
            │   ├── <mask_name101.png>
            │   ├── <mask_name102.png>
            │   └── ...
            ├── subset2/
            │   ├── <mask_name201.png>
            │   ├── <mask_name202.png>
            │   └── ...
            ├── ...
    

    Image ids in the <subset_name>-image_ids_and_rotation.csv should match with image names in the task.

    5.2.25.17 -

    Cityscapes

    • Format specification

    • Dataset examples

    • Supported annotations

      • Polygons (segmentation task)
    • Supported attributes

      • ‘is_crowd’ (boolean, should be defined for labels as checkbox -es) Specifies if the annotation label can distinguish between different instances. If False, the annotation id field encodes the instance id.

    Cityscapes export

    Downloaded file: a zip archive of the following structure:

    .
    ├── label_color.txt
    ├── gtFine
    │   ├── <subset_name>
    │   │   └── <city_name>
    │   │       ├── image_0_gtFine_instanceIds.png
    │   │       ├── image_0_gtFine_color.png
    │   │       ├── image_0_gtFine_labelIds.png
    │   │       ├── image_1_gtFine_instanceIds.png
    │   │       ├── image_1_gtFine_color.png
    │   │       ├── image_1_gtFine_labelIds.png
    │   │       ├── ...
    └── imgsFine  # if saving images was requested
        └── leftImg8bit
            ├── <subset_name>
            │   └── <city_name>
            │       ├── image_0_leftImg8bit.png
            │       ├── image_1_leftImg8bit.png
            │       ├── ...
    
    • label_color.txt a file that describes the color for each label
    # label_color.txt example
    # r g b label_name
    0 0 0 background
    0 255 0 tree
    ...
    
    • *_gtFine_color.png class labels encoded by its color.
    • *_gtFine_labelIds.png class labels are encoded by its index.
    • *_gtFine_instanceIds.png class and instance labels encoded by an instance ID. The pixel values encode class and the individual instance: the integer part of a division by 1000 of each ID provides class ID, the remainder is the instance ID. If a certain annotation describes multiple instances, then the pixels have the regular ID of that class

    Cityscapes annotations import

    Uploaded file: a zip archive with the following structure:

    .
    ├── label_color.txt # optional
    └── gtFine
        └── <city_name>
            ├── image_0_gtFine_instanceIds.png
            ├── image_1_gtFine_instanceIds.png
            ├── ...
    

    Creating task with Cityscapes dataset

    Create a task with the labels you need or you can use the labels and colors of the original dataset. To work with the Cityscapes format, you must have a black color label for the background.

    Original Cityscapes color map:

    [
        {"name": "unlabeled", "color": "#000000", "attributes": []},
        {"name": "egovehicle", "color": "#000000", "attributes": []},
        {"name": "rectificationborder", "color": "#000000", "attributes": []},
        {"name": "outofroi", "color": "#000000", "attributes": []},
        {"name": "static", "color": "#000000", "attributes": []},
        {"name": "dynamic", "color": "#6f4a00", "attributes": []},
        {"name": "ground", "color": "#510051", "attributes": []},
        {"name": "road", "color": "#804080", "attributes": []},
        {"name": "sidewalk", "color": "#f423e8", "attributes": []},
        {"name": "parking", "color": "#faaaa0", "attributes": []},
        {"name": "railtrack", "color": "#e6968c", "attributes": []},
        {"name": "building", "color": "#464646", "attributes": []},
        {"name": "wall", "color": "#66669c", "attributes": []},
        {"name": "fence", "color": "#be9999", "attributes": []},
        {"name": "guardrail", "color": "#b4a5b4", "attributes": []},
        {"name": "bridge", "color": "#966464", "attributes": []},
        {"name": "tunnel", "color": "#96785a", "attributes": []},
        {"name": "pole", "color": "#999999", "attributes": []},
        {"name": "polegroup", "color": "#999999", "attributes": []},
        {"name": "trafficlight", "color": "#faaa1e", "attributes": []},
        {"name": "trafficsign", "color": "#dcdc00", "attributes": []},
        {"name": "vegetation", "color": "#6b8e23", "attributes": []},
        {"name": "terrain", "color": "#98fb98", "attributes": []},
        {"name": "sky", "color": "#4682b4", "attributes": []},
        {"name": "person", "color": "#dc143c", "attributes": []},
        {"name": "rider", "color": "#ff0000", "attributes": []},
        {"name": "car", "color": "#00008e", "attributes": []},
        {"name": "truck", "color": "#000046", "attributes": []},
        {"name": "bus", "color": "#003c64", "attributes": []},
        {"name": "caravan", "color": "#00005a", "attributes": []},
        {"name": "trailer", "color": "#00006e", "attributes": []},
        {"name": "train", "color": "#005064", "attributes": []},
        {"name": "motorcycle", "color": "#0000e6", "attributes": []},
        {"name": "bicycle", "color": "#770b20", "attributes": []},
        {"name": "licenseplate", "color": "#00000e", "attributes": []}
    ]
    
    

    Upload images when creating a task:

    images.zip/
        ├── image_0.jpg
        ├── image_1.jpg
        ├── ...
    
    

    After creating the task, upload the Cityscapes annotations as described in the previous section.

    5.2.25.18 -

    KITTI

    • Format specification for KITTI detection

    • Format specification for KITTI segmentation

    • Dataset examples

    • supported annotations:

      • Rectangles (detection task)
      • Polygon (segmentation task)
    • supported attributes:

      • occluded (both UI option and a separate attribute). Indicates that a significant portion of the object within the bounding box is occluded by another object
      • truncated supported only for rectangles (should be defined for labels as checkbox -es). Indicates that the bounding box specified for the object does not correspond to the full extent of the object
      • ‘is_crowd’ supported only for polygons (should be defined for labels as checkbox -es). Indicates that the annotation covers multiple instances of the same class

    KITTI annotations export

    Downloaded file: a zip archive of the following structure:

    └─ annotations.zip/
        ├── label_colors.txt # list of pairs r g b label_name
        ├── labels.txt # list of labels
        └── default/
            ├── label_2/ # left color camera label files
            │   ├── <image_name_1>.txt
            │   ├── <image_name_2>.txt
            │   └── ...
            ├── instance/ # instance segmentation masks
            │   ├── <image_name_1>.png
            │   ├── <image_name_2>.png
            │   └── ...
            ├── semantic/ # semantic segmentation masks (labels are encoded by its id)
            │   ├── <image_name_1>.png
            │   ├── <image_name_2>.png
            │   └── ...
            └── semantic_rgb/ # semantic segmentation masks (labels are encoded by its color)
                ├── <image_name_1>.png
                ├── <image_name_2>.png
                └── ...
    

    KITTI annotations import

    You can upload KITTI annotations in two ways: rectangles for the detection task and masks for the segmentation task.

    For detection tasks the uploading archive should have the following structure:

    └─ annotations.zip/
        ├── labels.txt # optional, labels list for non-original detection labels
        └── <subset_name>/
            ├── label_2/ # left color camera label files
            │   ├── <image_name_1>.txt
            │   ├── <image_name_2>.txt
            │   └── ...
    

    For segmentation tasks the uploading archive should have the following structure:

    └─ annotations.zip/
        ├── label_colors.txt # optional, color map for non-original segmentation labels
        └── <subset_name>/
            ├── instance/ # instance segmentation masks
            │   ├── <image_name_1>.png
            │   ├── <image_name_2>.png
            │   └── ...
            ├── semantic/ # optional, semantic segmentation masks (labels are encoded by its id)
            │   ├── <image_name_1>.png
            │   ├── <image_name_2>.png
            │   └── ...
            └── semantic_rgb/ # optional, semantic segmentation masks (labels are encoded by its color)
                ├── <image_name_1>.png
                ├── <image_name_2>.png
                └── ...
    

    All annotation files and masks should have structures that are described in the original format specification.

    5.2.25.19 -

    LFW

    • Format specification

    • Dataset examples

    • Supported annotations: tags, points.

    • Supported attributes:

      • negative_pairs (should be defined for labels as text): list of image names with mismatched persons.
      • positive_pairs (should be defined for labels as text): list of image names with matched persons.

    Import LFW annotation

    The uploaded annotations file should be a zip file with the following structure:

    <archive_name>.zip/
        └── annotations/
            ├── landmarks.txt # list with landmark points for each image
            ├── pairs.txt # list of matched and mismatched pairs of person
            └── people.txt # optional file with a list of persons name
    

    Full information about the content of annotation files is available here

    Export LFW annotation

    Downloaded file: a zip archive of the following structure:

    <archive_name>.zip/
        └── images/ # if the option save images was selected
        │    ├── name1/
        │    │   ├── name1_0001.jpg
        │    │   ├── name1_0002.jpg
        │    │   ├── ...
        │    ├── name2/
        │    │   ├── name2_0001.jpg
        │    │   ├── name2_0002.jpg
        │    │   ├── ...
        │    ├── ...
        ├── landmarks.txt
        ├── pairs.txt
        └── people.txt
    

    Example: create task with images and upload LFW annotations into it

    This is one of the possible ways to create a task and add LFW annotations for it.

    • On the task creation page:
      • Add labels that correspond to the names of the persons.
      • For each label define text attributes with names positive_pairs and negative_pairs
      • Add images using zip archive from local repository:
    images.zip/
        ├── name1_0001.jpg
        ├── name1_0002.jpg
        ├── ...
        ├── name1_<N>.jpg
        ├── name2_0001.jpg
        ├── ...
    
    • On the annotation page: Upload annotation -> LFW 1.0 -> choose archive with structure that described in the import section.

    5.2.26 - Task synchronization with a repository

    Notice: this feature works only if a git repository was specified when the task was created.

    1. At the end of the annotation process, a task is synchronized by clicking Synchronize on the task page. If the synchronization is successful, the button will change to Sychronized in blue:

    2. The annotation is now in the repository in a temporary branch. The next step is to go to the repository and manually create a pull request to the main branch.

    3. After merging the PR, when the annotation is saved in the main branch, the button changes to Merged and is highlighted in green.

    If annotation in the task does not correspond annotations in the repository, the sync button will turn red:

    5.2.27 - XML annotation format

    When you want to download annotations from Computer Vision Annotation Tool (CVAT) you can choose one of several data formats. The document describes XML annotation format. Each format has X.Y version (e.g. 1.0). In general the major version (X) is incremented when the data format has incompatible changes and the minor version (Y) is incremented when the data format is slightly modified (e.g. it has one or several extra fields inside meta information). The document will describe all changes for all versions of XML annotation format.

    Version 1.1

    There are two different formats for images and video tasks at the moment. The both formats have a common part which is described below. From the previous version flipped tag was added. Also original_size tag was added for interpolation mode to specify frame size. In annotation mode each image tag has width and height attributes for the same purpose.

    For what is rle, see Run-length encoding

    <?xml version="1.0" encoding="utf-8"?>
    <annotations>
      <version>1.1</version>
      <meta>
        <task>
          <id>Number: id of the task</id>
          <name>String: some task name</name>
          <size>Number: count of frames/images in the task</size>
          <mode>String: interpolation or annotation</mode>
          <overlap>Number: number of overlapped frames between segments</overlap>
          <bugtracker>String: URL on an page which describe the task</bugtracker>
          <flipped>Boolean: were images of the task flipped? (True/False)</flipped>
          <created>String: date when the task was created</created>
          <updated>String: date when the task was updated</updated>
          <labels>
            <label>
              <name>String: name of the label (e.g. car, person)</name>
              <type>String: any, bbox, cuboid, cuboid_3d, ellipse, mask, polygon, polyline, points, skeleton, tag</type>
              <attributes>
                <attribute>
                  <name>String: attribute name</name>
                  <mutable>Boolean: mutable (allow different values between frames)</mutable>
                  <input_type>String: select, checkbox, radio, number, text</input_type>
                  <default_value>String: default value</default_value>
                  <values>String: possible values, separated by newlines
    ex. value 2
    ex. value 3</values>
                </attribute>
              </attributes>
              <svg>String: label representation in svg, only for skeletons</svg>
              <parent>String: label parent name, only for skeletons</parent>
            </label>
          </labels>
          <segments>
            <segment>
              <id>Number: id of the segment</id>
              <start>Number: first frame</start>
              <stop>Number: last frame</stop>
              <url>String: URL (e.g. http://cvat.example.com/?id=213)</url>
            </segment>
          </segments>
          <owner>
            <username>String: the author of the task</username>
            <email>String: email of the author</email>
          </owner>
          <original_size>
            <width>Number: frame width</width>
            <height>Number: frame height</height>
          </original_size>
        </task>
        <dumped>String: date when the annotation was dumped</dumped>
      </meta>
      ...
    </annotations>
    

    Annotation

    Below you can find description of the data format for images tasks. On each image it is possible to have many different objects. Each object can have multiple attributes. If an annotation task is created with z_order flag then each object will have z_order attribute which is used to draw objects properly when they are intersected (if z_order is bigger the object is closer to camera). In previous versions of the format only box shape was available. In later releases mask, polygon, polyline, points, skeletons and tags were added. Please see below for more details:

    <?xml version="1.0" encoding="utf-8"?>
    <annotations>
      ...
      <image id="Number: id of the image (the index in lexical order of images)" name="String: path to the image"
        width="Number: image width" height="Number: image height">
        <box label="String: the associated label" xtl="Number: float" ytl="Number: float" xbr="Number: float" ybr="Number: float" occluded="Number: 0 - False, 1 - True" z_order="Number: z-order of the object">
          <attribute name="String: an attribute name">String: the attribute value</attribute>
          ...
        </box>
        <polygon label="String: the associated label" points="x0,y0;x1,y1;..." occluded="Number: 0 - False, 1 - True"
        z_order="Number: z-order of the object">
          <attribute name="String: an attribute name">String: the attribute value</attribute>
          ...
        </polygon>
        <polyline label="String: the associated label" points="x0,y0;x1,y1;..." occluded="Number: 0 - False, 1 - True"
        z_order="Number: z-order of the object">
          <attribute name="String: an attribute name">String: the attribute value</attribute>
          ...
        </polyline>
        <polyline label="String: the associated label" points="x0,y0;x1,y1;..." occluded="Number: 0 - False, 1 - True"
        z_order="Number: z-order of the object">
          <attribute name="String: an attribute name">String: the attribute value</attribute>
          ...
        </polyline>
        <points label="String: the associated label" points="x0,y0;x1,y1;..." occluded="Number: 0 - False, 1 - True"
        z_order="Number: z-order of the object">
          <attribute name="String: an attribute name">String: the attribute value</attribute>
          ...
        </points>
        <tag label="String: the associated label" source="manual or auto">
          <attribute name="String: an attribute name">String: the attribute value</attribute>
          ...
        </tag>
        <skeleton label="String: the associated label" z_order="Number: z-order of the object">
          <points label="String: the associated label" occluded="Number: 0 - False, 1 - True" outside="Number: 0 - False, 1 - True" points="x0,y0;x1,y1">
            <attribute name="String: an attribute name">String: the attribute value</attribute>
          </points>
          ...
          <attribute name="String: an attribute name">String: the attribute value</attribute>
          ...
        </skeleton>
        <mask label="String: the associated label" source="manual or auto" occluded="Number: 0 - False, 1 - True" rle="RLE mask" left="Number: left coordinate of the image where the mask begins" top="Number: top coordinate of the image where the mask begins" width="Number: width of the mask" height="Number: height of the mask" z_order="Number: z-order of the object">
        </mask>
        ...
      </image>
      ...
    </annotations>
    

    Example:

    <?xml version="1.0" encoding="utf-8"?>
    <annotations>
      <version>1.1</version>
      <meta>
        <task>
          <id>4</id>
          <name>segmentation</name>
          <size>27</size>
          <mode>annotation</mode>
          <overlap>0</overlap>
          <bugtracker></bugtracker>
          <flipped>False</flipped>
          <created>2018-09-25 11:34:24.617558+03:00</created>
          <updated>2018-09-25 11:38:27.301183+03:00</updated>
          <labels>
            <label>
              <name>car</name>
              <attributes>
              </attributes>
            </label>
            <label>
              <name>traffic_line</name>
              <attributes>
              </attributes>
            </label>
            <label>
              <name>wheel</name>
              <attributes>
              </attributes>
            </label>
            <label>
              <name>plate</name>
              <attributes>
              </attributes>
            </label>
            <label>
              <name>s1</name>
              <type>skeleton</type>
              <attributes>
              </attributes>
              <svg>&lt;line x1="36.87290954589844" y1="47.732025146484375" x2="86.87290954589844" y2="10.775501251220703" stroke="black" data-type="edge" data-node-from="2" stroke-width="0.5" data-node-to="3"&gt;&lt;/line&gt;&lt;line x1="25.167224884033203" y1="22.64841079711914" x2="36.87290954589844" y2="47.732025146484375" stroke="black" data-type="edge" data-node-from="1" stroke-width="0.5" data-node-to="2"&gt;&lt;/line&gt;&lt;circle r="1.5" stroke="black" fill="#b3b3b3" cx="25.167224884033203" cy="22.64841079711914" stroke-width="0.1" data-type="element node" data-element-id="1" data-node-id="1" data-label-name="1"&gt;&lt;/circle&gt;&lt;circle r="1.5" stroke="black" fill="#b3b3b3" cx="36.87290954589844" cy="47.732025146484375" stroke-width="0.1" data-type="element node" data-element-id="2" data-node-id="2" data-label-name="2"&gt;&lt;/circle&gt;&lt;circle r="1.5" stroke="black" fill="#b3b3b3" cx="86.87290954589844" cy="10.775501251220703" stroke-width="0.1" data-type="element node" data-element-id="3" data-node-id="3" data-label-name="3"&gt;&lt;/circle&gt;</svg>
            </label>
            <label>
              <name>1</name>
              <type>points</type>
              <attributes>
              </attributes>
              <parent>s1</parent>
            </label>
            <label>
              <name>2</name>
              <type>points</type>
              <attributes>
              </attributes>
              <parent>s1</parent>
            </label>
            <label>
              <name>3</name>
              <type>points</type>
              <attributes>
              </attributes>
              <parent>s1</parent>
            </label>
          </labels>
          <segments>
            <segment>
              <id>4</id>
              <start>0</start>
              <stop>26</stop>
              <url>http://localhost:8080/?id=4</url>
            </segment>
          </segments>
          <owner>
            <username>admin</username>
            <email></email>
          </owner>
        </task>
        <dumped>2018-09-25 11:38:28.799808+03:00</dumped>
      </meta>
      <image id="0" name="filename000.jpg" width="1600" height="1200">
        <box label="plate" xtl="797.33" ytl="870.92" xbr="965.52" ybr="928.94" occluded="0" z_order="4">
        </box>
        <polygon label="car" points="561.30,916.23;561.30,842.77;554.72,761.63;553.62,716.67;565.68,677.20;577.74,566.45;547.04,559.87;536.08,542.33;528.40,520.40;541.56,512.72;559.10,509.43;582.13,506.14;588.71,464.48;583.23,448.03;587.61,434.87;594.19,431.58;609.54,399.78;633.66,369.08;676.43,294.52;695.07,279.17;703.84,279.17;735.64,268.20;817.88,264.91;923.14,266.01;997.70,274.78;1047.04,283.55;1063.49,289.04;1090.90,330.70;1111.74,371.27;1135.86,397.59;1147.92,428.29;1155.60,435.97;1157.79,451.32;1156.69,462.28;1159.98,491.89;1163.27,522.59;1173.14,513.82;1199.46,516.01;1224.68,521.49;1225.77,544.52;1207.13,568.64;1181.91,576.32;1178.62,582.90;1177.53,619.08;1186.30,680.48;1199.46,711.19;1206.03,733.12;1203.84,760.53;1197.26,818.64;1199.46,840.57;1203.84,908.56;1192.88,930.49;1184.10,939.26;1162.17,944.74;1139.15,960.09;1058.01,976.54;1028.40,969.96;1002.09,972.15;931.91,974.35;844.19,972.15;772.92,972.15;729.06,967.77;713.71,971.06;685.20,973.25;659.98,968.86;644.63,984.21;623.80,983.12;588.71,985.31;560.20,966.67" occluded="0" z_order="1">
        </polygon>
        <polyline label="traffic_line" points="462.10,0.00;126.80,1200.00" occluded="0" z_order="3">
        </polyline>
        <polyline label="traffic_line" points="1212.40,0.00;1568.66,1200.00" occluded="0" z_order="2">
        </polyline>
        <points label="wheel" points="574.90,939.48;1170.16,907.90;1130.69,445.26;600.16,459.48" occluded="0" z_order="5">
        </points>
        <tag label="good_frame" source="manual">
        </tag>
        <skeleton label="s1" source="manual" z_order="0">
          <points label="1" occluded="0" source="manual" outside="0" points="54.47,94.81">
          </points>
          <points label="2" occluded="0" source="manual" outside="0" points="68.02,162.34">
          </points>
          <points label="3" occluded="0" source="manual" outside="0" points="125.87,62.85">
          </points>
        </skeleton>
        <mask label="car" source="manual" occluded="0" rle="3, 5, 7, 7, 5, 9, 3, 11, 2, 11, 2, 12, 1, 12, 1, 26, 1, 12, 1, 12, 2, 11, 3, 9, 5, 7, 7, 5, 3" left="707" top="888" width="13" height="15" z_order="0">
        </mask>
      </image>
    </annotations>
    

    Interpolation

    Below you can find description of the data format for video tasks. The annotation contains tracks. Each track corresponds to an object which can be presented on multiple frames. The same object cannot be presented on the same frame in multiple locations. Each location of the object can have multiple attributes even if an attribute is immutable for the object it will be cloned for each location (a known redundancy).

    <?xml version="1.0" encoding="utf-8"?>
    <annotations>
      ...
      <track id="Number: id of the track (doesn't have any special meeting)" label="String: the associated label" source="manual or auto">
        <box frame="Number: frame" xtl="Number: float" ytl="Number: float" xbr="Number: float" ybr="Number: float" outside="Number: 0 - False, 1 - True" occluded="Number: 0 - False, 1 - True" keyframe="Number: 0 - False, 1 - True">
          <attribute name="String: an attribute name">String: the attribute value</attribute>
          ...
        </box>
        <polygon frame="Number: frame" points="x0,y0;x1,y1;..." outside="Number: 0 - False, 1 - True" occluded="Number: 0 - False, 1 - True" keyframe="Number: 0 - False, 1 - True">
          <attribute name="String: an attribute name">String: the attribute value</attribute>
        </polygon>
        <polyline frame="Number: frame" points="x0,y0;x1,y1;..." outside="Number: 0 - False, 1 - True" occluded="Number: 0 - False, 1 - True" keyframe="Number: 0 - False, 1 - True">
          <attribute name="String: an attribute name">String: the attribute value</attribute>
        </polyline>
        <points frame="Number: frame" points="x0,y0;x1,y1;..." outside="Number: 0 - False, 1 - True" occluded="Number: 0 - False, 1 - True" keyframe="Number: 0 - False, 1 - True">
          <attribute name="String: an attribute name">String: the attribute value</attribute>
        </points>
        <mask frame="Number: frame" outside="Number: 0 - False, 1 - True" occluded="Number: 0 - False, 1 - True" rle="RLE mask" left="Number: left coordinate of the image where the mask begins" top="Number: top coordinate of the image where the mask begins" width="Number: width of the mask" height="Number: height of the mask" z_order="Number: z-order of the object">
        </mask>
        ...
      </track>
      <track id="Number: id of the track (doesn't have any special meeting)" label="String: the associated label" source="manual or auto">
        <skeleton frame="Number: frame" keyframe="Number: 0 - False, 1 - True">
          <points label="String: the associated label" outside="Number: 0 - False, 1 - True" occluded="Number: 0 - False, 1 - True" keyframe="Number: 0 - False, 1 - True" points="x0,y0;x1,y1">
          </points>
          ...
        </skeleton>
        ...
      </track>
      ...
    </annotations>
    

    Example:

    <?xml version="1.0" encoding="utf-8"?>
    <annotations>
      <version>1.1</version>
      <meta>
        <task>
          <id>5</id>
          <name>interpolation</name>
          <size>4620</size>
          <mode>interpolation</mode>
          <overlap>5</overlap>
          <bugtracker></bugtracker>
          <flipped>False</flipped>
          <created>2018-09-25 12:32:09.868194+03:00</created>
          <updated>2018-09-25 16:05:05.619841+03:00</updated>
          <labels>
            <label>
              <name>person</name>
              <attributes>
              </attributes>
            </label>
            <label>
              <name>car</name>
              <attributes>
              </attributes>
            </label>
            <label>
              <name>s1</name>
              <type>skeleton</type>
              <attributes>
              </attributes>
              <svg>&lt;line x1="36.87290954589844" y1="47.732025146484375" x2="86.87290954589844" y2="10.775501251220703" stroke="black" data-type="edge" data-node-from="2" stroke-width="0.5" data-node-to="3"&gt;&lt;/line&gt;&lt;line x1="25.167224884033203" y1="22.64841079711914" x2="36.87290954589844" y2="47.732025146484375" stroke="black" data-type="edge" data-node-from="1" stroke-width="0.5" data-node-to="2"&gt;&lt;/line&gt;&lt;circle r="1.5" stroke="black" fill="#b3b3b3" cx="25.167224884033203" cy="22.64841079711914" stroke-width="0.1" data-type="element node" data-element-id="1" data-node-id="1" data-label-name="1"&gt;&lt;/circle&gt;&lt;circle r="1.5" stroke="black" fill="#b3b3b3" cx="36.87290954589844" cy="47.732025146484375" stroke-width="0.1" data-type="element node" data-element-id="2" data-node-id="2" data-label-name="2"&gt;&lt;/circle&gt;&lt;circle r="1.5" stroke="black" fill="#b3b3b3" cx="86.87290954589844" cy="10.775501251220703" stroke-width="0.1" data-type="element node" data-element-id="3" data-node-id="3" data-label-name="3"&gt;&lt;/circle&gt;</svg>
            </label>
            <label>
              <name>1</name>
              <type>points</type>
              <attributes>
              </attributes>
              <parent>s1</parent>
            </label>
            <label>
              <name>2</name>
              <type>points</type>
              <attributes>
              </attributes>
              <parent>s1</parent>
            </label>
            <label>
              <name>3</name>
              <type>points</type>
              <attributes>
              </attributes>
              <parent>s1</parent>
            </label>
          </labels>
          <segments>
            <segment>
              <id>5</id>
              <start>0</start>
              <stop>4619</stop>
              <url>http://localhost:8080/?id=5</url>
            </segment>
          </segments>
          <owner>
            <username>admin</username>
            <email></email>
          </owner>
          <original_size>
            <width>640</width>
            <height>480</height>
          </original_size>
        </task>
        <dumped>2018-09-25 16:05:07.134046+03:00</dumped>
      </meta>
      <track id="0" label="car">
        <polygon frame="0" points="324.79,213.16;323.74,227.90;347.42,237.37;371.11,217.37;350.05,190.00;318.47,191.58" outside="0" occluded="0" keyframe="1">
        </polygon>
        <polygon frame="1" points="324.79,213.16;323.74,227.90;347.42,237.37;371.11,217.37;350.05,190.00;318.47,191.58" outside="1" occluded="0" keyframe="1">
        </polygon>
        <polygon frame="6" points="305.32,237.90;312.16,207.90;352.69,206.32;355.32,233.16;331.11,254.74" outside="0" occluded="0" keyframe="1">
        </polygon>
        <polygon frame="7" points="305.32,237.90;312.16,207.90;352.69,206.32;355.32,233.16;331.11,254.74" outside="1" occluded="0" keyframe="1">
        </polygon>
        <polygon frame="13" points="313.74,233.16;331.11,220.00;359.53,243.16;333.21,283.16;287.95,274.74" outside="0" occluded="0" keyframe="1">
        </polygon>
        <polygon frame="14" points="313.74,233.16;331.11,220.00;359.53,243.16;333.21,283.16;287.95,274.74" outside="1" occluded="0" keyframe="1">
        </polygon>
      </track>
      <track id="1" label="s1" source="manual">
        <skeleton frame="0" keyframe="1" z_order="0">
          <points label="1" outside="0" occluded="0" keyframe="1" points="112.07,258.59">
          </points>
          <points label="2" outside="0" occluded="0" keyframe="1" points="127.87,333.23">
          </points>
          <points label="3" outside="0" occluded="0" keyframe="1" points="195.37,223.27">
          </points>
        </skeleton>
        <skeleton frame="1" keyframe="1" z_order="0">
          <points label="1" outside="1" occluded="0" keyframe="1" points="112.07,258.59">
          </points>
          <points label="2" outside="1" occluded="0" keyframe="1" points="127.87,333.23">
          </points>
          <points label="3" outside="1" occluded="0" keyframe="1" points="195.37,223.27">
          </points>
        </skeleton>
        <skeleton frame="6" keyframe="1" z_order="0">
          <points label="1" outside="0" occluded="0" keyframe="0" points="120.07,270.59">
          </points>
          <points label="2" outside="0" occluded="0" keyframe="0" points="140.87,350.23">
          </points>
          <points label="3" outside="0" occluded="0" keyframe="0" points="210.37,260.27">
          </points>
        </skeleton>
        <skeleton frame="7" keyframe="1" z_order="0">
          <points label="1" outside="1" occluded="0" keyframe="1" points="120.07,270.59">
          </points>
          <points label="2" outside="1" occluded="0" keyframe="1" points="140.87,350.23">
          </points>
          <points label="3" outside="1" occluded="0" keyframe="1" points="210.37,260.27">
          </points>
        </skeleton>
        <skeleton frame="13" keyframe="0" z_order="0">
          <points label="1" outside="0" occluded="0" keyframe="0" points="112.07,258.59">
          </points>
          <points label="2" outside="0" occluded="0" keyframe="0" points="127.87,333.23">
          </points>
          <points label="3" outside="0" occluded="0" keyframe="0" points="195.37,223.27">
          </points>
        </skeleton>
        <skeleton frame="14" keyframe="1" z_order="0">
          <points label="1" outside="1" occluded="0" keyframe="1" points="112.07,258.59">
          </points>
          <points label="2" outside="1" occluded="0" keyframe="1" points="127.87,333.23">
          </points>
          <points label="3" outside="1" occluded="0" keyframe="1" points="195.37,223.27">
          </points>
        </skeleton>
      </track>
    </annotations>
    

    5.2.28 - Shortcuts

    List of available mouse and keyboard shortcuts.

    Many UI elements have shortcut hints. Put your pointer to a required element to see it.

    Shortcut Common
    Main functions
    F1 Open/hide the list of available shortcuts
    F2 Go to the settings page or go back
    Ctrl+S Go to the settings page or go back
    Ctrl+Z Cancel the latest action related with objects
    Ctrl+Shift+Z or Ctrl+Y Cancel undo action
    Hold Mouse Wheel To move an image frame (for example, while drawing)
    Player
    F Go to the next frame
    D Go to the previous frame
    V Go forward with a step
    C Go backward with a step
    Right Search the next frame that satisfies to the filters
    or next frame which contain any objects
    Left Search the previous frame that satisfies to the filters
    or previous frame which contain any objects
    Space Start/stop automatic changing frames
    ` or ~ Focus on the element to change the current frame
    Modes
    N Repeat the latest procedure of drawing with the same parameters
    M Activate or deactivate mode to merging shapes
    Alt+M Activate or deactivate mode to splitting shapes
    G Activate or deactivate mode to grouping shapes
    Shift+G Reset group for selected shapes (in group mode)
    Esc Cancel any active canvas mode
    Image operations
    Ctrl+R Change image angle (add 90 degrees)
    Ctrl+Shift+R Change image angle (subtract 90 degrees)
    Operations with objects
    Ctrl Switch automatic bordering for polygons and polylines during drawing/editing
    Hold Ctrl When the shape is active and fix it
    Alt+Click on point Deleting a point (used when hovering over a point of polygon, polyline, points)
    Shift+Click on point Editing a shape (used when hovering over a point of polygon, polyline or points)
    Right-Click on shape Display of an object element from objects sidebar
    T+L Change locked state for all objects in the sidebar
    L Change locked state for an active object
    T+H Change hidden state for objects in the sidebar
    H Change hidden state for an active object
    Q or / Change occluded property for an active object
    Del or Shift+Del Delete an active object. Use shift to force delete of locked objects
    - or _ Put an active object “farther” from the user (decrease z axis value)
    + or = Put an active object “closer” to the user (increase z axis value)
    Ctrl+C Copy shape to CVAT internal clipboard
    Ctrl+V Paste a shape from internal CVAT clipboard
    Hold Ctrl while pasting When pasting shape from the buffer for multiple pasting.
    Ctrl+B Make a copy of the object on the following frames
    Ctrl+(0..9) Changes a label for an activated object or for the next drawn object if no objects are activated
    Operations are available only for track
    K Change keyframe property for an active track
    O Change outside property for an active track
    R Go to the next keyframe of an active track
    E Go to the previous keyframe of an active track
    Attribute annotation mode
    Up Arrow Go to the next attribute (up)
    Down Arrow Go to the next attribute (down)
    Tab Go to the next annotated object in current frame
    Shift+Tab Go to the previous annotated object in current frame
    <number> Assign a corresponding value to the current attribute
    Standard 3d mode
    Shift+Up Arrow Increases camera roll angle
    Shift+Down Arrow Decreases camera roll angle
    Shift+Left Arrow Decreases camera pitch angle
    Shift+Right Arrow Increases camera pitch angle
    Alt+O Move the camera up
    Alt+U Move the camera down
    Alt+J Move the camera left
    Alt+L Move the camera right
    Alt+I Performs zoom in
    Alt+K Performs zoom out

    5.2.29 - Filter

    Guide to using the Filter feature in CVAT.

    There are some reasons to use the feature:

    1. When you use a filter, objects that don’t match the filter will be hidden.
    2. The fast navigation between frames which have an object of interest. Use the Left Arrow / Right Arrow keys for this purpose or customize the UI buttons by right-clicking and select switching by filter. If there are no objects which correspond to the filter, you will go to the previous / next frame which contains any annotated objects.

    To apply filters you need to click on the button on the top panel.

    Create a filter

    It will open a window for filter input. Here you will find two buttons: Add rule and Add group.

    Rules

    The Add rule button adds a rule for objects display. A rule may use the following properties:

    Supported properties for annotation

    Properties Supported values Description
    Label all the label names that are in the task label name
    Type shape, track or tag type of object
    Shape all shape types type of shape
    Occluded true or false occluded (read more)
    Width number of px or field shape width
    Height number of px or field shape height
    ServerID number or field ID of the object on the server
    (You can find out by forming a link to the object through the Action menu)
    ObjectID number or field ID of the object in your client
    (indicated on the objects sidebar)
    Attributes some other fields including attributes with a
    similar type or a specific attribute value
    any fields specified by a label

    Supported operators for properties

    == - Equally; != - Not equal; > - More; >= - More or equal; < - Less; <= - Less or equal;

    Any in; Not in - these operators allow you to set multiple values in one rule;

    Is empty; is not empty – these operators don’t require to input a value.

    Between; Not between – these operators allow you to choose a range between two values.

    Like - this operator indicate that the property must contain a value.

    Starts with; Ends with - filter by beginning or end.

    Some properties support two types of values that you can choose:

    You can add multiple rules, to do so click the add rule button and set another rule. Once you’ve set a new rule, you’ll be able to choose which operator they will be connected by: And or Or.

    All subsequent rules will be joined by the chosen operator. Click Submit to apply the filter or if you want multiple rules to be connected by different operators, use groups.

    Groups

    To add a group, click the Add group button. Inside the group you can create rules or groups.

    If there is more than one rule in the group, they can be connected by And or Or operators. The rule group will work as well as a separate rule outside the group and will be joined by an operator outside the group. You can create groups within other groups, to do so you need to click the add group button within the group.

    You can move rules and groups. To move the rule or group, drag it by the button. To remove the rule or group, click on the Delete button.

    If you activate the Not button, objects that don’t match the group will be filtered out. Click Submit to apply the filter. The Cancel button undoes the filter. The Clear filter button removes the filter.

    Once applied filter automatically appears in Recent used list. Maximum length of the list is 10.


    Sort and filter lists

    On the projects, task list on the project page, tasks, jobs, and cloud storage pages, you can use sorting and filters.

    The applied filter and sorting will be displayed in the URL of your browser, Thus, you can share the page with sorting and filter applied.

    Sort by

    You can sort by the following parameters:

    • Jobs list: ID, assignee, updated date, stage, state, task ID, project ID, task name, project name.
    • Tasks list or tasks list on project page: ID, owner, status, assignee, updated date, subset, mode, dimension, project ID, name, project name.
    • Projects list: ID, assignee, owner, status, name, updated date.
    • Cloud storages list: ID, provider type, updated date, display name, resource, credentials, owner, description.

    To apply sorting, drag the parameter to the top area above the horizontal bar. The parameters below the horizontal line will not be applied. By moving the parameters you can change the priority, first of all sorting will occur according to the parameters that are above.

    Pressing the Sort button switches Ascending sort/Descending sort.

    Quick filters

    Quick Filters contain several frequently used filters:

    • Assigned to me - show only those projects, tasks or jobs that are assigned to you.
    • Owned by me - show only those projects or tasks that are owned by you.
    • Not completed - show only those projects, tasks or jobs that have a status other than completed.
    • AWS storages - show only AWS cloud storages
    • Azure storages - show only Azure cloud storages
    • Google cloud storages - show only Google cloud storages

    Date and time selection

    When creating a Last updated rule, you can select the date and time by using the selection window.

    You can select the year and month using the arrows or by clicking on the year and month. To select a day, click on it in the calendar, To select the time, you can select the hours and minutes using the scrolling list. Or you can select the current date and time by clicking the Now button. To apply, click Ok.

    5.2.30 - Review

    Guide to using the Review mode for task validation.

    A special mode to check the annotation allows you to point to an object or area in the frame containing an error. Review mode is not available in 3D tasks.

    Review

    To conduct a review, you need to change the stage to validation for the desired job on the task page and assign a user who will conduct the check. Now the job will open in a fashion review. You can also switch to the Review mode using the UI switcher on the top panel.

    Review mode is a UI mode, there is a special Issue tool which you can use to identify objects or areas in the frame and describe the issue.

    • To do this, first click Open an issue icon on the controls sidebar:

    • Then click on a place in the frame to highlight the place or highlight the area by holding the left mouse button and describe the issue. To select an object, right-click on it and select Open an issue or select one of several quick issues. The object or area will be shaded in red.

    • The created issue will appear in the workspace and in the Issues tab on the objects sidebar.

    • Once all the issues are marked, save the annotation, open the menu and select job state rejected or completed.

    After the review, other users will be able to see the issues, comment on each issue and change the status of the issue to Resolved.

    After the issues are fixed select Finish the job from the menu to finish the task. Or you can switch stage to acceptance on the task page.

    Resolve issues

    After review, you may see the issues in the Issues tab in the object sidebar.

    • You can use the arrows on the Issues tab to navigate the frames that contain issues.

    • In the workspace you can click on issue, you can send a comment on the issue or, if the issue is resolved, change the status to Resolve. You can remove the issue by clicking Remove (if your account have the appropriate permissions).

    • If few issues were created in one place you can access them by hovering over issue and scrolling the mouse wheel.

    If the issue is resolved, you can reopen the issue by clicking the Reopen button.

    5.2.31 - Contextual images

    Contextual images of the task

    Contextual images are additional images that provide context or additional information related to the primary image.

    Use them to add extra contextual about the object to improve the accuracy of annotation.

    Contextual images are available for 2D and 3D tasks.

    See:

    Folder structure

    To add contextual images to the task, you need to organize the images folder.

    Before uploading the archive to CVAT, do the following:

    1. In the folder with the images for annotation, create a folder: related_images.
    2. Add to the related_images a subfolder with the same name as the primary image to which it should be linked.
    3. Place the contextual image(s) within the subfolder created in step 2.
    4. Add folder to the archive.
    5. Create task.

    Data format

    Example file structure for 2D and 3D tasks:

      root_directory
        image_1_to_be_annotated.jpg
        image_2_to_be_annotated.jpg
        related_images/
          image_1_to_be_annotated_jpg/
            context_image_for_image_1.jpg
          image_2_to_be_annotated_jpg/
            context_image_for_image_2.jpg
         subdirectory_example/
            image_3_to_be_annotated.jpg
             related_images/
              image_3_to_be_annotated_jpg/
                 context_image_for_image_3.jpg
     root_directory
        image_1_to_be_annotated.pcd
        image_2_to_be_annotated.pcd
         related_images/
            image_1_to_be_annotated_pcd/
               context_image_for_image_1.jpg
            image_2_to_be_annotated_pcd/
               context_image_for_image_2.jpg
     /any_directory
        pointcloud.pcd
        pointcloud.jpg
    /any_other_directory
        /any_subdirectory
            pointcloud.pcd
            pointcloud.png
     /image_00
        /data
            /0000000000.png
            /0000000001.png
            /0000000002.png
            /0000000003.png
    /image_01
        /data
            /0000000000.png
            /0000000001.png
            /0000000002.png
            /0000000003.png
    /image_02
        /data
            /0000000000.png
            /0000000001.png
            /0000000002.png
            /0000000003.png
    /image_N
        /data
            /0000000000.png
            /0000000001.png
            /0000000002.png
            /0000000003.png
    /velodyne_points
        /data
            /0000000000.bin
            /0000000001.bin
            /0000000002.bin
            /0000000003.bin
    • For KITTI: image_00, image_01, image_02, image_N, (where N is any number <= 12) are context images.
    • For 3D option 3: a regular image file placed near a .pcd file with the same name is considered to be a context image.

    For more general information about 3D data formats, see 3D data formats.

    Contextual images

    The maximum amount of contextual images is twelve.

    By default they will be positioned on the right side of the main image.

    Note: By default, only three contextual images will be visible.

    contex_images_1

    When you add contextual images to the set, small toolbar will appear on the top of the screen, with the following elements:

    Element Description
    contex_images_4 Fit views. Click to restore the layout to its original appearance.

    If you’ve expanded any images in the layout, they will returned to their original size.

    This won’t affect the number of context images on the screen.

    contex_images_5 Add new image. Click to add context image to the layout.
    contex_images_6 Reload layout. Click to reload layout to the default view.

    Note, that this action can change the number of context images resetting them back to three.

    Each context image has the following elements:

    contex_images_2

    Element Description
    1 Full screen. Click to expand the contextual image in to the full screen mode.

    Click again to revert contextual image to windowed mode.

    2 Move contextual image. Hold and move contextual image to the other place on the screen.

    contex_images_3

    3 Name. Unique contextual image name
    4 Select contextual image. Click to open a horisontal listview of all available contextual images.

    Click on one to select.

    5 Close. Click to remove image from contextual images menu.
    6 Extend Hold and pull to extend the image.

    5.2.32 - Shape grouping

    Grouping multiple shapes during annotation.

    This feature allows us to group several shapes.

    You may use the Group Shapes button or shortcuts:

    • G — start selection / end selection in group mode
    • Esc — close group mode
    • Shift+G — reset group for selected shapes

    You may select shapes clicking on them or selecting an area.

    Grouped shapes will have group_id filed in dumped annotation.

    Also you may switch color distribution from an instance (default) to a group. You have to switch Color By Group checkbox for that.

    Shapes that don’t have group_id, will be highlighted in white.

    5.2.33 - Dataset Manifest

    Overview

    When we create a new task in CVAT, we need to specify where to get the input data from. CVAT allows to use different data sources, including local file uploads, a mounted file share on the server, cloud storages and remote URLs. In some cases CVAT needs to have extra information about the input data. This information can be provided in Dataset manifest files. They are mainly used when working with cloud storages to reduce the amount of network traffic used and speed up the task creation process. However, they can also be used in other cases, which will be explained below.

    A dataset manifest file is a text file in the JSONL format. These files can be created automatically with the special command-line tool, or manually, following the manifest file format specification.

    How and when to use manifest files

    Manifest files can be used in the following cases:

    • A video file or a set of images is used as the data source and the caching mode is enabled. Read more
    • The data is located in a cloud storage. Read more
    • The predefined file sorting method is specified. Read more

    The predefined sorting method

    Independently of the file source being used, when the predefined sorting method is selected in the task configuration, the source files will be ordered according to the .jsonl manifest file, if it is found in the input list of files. If a manifest is not found, the order provided in the input file list is used.

    For image archives (e.g. .zip), a manifest file (*.jsonl) is required when using the predefined file ordering. A manifest file must be provided next to the archive in the input list of files, it must not be inside the archive.

    If there are multiple manifest files in the input file list, an error will be raised.

    How to generate manifest files

    CVAT provides a dedicated Python tool to generate manifest files. The source code can be found here.

    Using the tool is the recommended way to create manifest files for you data. The data must be available locally to the tool to generate manifest.

    Usage

    usage: create.py [-h] [--force] [--output-dir .] source
    
    positional arguments:
      source                Source paths
    
    optional arguments:
      -h, --help            show this help message and exit
      --force               Use this flag to prepare the manifest file for video data
                            if by default the video does not meet the requirements
                            and a manifest file is not prepared
      --output-dir OUTPUT_DIR
                            Directory where the manifest file will be saved
    

    Use the script from a Docker image

    This is the recommended way to use the tool.

    The script can be used from the cvat/server image:

    docker run -it --rm -u "$(id -u)":"$(id -g)" \
      -v "${PWD}":"/local" \
      --entrypoint python3 \
      cvat/server \
      utils/dataset_manifest/create.py --output-dir /local /local/<path/to/sources>
    

    Make sure to adapt the command to your file locations.

    Use the script directly

    Ubuntu 20.04

    Install dependencies:

    # General
    sudo apt-get update && sudo apt-get --no-install-recommends install -y \
        python3-dev python3-pip python3-venv pkg-config
    
    # Library components
    sudo apt-get install --no-install-recommends -y \
        libavformat-dev libavcodec-dev libavdevice-dev \
        libavutil-dev libswscale-dev libswresample-dev libavfilter-dev
    

    Create an environment and install the necessary python modules:

    python3 -m venv .env
    . .env/bin/activate
    pip install -U pip
    pip install -r utils/dataset_manifest/requirements.in
    

    Please note that if used with video this way, the results may be different from what would the server decode. It is related to the ffmpeg library version. For this reason, using the Docker-based version of the tool is recommended.

    Examples

    Create a dataset manifest in the current directory with video which contains enough keyframes:

    python utils/dataset_manifest/create.py ~/Documents/video.mp4
    

    Create a dataset manifest with video which does not contain enough keyframes:

    python utils/dataset_manifest/create.py --force --output-dir ~/Documents ~/Documents/video.mp4
    

    Create a dataset manifest with images:

    python utils/dataset_manifest/create.py --output-dir ~/Documents ~/Documents/images/
    

    Create a dataset manifest with pattern (may be used *, ?, []):

    python utils/dataset_manifest/create.py --output-dir ~/Documents "/home/${USER}/Documents/**/image*.jpeg"
    

    Create a dataset manifest using Docker image:

    docker run -it --rm -u "$(id -u)":"$(id -g)" \
      -v ~/Documents/data/:${HOME}/manifest/:rw \
      --entrypoint '/usr/bin/bash' \
      cvat/server \
      utils/dataset_manifest/create.py --output-dir ~/manifest/ ~/manifest/images/
    

    File format

    The dataset manifest files are text files in JSONL format. These files have 2 sub-formats: for video and for images and 3d data.

    Each top-level entry enclosed in curly braces must use 1 string, no empty strings is allowed. The formatting in the descriptions below is only for demonstration.

    Dataset manifest for video

    The file describes a single video.

    pts - time at which the frame should be shown to the user checksum - md5 hash sum for the specific image/frame decoded

    { "version": <string, version id> }
    { "type": "video" }
    { "properties": {
      "name": <string, filename>,
      "resolution": [<int, width>, <int, height>],
      "length": <int, frame count>
    }}
    {
      "number": <int, frame number>,
      "pts": <int, frame pts>,
      "checksum": <string, md5 frame hash>
    } (repeatable)
    

    Dataset manifest for images and other data types

    The file describes an ordered set of images and 3d point clouds.

    name - file basename and leading directories from the dataset root checksum - md5 hash sum for the specific image/frame decoded

    { "version": <string, version id> }
    { "type": "images" }
    {
      "name": <string, image filename>,
      "extension": <string, . + file extension>,
      "width": <int, width>,
      "height": <int, height>,
      "meta": <dict, optional>,
      "checksum": <string, md5 hash, optional>
    } (repeatable)
    

    Example files

    Manifest for a video

    {"version":"1.0"}
    {"type":"video"}
    {"properties":{"name":"video.mp4","resolution":[1280,720],"length":778}}
    {"number":0,"pts":0,"checksum":"17bb40d76887b56fe8213c6fded3d540"}
    {"number":135,"pts":486000,"checksum":"9da9b4d42c1206d71bf17a7070a05847"}
    {"number":270,"pts":972000,"checksum":"a1c3a61814f9b58b00a795fa18bb6d3e"}
    {"number":405,"pts":1458000,"checksum":"18c0803b3cc1aa62ac75b112439d2b62"}
    {"number":540,"pts":1944000,"checksum":"4551ecea0f80e95a6c32c32e70cac59e"}
    {"number":675,"pts":2430000,"checksum":"0e72faf67e5218c70b506445ac91cdd7"}
    

    Manifest for a dataset with images

    {"version":"1.0"}
    {"type":"images"}
    {"name":"image1","extension":".jpg","width":720,"height":405,"meta":{"related_images":[]},"checksum":"548918ec4b56132a5cff1d4acabe9947"}
    {"name":"image2","extension":".jpg","width":183,"height":275,"meta":{"related_images":[]},"checksum":"4b4eefd03cc6a45c1c068b98477fb639"}
    {"name":"image3","extension":".jpg","width":301,"height":167,"meta":{"related_images":[]},"checksum":"0e454a6f4a13d56c82890c98be063663"}
    

    5.2.34 - Data preparation on the fly

    Description

    Data on the fly processing is a way of working with data, the main idea of which is as follows: when creating a task, the minimum necessary meta information is collected. This meta information allows in the future to create necessary chunks when receiving a request from a client.

    Generated chunks are stored in a cache of the limited size with a policy of evicting less popular items.

    When a request is received from a client, the required chunk is searched for in the cache. If the chunk does not exist yet, it is created using prepared meta information and then put into the cache.

    This method of working with data allows:

    • reduce the task creation time.
    • store data in a cache of the limited size with a policy of evicting less popular items.

    Unfortunately, this method has several drawbacks:

    • The first access to the data will take more time.
    • It will not work for some videos, even if they have a valid manifest file. If there are not enough keyframes in the video for smooth video decoding, the task data chunks will be created with the default method, i.e. during the task creation.
    • If the data has not been cached yet, and is not reachable during the access time, it cannot be retrieved.

    How to use

    To enable or disable this feature for a new task, use the Use Cache toggle in the task configuration.

    Uploading a manifest with data

    When creating a task, you can upload a manifest.jsonl file along with the video or dataset with images. You can see how to prepare it here.

    5.2.35 - Serverless tutorial

    Introduction

    Leveraging the power of computers to solve daily routine problems, fix mistakes, and find information has become second nature. It is therefore natural to use computing power in annotating datasets. There are multiple publicly available DL models for classification, object detection, and semantic segmentation which can be used for data annotation. Whilst some of these publicly available DL models can be found on CVAT, it is relatively simple to integrate your privately trained ML/DL model into CVAT.

    With the imperfection of the world, alongside the unavailability of a silver bullet that can solve all our problems; publicly available DL models cannot be used when we want to detect niche or specific objects on which these publicly available models were not trained. As annotation requirements can be sometimes strict, automatically annotated objects cannot be accepted as it is, and it is easier to annotate them from scratch. With these limitations in mind, a DL solution that can perfectly annotate 50% of your data equates to reducing manual annotation by half.

    Since we know DL models can help us to annotate faster, how then do we use them? In CVAT all such DL models are implemented as serverless functions using the Nuclio serverless platform. There are multiple implemented functions that can be found in the serverless directory such as Mask RCNN, Faster RCNN, SiamMask, Inside Outside Guidance, Deep Extreme Cut, etc. Follow the installation guide to build and deploy these serverless functions. See the user guide to understand how to use these functions in the UI to automatically annotate data.

    What is a serverless function and why is it used for automatic annotation in CVAT? Let’s assume that you have a DL model and want to use it for AI-assisted annotation. The naive approach is to implement a Python script which uses the DL model to prepare a file with annotations in a public format like MS COCO or Pascal VOC. After that you can upload the annotation file into CVAT. It works but it is not user-friendly. How to make CVAT run the script for you?

    You can pack the script with your DL model into a container which provides a standard interface for interacting with it. One way to do that is to use the function as a service approach. Your script becomes a function inside cloud infrastructure which can be called over HTTP. The Nuclio serverless platform helps us to implement and manage such functions.

    CVAT supports Nuclio out of the box if it is built properly. See the installation guide for instructions. Thus if you deploy a serverless function, the CVAT server can see it and call it with appropriate arguments. Of course there are some tricks how to create serverless functions for CVAT and we will discuss them in next sections of the tutorial.

    Using builtin DL models in practice

    In the tutorial it is assumed that you already have the cloned CVAT GitHub repo. To build CVAT with serverless support you need to run docker compose command with specific configuration files. In the case it is docker-compose.serverless.yml. It has necessary instructions how to build and deploy Nuclio platform as a docker container and enable corresponding support in CVAT.

    docker compose -f docker-compose.yml -f docker-compose.dev.yml -f components/serverless/docker-compose.serverless.yml up -d --build
    
    docker compose -f docker-compose.yml -f docker-compose.dev.yml -f components/serverless/docker-compose.serverless.yml ps
    
       Name                 Command                  State                            Ports
    -------------------------------------------------------------------------------------------------------------
    cvat         /usr/bin/supervisord             Up             8080/tcp
    cvat_db      docker-entrypoint.sh postgres    Up             5432/tcp
    cvat_proxy   /docker-entrypoint.sh /bin ...   Up             0.0.0.0:8080->80/tcp,:::8080->80/tcp
    cvat_redis   docker-entrypoint.sh redis ...   Up             6379/tcp
    cvat_ui      /docker-entrypoint.sh ngin ...   Up             80/tcp
    nuclio       /docker-entrypoint.sh sh - ...   Up (healthy)   80/tcp, 0.0.0.0:8070->8070/tcp,:::8070->8070/tcp
    

    Next step is to deploy builtin serverless functions using Nuclio command line tool (aka nuctl). It is assumed that you followed the installation guide and nuctl is already installed on your operating system. Run the following command to check that it works. In the beginning you should not have any deployed serverless functions.

    nuctl get functions
    
    No functions found
    

    Let’s see on examples how to use DL models for annotation in different computer vision tasks.

    Tracking using SiamMask

    In this use case a user needs to annotate all individual objects on a video as tracks. Basically for every object we need to know its location on every frame.

    First step is to deploy SiamMask. The deployment process can depend on your operating system. On Linux you can use serverless/deploy_cpu.sh auxiliary script, but below we are using nuctl directly.

    nuctl create project cvat
    
    nuctl deploy --project-name cvat --path "./serverless/pytorch/foolwood/siammask/nuclio" --platform local
    
    21.05.07 13:00:22.233                     nuctl (I) Deploying function {"name": ""}
    21.05.07 13:00:22.233                     nuctl (I) Building {"versionInfo": "Label: 1.5.16, Git commit: ae43a6a560c2bec42d7ccfdf6e8e11a1e3cc3774, OS: linux, Arch: amd64, Go version: go1.14.3", "name": ""}
    21.05.07 13:00:22.652                     nuctl (I) Cleaning up before deployment {"functionName": "pth-foolwood-siammask"}
    21.05.07 13:00:22.705                     nuctl (I) Staging files and preparing base images
    21.05.07 13:00:22.706                     nuctl (I) Building processor image {"imageName": "cvat/pth.foolwood.siammask:latest"}
    21.05.07 13:00:22.706     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/handler-builder-python-onbuild:1.5.16-amd64"}
    21.05.07 13:00:26.351     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/uhttpc:0.0.1-amd64"}
    21.05.07 13:00:29.819            nuctl.platform (I) Building docker image {"image": "cvat/pth.foolwood.siammask:latest"}
    21.05.07 13:00:30.103            nuctl.platform (I) Pushing docker image into registry {"image": "cvat/pth.foolwood.siammask:latest", "registry": ""}
    21.05.07 13:00:30.103            nuctl.platform (I) Docker image was successfully built and pushed into docker registry {"image": "cvat/pth.foolwood.siammask:latest"}
    21.05.07 13:00:30.104                     nuctl (I) Build complete {"result": {"Image":"cvat/pth.foolwood.siammask:latest","UpdatedFunctionConfig":{"metadata":{"name":"pth-foolwood-siammask","namespace":"nuclio","labels":{"nuclio.io/project-name":"cvat"},"annotations":{"framework":"pytorch","name":"SiamMask","spec":"","type":"tracker"}},"spec":{"description":"Fast Online Object Tracking and Segmentation","handler":"main:handler","runtime":"python:3.6","env":[{"name":"PYTHONPATH","value":"/opt/nuclio/SiamMask:/opt/nuclio/SiamMask/experiments/siammask_sharp"}],"resources":{},"image":"cvat/pth.foolwood.siammask:latest","targetCPU":75,"triggers":{"myHttpTrigger":{"class":"","kind":"http","name":"myHttpTrigger","maxWorkers":2,"workerAvailabilityTimeoutMilliseconds":10000,"attributes":{"maxRequestBodySize":33554432}}},"build":{"image":"cvat/pth.foolwood.siammask","baseImage":"continuumio/miniconda3","directives":{"preCopy":[{"kind":"WORKDIR","value":"/opt/nuclio"},{"kind":"RUN","value":"conda create -y -n siammask python=3.6"},{"kind":"SHELL","value":"[\"conda\", \"run\", \"-n\", \"siammask\", \"/bin/bash\", \"-c\"]"},{"kind":"RUN","value":"git clone https://github.com/foolwood/SiamMask.git"},{"kind":"RUN","value":"pip install -r SiamMask/requirements.txt jsonpickle"},{"kind":"RUN","value":"conda install -y gcc_linux-64"},{"kind":"RUN","value":"cd SiamMask \u0026\u0026 bash make.sh \u0026\u0026 cd -"},{"kind":"RUN","value":"wget -P SiamMask/experiments/siammask_sharp http://www.robots.ox.ac.uk/~qwang/SiamMask_DAVIS.pth"},{"kind":"ENTRYPOINT","value":"[\"conda\", \"run\", \"-n\", \"siammask\"]"}]},"codeEntryType":"image"},"platform":{"attributes":{"mountMode":"volume","restartPolicy":{"maximumRetryCount":3,"name":"always"}}},"readinessTimeoutSeconds":60,"securityContext":{},"eventTimeout":"30s"}}}}
    21.05.07 13:00:31.387            nuctl.platform (I) Waiting for function to be ready {"timeout": 60}
    21.05.07 13:00:32.796                     nuctl (I) Function deploy complete {"functionName": "pth-foolwood-siammask", "httpPort": 49155}
    
    nuctl get functions
    
      NAMESPACE |         NAME          | PROJECT | STATE | NODE PORT | REPLICAS
      nuclio    | pth-foolwood-siammask | cvat    | ready |     49155 | 1/1
    

    Let’s see how it works in the UI. Go to the models tab and check that you can see SiamMask in the list. If you cannot, it means that there are some problems. Go to one of our public channels and ask for help.

    Models list with SiamMask

    After that, go to the new task page and create a task with this video file. You can choose any task name, any labels, and even another video file if you like. In this case, the Remote sources option was used to specify the video file. Press submit button at the end to finish the process.

    Create a video annotation task

    Open the task and use AI tools to start tracking an object. Draw a bounding box around an object, and sequentially switch through the frame and correct the restrictive box if necessary.

    Start tracking an object

    Finally you will get bounding boxes.

    SiamMask results

    SiamMask model is more optimized to work on Nvidia GPUs. For more information about deploying the model for the GPU, read on.

    Object detection using YOLO-v3

    First of all let’s deploy the DL model. The deployment process is similar for all serverless functions. Need to run nuctl deploy command with appropriate arguments. To simplify the process, you can use serverless/deploy_cpu.sh command. Inference of the serverless function is optimized for CPU using Intel OpenVINO framework.

    serverless/deploy_cpu.sh serverless/openvino/omz/public/yolo-v3-tf/
    
    Deploying serverless/openvino/omz/public/yolo-v3-tf function...
    21.07.12 15:55:17.314                     nuctl (I) Deploying function {"name": ""}
    21.07.12 15:55:17.314                     nuctl (I) Building {"versionInfo": "Label: 1.5.16, Git commit: ae43a6a560c2bec42d7ccfdf6e8e11a1e3cc3774, OS: linux, Arch: amd64, Go version: go1.14.3", "name": ""}
    21.07.12 15:55:17.682                     nuctl (I) Cleaning up before deployment {"functionName": "openvino-omz-public-yolo-v3-tf"}
    21.07.12 15:55:17.739                     nuctl (I) Staging files and preparing base images
    21.07.12 15:55:17.743                     nuctl (I) Building processor image {"imageName": "cvat/openvino.omz.public.yolo-v3-tf:latest"}
    21.07.12 15:55:17.743     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/handler-builder-python-onbuild:1.5.16-amd64"}
    21.07.12 15:55:21.048     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/uhttpc:0.0.1-amd64"}
    21.07.12 15:55:24.595            nuctl.platform (I) Building docker image {"image": "cvat/openvino.omz.public.yolo-v3-tf:latest"}
    21.07.12 15:55:30.359            nuctl.platform (I) Pushing docker image into registry {"image": "cvat/openvino.omz.public.yolo-v3-tf:latest", "registry": ""}
    21.07.12 15:55:30.359            nuctl.platform (I) Docker image was successfully built and pushed into docker registry {"image": "cvat/openvino.omz.public.yolo-v3-tf:latest"}
    21.07.12 15:55:30.359                     nuctl (I) Build complete {"result": {"Image":"cvat/openvino.omz.public.yolo-v3-tf:latest","UpdatedFunctionConfig":{"metadata":{"name":"openvino-omz-public-yolo-v3-tf","namespace":"nuclio","labels":{"nuclio.io/project-name":"cvat"},"annotations":{"framework":"openvino","name":"YOLO v3","spec":"[\n  { \"id\": 0, \"name\": \"person\" },\n  { \"id\": 1, \"name\": \"bicycle\" },\n  { \"id\": 2, \"name\": \"car\" },\n  { \"id\": 3, \"name\": \"motorbike\" },\n  { \"id\": 4, \"name\": \"aeroplane\" },\n  { \"id\": 5, \"name\": \"bus\" },\n  { \"id\": 6, \"name\": \"train\" },\n  { \"id\": 7, \"name\": \"truck\" },\n  { \"id\": 8, \"name\": \"boat\" },\n  { \"id\": 9, \"name\": \"traffic light\" },\n  { \"id\": 10, \"name\": \"fire hydrant\" },\n  { \"id\": 11, \"name\": \"stop sign\" },\n  { \"id\": 12, \"name\": \"parking meter\" },\n  { \"id\": 13, \"name\": \"bench\" },\n  { \"id\": 14, \"name\": \"bird\" },\n  { \"id\": 15, \"name\": \"cat\" },\n  { \"id\": 16, \"name\": \"dog\" },\n  { \"id\": 17, \"name\": \"horse\" },\n  { \"id\": 18, \"name\": \"sheep\" },\n  { \"id\": 19, \"name\": \"cow\" },\n  { \"id\": 20, \"name\": \"elephant\" },\n  { \"id\": 21, \"name\": \"bear\" },\n  { \"id\": 22, \"name\": \"zebra\" },\n  { \"id\": 23, \"name\": \"giraffe\" },\n  { \"id\": 24, \"name\": \"backpack\" },\n  { \"id\": 25, \"name\": \"umbrella\" },\n  { \"id\": 26, \"name\": \"handbag\" },\n  { \"id\": 27, \"name\": \"tie\" },\n  { \"id\": 28, \"name\": \"suitcase\" },\n  { \"id\": 29, \"name\": \"frisbee\" },\n  { \"id\": 30, \"name\": \"skis\" },\n  { \"id\": 31, \"name\": \"snowboard\" },\n  { \"id\": 32, \"name\": \"sports ball\" },\n  { \"id\": 33, \"name\": \"kite\" },\n  { \"id\": 34, \"name\": \"baseball bat\" },\n  { \"id\": 35, \"name\": \"baseball glove\" },\n  { \"id\": 36, \"name\": \"skateboard\" },\n  { \"id\": 37, \"name\": \"surfboard\" },\n  { \"id\": 38, \"name\": \"tennis racket\" },\n  { \"id\": 39, \"name\": \"bottle\" },\n  { \"id\": 40, \"name\": \"wine glass\" },\n  { \"id\": 41, \"name\": \"cup\" },\n  { \"id\": 42, \"name\": \"fork\" },\n  { \"id\": 43, \"name\": \"knife\" },\n  { \"id\": 44, \"name\": \"spoon\" },\n  { \"id\": 45, \"name\": \"bowl\" },\n  { \"id\": 46, \"name\": \"banana\" },\n  { \"id\": 47, \"name\": \"apple\" },\n  { \"id\": 48, \"name\": \"sandwich\" },\n  { \"id\": 49, \"name\": \"orange\" },\n  { \"id\": 50, \"name\": \"broccoli\" },\n  { \"id\": 51, \"name\": \"carrot\" },\n  { \"id\": 52, \"name\": \"hot dog\" },\n  { \"id\": 53, \"name\": \"pizza\" },\n  { \"id\": 54, \"name\": \"donut\" },\n  { \"id\": 55, \"name\": \"cake\" },\n  { \"id\": 56, \"name\": \"chair\" },\n  { \"id\": 57, \"name\": \"sofa\" },\n  { \"id\": 58, \"name\": \"pottedplant\" },\n  { \"id\": 59, \"name\": \"bed\" },\n  { \"id\": 60, \"name\": \"diningtable\" },\n  { \"id\": 61, \"name\": \"toilet\" },\n  { \"id\": 62, \"name\": \"tvmonitor\" },\n  { \"id\": 63, \"name\": \"laptop\" },\n  { \"id\": 64, \"name\": \"mouse\" },\n  { \"id\": 65, \"name\": \"remote\" },\n  { \"id\": 66, \"name\": \"keyboard\" },\n  { \"id\": 67, \"name\": \"cell phone\" },\n  { \"id\": 68, \"name\": \"microwave\" },\n  { \"id\": 69, \"name\": \"oven\" },\n  { \"id\": 70, \"name\": \"toaster\" },\n  { \"id\": 71, \"name\": \"sink\" },\n  { \"id\": 72, \"name\": \"refrigerator\" },\n  { \"id\": 73, \"name\": \"book\" },\n  { \"id\": 74, \"name\": \"clock\" },\n  { \"id\": 75, \"name\": \"vase\" },\n  { \"id\": 76, \"name\": \"scissors\" },\n  { \"id\": 77, \"name\": \"teddy bear\" },\n  { \"id\": 78, \"name\": \"hair drier\" },\n  { \"id\": 79, \"name\": \"toothbrush\" }\n]\n","type":"detector"}},"spec":{"description":"YOLO v3 via Intel OpenVINO","handler":"main:handler","runtime":"python:3.6","env":[{"name":"NUCLIO_PYTHON_EXE_PATH","value":"/opt/nuclio/common/openvino/python3"}],"resources":{},"image":"cvat/openvino.omz.public.yolo-v3-tf:latest","targetCPU":75,"triggers":{"myHttpTrigger":{"class":"","kind":"http","name":"myHttpTrigger","maxWorkers":2,"workerAvailabilityTimeoutMilliseconds":10000,"attributes":{"maxRequestBodySize":33554432}}},"volumes":[{"volume":{"name":"volume-1","hostPath":{"path":"/home/nmanovic/Workspace/cvat/serverless/common"}},"volumeMount":{"name":"volume-1","mountPath":"/opt/nuclio/common"}}],"build":{"image":"cvat/openvino.omz.public.yolo-v3-tf","baseImage":"openvino/ubuntu18_dev:2020.2","directives":{"preCopy":[{"kind":"USER","value":"root"},{"kind":"WORKDIR","value":"/opt/nuclio"},{"kind":"RUN","value":"ln -s /usr/bin/pip3 /usr/bin/pip"},{"kind":"RUN","value":"/opt/intel/openvino/deployment_tools/open_model_zoo/tools/downloader/downloader.py --name yolo-v3-tf -o /opt/nuclio/open_model_zoo"},{"kind":"RUN","value":"/opt/intel/openvino/deployment_tools/open_model_zoo/tools/downloader/converter.py --name yolo-v3-tf --precisions FP32 -d /opt/nuclio/open_model_zoo -o /opt/nuclio/open_model_zoo"}]},"codeEntryType":"image"},"platform":{"attributes":{"mountMode":"volume","restartPolicy":{"maximumRetryCount":3,"name":"always"}}},"readinessTimeoutSeconds":60,"securityContext":{},"eventTimeout":"30s"}}}}
    21.07.12 15:55:31.496            nuctl.platform (I) Waiting for function to be ready {"timeout": 60}
    21.07.12 15:55:32.894                     nuctl (I) Function deploy complete {"functionName": "openvino-omz-public-yolo-v3-tf", "httpPort": 49156}
    

    Again, go to models tab and check that you can see YOLO v3 in the list. If you cannot by a reason it means that there are some problems. Go to one of our public channels and ask for help.

    Let us reuse the task which you created for testing SiamMask serverless function above. Choose the magic wand tool, go to the Detectors tab, and select YOLO v3 model. Press Annotate button and after a couple of seconds you should see detection results. Do not forget to save annotations.

    YOLO v3 results

    Also it is possible to run a detector for the whole annotation task. Thus CVAT will run the serverless function on every frame of the task and submit results directly into database. For more details please read the guide.

    Objects segmentation using Mask-RCNN

    If you have a detector, which returns polygons, you can segment objects. One of such detectors is Mask-RCNN. There are several implementations of the detector available out of the box:

    • serverless/openvino/omz/public/mask_rcnn_inception_resnet_v2_atrous_coco is optimized using Intel OpenVINO framework and works well if it is run on an Intel CPU.
    • serverless/tensorflow/matterport/mask_rcnn/ is optimized for GPU.

    The deployment process for a serverless function optimized for GPU is similar. Just need to run serverless/deploy_gpu.sh script. It runs mostly the same commands but utilize function-gpu.yaml configuration file instead of function.yaml internally. See next sections if you want to understand the difference.

    Note: Please do not run several GPU functions at the same time. In many cases it will not work out of the box. For now you should manually schedule different functions on different GPUs and it requires source code modification. Nuclio autoscaler does not support the local platform (docker).

    serverless/deploy_gpu.sh serverless/tensorflow/matterport/mask_rcnn
    
    Deploying serverless/tensorflow/matterport/mask_rcnn function...
    21.07.12 16:48:48.995                     nuctl (I) Deploying function {"name": ""}
    21.07.12 16:48:48.995                     nuctl (I) Building {"versionInfo": "Label: 1.5.16, Git commit: ae43a6a560c2bec42d7ccfdf6e8e11a1e3cc3774, OS: linux, Arch: amd64, Go version: go1.14.3", "name": ""}
    21.07.12 16:48:49.356                     nuctl (I) Cleaning up before deployment {"functionName": "tf-matterport-mask-rcnn"}
    21.07.12 16:48:49.470                     nuctl (I) Function already exists, deleting function containers {"functionName": "tf-matterport-mask-rcnn"}
    21.07.12 16:48:50.247                     nuctl (I) Staging files and preparing base images
    21.07.12 16:48:50.248                     nuctl (I) Building processor image {"imageName": "cvat/tf.matterport.mask_rcnn:latest"}
    21.07.12 16:48:50.249     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/handler-builder-python-onbuild:1.5.16-amd64"}
    21.07.12 16:48:53.674     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/uhttpc:0.0.1-amd64"}
    21.07.12 16:48:57.424            nuctl.platform (I) Building docker image {"image": "cvat/tf.matterport.mask_rcnn:latest"}
    21.07.12 16:48:57.763            nuctl.platform (I) Pushing docker image into registry {"image": "cvat/tf.matterport.mask_rcnn:latest", "registry": ""}
    21.07.12 16:48:57.764            nuctl.platform (I) Docker image was successfully built and pushed into docker registry {"image": "cvat/tf.matterport.mask_rcnn:latest"}
    21.07.12 16:48:57.764                     nuctl (I) Build complete {"result": {"Image":"cvat/tf.matterport.mask_rcnn:latest","UpdatedFunctionConfig":{"metadata":{"name":"tf-matterport-mask-rcnn","namespace":"nuclio","labels":{"nuclio.io/project-name":"cvat"},"annotations":{"framework":"tensorflow","name":"Mask RCNN via Tensorflow","spec":"[\n  { \"id\": 0, \"name\": \"BG\" },\n  { \"id\": 1, \"name\": \"person\" },\n  { \"id\": 2, \"name\": \"bicycle\" },\n  { \"id\": 3, \"name\": \"car\" },\n  { \"id\": 4, \"name\": \"motorcycle\" },\n  { \"id\": 5, \"name\": \"airplane\" },\n  { \"id\": 6, \"name\": \"bus\" },\n  { \"id\": 7, \"name\": \"train\" },\n  { \"id\": 8, \"name\": \"truck\" },\n  { \"id\": 9, \"name\": \"boat\" },\n  { \"id\": 10, \"name\": \"traffic_light\" },\n  { \"id\": 11, \"name\": \"fire_hydrant\" },\n  { \"id\": 12, \"name\": \"stop_sign\" },\n  { \"id\": 13, \"name\": \"parking_meter\" },\n  { \"id\": 14, \"name\": \"bench\" },\n  { \"id\": 15, \"name\": \"bird\" },\n  { \"id\": 16, \"name\": \"cat\" },\n  { \"id\": 17, \"name\": \"dog\" },\n  { \"id\": 18, \"name\": \"horse\" },\n  { \"id\": 19, \"name\": \"sheep\" },\n  { \"id\": 20, \"name\": \"cow\" },\n  { \"id\": 21, \"name\": \"elephant\" },\n  { \"id\": 22, \"name\": \"bear\" },\n  { \"id\": 23, \"name\": \"zebra\" },\n  { \"id\": 24, \"name\": \"giraffe\" },\n  { \"id\": 25, \"name\": \"backpack\" },\n  { \"id\": 26, \"name\": \"umbrella\" },\n  { \"id\": 27, \"name\": \"handbag\" },\n  { \"id\": 28, \"name\": \"tie\" },\n  { \"id\": 29, \"name\": \"suitcase\" },\n  { \"id\": 30, \"name\": \"frisbee\" },\n  { \"id\": 31, \"name\": \"skis\" },\n  { \"id\": 32, \"name\": \"snowboard\" },\n  { \"id\": 33, \"name\": \"sports_ball\" },\n  { \"id\": 34, \"name\": \"kite\" },\n  { \"id\": 35, \"name\": \"baseball_bat\" },\n  { \"id\": 36, \"name\": \"baseball_glove\" },\n  { \"id\": 37, \"name\": \"skateboard\" },\n  { \"id\": 38, \"name\": \"surfboard\" },\n  { \"id\": 39, \"name\": \"tennis_racket\" },\n  { \"id\": 40, \"name\": \"bottle\" },\n  { \"id\": 41, \"name\": \"wine_glass\" },\n  { \"id\": 42, \"name\": \"cup\" },\n  { \"id\": 43, \"name\": \"fork\" },\n  { \"id\": 44, \"name\": \"knife\" },\n  { \"id\": 45, \"name\": \"spoon\" },\n  { \"id\": 46, \"name\": \"bowl\" },\n  { \"id\": 47, \"name\": \"banana\" },\n  { \"id\": 48, \"name\": \"apple\" },\n  { \"id\": 49, \"name\": \"sandwich\" },\n  { \"id\": 50, \"name\": \"orange\" },\n  { \"id\": 51, \"name\": \"broccoli\" },\n  { \"id\": 52, \"name\": \"carrot\" },\n  { \"id\": 53, \"name\": \"hot_dog\" },\n  { \"id\": 54, \"name\": \"pizza\" },\n  { \"id\": 55, \"name\": \"donut\" },\n  { \"id\": 56, \"name\": \"cake\" },\n  { \"id\": 57, \"name\": \"chair\" },\n  { \"id\": 58, \"name\": \"couch\" },\n  { \"id\": 59, \"name\": \"potted_plant\" },\n  { \"id\": 60, \"name\": \"bed\" },\n  { \"id\": 61, \"name\": \"dining_table\" },\n  { \"id\": 62, \"name\": \"toilet\" },\n  { \"id\": 63, \"name\": \"tv\" },\n  { \"id\": 64, \"name\": \"laptop\" },\n  { \"id\": 65, \"name\": \"mouse\" },\n  { \"id\": 66, \"name\": \"remote\" },\n  { \"id\": 67, \"name\": \"keyboard\" },\n  { \"id\": 68, \"name\": \"cell_phone\" },\n  { \"id\": 69, \"name\": \"microwave\" },\n  { \"id\": 70, \"name\": \"oven\" },\n  { \"id\": 71, \"name\": \"toaster\" },\n  { \"id\": 72, \"name\": \"sink\" },\n  { \"id\": 73, \"name\": \"refrigerator\" },\n  { \"id\": 74, \"name\": \"book\" },\n  { \"id\": 75, \"name\": \"clock\" },\n  { \"id\": 76, \"name\": \"vase\" },\n  { \"id\": 77, \"name\": \"scissors\" },\n  { \"id\": 78, \"name\": \"teddy_bear\" },\n  { \"id\": 79, \"name\": \"hair_drier\" },\n  { \"id\": 80, \"name\": \"toothbrush\" }\n]\n","type":"detector"}},"spec":{"description":"Mask RCNN optimized for GPU","handler":"main:handler","runtime":"python:3.6","env":[{"name":"MASK_RCNN_DIR","value":"/opt/nuclio/Mask_RCNN"}],"resources":{"limits":{"nvidia.com/gpu":"1"}},"image":"cvat/tf.matterport.mask_rcnn:latest","targetCPU":75,"triggers":{"myHttpTrigger":{"class":"","kind":"http","name":"myHttpTrigger","maxWorkers":1,"workerAvailabilityTimeoutMilliseconds":10000,"attributes":{"maxRequestBodySize":33554432}}},"volumes":[{"volume":{"name":"volume-1","hostPath":{"path":"/home/nmanovic/Workspace/cvat/serverless/common"}},"volumeMount":{"name":"volume-1","mountPath":"/opt/nuclio/common"}}],"build":{"functionConfigPath":"serverless/tensorflow/matterport/mask_rcnn/nuclio/function-gpu.yaml","image":"cvat/tf.matterport.mask_rcnn","baseImage":"tensorflow/tensorflow:1.15.5-gpu-py3","directives":{"postCopy":[{"kind":"WORKDIR","value":"/opt/nuclio"},{"kind":"RUN","value":"apt update \u0026\u0026 apt install --no-install-recommends -y git curl"},{"kind":"RUN","value":"git clone --depth 1 https://github.com/matterport/Mask_RCNN.git"},{"kind":"RUN","value":"curl -L https://github.com/matterport/Mask_RCNN/releases/download/v2.0/mask_rcnn_coco.h5 -o Mask_RCNN/mask_rcnn_coco.h5"},{"kind":"RUN","value":"pip3 install numpy cython pyyaml keras==2.1.0 scikit-image Pillow"}]},"codeEntryType":"image"},"platform":{"attributes":{"mountMode":"volume","restartPolicy":{"maximumRetryCount":3,"name":"always"}}},"readinessTimeoutSeconds":60,"securityContext":{},"eventTimeout":"30s"}}}}
    21.07.12 16:48:59.071            nuctl.platform (I) Waiting for function to be ready {"timeout": 60}
    21.07.12 16:49:00.437                     nuctl (I) Function deploy complete {"functionName": "tf-matterport-mask-rcnn", "httpPort": 49155}
    

    Now you should be able to annotate objects using segmentation masks.

    Mask RCNN results

    Adding your own DL models

    Choose a DL model

    For the tutorial I will choose a popular AI library with a lot of models inside. In your case it can be your own model. If it is based on detectron2 it will be easy to integrate. Just follow the tutorial.

    Detectron2 is Facebook AI Research’s next generation library that provides state-of-the-art detection and segmentation algorithms. It is the successor of Detectron and maskrcnn-benchmark. It supports a number of computer vision research projects and production applications in Facebook.

    Clone the repository somewhere. I assume that all other experiments will be run from the cloned detectron2 directory.

    git clone https://github.com/facebookresearch/detectron2
    cd detectron2
    

    Run local experiments

    Let’s run a detection model locally. First of all need to install requirements for the library.

    In my case I have Ubuntu 20.04 with python 3.8.5. I installed PyTorch 1.8.1 for Linux with pip, python, and CPU inside a virtual environment. Follow opencv-python installation guide to get the library for demo and visualization.

    python3 -m venv .detectron2
    . .detectron2/bin/activate
    pip install torch==1.8.1+cpu torchvision==0.9.1+cpu torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
    pip install opencv-python
    

    Install the detectron2 library from your local clone (you should be inside detectron2 directory).

    python -m pip install -e .
    

    After the library from Facebook AI Research is installed, we can run a couple of experiments. See the official tutorial for more examples. I decided to experiment with RetinaNet. First step is to download model weights.

    curl -O https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_101_FPN_3x/190397697/model_final_971ab9.pkl
    

    To run experiments let’s download an image with cats from wikipedia.

    curl -O https://upload.wikimedia.org/wikipedia/commons/thumb/0/0b/Cat_poster_1.jpg/1920px-Cat_poster_1.jpg
    

    Finally let’s run the DL model inference on CPU. If all is fine, you will see a window with cats and bounding boxes around them with scores.

    python demo/demo.py --config-file configs/COCO-Detection/retinanet_R_101_FPN_3x.yaml \
      --input 1920px-Cat_poster_1.jpg --opts MODEL.WEIGHTS model_final_971ab9.pkl MODEL.DEVICE cpu
    

    Cats detected by RetinaNet R101

    Next step is to minimize demo/demo.py script and keep code which is necessary to load, run, and interpret output of the model only. Let’s hard code parameters and remove argparse. Keep only code which is responsible for working with an image. There is no common advice how to minimize some code.

    Finally you should get something like the code below which has fixed config, read a predefined image, initialize predictor, and run inference. As the final step it prints all detected bounding boxes with scores and labels.

    from detectron2.config import get_cfg
    from detectron2.data.detection_utils import read_image
    from detectron2.engine.defaults import DefaultPredictor
    from detectron2.data.datasets.builtin_meta import COCO_CATEGORIES
    
    CONFIG_FILE = "configs/COCO-Detection/retinanet_R_101_FPN_3x.yaml"
    CONFIG_OPTS = ["MODEL.WEIGHTS", "model_final_971ab9.pkl", "MODEL.DEVICE", "cpu"]
    CONFIDENCE_THRESHOLD = 0.5
    
    def setup_cfg():
        cfg = get_cfg()
        cfg.merge_from_file(CONFIG_FILE)
        cfg.merge_from_list(CONFIG_OPTS)
        cfg.MODEL.RETINANET.SCORE_THRESH_TEST = CONFIDENCE_THRESHOLD
        cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = CONFIDENCE_THRESHOLD
        cfg.MODEL.PANOPTIC_FPN.COMBINE.INSTANCES_CONFIDENCE_THRESH = CONFIDENCE_THRESHOLD
        cfg.freeze()
        return cfg
    
    
    if __name__ == "__main__":
        cfg = setup_cfg()
        input = "1920px-Cat_poster_1.jpg"
        img = read_image(input, format="BGR")
        predictor = DefaultPredictor(cfg)
        predictions = predictor(img)
        instances = predictions['instances']
        pred_boxes = instances.pred_boxes
        scores = instances.scores
        pred_classes = instances.pred_classes
        for box, score, label in zip(pred_boxes, scores, pred_classes):
            label = COCO_CATEGORIES[int(label)]["name"]
            print(box.tolist(), float(score), label)
    

    DL model as a serverless function

    When we know how to run the DL model locally, we can prepare a serverless function which can be used by CVAT to annotate data. Let’s see how function.yaml will look like…

    Let’s look at faster_rcnn_inception_v2_coco serverless function configuration as an example and try adapting it to our case. First of all let’s invent an unique name for the new function: pth-facebookresearch-detectron2-retinanet-r101. Section annotations describes our function for CVAT serverless subsystem:

    • annotations.name is a display name
    • annotations.type is a type of the serverless function. It can have several different values. Basically it affects input and output of the function. In our case it has detector type and it means that the integrated DL model can generate shapes with labels for an image.
    • annotations.framework is used for information only and can have arbitrary value. Usually it has values like OpenVINO, PyTorch, TensorFlow, etc.
    • annotations.spec describes the list of labels which the model supports. In the case the DL model was trained on MS COCO dataset and the list of labels correspond to the dataset.
    • spec.description is used to provide basic information for the model.

    All other parameters are described in Nuclio documentation.

    • spec.handler is the entry point to your function.
    • spec.runtime is the name of the language runtime.
    • spec.eventTimeout is the global event timeout

    Next step is to describe how to build our serverless function:

    • spec.build.image is the name of your docker image
    • spec.build.baseImage is the name of a base container image from which to build the function
    • spec.build.directives are commands to build your docker image

    In our case we start from Ubuntu 20.04 base image, install curl to download weights for our model, git to clone detectron2 project from GitHub, and python together with pip. Repeat installation steps which we used to setup the DL model locally with minor modifications.

    For Nuclio platform we have to specify a couple of more parameters:

    • spec.triggers.myHttpTrigger describes HTTP trigger to handle incoming HTTP requests.
    • spec.platform describes some important parameters to run your functions like restartPolicy and mountMode. Read Nuclio documentation for more details.
    metadata:
      name: pth-facebookresearch-detectron2-retinanet-r101
      namespace: cvat
      annotations:
        name: RetinaNet R101
        type: detector
        framework: pytorch
        spec: |
          [
            { "id": 1, "name": "person" },
            { "id": 2, "name": "bicycle" },
    
            ...
    
            { "id":89, "name": "hair_drier" },
            { "id":90, "name": "toothbrush" }
          ]      
    
    spec:
      description: RetinaNet R101 from Detectron2
      runtime: 'python:3.8'
      handler: main:handler
      eventTimeout: 30s
    
      build:
        image: cvat/pth.facebookresearch.detectron2.retinanet_r101
        baseImage: ubuntu:20.04
    
        directives:
          preCopy:
            - kind: ENV
              value: DEBIAN_FRONTEND=noninteractive
            - kind: RUN
              value: apt-get update && apt-get -y install curl git python3 python3-pip
            - kind: WORKDIR
              value: /opt/nuclio
            - kind: RUN
              value: pip3 install torch==1.8.1+cpu torchvision==0.9.1+cpu torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
            - kind: RUN
              value: pip3 install 'git+https://github.com/facebookresearch/detectron2@v0.4'
            - kind: RUN
              value: curl -O https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_101_FPN_3x/190397697/model_final_971ab9.pkl
            - kind: RUN
              value: ln -s /usr/bin/pip3 /usr/local/bin/pip
    
      triggers:
        myHttpTrigger:
          maxWorkers: 2
          kind: 'http'
          workerAvailabilityTimeoutMilliseconds: 10000
          attributes:
            maxRequestBodySize: 33554432 # 32MB
    
      platform:
        attributes:
          restartPolicy:
            name: always
            maximumRetryCount: 3
          mountMode: volume
    

    Full code can be found here: detectron2/retinanet/nuclio/function.yaml

    Next step is to adapt our source code which we implemented to run the DL model locally to requirements of Nuclio platform. First step is to load the model into memory using init_context(context) function. Read more about the function in Best Practices and Common Pitfalls.

    After that we need to accept incoming HTTP requests, run inference, reply with detection results. For the process our entry point is responsible which we specified in our function specification handler(context, event). Again in accordance to function specification the entry point should be located inside main.py.

    
    def init_context(context):
        context.logger.info("Init context...  0%")
    
        cfg = get_config('COCO-Detection/retinanet_R_101_FPN_3x.yaml')
        cfg.merge_from_list(CONFIG_OPTS)
        cfg.MODEL.RETINANET.SCORE_THRESH_TEST = CONFIDENCE_THRESHOLD
        cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = CONFIDENCE_THRESHOLD
        cfg.MODEL.PANOPTIC_FPN.COMBINE.INSTANCES_CONFIDENCE_THRESH = CONFIDENCE_THRESHOLD
        cfg.freeze()
        predictor = DefaultPredictor(cfg)
    
        context.user_data.model_handler = predictor
    
        context.logger.info("Init context...100%")
    
    def handler(context, event):
        context.logger.info("Run retinanet-R101 model")
        data = event.body
        buf = io.BytesIO(base64.b64decode(data["image"]))
        threshold = float(data.get("threshold", 0.5))
        image = convert_PIL_to_numpy(Image.open(buf), format="BGR")
    
        predictions = context.user_data.model_handler(image)
    
        instances = predictions['instances']
        pred_boxes = instances.pred_boxes
        scores = instances.scores
        pred_classes = instances.pred_classes
        results = []
        for box, score, label in zip(pred_boxes, scores, pred_classes):
            label = COCO_CATEGORIES[int(label)]["name"]
            if score >= threshold:
                results.append({
                    "confidence": str(float(score)),
                    "label": label,
                    "points": box.tolist(),
                    "type": "rectangle",
                })
    
        return context.Response(body=json.dumps(results), headers={},
            content_type='application/json', status_code=200)
    
    

    Full code can be found here: detectron2/retinanet/nuclio/main.py

    Deploy RetinaNet serverless function

    To use the new serverless function you have to deploy it using nuctl command. The actual deployment process is described in automatic annotation guide.

    ./serverless/deploy_cpu.sh ./serverless/pytorch/facebookresearch/detectron2/retinanet/
    
    21.07.21 15:20:31.011                     nuctl (I) Deploying function {"name": ""}
    21.07.21 15:20:31.011                     nuctl (I) Building {"versionInfo": "Label: 1.5.16, Git commit: ae43a6a560c2bec42d7ccfdf6e8e11a1e3cc3774, OS: linux, Arch: amd64, Go version: go1.14.3", "name": ""}
    21.07.21 15:20:31.407                     nuctl (I) Cleaning up before deployment {"functionName": "pth-facebookresearch-detectron2-retinanet-r101"}
    21.07.21 15:20:31.497                     nuctl (I) Function already exists, deleting function containers {"functionName": "pth-facebookresearch-detectron2-retinanet-r101"}
    21.07.21 15:20:31.914                     nuctl (I) Staging files and preparing base images
    21.07.21 15:20:31.915                     nuctl (I) Building processor image {"imageName": "cvat/pth.facebookresearch.detectron2.retinanet_r101:latest"}
    21.07.21 15:20:31.916     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/handler-builder-python-onbuild:1.5.16-amd64"}
    21.07.21 15:20:34.495     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/uhttpc:0.0.1-amd64"}
    21.07.21 15:20:37.524            nuctl.platform (I) Building docker image {"image": "cvat/pth.facebookresearch.detectron2.retinanet_r101:latest"}
    21.07.21 15:20:37.852            nuctl.platform (I) Pushing docker image into registry {"image": "cvat/pth.facebookresearch.detectron2.retinanet_r101:latest", "registry": ""}
    21.07.21 15:20:37.853            nuctl.platform (I) Docker image was successfully built and pushed into docker registry {"image": "cvat/pth.facebookresearch.detectron2.retinanet_r101:latest"}
    21.07.21 15:20:37.853                     nuctl (I) Build complete {"result": {"Image":"cvat/pth.facebookresearch.detectron2.retinanet_r101:latest","UpdatedFunctionConfig":{"metadata":{"name":"pth-facebookresearch-detectron2-retinanet-r101","namespace":"nuclio","labels":{"nuclio.io/project-name":"cvat"},"annotations":{"framework":"pytorch","name":"RetinaNet R101","spec":"[\n  { \"id\": 1, \"name\": \"person\" },\n  { \"id\": 2, \"name\": \"bicycle\" },\n  { \"id\": 3, \"name\": \"car\" },\n  { \"id\": 4, \"name\": \"motorcycle\" },\n  { \"id\": 5, \"name\": \"airplane\" },\n  { \"id\": 6, \"name\": \"bus\" },\n  { \"id\": 7, \"name\": \"train\" },\n  { \"id\": 8, \"name\": \"truck\" },\n  { \"id\": 9, \"name\": \"boat\" },\n  { \"id\":10, \"name\": \"traffic_light\" },\n  { \"id\":11, \"name\": \"fire_hydrant\" },\n  { \"id\":13, \"name\": \"stop_sign\" },\n  { \"id\":14, \"name\": \"parking_meter\" },\n  { \"id\":15, \"name\": \"bench\" },\n  { \"id\":16, \"name\": \"bird\" },\n  { \"id\":17, \"name\": \"cat\" },\n  { \"id\":18, \"name\": \"dog\" },\n  { \"id\":19, \"name\": \"horse\" },\n  { \"id\":20, \"name\": \"sheep\" },\n  { \"id\":21, \"name\": \"cow\" },\n  { \"id\":22, \"name\": \"elephant\" },\n  { \"id\":23, \"name\": \"bear\" },\n  { \"id\":24, \"name\": \"zebra\" },\n  { \"id\":25, \"name\": \"giraffe\" },\n  { \"id\":27, \"name\": \"backpack\" },\n  { \"id\":28, \"name\": \"umbrella\" },\n  { \"id\":31, \"name\": \"handbag\" },\n  { \"id\":32, \"name\": \"tie\" },\n  { \"id\":33, \"name\": \"suitcase\" },\n  { \"id\":34, \"name\": \"frisbee\" },\n  { \"id\":35, \"name\": \"skis\" },\n  { \"id\":36, \"name\": \"snowboard\" },\n  { \"id\":37, \"name\": \"sports_ball\" },\n  { \"id\":38, \"name\": \"kite\" },\n  { \"id\":39, \"name\": \"baseball_bat\" },\n  { \"id\":40, \"name\": \"baseball_glove\" },\n  { \"id\":41, \"name\": \"skateboard\" },\n  { \"id\":42, \"name\": \"surfboard\" },\n  { \"id\":43, \"name\": \"tennis_racket\" },\n  { \"id\":44, \"name\": \"bottle\" },\n  { \"id\":46, \"name\": \"wine_glass\" },\n  { \"id\":47, \"name\": \"cup\" },\n  { \"id\":48, \"name\": \"fork\" },\n  { \"id\":49, \"name\": \"knife\" },\n  { \"id\":50, \"name\": \"spoon\" },\n  { \"id\":51, \"name\": \"bowl\" },\n  { \"id\":52, \"name\": \"banana\" },\n  { \"id\":53, \"name\": \"apple\" },\n  { \"id\":54, \"name\": \"sandwich\" },\n  { \"id\":55, \"name\": \"orange\" },\n  { \"id\":56, \"name\": \"broccoli\" },\n  { \"id\":57, \"name\": \"carrot\" },\n  { \"id\":58, \"name\": \"hot_dog\" },\n  { \"id\":59, \"name\": \"pizza\" },\n  { \"id\":60, \"name\": \"donut\" },\n  { \"id\":61, \"name\": \"cake\" },\n  { \"id\":62, \"name\": \"chair\" },\n  { \"id\":63, \"name\": \"couch\" },\n  { \"id\":64, \"name\": \"potted_plant\" },\n  { \"id\":65, \"name\": \"bed\" },\n  { \"id\":67, \"name\": \"dining_table\" },\n  { \"id\":70, \"name\": \"toilet\" },\n  { \"id\":72, \"name\": \"tv\" },\n  { \"id\":73, \"name\": \"laptop\" },\n  { \"id\":74, \"name\": \"mouse\" },\n  { \"id\":75, \"name\": \"remote\" },\n  { \"id\":76, \"name\": \"keyboard\" },\n  { \"id\":77, \"name\": \"cell_phone\" },\n  { \"id\":78, \"name\": \"microwave\" },\n  { \"id\":79, \"name\": \"oven\" },\n  { \"id\":80, \"name\": \"toaster\" },\n  { \"id\":81, \"name\": \"sink\" },\n  { \"id\":83, \"name\": \"refrigerator\" },\n  { \"id\":84, \"name\": \"book\" },\n  { \"id\":85, \"name\": \"clock\" },\n  { \"id\":86, \"name\": \"vase\" },\n  { \"id\":87, \"name\": \"scissors\" },\n  { \"id\":88, \"name\": \"teddy_bear\" },\n  { \"id\":89, \"name\": \"hair_drier\" },\n  { \"id\":90, \"name\": \"toothbrush\" }\n]\n","type":"detector"}},"spec":{"description":"RetinaNet R101 from Detectron2","handler":"main:handler","runtime":"python:3.8","resources":{},"image":"cvat/pth.facebookresearch.detectron2.retinanet_r101:latest","targetCPU":75,"triggers":{"myHttpTrigger":{"class":"","kind":"http","name":"myHttpTrigger","maxWorkers":2,"workerAvailabilityTimeoutMilliseconds":10000,"attributes":{"maxRequestBodySize":33554432}}},"volumes":[{"volume":{"name":"volume-1","hostPath":{"path":"/home/nmanovic/Workspace/cvat/serverless/common"}},"volumeMount":{"name":"volume-1","mountPath":"/opt/nuclio/common"}}],"build":{"image":"cvat/pth.facebookresearch.detectron2.retinanet_r101","baseImage":"ubuntu:20.04","directives":{"preCopy":[{"kind":"ENV","value":"DEBIAN_FRONTEND=noninteractive"},{"kind":"RUN","value":"apt-get update \u0026\u0026 apt-get -y install curl git python3 python3-pip"},{"kind":"WORKDIR","value":"/opt/nuclio"},{"kind":"RUN","value":"pip3 install torch==1.8.1+cpu torchvision==0.9.1+cpu torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html"},{"kind":"RUN","value":"pip3 install 'git+https://github.com/facebookresearch/detectron2@v0.4'"},{"kind":"RUN","value":"curl -O https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_101_FPN_3x/190397697/model_final_971ab9.pkl"},{"kind":"RUN","value":"ln -s /usr/bin/pip3 /usr/local/bin/pip"}]},"codeEntryType":"image"},"platform":{"attributes":{"mountMode":"volume","restartPolicy":{"maximumRetryCount":3,"name":"always"}}},"readinessTimeoutSeconds":60,"securityContext":{},"eventTimeout":"30s"}}}}
    21.07.21 15:20:39.042            nuctl.platform (I) Waiting for function to be ready {"timeout": 60}
    21.07.21 15:20:40.480                     nuctl (I) Function deploy complete {"functionName": "pth-facebookresearch-detectron2-retinanet-r101", "httpPort": 49153}
    

    Advanced capabilities

    Optimize using GPU

    To optimize a function for a specific device (e.g. GPU), basically you just need to modify instructions above to run the function on the target device. In most cases it will be necessary to modify installation instructions only.

    For RetinaNet R101 which was added above modifications will look like:

    --- function.yaml	2021-06-25 21:06:51.603281723 +0300
    +++ function-gpu.yaml	2021-07-07 22:38:53.454202637 +0300
    @@ -90,7 +90,7 @@
           ]
    
     spec:
    -  description: RetinaNet R101 from Detectron2
    +  description: RetinaNet R101 from Detectron2 optimized for GPU
       runtime: 'python:3.8'
       handler: main:handler
       eventTimeout: 30s
    @@ -108,7 +108,7 @@
             - kind: WORKDIR
               value: /opt/nuclio
             - kind: RUN
    -          value: pip3 install torch==1.8.1+cpu torchvision==0.9.1+cpu torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
    +          value: pip3 install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
             - kind: RUN
               value: git clone https://github.com/facebookresearch/detectron2
             - kind: RUN
    @@ -120,12 +120,16 @@
    
       triggers:
         myHttpTrigger:
    -      maxWorkers: 2
    +      maxWorkers: 1
           kind: 'http'
           workerAvailabilityTimeoutMilliseconds: 10000
           attributes:
             maxRequestBodySize: 33554432 # 32MB
    
    +  resources:
    +    limits:
    +      nvidia.com/gpu: 1
    +
       platform:
         attributes:
           restartPolicy:
    

    Note: GPU has very limited amount of memory and it doesn’t allow to run multiple serverless functions in parallel for now using free open-source Nuclio version on the local platform because scaling to zero feature is absent. Theoretically it is possible to run different functions on different GPUs, but it requires to change source code on corresponding serverless functions to choose a free GPU.

    Debugging a serverless function

    Let’s say you have a problem with your serverless function and want to debug it. Of course you can use context.logger.info or similar methods to print the intermediate state of your function. Another way is to debug using Visual Studio Code. Please see instructions below to setup your environment step by step.

    Let’s modify our function.yaml to include debugpy package and specify that maxWorkers count is 1. Otherwise both workers will try to use the same port and it will lead to an exception in python code.

            - kind: RUN
              value: pip3 install debugpy
    
      triggers:
        myHttpTrigger:
          maxWorkers: 1
    

    Change main.py to listen to a port (e.g. 5678). Insert code below in the beginning of your file with entry point.

    import debugpy
    debugpy.listen(5678)
    

    After these changes deploy the serverless function once again. For serverless/pytorch/facebookresearch/detectron2/retinanet/nuclio/ you should run the command below:

    serverless/deploy_cpu.sh serverless/pytorch/facebookresearch/detectron2/retinanet
    

    To debug python code inside a container you have to publish the port (in this tutorial it is 5678). Nuclio deploy command doesn’t support that and we have to workaround it using SSH port forwarding.

    • Install SSH server on your host machine using sudo apt install openssh-server
    • In /etc/ssh/sshd_config host file set GatewayPorts yes
    • Restart ssh service to apply changes using sudo systemctl restart ssh.service

    Next step is to install ssh client inside the container and run port forwarding. In the snippet below instead of user and ipaddress provide username and IP address of your host (usually IP address starts from 192.168.). You will need to confirm that you want to connect to your host computer and enter your password. Keep the terminal open after that.

    docker exec -it nuclio-nuclio-pth-facebookresearch-detectron2-retinanet-r101 /bin/bash
    apt update && apt install -y ssh
    ssh -R 5678:localhost:5678 user@ipaddress
    

    See how the latest command looks like in my case:

    root@2d6cceec8f70:/opt/nuclio# ssh -R 5678:localhost:5678 nmanovic@192.168.50.188
    The authenticity of host '192.168.50.188 (192.168.50.188)' can't be established.
    ECDSA key fingerprint is SHA256:0sD6IWi+FKAhtUXr2TroHqyjcnYRIGLLx/wkGaZeRuo.
    Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
    Warning: Permanently added '192.168.50.188' (ECDSA) to the list of known hosts.
    nmanovic@192.168.50.188's password:
    Welcome to Ubuntu 20.04.2 LTS (GNU/Linux 5.8.0-53-generic x86_64)
    
     * Documentation:  https://help.ubuntu.com
     * Management:     https://landscape.canonical.com
     * Support:        https://ubuntu.com/advantage
    
    223 updates can be applied immediately.
    132 of these updates are standard security updates.
    To see these additional updates run: apt list --upgradable
    
    Your Hardware Enablement Stack (HWE) is supported until April 2025.
    Last login: Fri Jun 25 16:39:04 2021 from 172.17.0.5
    [setupvars.sh] OpenVINO environment initialized
    nmanovic@nmanovic-dl-node:~$
    

    Finally, add the configuration below into your launch.json. Open Visual Studio Code and run Serverless Debug configuration, set a breakpoint in main.py and try to call the serverless function from CVAT UI. The breakpoint should be triggered in Visual Studio Code and it should be possible to inspect variables and debug code.

    {
      "name": "Serverless Debug",
      "type": "python",
      "request": "attach",
      "connect": {
        "host": "localhost",
        "port": 5678
      },
      "pathMappings": [
        {
          "localRoot": "${workspaceFolder}/serverless/pytorch/facebookresearch/detectron2/retinanet/nuclio",
          "remoteRoot": "/opt/nuclio"
        }
      ]
    }
    

    VS Code debug RetinaNet

    Note: In case of changes in the source code, need to re-deploy the function and initiate port forwarding again.

    Troubleshooting

    First of all need to check that you are using the recommended version of Nuclio framework. In my case it is 1.5.16 but you need to check the installation manual.

    nuctl version
    
    Client version:
    "Label: 1.5.16, Git commit: ae43a6a560c2bec42d7ccfdf6e8e11a1e3cc3774, OS: linux, Arch: amd64, Go version: go1.14.3"
    

    Check that Nuclio dashboard is running and its version corresponds to nuctl.

    docker ps --filter NAME=^nuclio$
    
    CONTAINER ID   IMAGE                                   COMMAND                  CREATED       STATUS                    PORTS                                               NAMES
    7ab0c076c927   quay.io/nuclio/dashboard:1.5.16-amd64   "/docker-entrypoint.…"   6 weeks ago   Up 46 minutes (healthy)   80/tcp, 0.0.0.0:8070->8070/tcp, :::8070->8070/tcp   nuclio
    

    Be sure that the model, which doesn’t work, is healthy. In my case Inside Outside Guidance is not running.

    docker ps --filter NAME=iog
    
    CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
    

    Let’s run it. Go to the root of CVAT repository and run the deploying command.

    serverless/deploy_cpu.sh serverless/pytorch/shiyinzhang/iog
    
    Deploying serverless/pytorch/shiyinzhang/iog function...
    21.07.06 12:49:08.763                     nuctl (I) Deploying function {"name": ""}
    21.07.06 12:49:08.763                     nuctl (I) Building {"versionInfo": "Label: 1.5.16, Git commit: ae43a6a560c2bec42d7ccfdf6e8e11a1e3cc3774, OS: linux, Arch: amd64, Go version: go1.14.3", "name": ""}
    21.07.06 12:49:09.085                     nuctl (I) Cleaning up before deployment {"functionName": "pth-shiyinzhang-iog"}
    21.07.06 12:49:09.162                     nuctl (I) Function already exists, deleting function containers {"functionName": "pth-shiyinzhang-iog"}
    21.07.06 12:49:09.230                     nuctl (I) Staging files and preparing base images
    21.07.06 12:49:09.232                     nuctl (I) Building processor image {"imageName": "cvat/pth.shiyinzhang.iog:latest"}
    21.07.06 12:49:09.232     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/handler-builder-python-onbuild:1.5.16-amd64"}
    21.07.06 12:49:12.525     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/uhttpc:0.0.1-amd64"}
    21.07.06 12:49:16.222            nuctl.platform (I) Building docker image {"image": "cvat/pth.shiyinzhang.iog:latest"}
    21.07.06 12:49:16.555            nuctl.platform (I) Pushing docker image into registry {"image": "cvat/pth.shiyinzhang.iog:latest", "registry": ""}
    21.07.06 12:49:16.555            nuctl.platform (I) Docker image was successfully built and pushed into docker registry {"image": "cvat/pth.shiyinzhang.iog:latest"}
    21.07.06 12:49:16.555                     nuctl (I) Build complete {"result": {"Image":"cvat/pth.shiyinzhang.iog:latest","UpdatedFunctionConfig":{"metadata":{"name":"pth-shiyinzhang-iog","namespace":"nuclio","labels":{"nuclio.io/project-name":"cvat"},"annotations":{"framework":"pytorch","min_pos_points":"1","name":"IOG","spec":"","startswith_box":"true","type":"interactor"}},"spec":{"description":"Interactive Object Segmentation with Inside-Outside Guidance","handler":"main:handler","runtime":"python:3.6","env":[{"name":"PYTHONPATH","value":"/opt/nuclio/iog"}],"resources":{},"image":"cvat/pth.shiyinzhang.iog:latest","targetCPU":75,"triggers":{"myHttpTrigger":{"class":"","kind":"http","name":"myHttpTrigger","maxWorkers":2,"workerAvailabilityTimeoutMilliseconds":10000,"attributes":{"maxRequestBodySize":33554432}}},"volumes":[{"volume":{"name":"volume-1","hostPath":{"path":"/home/nmanovic/Workspace/cvat/serverless/common"}},"volumeMount":{"name":"volume-1","mountPath":"/opt/nuclio/common"}}],"build":{"image":"cvat/pth.shiyinzhang.iog","baseImage":"continuumio/miniconda3","directives":{"preCopy":[{"kind":"WORKDIR","value":"/opt/nuclio"},{"kind":"RUN","value":"conda create -y -n iog python=3.6"},{"kind":"SHELL","value":"[\"conda\", \"run\", \"-n\", \"iog\", \"/bin/bash\", \"-c\"]"},{"kind":"RUN","value":"conda install -y -c anaconda curl"},{"kind":"RUN","value":"conda install -y pytorch=0.4 torchvision=0.2 -c pytorch"},{"kind":"RUN","value":"conda install -y -c conda-forge pycocotools opencv scipy"},{"kind":"RUN","value":"git clone https://github.com/shiyinzhang/Inside-Outside-Guidance.git iog"},{"kind":"WORKDIR","value":"/opt/nuclio/iog"},{"kind":"ENV","value":"fileid=1Lm1hhMhhjjnNwO4Pf7SC6tXLayH2iH0l"},{"kind":"ENV","value":"filename=IOG_PASCAL_SBD.pth"},{"kind":"RUN","value":"curl -c ./cookie -s -L \"https://drive.google.com/uc?export=download\u0026id=${fileid}\""},{"kind":"RUN","value":"echo \"/download/ {print \\$NF}\" \u003e confirm_code.awk"},{"kind":"RUN","value":"curl -Lb ./cookie \"https://drive.google.com/uc?export=download\u0026confirm=`awk -f confirm_code.awk ./cookie`\u0026id=${fileid}\" -o ${filename}"},{"kind":"WORKDIR","value":"/opt/nuclio"},{"kind":"ENTRYPOINT","value":"[\"conda\", \"run\", \"-n\", \"iog\"]"}]},"codeEntryType":"image"},"platform":{"attributes":{"mountMode":"volume","restartPolicy":{"maximumRetryCount":3,"name":"always"}}},"readinessTimeoutSeconds":60,"securityContext":{},"eventTimeout":"30s"}}}}
    21.07.06 12:49:17.422     nuctl.platform.docker (W) Failed to run container {"err": "stdout:\n1373cb432a178a3606685b5975e40a0755bc7958786c182304f5d1bbc0873ceb\ndocker: Error response from daemon: driver failed programming external connectivity on endpoint nuclio-nuclio-pth-shiyinzhang-iog (df68e7b4a60e553ee3079f1f1622b050cc958bd50f2cd359a20164d8a417d0ea): Bind for 0.0.0.0:49154 failed: port is already allocated.\n\nstderr:\n", "errVerbose": "\nError - exit status 125\n    /nuclio/pkg/cmdrunner/shellrunner.go:96\n\nCall stack:\nstdout:\n1373cb432a178a3606685b5975e40a0755bc7958786c182304f5d1bbc0873ceb\ndocker: Error response from daemon: driver failed programming external connectivity on endpoint nuclio-nuclio-pth-shiyinzhang-iog (df68e7b4a60e553ee3079f1f1622b050cc958bd50f2cd359a20164d8a417d0ea): Bind for 0.0.0.0:49154 failed: port is already allocated.\n\nstderr:\n\n    /nuclio/pkg/cmdrunner/shellrunner.go:96\nstdout:\n1373cb432a178a3606685b5975e40a0755bc7958786c182304f5d1bbc0873ceb\ndocker: Error response from daemon: driver failed programming external connectivity on endpoint nuclio-nuclio-pth-shiyinzhang-iog (df68e7b4a60e553ee3079f1f1622b050cc958bd50f2cd359a20164d8a417d0ea): Bind for 0.0.0.0:49154 failed: port is already allocated.\n\nstderr:\n", "errCauses": [{"error": "exit status 125"}], "stdout": "1373cb432a178a3606685b5975e40a0755bc7958786c182304f5d1bbc0873ceb\ndocker: Error response from daemon: driver failed programming external connectivity on endpoint nuclio-nuclio-pth-shiyinzhang-iog (df68e7b4a60e553ee3079f1f1622b050cc958bd50f2cd359a20164d8a417d0ea): Bind for 0.0.0.0:49154 failed: port is already allocated.\n", "stderr": ""}
    21.07.06 12:49:17.422                     nuctl (W) Failed to create a function; setting the function status {"err": "Failed to run a Docker container", "errVerbose": "\nError - exit status 125\n    /nuclio/pkg/cmdrunner/shellrunner.go:96\n\nCall stack:\nstdout:\n1373cb432a178a3606685b5975e40a0755bc7958786c182304f5d1bbc0873ceb\ndocker: Error response from daemon: driver failed programming external connectivity on endpoint nuclio-nuclio-pth-shiyinzhang-iog (df68e7b4a60e553ee3079f1f1622b050cc958bd50f2cd359a20164d8a417d0ea): Bind for 0.0.0.0:49154 failed: port is already allocated.\n\nstderr:\n\n    /nuclio/pkg/cmdrunner/shellrunner.go:96\nFailed to run a Docker container\n    /nuclio/pkg/platform/local/platform.go:653\nFailed to run a Docker container", "errCauses": [{"error": "stdout:\n1373cb432a178a3606685b5975e40a0755bc7958786c182304f5d1bbc0873ceb\ndocker: Error response from daemon: driver failed programming external connectivity on endpoint nuclio-nuclio-pth-shiyinzhang-iog (df68e7b4a60e553ee3079f1f1622b050cc958bd50f2cd359a20164d8a417d0ea): Bind for 0.0.0.0:49154 failed: port is already allocated.\n\nstderr:\n", "errorVerbose": "\nError - exit status 125\n    /nuclio/pkg/cmdrunner/shellrunner.go:96\n\nCall stack:\nstdout:\n1373cb432a178a3606685b5975e40a0755bc7958786c182304f5d1bbc0873ceb\ndocker: Error response from daemon: driver failed programming external connectivity on endpoint nuclio-nuclio-pth-shiyinzhang-iog (df68e7b4a60e553ee3079f1f1622b050cc958bd50f2cd359a20164d8a417d0ea): Bind for 0.0.0.0:49154 failed: port is already allocated.\n\nstderr:\n\n    /nuclio/pkg/cmdrunner/shellrunner.go:96\nstdout:\n1373cb432a178a3606685b5975e40a0755bc7958786c182304f5d1bbc0873ceb\ndocker: Error response from daemon: driver failed programming external connectivity on endpoint nuclio-nuclio-pth-shiyinzhang-iog (df68e7b4a60e553ee3079f1f1622b050cc958bd50f2cd359a20164d8a417d0ea): Bind for 0.0.0.0:49154 failed: port is already allocated.\n\nstderr:\n", "errorCauses": [{"error": "exit status 125"}]}]}
    
    Error - exit status 125
        /nuclio/pkg/cmdrunner/shellrunner.go:96
    
    Call stack:
    stdout:
    1373cb432a178a3606685b5975e40a0755bc7958786c182304f5d1bbc0873ceb
    docker: Error response from daemon: driver failed programming external connectivity on endpoint nuclio-nuclio-pth-shiyinzhang-iog (df68e7b4a60e553ee3079f1f1622b050cc958bd50f2cd359a20164d8a417d0ea): Bind for 0.0.0.0:49154 failed: port is already allocated.
    
    stderr:
    
        /nuclio/pkg/cmdrunner/shellrunner.go:96
    Failed to run a Docker container
        /nuclio/pkg/platform/local/platform.go:653
    Failed to deploy function
        ...//nuclio/pkg/platform/abstract/platform.go:182
      NAMESPACE |                      NAME                      | PROJECT | STATE | NODE PORT | REPLICAS
      nuclio    | openvino-dextr                                 | cvat    | ready |     49154 | 1/1
      nuclio    | pth-foolwood-siammask                          | cvat    | ready |     49155 | 1/1
      nuclio    | pth-facebookresearch-detectron2-retinanet-r101 | cvat    | ready |     49155 | 1/1
      nuclio    | pth-shiyinzhang-iog                            | cvat    | error |         0 | 1/1
    

    In this case the container was built some time ago and the port 49154 was assigned by Nuclio. Now the port is used by openvino-dextr as we can see in logs. To prove our hypothesis just need to run a couple of docker commands:

    docker container ls -a | grep iog
    
    eb0c1ee46630   cvat/pth.shiyinzhang.iog:latest                              "conda run -n iog pr…"   9 minutes ago       Created                                                                          nuclio-nuclio-pth-shiyinzhang-iog
    
    docker inspect eb0c1ee46630 | grep 49154
    
                "Error": "driver failed programming external connectivity on endpoint nuclio-nuclio-pth-shiyinzhang-iog (02384290f91b2216162b1603322dadee426afe7f439d3d090f598af5d4863b2d): Bind for 0.0.0.0:49154 failed: port is already allocated",
                            "HostPort": "49154"
    

    To solve the problem let’s just remove the previous container for the function. In this case it is eb0c1ee46630. After that the deploying command works as expected.

    docker container rm eb0c1ee46630
    
    eb0c1ee46630
    
    serverless/deploy_cpu.sh serverless/pytorch/shiyinzhang/iog
    
    Deploying serverless/pytorch/shiyinzhang/iog function...
    21.07.06 13:09:52.934                     nuctl (I) Deploying function {"name": ""}
    21.07.06 13:09:52.934                     nuctl (I) Building {"versionInfo": "Label: 1.5.16, Git commit: ae43a6a560c2bec42d7ccfdf6e8e11a1e3cc3774, OS: linux, Arch: amd64, Go version: go1.14.3", "name": ""}
    21.07.06 13:09:53.282                     nuctl (I) Cleaning up before deployment {"functionName": "pth-shiyinzhang-iog"}
    21.07.06 13:09:53.341                     nuctl (I) Staging files and preparing base images
    21.07.06 13:09:53.342                     nuctl (I) Building processor image {"imageName": "cvat/pth.shiyinzhang.iog:latest"}
    21.07.06 13:09:53.342     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/handler-builder-python-onbuild:1.5.16-amd64"}
    21.07.06 13:09:56.633     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/uhttpc:0.0.1-amd64"}
    21.07.06 13:10:00.163            nuctl.platform (I) Building docker image {"image": "cvat/pth.shiyinzhang.iog:latest"}
    21.07.06 13:10:00.452            nuctl.platform (I) Pushing docker image into registry {"image": "cvat/pth.shiyinzhang.iog:latest", "registry": ""}
    21.07.06 13:10:00.452            nuctl.platform (I) Docker image was successfully built and pushed into docker registry {"image": "cvat/pth.shiyinzhang.iog:latest"}
    21.07.06 13:10:00.452                     nuctl (I) Build complete {"result": {"Image":"cvat/pth.shiyinzhang.iog:latest","UpdatedFunctionConfig":{"metadata":{"name":"pth-shiyinzhang-iog","namespace":"nuclio","labels":{"nuclio.io/project-name":"cvat"},"annotations":{"framework":"pytorch","min_pos_points":"1","name":"IOG","spec":"","startswith_box":"true","type":"interactor"}},"spec":{"description":"Interactive Object Segmentation with Inside-Outside Guidance","handler":"main:handler","runtime":"python:3.6","env":[{"name":"PYTHONPATH","value":"/opt/nuclio/iog"}],"resources":{},"image":"cvat/pth.shiyinzhang.iog:latest","targetCPU":75,"triggers":{"myHttpTrigger":{"class":"","kind":"http","name":"myHttpTrigger","maxWorkers":2,"workerAvailabilityTimeoutMilliseconds":10000,"attributes":{"maxRequestBodySize":33554432}}},"volumes":[{"volume":{"name":"volume-1","hostPath":{"path":"/home/nmanovic/Workspace/cvat/serverless/common"}},"volumeMount":{"name":"volume-1","mountPath":"/opt/nuclio/common"}}],"build":{"image":"cvat/pth.shiyinzhang.iog","baseImage":"continuumio/miniconda3","directives":{"preCopy":[{"kind":"WORKDIR","value":"/opt/nuclio"},{"kind":"RUN","value":"conda create -y -n iog python=3.6"},{"kind":"SHELL","value":"[\"conda\", \"run\", \"-n\", \"iog\", \"/bin/bash\", \"-c\"]"},{"kind":"RUN","value":"conda install -y -c anaconda curl"},{"kind":"RUN","value":"conda install -y pytorch=0.4 torchvision=0.2 -c pytorch"},{"kind":"RUN","value":"conda install -y -c conda-forge pycocotools opencv scipy"},{"kind":"RUN","value":"git clone https://github.com/shiyinzhang/Inside-Outside-Guidance.git iog"},{"kind":"WORKDIR","value":"/opt/nuclio/iog"},{"kind":"ENV","value":"fileid=1Lm1hhMhhjjnNwO4Pf7SC6tXLayH2iH0l"},{"kind":"ENV","value":"filename=IOG_PASCAL_SBD.pth"},{"kind":"RUN","value":"curl -c ./cookie -s -L \"https://drive.google.com/uc?export=download\u0026id=${fileid}\""},{"kind":"RUN","value":"echo \"/download/ {print \\$NF}\" \u003e confirm_code.awk"},{"kind":"RUN","value":"curl -Lb ./cookie \"https://drive.google.com/uc?export=download\u0026confirm=`awk -f confirm_code.awk ./cookie`\u0026id=${fileid}\" -o ${filename}"},{"kind":"WORKDIR","value":"/opt/nuclio"},{"kind":"ENTRYPOINT","value":"[\"conda\", \"run\", \"-n\", \"iog\"]"}]},"codeEntryType":"image"},"platform":{"attributes":{"mountMode":"volume","restartPolicy":{"maximumRetryCount":3,"name":"always"}}},"readinessTimeoutSeconds":60,"securityContext":{},"eventTimeout":"30s"}}}}
    21.07.06 13:10:01.604            nuctl.platform (I) Waiting for function to be ready {"timeout": 60}
    21.07.06 13:10:02.976                     nuctl (I) Function deploy complete {"functionName": "pth-shiyinzhang-iog", "httpPort": 49159}
      NAMESPACE |                      NAME                      | PROJECT | STATE | NODE PORT | REPLICAS
      nuclio    | openvino-dextr                                 | cvat    | ready |     49154 | 1/1
      nuclio    | pth-foolwood-siammask                          | cvat    | ready |     49155 | 1/1
      nuclio    | pth-saic-vul-fbrs                              | cvat    | ready |     49156 | 1/1
      nuclio    | pth-facebookresearch-detectron2-retinanet-r101 | cvat    | ready |     49155 | 1/1
      nuclio    | pth-shiyinzhang-iog                            | cvat    | ready |     49159 | 1/1
    

    When you investigate an issue with a serverless function, it is extremely useful to look at logs. Just run a couple of commands like docker logs <container>.

    docker logs cvat
    
    2021-07-06 13:44:54,699 DEBG 'runserver' stderr output:
    [Tue Jul 06 13:44:54.699431 2021] [wsgi:error] [pid 625:tid 140010969868032] [remote 172.28.0.3:40972] [2021-07-06 13:44:54,699] ERROR django.request: Internal Server Error: /api/lambda/functions/pth-shiyinzhang-iog
    
    2021-07-06 13:44:54,700 DEBG 'runserver' stderr output:
    [Tue Jul 06 13:44:54.699712 2021] [wsgi:error] [pid 625:tid 140010969868032] [remote 172.28.0.3:40972] ERROR - 2021-07-06 13:44:54,699 - log - Internal Server Error: /api/lambda/functions/pth-shiyinzhang-iog
    
    
    docker container ls --filter name=iog
    
    CONTAINER ID   IMAGE                             COMMAND                  CREATED       STATUS                 PORTS                                         NAMES
    3b6ef9a9f3e2   cvat/pth.shiyinzhang.iog:latest   "conda run -n iog pr…"   4 hours ago   Up 4 hours (healthy)   0.0.0.0:49159->8080/tcp, :::49159->8080/tcp   nuclio-nuclio-pth-shiyinzhang-iog
    
    
    docker logs nuclio-nuclio-pth-shiyinzhang-iog
    

    If before model deployment you see that the NODE PORT is 0, you need to assign it manually. Add the port: 32001 attribute to the function.yaml file of each model, before you deploy the model. Different ports should be prescribed for different models.

    triggers:
    myHttpTrigger:
        maxWorkers: 1
        kind: 'http'
        workerAvailabilityTimeoutMilliseconds: 10000
        attributes:
    +     port: 32001
          maxRequestBodySize: 33554432 # 32MB
    

    Installation serverless functions on Windows 10 with using the Ubuntu subsystem

    If you encounter a problem running serverless functions on Windows 10, you can use the Ubuntu subsystem, for this do the following:

    1. Install WSL 2 and Docker Desktop as described in installation manual

    2. Install Ubuntu 18.04 from Microsoft store.

    3. Enable integration for Ubuntu-18.04 in the settings of Docker Desktop in the Resources WSL integration tab:

      Docker WSL integration Ubuntu 18.04

    4. Then you can download and install nuctl on Ubuntu, using the automatic annotation guide.

    5. Install git and clone repository on Ubuntu, as described in the installation manual.

    6. After that, run the commands from this tutorial through Ubuntu.

    6 - Administration

    This section contains documents for system administrators.

    6.1 - Basics

    This section contains basic documents for system administrators.

    6.1.1 - Installation Guide

    A CVAT installation guide for different operating systems.

    Quick installation guide

    Before you can use CVAT, you’ll need to get it installed. The document below contains instructions for the most popular operating systems. If your system is not covered by the document it should be relatively straightforward to adapt the instructions below for other systems.

    Probably you need to modify the instructions below in case you are behind a proxy server. Proxy is an advanced topic and it is not covered by the guide.

    For access from China, read sources for users from China section.

    Ubuntu 18.04 (x86_64/amd64)

    • Open a terminal window. If you don’t know how to open a terminal window on Ubuntu please read the answer.

    • Type commands below into the terminal window to install Docker and Docker Compose. More instructions can be found here.

      sudo apt-get update
      sudo apt-get --no-install-recommends install -y \
        apt-transport-https \
        ca-certificates \
        curl \
        gnupg-agent \
        software-properties-common
      curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
      sudo add-apt-repository \
        "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
        $(lsb_release -cs) \
        stable"
      sudo apt-get update
      sudo apt-get --no-install-recommends install -y \
        docker-ce docker-ce-cli containerd.io docker-compose-plugin
      
    • Perform post-installation steps to run docker without root permissions.

      sudo groupadd docker
      sudo usermod -aG docker $USER
      

      Log out and log back in (or reboot) so that your group membership is re-evaluated. You can type groups command in a terminal window after that and check if docker group is in its output.

    • Clone CVAT source code from the GitHub repository with Git.

      Following command will clone the latest develop branch:

      git clone https://github.com/opencv/cvat
      cd cvat
      

      See alternatives if you want to download one of the release versions or use the wget or curl tools.

    • To access CVAT over a network or through a different system, export CVAT_HOST environment variable

      export CVAT_HOST=your-ip-address
      
    • Run docker containers. It will take some time to download the latest CVAT release and other required images like postgres, redis, etc. from DockerHub and create containers.

      docker compose up -d
      
    • (Optional) Use CVAT_VERSION environment variable to specify the version of CVAT you want to install specific version (e.g v2.1.0, dev). Default behavior: dev images will be pulled for develop branch, and corresponding release images for release versions.

      CVAT_VERSION=dev docker compose up -d
      
    • Alternative: if you want to build the images locally with unreleased changes see How to pull/build/update CVAT images section

    • You can register a user but by default, it will not have rights even to view the list of tasks. Thus you should create a superuser. A superuser can use an admin panel to assign the correct groups to the user. Please use the command below:

      docker exec -it cvat_server bash -ic 'python3 ~/manage.py createsuperuser'
      

      Choose a username and a password for your admin account. For more information please read Django documentation.

    • Google Chrome is the only browser that is supported by CVAT. You need to install it as well. Type commands below in a terminal window:

      curl https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
      sudo sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'
      sudo apt-get update
      sudo apt-get --no-install-recommends install -y google-chrome-stable
      
    • Open the installed Google Chrome browser and go to localhost:8080. Type your login/password for the superuser on the login page and press the Login button. Now you should be able to create a new annotation task. Please read the CVAT manual for more details.

    Windows 10

    • Install WSL2 (Windows subsystem for Linux) refer to this official guide. WSL2 requires Windows 10, version 2004 or higher. After installing WSL2, install a Linux Distribution of your choice.

    • Download and install Docker Desktop for Windows. Double-click Docker for Windows Installer to run the installer. More instructions can be found here. Official guide for docker WSL2 backend can be found here. Note: Check that you are specifically using WSL2 backend for Docker.

    • In Docker Desktop, go to Settings >> Resources >> WSL Integration, and enable integration with the Linux Distribution that you chose.

    • Download and install Git for Windows. When installing the package please keep all options by default. More information about the package can be found here.

    • Download and install Google Chrome. It is the only browser which is supported by CVAT.

    • Go to windows menu, find the Linux distribution you installed and run it. You should see a terminal window.

    • Clone CVAT source code from the GitHub repository.

      The following command will clone the latest develop branch:

      git clone https://github.com/opencv/cvat
      cd cvat
      

      See alternatives if you want to download one of the release versions.

    • Run docker containers. It will take some time to download the latest CVAT release and other required images like postgres, redis, etc. from DockerHub and create containers.

      docker compose up -d
      
    • (Optional) Use CVAT_VERSION environment variable to specify the version of CVAT you want to install specific version (e.g v2.1.0, dev). Default behavior: dev images will be pulled for develop branch, and corresponding release images for release versions.

      CVAT_VERSION=dev docker compose up -d
      
    • Alternative: if you want to build the images locally with unreleased changes see How to pull/build/update CVAT images section

    • You can register a user but by default, it will not have rights even to view the list of tasks. Thus you should create a superuser. A superuser can use an admin panel to assign correct groups to other users. Please use the command below:

      sudo docker exec -it cvat_server bash -ic 'python3 ~/manage.py createsuperuser'
      

      If you don’t have winpty installed or the above command does not work, you may also try the following:

      # enter docker image first
      docker exec -it cvat_server /bin/bash
      # then run
      python3 ~/manage.py createsuperuser
      

      Choose a username and a password for your admin account. For more information please read Django documentation.

    • Open the installed Google Chrome browser and go to localhost:8080. Type your login/password for the superuser on the login page and press the Login button. Now you should be able to create a new annotation task. Please read the CVAT manual for more details.

    Mac OS Mojave

    • Download Docker for Mac. Double-click Docker.dmg to open the installer, then drag Moby the whale to the Applications folder. Double-click Docker.app in the Applications folder to start Docker. More instructions can be found here.

    • There are several ways to install Git on a Mac. The easiest is probably to install the Xcode Command Line Tools. On Mavericks (10.9) or above you can do this simply by trying to run git from the Terminal the very first time.

      git --version
      

      If you don’t have it installed already, it will prompt you to install it. More instructions can be found here.

    • Download and install Google Chrome. It is the only browser which is supported by CVAT.

    • Open a terminal window. The terminal app is in the Utilities folder in Applications. To open it, either open your Applications folder, then open Utilities and double-click on Terminal, or press Command - spacebar to launch Spotlight and type “Terminal,” then double-click the search result.

    • Clone CVAT source code from the GitHub repository with Git.

      The following command will clone the latest develop branch:

      git clone https://github.com/opencv/cvat
      cd cvat
      

      See alternatives if you want to download one of the release versions or use the wget or curl tools.

    • Run docker containers. It will take some time to download the latest CVAT release and other required images like postgres, redis, etc. from DockerHub and create containers.

      docker compose up -d
      
    • (Optional) Use CVAT_VERSION environment variable to specify the version of CVAT you want to install specific version (e.g v2.1.0, dev). Default behavior: dev images will be pulled for develop branch, and corresponding release images for release versions.

      CVAT_VERSION=dev docker compose up -d
      
    • Alternative: if you want to build the images locally with unreleased changes see How to pull/build/update CVAT images section

    • You can register a user but by default, it will not have rights even to view the list of tasks. Thus you should create a superuser. A superuser can use an admin panel to assign correct groups to other users. Please use the command below:

      docker exec -it cvat_server bash -ic 'python3 ~/manage.py createsuperuser'
      

      Choose a username and a password for your admin account. For more information please read Django documentation.

    • Open the installed Google Chrome browser and go to localhost:8080. Type your login/password for the superuser on the login page and press the Login button. Now you should be able to create a new annotation task. Please read the CVAT manual for more details.

    Advanced Topics

    How to get CVAT source code

    Git (Linux, Mac, Windows)

    1. Install Git on your system if it’s not already installed

      • Ubuntu:
      sudo apt-get --no-install-recommends install -y git
      
    2. Clone CVAT source code from the GitHub repository.

      The command below will clone the default branch (develop):

      git clone https://github.com/opencv/cvat
      cd cvat
      

      To clone specific tag, e.g. v2.1.0:

      git clone -b v2.1.0 https://github.com/opencv/cvat
      cd cvat
      

    Wget (Linux, Mac)

    To download latest develop branch:

    wget https://github.com/opencv/cvat/archive/refs/heads/develop.zip
    unzip develop.zip && mv cvat-develop cvat
    cd cvat
    

    To download specific tag:

    wget https://github.com/opencv/cvat/archive/refs/tags/v1.7.0.zip
    unzip v1.7.0.zip && mv cvat-1.7.0 cvat
    cd cvat
    

    Curl (Linux, Mac)

    To download the latest develop branch:

    curl -LO https://github.com/opencv/cvat/archive/refs/heads/develop.zip
    unzip develop.zip && mv cvat-develop cvat
    cd cvat
    

    To download specific tag:

    curl -LO https://github.com/opencv/cvat/archive/refs/tags/v1.7.0.zip
    unzip v1.7.0.zip && mv cvat-1.7.0 cvat
    cd cvat
    

    CVAT healthcheck command

    The following command allows testing the CVAT container to make sure it works.

    docker exec -t cvat_server python manage.py health_check
    

    The expected output of a healthy CVAT container:

    Cache backend: default   ... working
    DatabaseBackend          ... working
    DiskUsage                ... working
    MemoryUsage              ... working
    MigrationsHealthCheck    ... working
    OPAHealthCheck           ... working
    

    Deploying CVAT behind a proxy

    If you deploy CVAT behind a proxy and do not plan to use any of serverless functions for automatic annotation, the exported environment variables http_proxy, https_proxy and no_proxy should be enough to build images. Otherwise please create or edit the file ~/.docker/config.json in the home directory of the user which starts containers and add JSON such as the following:

    {
      "proxies": {
        "default": {
          "httpProxy": "http://proxy_server:port",
          "httpsProxy": "http://proxy_server:port",
          "noProxy": "*.test.example.com,.example2.com"
        }
      }
    }
    

    These environment variables are set automatically within any container. Please see the Docker documentation for more details.

    Using the Traefik dashboard

    If you are customizing the docker compose files and you come upon some unexpected issues, using the Traefik dashboard might be very useful to see if the problem is with Traefik configuration, or with some of the services.

    You can enable the Traefik dashboard by uncommenting the following lines from docker-compose.yml

    services:
      traefik:
        # Uncomment to get Traefik dashboard
        #   - "--entryPoints.dashboard.address=:8090"
        #   - "--api.dashboard=true"
        # labels:
        #   - traefik.enable=true
        #   - traefik.http.routers.dashboard.entrypoints=dashboard
        #   - traefik.http.routers.dashboard.service=api@internal
        #   - traefik.http.routers.dashboard.rule=Host(`${CVAT_HOST:-localhost}`)
    

    and if you are using docker-compose.https.yml, also uncomment these lines

    services:
      traefik:
        command:
          # Uncomment to get Traefik dashboard
          # - "--entryPoints.dashboard.address=:8090"
          # - "--api.dashboard=true"
    

    Note that this “insecure” dashboard is not recommended in production (and if your instance is publicly available); if you want to keep the dashboard in production you should read Traefik’s documentation on how to properly secure it.

    Additional components

    Semi-automatic and automatic annotation

    Please follow this guide.

    Stop all containers

    The command below stops and removes containers and networks created by up.

    docker compose down
    

    Use your own domain

    If you want to access your instance of CVAT outside of your localhost (on another domain), you should specify the CVAT_HOST environment variable, like this:

    export CVAT_HOST=<YOUR_DOMAIN>
    

    Share path

    You can use shared storage for uploading data when you create a task. To do that, you must mount the shared storage to the CVAT docker container. Example of docker-compose.override.yml for this purpose:

    services:
      cvat_server:
        volumes:
          - cvat_share:/home/django/share:ro
      cvat_worker_import:
        volumes:
          - cvat_share:/home/django/share:ro
      cvat_worker_export:
        volumes:
          - cvat_share:/home/django/share:ro
      cvat_worker_annotation:
        volumes:
          - cvat_share:/home/django/share:ro
    
    volumes:
      cvat_share:
        driver_opts:
          type: none
          device: /mnt/share
          o: bind
    

    You can change the share device path to your actual share.

    You can mount your cloud storage as a FUSE and use it later as a share.

    Email verification

    You can enable email verification for newly registered users. Specify these options in the settings file to configure Django allauth to enable email verification (ACCOUNT_EMAIL_VERIFICATION = ‘mandatory’). Access is denied until the user’s email address is verified.

    ACCOUNT_AUTHENTICATION_METHOD = 'username_email'
    ACCOUNT_CONFIRM_EMAIL_ON_GET = True
    ACCOUNT_EMAIL_REQUIRED = True
    ACCOUNT_EMAIL_VERIFICATION = 'mandatory'
    
    # Email backend settings for Django
    EMAIL_BACKEND = 'django.core.mail.backends.smtp.EmailBackend'
    

    Also, you need to configure the Django email backend to send emails. This depends on the email server you are using and is not covered in this tutorial, please see Django SMTP backend configuration for details.

    Deploy CVAT on the Scaleway public cloud

    Please follow this tutorial to install and set up remote access to CVAT on a Scaleway cloud instance with data in a mounted object storage bucket.

    Deploy secure CVAT instance with HTTPS

    Using Traefik, you can automatically obtain a TLS certificate for your domain from Let’s Encrypt, enabling you to use HTTPS protocol to access your website.

    To enable this, first set the CVAT_HOST (the domain of your website) and ACME_EMAIL (contact email for Let’s Encrypt) environment variables:

    export CVAT_HOST=<YOUR_DOMAIN>
    export ACME_EMAIL=<YOUR_EMAIL>
    

    Then, use the docker-compose.https.yml file to override the base docker-compose.yml file:

    docker compose -f docker-compose.yml -f docker-compose.https.yml up -d
    

    In the firewall, ports 80 and 443 must be open for inbound connections from any

    Then, the CVAT instance will be available at your domain on ports 443 (HTTPS) and 80 (HTTP, redirects to 443).

    How to pull/build/update CVAT images

    • For a CVAT version lower or equal to 2.1.0, you need to pull images using docker because the compose configuration always points to the latest image tag, e.g.

      docker pull cvat/server:v1.7.0
      docker tag cvat/server:v1.7.0 openvino/cvat_server:latest
      
      docker pull cvat/ui:v1.7.0
      docker tag cvat/ui:v1.7.0 openvino/cvat_ui:latest
      

      For CVAT version more than v2.1.0 it’s possible to pull specific version of prebuilt images from DockerHub using CVAT_VERSION environment variable to specify the version (e.g. dev):

      CVAT_VERSION=dev docker compose pull
      
    • To build images yourself include docker-compose.dev.yml compose config file to docker compose command. This can be useful if you want to build a CVAT with some source code changes.

      docker compose -f docker-compose.yml -f docker-compose.dev.yml build
      
    • To update local images to latest or dev tags run:

      CVAT_VERSION=dev docker compose pull
      

      or

      CVAT_VERSION=latest docker compose pull
      

    Troubleshooting

    Sources for users from China

    If you stay in China, for installation you need to override the following sources.

    • For use apt update using:

      Ubuntu mirroring help

      Pre-compiled packages:

      deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal main restricted universe multiverse
      deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
      deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
      deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-security main restricted universe multiverse
      

      Or source packages:

      deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal main restricted universe multiverse
      deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
      deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
      deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-security main restricted universe multiverse
      
    • Docker mirror station

      Add registry mirrors into daemon.json file:

      {
          "registry-mirrors": [
              "http://f1361db2.m.daocloud.io",
              "https://docker.mirrors.ustc.edu.cn",
              "https://hub-mirror.c.163.com",
              "https://https://mirror.ccs.tencentyun.com",
              "https://mirror.ccs.tencentyun.com",
          ]
      }
      
    • For using pip:

      PyPI mirroring help

      pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
      
    • For using npm:

      npm mirroring help

      npm config set registry https://registry.npm.taobao.org/
      
    • Instead of git using gitee:

      CVAT repository on gitee.com

    • For replace acceleration source docker.com run:

      curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
      sudo add-apt-repository \
        "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
        $(lsb_release -cs) \
      
    • For replace acceleration source google.com run:

      curl https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
      

    HTTPS is not working because of a certificate

    If you’re having trouble with an SSL connection, to find the cause, you’ll need to get the logs from traefik by running:

    docker logs traefik
    

    The logs will help you find out the problem.

    If the error is related to a firewall, then:

    • Open ports 80 and 443 for inbound connections from any.
    • Delete acme.json. The location should be something like: /var/lib/docker/volumes/cvat_cvat_letsencrypt/_data/acme.json.

    After acme.json is removed, stop all cvat docker containers:

    docker compose -f docker-compose.yml -f docker-compose.https.yml down
    

    Make sure variables set (with your values):

    export CVAT_HOST=<YOUR_DOMAIN>
    export ACME_EMAIL=<YOUR_EMAIL>
    

    and restart docker:

    docker compose -f docker-compose.yml -f docker-compose.https.yml up -d
    

    6.1.2 - Superuser registration

    A CVAT installation guide to create a superuser.

    This section is for users who want to be a bit more flexible with CVAT use.

    The user you register by default does not have full permissions on the instance, so you must create a superuser. The superuser can use Django administration panel to assign groups (roles) to other users.
    Available roles are: user (default), admin, business, worker.

    Prerequisites

    Before you register an admin account (superuser), you need to install CVAT locally, see Installation Guide.

    Steps of installation are partly different, depending on the type of operation system (OS).

    This section starts with Create superuser step that is common for all OS.

    Register as a superuser

    In the process of installation you need to create a superuser:

    1. In a terminal run the following command:
      docker exec -it cvat_server bash -ic 'python3 ~/manage.py createsuperuser'
    
    1. Set up username, email address, and password.
    2. Go to localhost:8080, and log in with credentials from step 2.
    3. (Optional) Go to Django administration panel panel to:
      • Create/edit/delete users
      • Control permissions of users and access to the tool.

    Django panel

    To manage users' permission, in the Django administration panel:

    1. On the left menu click Users.
    2. On the main pane click Admin and scroll down to Permissions section.
    3. Select user groups and add/remove permissions.

    6.1.3 - AWS-Deployment Guide

    Instructions for deploying CVAT on Nvidia GPU and other AWS machines.

    There are two ways of deploying the CVAT.

    1. On Nvidia GPU Machine: Tensorflow annotation feature is dependent on GPU hardware. One of the easy ways to launch CVAT with the tf-annotation app is to use AWS P3 instances, which provides the NVIDIA GPU. Read more about P3 instances here. Overall setup instruction is explained in main readme file, except Installing Nvidia drivers. So we need to download the drivers and install it. For Amazon P3 instances, download the Nvidia Drivers from Nvidia website. For more check Installing the NVIDIA Driver on Linux Instances link.

    2. On Any other AWS Machine: We can follow the same instruction guide mentioned in the installation instructions. The additional step is to add a security group and rule to allow incoming connections.

    For any of above, don’t forget to set the CVAT_HOST environment variable to the exposed AWS public IP address or hostname:

    export CVAT_HOST=your-instance.amazonaws.com
    

    In case of problems with using hostname, you can also use the public IPV4 instead of hostname. For AWS or any cloud based machines where the instances need to be terminated or stopped, the public IPV4 and hostname changes with every stop and reboot. To address this efficiently, avoid using spot instances that cannot be stopped, since copying the EBS to an AMI and restarting it throws problems. On the other hand, when a regular instance is stopped and restarted, the new hostname/IPV4 can be used to set the CVAT_HOST environment variable.

    6.1.4 - REST API guide

    Instructions on how to interact with REST API and getting swagger documentation.

    To access swagger documentation you need to be authorized.

    Automatically generated Swagger documentation for Django REST API is available on <cvat_origin>/api/swagger(default: localhost:8080/api/swagger).

    Swagger documentation is visible on allowed hosts, Update environment variable in docker-compose.yml file with cvat hosted machine IP or domain name. Example - ALLOWED_HOSTS: 'localhost, 127.0.0.1'.

    Make a request to a resource stored on a server and the server will respond with the requested information. The HTTP protocol is used to transport a data. Requests are divided into groups:

    • auth - user authorization queries
    • comments - requests to post/delete comments to issues
    • issues - update, delete and view problem comments
    • jobs -requests to manage the job
    • lambda - requests to work with lambda function
    • projects - project management queries
    • reviews -adding and removing the review of the job
    • server - server information requests
    • tasks - requests to manage tasks
    • users - user management queries

    Besides it contains Models. Models - the data type is described using a  schema object.

    Each group contains queries related to a different types of HTTP methods such as: GET, POST, PATCH, DELETE, etc. Different methods are highlighted in different color. Each item has a name and description. Clicking on an element opens a form with a name, description and settings input field or an example of json values.

    To find out more, read swagger specification.

    To try to send a request, click Try it now and type Execute. You’ll get a response in the form of Curl, Request URL and Server response.

    6.2 - Advanced

    This section contains advanced documents for system administrators.

    6.2.1 - CVAT deployment on Kubernetes with Helm

    Instructions for deploying CVAT on a Kubernetes cluster.

    Prerequisites

    1. Installed and configured kubernetes cluster. If you do not already have a cluster, you can create one by using Minikube. How to setup Minikube.
    2. Installed kubectl
    3. Installed Helm.
    4. Installed dependencies

    Installing dependencies

    To install and/or update run:

    helm dependency update
    

    Optional steps

    1. Ingress configuration for the Traefik ingress controller is enabled by default.

      Note for Minikube use:

      • because the Traefik creates its main service with Loadbalanser type, which involve the assignment of externalIP by Cloud, what never happens on Minikube, you need to explicitly set the externalIP address for the traefic service. Add the following to values.override.yaml file:
        traefik:
          service:
            externalIPs:
              - "your minikube IP (can be obtained with `minikube ip` command)"
        
      • Also ensure that your CVAT ingress appears on your hosts file (/etc/hosts). You can do this by running this command: cvat.local is default domainname, you can override it via values.override.yaml.
        echo "$(minikube ip) cvat.local" | sudo tee -a /etc/hosts
        

    Configuration

    1. Create values.override.yaml file inside helm-chart directory.
    2. Fill values.override.yaml with new parameters for chart.
    3. Override postgresql password

    Postgresql password?

    Put below into your values.override.yaml

    postgresql:
      secret:
        password: <insert_password>
        postgres_password: <insert_postgres_password>
        replication_password: <insert_replication_password>
    

    Or create your own secret and use it with:

    postgresql:
       global:
         postgresql:
           existingSecret: <secret>
    

    (Optional) Enable Auto annotation feature

    Before starting, ensure that the following prerequisites are met:

    • The Nuclio CLI (nuctl) is installed. To install the CLI, simply download the appropriate CLI version to your installation machine.
    1. Set nuclio.enabled: true in your values.override.yaml

    2. Run helm dependency update in helm-chart directory

    3. Because Nuclio functions are images that need to be pushed and pulled to/from the registry, you need to configure credentials to pull from your preferable registry with the following settings: Options:

      • values.override.yaml file:

        registry:
          loginUrl: someurl
          credentials:
            username: someuser
            password: somepass
        
      • Or you can create a secret with credentials as described in the guide and set registry.secretName=your-registry-credentials-secret-name in the values.override.yaml file.

      • In the case of using Minikube, you can run a local unsecured registry with minikube add-ons:

        minikube addons enable registry
        minikube addons enable registry-aliases
        

        Before Docker container images can be pushed to your newly created insecure registry, you need to add its address ($(minikube ip):5000) to the list of insecure registries to instruct Docker to accept working against it: follow the instructions in the Docker documentation

      You might also need to log into your registry account (docker login) on the installation machine before running the deployment command.

    4. Create cvat project:

      nuctl --namespace <your cvat namespace> create project cvat
      
    5. Finally deploy the function, i.e.:

      • using minikube registry:
        nuctl deploy --project-name cvat --path serverless/tensorflow/faster_rcnn_inception_v2_coco/nuclio --registry $(minikube ip):5000 --run-registry registry.minikube
        
      • using Docker hub:
        nuctl deploy --project-name cvat --path serverless/tensorflow/faster_rcnn_inception_v2_coco/nuclio --registry docker.io/your_username
        

    Analytics

    Analytics is enabled by default, to disable set analytics.enabled: false in your values.override.yaml

    Deployment

    Make sure you are using correct kubernetes context. You can check it with kubectl config current-context.

    Warning: The k8s service name of Open Policy Agent is fixed to opa by default. This is done to be compatible with CVAT 2.0 but limits this helm chart to a single release per namespace. The OPA url currently can´t be set as an environment variable. As soon as this is possible you can set cvat.opa.composeCompatibleServiceName to false in your value.override.yaml and configure the opa url as additional env.

    Execute following command from repo root directory

    With overrides:

    helm upgrade -n <desired_namespace> <release_name> -i --create-namespace ./helm-chart -f ./helm-chart/values.yaml -f ./helm-chart/values.override.yaml

    Without overrides:

    helm upgrade -n <desired_namespace> <release_name> -i --create-namespace ./helm-chart -f ./helm-chart/values.yaml

    Post-deployment configuration

    1. Create super user

    How to create superuser?

    HELM_RELEASE_NAMESPACE="<desired_namespace>" &&\
    HELM_RELEASE_NAME="<release_name>" &&\
    BACKEND_POD_NAME=$(kubectl get pod --namespace $HELM_RELEASE_NAMESPACE -l tier=backend,app.kubernetes.io/instance=$HELM_RELEASE_NAME -o jsonpath='{.items[0].metadata.name}') &&\
    kubectl exec -it --namespace $HELM_RELEASE_NAMESPACE $BACKEND_POD_NAME -c cvat-backend-app-container -- python manage.py createsuperuser
    

    FAQ

    What is kubernetes and how it is working?

    See https://kubernetes.io/

    What is helm and how it is working?

    See https://helm.sh/

    How to setup Minikube

    1. Please follow the official Minikube installation guide
    2. minikube start --addons registry,registry-aliases
      

    How to understand what diff will be inflicted by ‘helm upgrade’?

    You can use https://github.com/databus23/helm-diff#install for that

    I want to use my own postgresql with your chart.

    Just set postgresql.enabled to false in the override file, then put the parameters of your database instance in the external field. You may also need to configure username, database and password fields to connect to your own database:

    postgresql:
      enabled: false
      external:
        host: postgresql.default.svc.cluster.local
        port: 5432
      auth:
        username: cvat
        database: cvat
      secret:
        password: cvat_postgresql
    

    In example above corresponding secret will be created automatically, but if you want to use existing secret change secret.create to false and set name of existing secret:

    postgresql:
      enabled: false
      external:
        host: postgresql.default.svc.cluster.local
        port: 5432
      secret:
        create: false
        name: "my-postgresql-secret"
    

    The secret must contain the database, username and password keys to access to the database like:

    apiVersion: v1
    kind: Secret
    metadata:
      name: "my-postgresql-secret"
      namespace: default
    type: generic
    stringData:
      database: cvat
      username: cvat
      password: secretpassword
    

    I want to use my own redis with your chart.

    Just set redis.enabled to false in the override file, then put the parameters of your Redis instance in the external field. You may also need to configure password field to connect to your own Redis:

    redis:
      enabled: false
      external:
        host: redis.hostname.local
      secret:
        password: cvat_redis
    

    In the above example the corresponding secret will be created automatically, but if you want to use an existing secret change secret.create to false and set name of the existing secret:

    redis:
      enabled: false
      external:
        host: redis.hostname.local
      secret:
        create: false
        name: "my-redis-secret"
    

    The secret must contain the redis-password key like:

    apiVersion: v1
    kind: Secret
    metadata:
      name: "my-redis-secret"
      namespace: default
    type: generic
    stringData:
      redis-password: secretpassword
    

    I want to override some settings in values.yaml.

    Just create file values.override.yaml and place your changes here, using same structure as in values.yaml. Then reference it in helm update/install command using -f flag

    Why you used external charts to provide redis and postgres?

    Because they definitely know what they do better then we are, so we are getting more quality and less support

    How to use custom domain name with k8s deployment:

    The default value cvat.local may be overridden with --set ingress.hosts[0].host option like this:

    helm upgrade -n default cvat -i --create-namespace helm-chart -f helm-chart/values.yaml -f helm-chart/values.override.yaml --set ingress.hosts[0].host=YOUR_FQDN
    

    How to fix fail of helm upgrade due label field is immutable reason?

    If an error message like this:

    Error: UPGRADE FAILED:cannot patch "cvat-backend-server" with kind Deployment: Deployment.apps "cvat-backend-server" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app":"cvat-app", "app.kubernetes.io/instance":"cvat", "app.kubernetes.io/managed-by":"Helm", "app.kubernetes.io/name":"cvat", "app.kubernetes.io/version":"latest", "component":"server", "helm.sh/chart":"cvat", "tier":"backend"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable
    

    To fix that, delete CVAT Deployments before upgrading

    kubectl delete deployments --namespace=foo -l app=cvat-app
    

    How to use existing PersistentVolume to store CVAT data instead of default storage

    It is assumed that you have created a PersistentVolumeClaim named my-claim-name and a PersistentVolume that backing the claim. Claims must exist in the same namespace as the Pod using the claim. For details see. Add these values in the values.override.yaml:

    cvat:
      backend:
        permissionFix:
          enabled: false
        defaultStorage:
          enabled: false
        server:
          additionalVolumes:
            - name: cvat-backend-data
              persistentVolumeClaim:
                claimName: my-claim-name
        worker:
          export:
            additionalVolumes:
              - name: cvat-backend-data
                persistentVolumeClaim:
                  claimName: my-claim-name
          import:
            additionalVolumes:
              - name: cvat-backend-data
                persistentVolumeClaim:
                  claimName: my-claim-name
          annotation:
            additionalVolumes:
              - name: cvat-backend-data
                persistentVolumeClaim:
                  claimName: my-claim-name
        utils:
          additionalVolumes:
            - name: cvat-backend-data
              persistentVolumeClaim:
                claimName: my-claim-name
    
    

    6.2.2 - Semi-automatic and Automatic Annotation

    Information about the installation of components needed for semi-automatic and automatic annotation.

    ⚠ WARNING: Do not use docker compose up If you did, make sure all containers are stopped by docker compose down.

    • To bring up cvat with auto annotation tool, from cvat root directory, you need to run:

      docker compose -f docker-compose.yml -f components/serverless/docker-compose.serverless.yml up -d
      

      If you did any changes to the Docker Compose files, make sure to add --build at the end.

      To stop the containers, simply run:

      docker compose -f docker-compose.yml -f components/serverless/docker-compose.serverless.yml down
      
    • You have to install nuctl command line tool to build and deploy serverless functions. Download version 1.11.24. It is important that the version you download matches the version in docker-compose.serverless.yml. For example, using wget.

      wget https://github.com/nuclio/nuclio/releases/download/<version>/nuctl-<version>-linux-amd64
      

      After downloading the nuclio, give it a proper permission and do a softlink.

      sudo chmod +x nuctl-<version>-linux-amd64
      sudo ln -sf $(pwd)/nuctl-<version>-linux-amd64 /usr/local/bin/nuctl
      
    • Deploy a couple of functions. This will automatically create a cvat Nuclio project to contain the functions. Commands below should be run only after CVAT has been installed using docker compose because it runs nuclio dashboard which manages all serverless functions.

      ./serverless/deploy_cpu.sh serverless/openvino/dextr
      ./serverless/deploy_cpu.sh serverless/openvino/omz/public/yolo-v3-tf
      

      GPU Support

      You will need to install Nvidia Container Toolkit. Also you will need to add --resource-limit nvidia.com/gpu=1 --triggers '{"myHttpTrigger": {"maxWorkers": 1}}' to the nuclio deployment command. You can increase the maxWorker if you have enough GPU memory. As an example, below will run on the GPU:

      nuctl deploy --project-name cvat \
        --path serverless/tensorflow/matterport/mask_rcnn/nuclio \
        --platform local --base-image tensorflow/tensorflow:1.15.5-gpu-py3 \
        --desc "GPU based implementation of Mask RCNN on Python 3, Keras, and TensorFlow." \
        --image cvat/tf.matterport.mask_rcnn_gpu \
        --triggers '{"myHttpTrigger": {"maxWorkers": 1}}' \
        --resource-limit nvidia.com/gpu=1
      

      Note:

      • The number of GPU deployed functions will be limited to your GPU memory.
      • See deploy_gpu.sh script for more examples.
      • For some models (namely SiamMask) you need an Nvidia driver version greater than or equal to 450.80.02.

      Note for Windows users:

      If you want to use nuclio under Windows CVAT installation you should install Nvidia drivers for WSL according to this instruction and follow the steps up to “2.3 Installing Nvidia drivers”. Important requirement: you should have the latest versions of Docker Desktop, Nvidia drivers for WSL, and the latest updates from the Windows Insider Preview Dev channel.

    Troubleshooting Nuclio Functions:

    • You can open nuclio dashboard at localhost:8070. Make sure status of your functions are up and running without any error.

    • Test your deployed DL model as a serverless function. The command below should work on Linux and Mac OS.

      image=$(curl https://upload.wikimedia.org/wikipedia/en/7/7d/Lenna_%28test_image%29.png --output - | base64 | tr -d '\n')
      cat << EOF > /tmp/input.json
      {"image": "$image"}
      EOF
      cat /tmp/input.json | nuctl invoke openvino-omz-public-yolo-v3-tf -c 'application/json'
      
      20.07.17 12:07:44.519    nuctl.platform.invoker (I) Executing function {"method": "POST", "url": "http://:57308", "headers": {"Content-Type":["application/json"],"X-Nuclio-Log-Level":["info"],"X-Nuclio-Target":["openvino-omz-public-yolo-v3-tf"]}}
      20.07.17 12:07:45.275    nuctl.platform.invoker (I) Got response {"status": "200 OK"}
      20.07.17 12:07:45.275                     nuctl (I) >>> Start of function logs
      20.07.17 12:07:45.275 ino-omz-public-yolo-v3-tf (I) Run yolo-v3-tf model {"worker_id": "0", "time": 1594976864570.9353}
      20.07.17 12:07:45.275                     nuctl (I) <<< End of function logs
      
      > Response headers:
      Date = Fri, 17 Jul 2020 09:07:45 GMT
      Content-Type = application/json
      Content-Length = 100
      Server = nuclio
      
      > Response body:
      [
          {
              "confidence": "0.9992254",
              "label": "person",
              "points": [
                  39,
                  124,
                  408,
                  512
              ],
              "type": "rectangle"
          }
      ]
      
    • To check for internal server errors, run docker ps -a to see the list of containers. Find the container that you are interested, e.g., nuclio-nuclio-tf-faster-rcnn-inception-v2-coco-gpu. Then check its logs by docker logs <name of your container> e.g.,

      docker logs nuclio-nuclio-tf-faster-rcnn-inception-v2-coco-gpu
      
    • To debug a code inside a container, you can use vscode to attach to a container instructions. To apply your changes, make sure to restart the container.

      docker restart <name_of_the_container>
      

    6.2.3 - CVAT Analytics and monitoring

    Instructions for deployment and customization of analytics and monitoring.

    CVAT Analytics suite of tools is designed to track and understand users' behavior, system performance, and for identifying potential issues in your application.

    You can also visualize user activity through Grafana, and aggregate user working time by the jobs.

    Gathered logs can be additionally filtered for efficient debugging.

    By using analytics, you’ll gain valuable insights to optimize your system and enhance user satisfaction.

    CVAT analytics are available from the top menu.

    CVAT Analytics

    Note: CVAT analytics and monitoring are available only for on-prem solution.

    See:

    High-level architecture

    The CVAT analytics is based on Vector, ClickHouse, and Grafana.

    CVAT Analytics

    CVAT Analytics

    CVAT and its analytics module can be set up locally, for self-hosted solution analytics are enabled by default.

    For detailed instructions for CVAT installation, see Installation Guide or refer to the CVAT Course for installation videos.

    All analytics-related features will be launched when you start CVAT containers with the following command:

    docker compose up -d
    

    Ports settings

    If you cannot access analytics on development environnement, see Analytics Ports

    Events log structure

    Relational database schema with the following fields:

    Field Description
    scope Scope of the event (e.g., zoomin:image, add:annotations, delete:image, update:assignee).
    obj_name Object name or None (e.g., task, job, cloudstorage, model, organization).
    obj_id Object identifier as in DB or None.
    obj_val Value for the event as string or None (e.g., frame number, number of added annotations).
    source Who generates the log event (e.g., server, ui).
    timestamp Local event time (in general for UI and server, the time is different).
    count How many times in the row it occurs.
    duration How much time does it take (it can be 0 for events without duration).
    project_id Project ID or None.
    task_id Task ID or None.
    job_id Job ID or None.
    user_id User ID or None.
    user_name User name or None.
    user_email User email or None.
    org_id Organization ID or None.
    org_slug Organization slug or None.
    payload JSON payload or None. Extra fields can be added to the JSON blob.

    Types of supported events

    Supported events change the scope of information displayed in Grafana.

    Supported Events

    Server events:

    • create:project, update:project, delete:project

    • create:task, update:task, delete:task

    • create:job, update:job, delete:job

    • create:organization, update:organization, delete:organization

    • create:user, update:user, delete:user

    • create:cloudstorage, update:cloudstorage, delete:cloudstorage

    • create:issue, update:issue, delete:issue

    • create:comment, update:comment, delete:comment

    • create:annotations, update:annotations, delete:annotations

    • create:label, update:label, delete:label

    Client events:

    • load:cvat

    • load:job, save:job, restore:job

    • upload:annotations

    • send:exception

    • send:task_info

    • draw:object, paste:object, copy:object, propagate:object, drag:object, resize:object, delete:object, lock:object, merge:objects

    • change:attribute

    • change:label

    • change:frame

    • move:image, zoom:image, fit:image, rotate:image

    • action:undo, action:redo

    • press:shortcut

    • send:debug_info

    • click:element

    Request id for tracking

    Note, that every response to an API request made to the the server includes a header named X-Request-Id, for example: X-Request-Id: 6a2b7102-c4b9-4d57-8754-5658132ba37d.

    This identifier is also recorded in all server events that occur as a result of the respective request.

    For example, when an operation to create a task is performed, other related entities such as labels and attributes are generated on the server in addition to the Task object.

    All events associated with this operation will have the same request_id in the payload field.

    Fetching event data as CSV from the /api/events endpoint

    The /api/events endpoint allows the fetching of event data with filtering parameters such as org_id, project_id, task_id, job_id, and user_id.

    For more details, see Swagger API Documentation.

    For example, to fetch all events associated with a specific job, the following curl command can be used:

    curl --user 'user:pass' https://app.cvat.ai/api/events?job_id=123
    

    In the response, you will receive a query ID:

    { "query_id": "150cac1f-09f1-4d73-b6a5-5f47aa5d0031" }
    

    As this process may take some time to complete, the status of the request can be checked by adding the query parameter query_id to the request:

    curl -I --user 'user:pass' https://app.cvat.ai/api/events?job_id=123&query_id=150cac1f-09f1-4d73-b6a5-5f47aa5d0031
    

    Upon successful creation, the server will return a 201 Created status:

    HTTP/2 201
    allow: GET, POST, HEAD, OPTIONS
    date: Tue, 16 May 2023 13:38:42 GMT
    referrer-policy: same-origin
    server: Apache
    vary: Accept,Origin,Cookie
    x-content-type-options: nosniff
    x-frame-options: DENY
    x-request-id: 4631f5fa-a4f0-42a8-b77b-7426fc298a85
    

    The CSV file can be downloaded by adding the action=download query parameter to the request:

    curl --user 'user:pass' https://app.cvat.ai/api/events?job_id=123&query_id=150cac1f-09f1-4d73-b6a5-5f47aa5d0031&action=download > /tmp/events.csv
    

    This will download and save the file to /tmp/events.csv on your local machine.

    Dashboards

    By default, three dashboards are available in CVAT.

    To access them, click General, you will be forwarded to the page with available dashboards.

    List of dashboards

    Dashboard Description
    All Events Dashboard that shows all event logs, timestamps, and source.
    Management Dashboard with information about user activities such as working time by job and so on.
    Monitoring Dashboard showing server logs, including errors.

    Dashboard: All Events

    The dashboard shows all events, their timestamps, and their source.

    Dashboard: All Events

    Element Description
    Filters Can be used as drop-down lists or search fields. Click on the arrow to activate.
    Overall activity Graph that shows the overall activity by the selected filters.
    Scope Users' activity, see Types of supported events.
    obj_name Object or item related to the Scope.
    obj_id Object’s id. Might be empty.
    source Source of the event, can be client or server.
    timestamp Time when the event happened.
    count Common field for all events, not null where it makes sense, for example, the
    number of saved objects in an annotation.
    duration Duration in milliseconds.
    project_id Id of the project.
    project_id Id of the project.
    task_id ID of the task.
    job_id ID of the job.

    There are two fields with statistics at the bottom of the dashboard, about browser and OS users use.

    Click on the column name to enable a filter.

    If you want to inspect the value, hover over it and click on the eye icon.

    Dashboard: Management

    The dashboard shows user activity.

    Dashboard: Management

    Element Description
    Filters Can be used as drop-down lists or search fields. Click on the arrow to activate.
    User activity Graph that shows when the user was active (data and time), click on the user id below, to see the graph for the dedicated user.
    Overall activity Graph shows common activity for all users.
    User User ID.
    Project Project ID. Might be empty.
    Task Task ID. Might be empty.
    Job Job ID. Might be empty.
    Working time(h) Time spent on task in hours.
    Activity Number of events for each user.

    Click on the column name to enable a filter.

    If you want to inspect the value, hover over it and click on the eye icon.

    Dashboard: Monitoring

    The dashboard shows server logs, helps handle errors, and shows user activity.

    Dashboard: Monitoring

    Element Description
    Filters Can be used as drop-down lists or search fields. Click on the arrow to activate.
    Active users (now) Number of active users on an instance.
    Overall activity Graph that shows the number of active users.
    Exceptions Graph that shows the number of errors that happened in the instance.
    timestamp Time when the error happened.
    user_id User ID.
    user_name User nickname.
    project_id Id of the project. Might be empty.
    task_id Task ID. Might be empty.
    job_id Job ID. Might be empty.
    error Error description
    stack Error description
    payload Error description
    stack Stack trace, which is a report of the active stack frames at a certain point in time during the execution. This information is typically used for debugging purposes to locate where an issue occurred.
    payload JSON that describes the entire object, which contains several properties. This data in the payload is related to an event that was created as a result of a failed API request. The payload contains information about this event.

    Click on the column name to enable a filter.

    If you want to inspect the value, hover over it and click on the eye icon.

    Dashboards setup

    You can adjust the dashboards. To do this, click on the graph or table name and from the drop-down menu select Edit.

    Adjust the query in the editor.

    Dashboard: look and feel

    Example of query:

    SELECT
        time,
        uniqExact(user_id) Users
    FROM
    (
        SELECT
          user_id,
          toStartOfInterval(timestamp, INTERVAL 15 minute) as time
        FROM cvat.events
        WHERE
          user_id IS NOT NULL
        GROUP BY
          user_id,
          time
        ORDER BY time ASC WITH FILL STEP toIntervalMinute(15)
    )
    GROUP BY time
    ORDER BY time
    

    Note, that by default the updated configuration will not be saved and will be reset to the default parameters after you restart the container.

    To save the updated configuration, do the following:

    1. Update Configuration: Start by making your desired changes in the query.

    2. Apply Changes: Once you’ve made your changes, click the Apply button to ensure the changes are implemented.

      Apply changes

    3. Save Configuration: To save your applied changes, on the top of the dashboard, click the Save button.

      Apply changes

    4. Replace Configuration File: After saving, replace the existing Grafana dashboard configuration file is located at components/analytics/grafana/dashboards with the new JSON configuration file.

      Apply changes

    5. Restart Grafana Service: To ensure, that all changes take effect, restart the Grafana service. If you’re using Docker Compose, execute the following command: docker compose restart cvat_grafana.

    For more information, see Grafana Dashboards.

    Example of use

    This video demonstrates available by default CVAT analytics features.

    6.2.4 - Mounting cloud storage

    Instructions on how to mount AWS S3 bucket, Microsoft Azure container or Google Drive as a filesystem.

    AWS S3 bucket as filesystem

    Ubuntu 20.04

    Mount

    1. Install s3fs:

      sudo apt install s3fs
      
    2. Enter your credentials in a file ${HOME}/.passwd-s3fs and set owner-only permissions:

      echo ACCESS_KEY_ID:SECRET_ACCESS_KEY > ${HOME}/.passwd-s3fs
      chmod 600 ${HOME}/.passwd-s3fs
      
    3. Uncomment user_allow_other in the /etc/fuse.conf file: sudo nano /etc/fuse.conf

    4. Run s3fs, replace bucket_name, mount_point:

      s3fs <bucket_name> <mount_point> -o allow_other -o passwd_file=${HOME}/.passwd-s3fs
      

    For more details see here.

    Automatically mount

    Follow the first 3 mounting steps above.

    Using fstab
    1. Create a bash script named aws_s3_fuse(e.g in /usr/bin, as root) with this content (replace user_name on whose behalf the disk will be mounted, backet_name, mount_point, /path/to/.passwd-s3fs):

      #!/bin/bash
      sudo -u <user_name> s3fs <backet_name> <mount_point> -o passwd_file=/path/to/.passwd-s3fs -o allow_other
      exit 0
      
    2. Give it the execution permission:

      sudo chmod +x /usr/bin/aws_s3_fuse
      
    3. Edit /etc/fstab adding a line like this, replace mount_point):

      /absolute/path/to/aws_s3_fuse  <mount_point>     fuse    allow_other,user,_netdev     0       0
      
    Using systemd
    1. Create unit file sudo nano /etc/systemd/system/s3fs.service (replace user_name, bucket_name, mount_point, /path/to/.passwd-s3fs):

      [Unit]
      Description=FUSE filesystem over AWS S3 bucket
      After=network.target
      
      [Service]
      Environment="MOUNT_POINT=<mount_point>"
      User=<user_name>
      Group=<user_name>
      ExecStart=s3fs <bucket_name> ${MOUNT_POINT} -o passwd_file=/path/to/.passwd-s3fs -o allow_other
      ExecStop=fusermount -u ${MOUNT_POINT}
      Restart=always
      Type=forking
      
      [Install]
      WantedBy=multi-user.target
      
    2. Update the system configurations, enable unit autorun when the system boots, mount the bucket:

      sudo systemctl daemon-reload
      sudo systemctl enable s3fs.service
      sudo systemctl start s3fs.service
      

    Check

    A file /etc/mtab contains records of currently mounted filesystems.

    cat /etc/mtab | grep 's3fs'
    

    Unmount filesystem

    fusermount -u <mount_point>
    

    If you used systemd to mount a bucket:

    sudo systemctl stop s3fs.service
    sudo systemctl disable s3fs.service
    

    Microsoft Azure container as filesystem

    Ubuntu 20.04

    Mount

    1. Set up the Microsoft package repository.(More here)

      wget https://packages.microsoft.com/config/ubuntu/20.04/packages-microsoft-prod.deb
      sudo dpkg -i packages-microsoft-prod.deb
      sudo apt-get update
      
    2. Install blobfuse and fuse:

      sudo apt-get install blobfuse fuse
      

      For more details see here

    3. Create environments (replace account_name, account_key, mount_point):

      export AZURE_STORAGE_ACCOUNT=<account_name>
      export AZURE_STORAGE_ACCESS_KEY=<account_key>
      MOUNT_POINT=<mount_point>
      
    4. Create a folder for cache:

      sudo mkdir -p /mnt/blobfusetmp
      
    5. Make sure the file must be owned by the user who mounts the container:

      sudo chown <user> /mnt/blobfusetmp
      
    6. Create the mount point, if it doesn’t exists:

      mkdir -p ${MOUNT_POINT}
      
    7. Uncomment user_allow_other in the /etc/fuse.conf file: sudo nano /etc/fuse.conf

    8. Mount container(replace your_container):

      blobfuse ${MOUNT_POINT} --container-name=<your_container> --tmp-path=/mnt/blobfusetmp -o allow_other
      

    Automatically mount

    Follow the first 7 mounting steps above.

    Using fstab
    1. Create configuration file connection.cfg with same content, change accountName, select one from accountKey or sasToken and replace with your value:

      accountName <account-name-here>
      # Please provide either an account key or a SAS token, and delete the other line.
      accountKey <account-key-here-delete-next-line>
      #change authType to specify only 1
      sasToken <shared-access-token-here-delete-previous-line>
      authType <MSI/SAS/SPN/Key/empty>
      containerName <insert-container-name-here>
      
    2. Create a bash script named azure_fuse(e.g in /usr/bin, as root) with content below (replace user_name on whose behalf the disk will be mounted, mount_point, /path/to/blobfusetmp,/path/to/connection.cfg):

      #!/bin/bash
      sudo -u <user_name> blobfuse <mount_point> --tmp-path=/path/to/blobfusetmp  --config-file=/path/to/connection.cfg -o allow_other
      exit 0
      
    3. Give it the execution permission:

      sudo chmod +x /usr/bin/azure_fuse
      
    4. Edit /etc/fstab with the blobfuse script. Add the following line(replace paths):

      /absolute/path/to/azure_fuse </path/to/desired/mountpoint> fuse allow_other,user,_netdev
      
    Using systemd
    1. Create unit file sudo nano /etc/systemd/system/blobfuse.service. (replace user_name, mount_point, container_name,/path/to/connection.cfg):

      [Unit]
      Description=FUSE filesystem over Azure container
      After=network.target
      
      [Service]
      Environment="MOUNT_POINT=<mount_point>"
      User=<user_name>
      Group=<user_name>
      ExecStart=blobfuse ${MOUNT_POINT} --container-name=<container_name> --tmp-path=/mnt/blobfusetmp --config-file=/path/to/connection.cfg -o allow_other
      ExecStop=fusermount -u ${MOUNT_POINT}
      Restart=always
      Type=forking
      
      [Install]
      WantedBy=multi-user.target
      
    2. Update the system configurations, enable unit autorun when the system boots, mount the container:

      sudo systemctl daemon-reload
      sudo systemctl enable blobfuse.service
      sudo systemctl start blobfuse.service
      

      Or for more detail see here

    Check

    A file /etc/mtab contains records of currently mounted filesystems.

    cat /etc/mtab | grep 'blobfuse'
    

    Unmount filesystem

    fusermount -u <mount_point>
    

    If you used systemd to mount a container:

    sudo systemctl stop blobfuse.service
    sudo systemctl disable blobfuse.service
    

    If you have any mounting problems, check out the answers to common problems

    Google Drive as filesystem

    Ubuntu 20.04

    Mount

    To mount a google drive as a filesystem in user space(FUSE) you can use google-drive-ocamlfuse To do this follow the instructions below:

    1. Install google-drive-ocamlfuse:

      sudo add-apt-repository ppa:alessandro-strada/ppa
      sudo apt-get update
      sudo apt-get install google-drive-ocamlfuse
      
    2. Run google-drive-ocamlfuse without parameters:

      google-drive-ocamlfuse
      

      This command will create the default application directory (~/.gdfuse/default), containing the configuration file config (see the wiki page for more details about configuration). And it will start a web browser to obtain authorization to access your Google Drive. This will let you modify default configuration before mounting the filesystem.

      Then you can choose a local directory to mount your Google Drive (e.g.: ~/GoogleDrive).

    3. Create the mount point, if it doesn’t exist(replace mount_point):

      mountpoint="<mount_point>"
      mkdir -p $mountpoint
      
    4. Uncomment user_allow_other in the /etc/fuse.conf file: sudo nano /etc/fuse.conf

    5. Mount the filesystem:

      google-drive-ocamlfuse -o allow_other $mountpoint
      

    Automatically mount

    Follow the first 4 mounting steps above.

    Using fstab
    1. Create a bash script named gdfuse(e.g in /usr/bin, as root) with this content (replace user_name on whose behalf the disk will be mounted, label, mount_point):

      #!/bin/bash
      sudo -u <user_name> google-drive-ocamlfuse -o allow_other -label <label> <mount_point>
      exit 0
      
    2. Give it the execution permission:

      sudo chmod +x /usr/bin/gdfuse
      
    3. Edit /etc/fstab adding a line like this, replace mount_point):

      /absolute/path/to/gdfuse  <mount_point>     fuse    allow_other,user,_netdev     0       0
      

      For more details see here

    Using systemd
    1. Create unit file sudo nano /etc/systemd/system/google-drive-ocamlfuse.service. (replace user_name, label(default label=default), mount_point):

      [Unit]
      Description=FUSE filesystem over Google Drive
      After=network.target
      
      [Service]
      Environment="MOUNT_POINT=<mount_point>"
      User=<user_name>
      Group=<user_name>
      ExecStart=google-drive-ocamlfuse -label <label> ${MOUNT_POINT}
      ExecStop=fusermount -u ${MOUNT_POINT}
      Restart=always
      Type=forking
      
      [Install]
      WantedBy=multi-user.target
      
    2. Update the system configurations, enable unit autorun when the system boots, mount the drive:

      sudo systemctl daemon-reload
      sudo systemctl enable google-drive-ocamlfuse.service
      sudo systemctl start google-drive-ocamlfuse.service
      

      For more details see here

    Check

    A file /etc/mtab contains records of currently mounted filesystems.

    cat /etc/mtab | grep 'google-drive-ocamlfuse'
    

    Unmount filesystem

    fusermount -u <mount_point>
    

    If you used systemd to mount a drive:

    sudo systemctl stop google-drive-ocamlfuse.service
    sudo systemctl disable google-drive-ocamlfuse.service
    

    6.2.5 - LDAP Backed Authentication

    Allow users to login with credentials from a central source

    The creation of settings.py

    When integrating LDAP login, we need to create an overlay to the default CVAT settings located in cvat/settings/production.py. This overlay is where we will configure Django to connect to the LDAP server.

    The main issue with using LDAP is that different LDAP implementations have different parameters. So the options used for Active Directory backed authentication will differ if you were to be using FreeIPA.

    Update docker-compose.override.yml

    In your override config you need to passthrough your settings and tell CVAT to use them by setting the DJANGO_SETTINGS_MODULE variable.

    services:
      cvat_server:
        environment:
          DJANGO_SETTINGS_MODULE: settings
        volumes:
          - ./settings.py:/home/django/settings.py:ro
    

    Active Directory Example

    The following example should allow for users to authenticate themselves against Active Directory. This example requires a dummy user named cvat_bind. The configuration for the bind account does not need any special permissions.

    When updating AUTH_LDAP_BIND_DN, you can write out the account info in two ways. Both are documented in the config below.

    This config is known to work with Windows Server 2022, but should work for older versions and Samba’s implementation of Active Directory.

    # We are overlaying production
    from cvat.settings.production import *
    
    # Custom code below
    import ldap
    from django_auth_ldap.config import LDAPSearch
    from django_auth_ldap.config import NestedActiveDirectoryGroupType
    
    # Notify CVAT that we are using LDAP authentication
    IAM_TYPE = 'LDAP'
    
    # Talking to the LDAP server
    AUTH_LDAP_SERVER_URI = "ldap://ad.example.com" # IP Addresses also work
    ldap.set_option(ldap.OPT_REFERRALS, 0)
    
    _BASE_DN = "CN=Users,DC=ad,DC=example,DC=com"
    
    # Authenticating with the LDAP server
    AUTH_LDAP_BIND_DN = "CN=cvat_bind,%s" % _BASE_DN
    # AUTH_LDAP_BIND_DN = "cvat_bind@ad.example.com"
    AUTH_LDAP_BIND_PASSWORD = "SuperSecurePassword^21"
    
    AUTH_LDAP_USER_SEARCH = LDAPSearch(
        _BASE_DN,
        ldap.SCOPE_SUBTREE,
        "(sAMAccountName=%(user)s)"
    )
    
    AUTH_LDAP_GROUP_SEARCH = LDAPSearch(
        _BASE_DN,
        ldap.SCOPE_SUBTREE,
        "(objectClass=group)"
    )
    
    # Mapping Django field names to Active Directory attributes
    AUTH_LDAP_USER_ATTR_MAP = {
        "user_name": "sAMAccountName",
        "first_name": "givenName",
        "last_name": "sn",
        "email": "mail",
    }
    
    # Group Management
    AUTH_LDAP_GROUP_TYPE = NestedActiveDirectoryGroupType()
    
    # Register Django LDAP backend
    AUTHENTICATION_BACKENDS += ['django_auth_ldap.backend.LDAPBackend']
    
    # Map Active Directory groups to Django/CVAT groups.
    AUTH_LDAP_ADMIN_GROUPS = [
        'CN=CVAT Admins,%s' % _BASE_DN,
    ]
    AUTH_LDAP_BUSINESS_GROUPS = [
        'CN=CVAT Managers,%s' % _BASE_DN,
    ]
    AUTH_LDAP_WORKER_GROUPS = [
        'CN=CVAT Workers,%s' % _BASE_DN,
    ]
    AUTH_LDAP_USER_GROUPS = [
        'CN=CVAT Users,%s' % _BASE_DN,
    ]
    
    DJANGO_AUTH_LDAP_GROUPS = {
        "admin": AUTH_LDAP_ADMIN_GROUPS,
        "business": AUTH_LDAP_BUSINESS_GROUPS,
        "user": AUTH_LDAP_USER_GROUPS,
        "worker": AUTH_LDAP_WORKER_GROUPS,
    }
    

    FreeIPA Example

    The following example should allow for users to authenticate themselves against FreeIPA. This example requires a dummy user named cvat_bind. The configuration for the bind account does not need any special permissions.

    When updating AUTH_LDAP_BIND_DN, you can only write the user info in one way, unlike with Active Directory

    This config is known to work with AlmaLinux 8, but may work for other versions and flavors of Enterprise Linux.

    # We are overlaying production
    from cvat.settings.production import *
    
    # Custom code below
    import ldap
    from django_auth_ldap.config import LDAPSearch
    from django_auth_ldap.config import GroupOfNamesType
    
    # Notify CVAT that we are using LDAP authentication
    IAM_TYPE = 'LDAP'
    
    _BASE_DN = "CN=Accounts,DC=ipa,DC=example,DC=com"
    
    # Talking to the LDAP server
    AUTH_LDAP_SERVER_URI = "ldap://ipa.example.com" # IP Addresses also work
    ldap.set_option(ldap.OPT_REFERRALS, 0)
    
    # Authenticating with the LDAP server
    AUTH_LDAP_BIND_DN = "UID=cvat_bind,CN=Users,%s" % _BASE_DN
    AUTH_LDAP_BIND_PASSWORD = "SuperSecurePassword^21"
    
    AUTH_LDAP_USER_SEARCH = LDAPSearch(
        "CN=Users,%s" % _BASE_DN,
        ldap.SCOPE_SUBTREE,
        "(uid=%(user)s)"
    )
    
    AUTH_LDAP_GROUP_SEARCH = LDAPSearch(
        "CN=Groups,%s" % _BASE_DN,
        ldap.SCOPE_SUBTREE,
        "(objectClass=groupOfNames)"
    )
    
    # Mapping Django field names to FreeIPA attributes
    AUTH_LDAP_USER_ATTR_MAP = {
        "user_name": "uid",
        "first_name": "givenName",
        "last_name": "sn",
        "email": "mail",
    }
    
    # Group Management
    AUTH_LDAP_GROUP_TYPE = GroupOfNamesType()
    
    # Register Django LDAP backend
    AUTHENTICATION_BACKENDS += ['django_auth_ldap.backend.LDAPBackend']
    
    # Map FreeIPA groups to Django/CVAT groups.
    AUTH_LDAP_ADMIN_GROUPS = [
        'CN=cvat_admins,CN=Groups,%s' % _BASE_DN,
    ]
    AUTH_LDAP_BUSINESS_GROUPS = [
        'CN=cvat_managers,CN=Groups,%s' % _BASE_DN,
    ]
    AUTH_LDAP_WORKER_GROUPS = [
        'CN=cvat_workers,CN=Groups,%s' % _BASE_DN,
    ]
    AUTH_LDAP_USER_GROUPS = [
        'CN=cvat_users,CN=Groups,%s' % _BASE_DN,
    ]
    
    DJANGO_AUTH_LDAP_GROUPS = {
        "admin": AUTH_LDAP_ADMIN_GROUPS,
        "business": AUTH_LDAP_BUSINESS_GROUPS,
        "user": AUTH_LDAP_USER_GROUPS,
        "worker": AUTH_LDAP_WORKER_GROUPS,
    }
    

    Resources

    6.2.6 - Backup guide

    Instructions on how to backup CVAT data with Docker.

    About CVAT data volumes

    Docker volumes are used to store all CVAT data:

    • cvat_db: PostgreSQL database files, used to store information about users, tasks, projects, annotations, etc. Mounted into cvat_db container by /var/lib/postgresql/data path.

    • cvat_data: used to store uploaded and prepared media data. Mounted into cvat container by /home/django/data path.

    • cvat_keys: used to store user ssh keys needed for synchronization with a remote Git repository. Mounted into cvat container by /home/django/keys path.

    • cvat_logs: used to store logs of CVAT backend processes managed by supevisord. Mounted into cvat container by /home/django/logs path.

    • cvat_events: this is an optional volume that is used only when Analytics component is enabled and is used to store Elasticsearch database files. Mounted into cvat_elasticsearch container by /usr/share/elasticsearch/data path.

    How to backup all CVAT data

    All CVAT containers should be stopped before backup:

    docker compose stop
    

    Please don’t forget to include all the compose config files that were used in the docker compose command using the -f parameter.

    Backup data:

    mkdir backup
    docker run --rm --name temp_backup --volumes-from cvat_db -v $(pwd)/backup:/backup ubuntu tar -czvf /backup/cvat_db.tar.gz /var/lib/postgresql/data
    docker run --rm --name temp_backup --volumes-from cvat_server -v $(pwd)/backup:/backup ubuntu tar -czvf /backup/cvat_data.tar.gz /home/django/data
    # [optional]
    docker run --rm --name temp_backup --volumes-from cvat_elasticsearch -v $(pwd)/backup:/backup ubuntu tar -czvf /backup/cvat_events.tar.gz /usr/share/elasticsearch/data
    

    Make sure the backup archives have been created, the output of ls backup command should look like this:

    ls backup
    cvat_data.tar.gz  cvat_db.tar.gz  cvat_events.tar.gz
    

    How to restore CVAT from backup

    Warning: use exactly the same CVAT version to restore DB. Otherwise it will not work because between CVAT releases the layout of DB can be changed. You always can upgrade CVAT later. It will take care to migrate your data properly internally.

    Note: CVAT containers must exist (if no, please follow the installation guide). Stop all CVAT containers:

    docker compose stop
    

    Restore data:

    cd <path_to_backup_folder>
    docker run --rm --name temp_backup --volumes-from cvat_db -v $(pwd):/backup ubuntu bash -c "cd /var/lib/postgresql/data && tar -xvf /backup/cvat_db.tar.gz --strip 4"
    docker run --rm --name temp_backup --volumes-from cvat_server -v $(pwd):/backup ubuntu bash -c "cd /home/django/data && tar -xvf /backup/cvat_data.tar.gz --strip 3"
    # [optional]
    docker run --rm --name temp_backup --volumes-from cvat_elasticsearch -v $(pwd):/backup ubuntu bash -c "cd /usr/share/elasticsearch/data && tar -xvf /backup/cvat_events.tar.gz --strip 4"
    

    After that run CVAT as usual:

    docker compose up -d
    

    Additional resources

    Docker guide about volume backups

    6.2.7 - Upgrade guide

    Instructions for upgrading CVAT deployed with docker compose

    Upgrade guide

    Note: updating CVAT from version 2.2.0 to version 2.3.0 requires additional manual actions with database data due to upgrading PostgreSQL base image major version. See details here

    To upgrade CVAT, follow these steps:

    • It is highly recommended backup all CVAT data before updating, follow the backup guide and backup all CVAT volumes.

    • Go to the previously cloned CVAT directory and stop all CVAT containers with:

      docker compose down
      

      If you have included additional components, include all compose configuration files that are used, e.g.:

      docker compose -f docker-compose.yml -f components/serverless/docker-compose.serverless.yml down
      
    • Update CVAT source code by any preferable way: clone with git or download zip file from GitHub. Note that you need to download the entire source code, not just the Docker Compose configuration file. Check the installation guide for details.

    • Verify settings: The installation process is changed/modified from version to version and you may need to export some environment variables, for example CVAT_HOST.

    • Update local CVAT images. Pull or build new CVAT images, see How to pull/build/update CVAT images section for details.

    • Start CVAT with:

      docker compose up -d
      

      When CVAT starts, it will upgrade its DB in accordance with the latest schema. It can take time especially if you have a lot of data. Please do not terminate the migration and wait till the process is complete. You can monitor the startup process with the following command:

      docker logs cvat_server -f
      

    How to upgrade CVAT from v2.2.0 to v2.3.0.

    Step by step commands how to upgrade CVAT from v2.2.0 to v2.3.0. Let’s assume that you have CVAT v2.2.0 working.

    docker exec -it cvat_db pg_dumpall > cvat.db.dump
    cd cvat
    docker compose down
    docker volume rm cvat_cvat_db
    export CVAT_VERSION="v2.3.0"
    cd ..
    mv cvat cvat_220
    wget https://github.com/opencv/cvat/archive/refs/tags/${CVAT_VERSION}.zip
    unzip ${CVAT_VERSION}.zip && mv cvat-${CVAT_VERSION:1} cvat
    unset CVAT_VERSION
    cd cvat
    export CVAT_HOST=cvat.example.com
    export ACME_EMAIL=example@example.com
    docker compose pull
    docker compose up -d cvat_db
    docker exec -i cvat_db psql -q -d postgres < ../cvat.db.dump
    docker compose -f docker-compose.yml -f docker-compose.dev.yml -f docker-compose.https.yml up -d
    

    How to upgrade CVAT from v1.7.0 to v2.2.0.

    Step by step commands how to upgrade CVAT from v1.7.0 to v2.2.0. Let’s assume that you have CVAT v1.7.0 working.

    export CVAT_VERSION="v2.2.0"
    cd cvat
    docker compose down
    cd ..
    mv cvat cvat_170
    wget https://github.com/opencv/cvat/archive/refs/tags/${CVAT_VERSION}.zip
    unzip ${CVAT_VERSION}.zip && mv cvat-${CVAT_VERSION:1} cvat
    cd cvat
    docker pull cvat/server:${CVAT_VERSION}
    docker tag cvat/server:${CVAT_VERSION} openvino/cvat_server:latest
    docker pull cvat/ui:${CVAT_VERSION}
    docker tag cvat/ui:${CVAT_VERSION} openvino/cvat_ui:latest
    docker compose up -d
    

    How to upgrade PostgreSQL database base image

    1. It is highly recommended backup all CVAT data before updating, follow the backup guide and backup CVAT database volume.

    2. Run previously used CVAT version as usual

    3. Backup current database with pg_dumpall tool:

      docker exec -it cvat_db pg_dumpall > cvat.db.dump
      
    4. Stop CVAT:

      docker compose down
      
    5. Delete current PostgreSQL’s volume, that’s why it’s important to have a backup:

      docker volume rm cvat_cvat_db
      
    6. Update CVAT source code by any preferable way: clone with git or download zip file from GitHub. Check the installation guide for details.

    7. Start database container only:

      docker compose up -d cvat_db
      
    8. Import PostgreSQL dump into new DB container:

      docker exec -i cvat_db psql -q -d postgres < cvat.db.dump
      
    9. Start CVAT:

      docker compose up -d
      

    6.2.8 - IAM: system roles

    System roles

    By default CVAT users can be assigned to one of the following groups: admin, business, user and worker.

    Each of these groups gives a set of permissions. TBD

    Changing permissions

    System permissions are defined using .rego files stored in cvat/apps/iam/rules/. Rego is a declarative language used for defining OPA policies. It’s syntax is defined in OPA docs.

    After changing the .rego files, you need to rebuilt and restart the docker compose for the changes to take effect. In this case you need to include docker-compose.dev.yml compose config file to docker compose command.

    6.2.9 - Webhooks

    CVAT Webhooks: set up and use

    Webhooks are user-defined HTTP callbacks that are triggered by specific events. When an event that triggers a webhook occurs, CVAT makes an HTTP request to the URL configured for the webhook. The request will include a payload with information about the event.

    CVAT, webhooks can be triggered by a variety of events, such as the creation, deletion, or modification of tasks, jobs, and so on. This makes it easy to set up automated processes that respond to changes made in CVAT.

    For example, you can set up webhooks to alert you when a job’s assignee is changed or when a job/task’s status is updated, for instance, when a job is completed and ready for review or has been reviewed. New task creation can also trigger notifications.

    These capabilities allow you to keep track of progress and changes in your CVAT workflow instantly.

    In CVAT you can create a webhook for a project or organization. You can use CVAT GUI or direct API calls.

    See:

    Create Webhook

    For project

    To create a webhook for Project, do the following:

    1. Create a Project.

    2. Go to the Projects and click on the project’s widget.

    3. In the top right corner, click Actions > Setup Webhooks.

    4. In the top right corner click +

      Create Project Webhook

    5. Fill in the Setup webhook form and click Submit.

    For organization

    To create a webhook for Organization, do the following:

    1. Create Organization
    2. Go to the Organization > Settings > Actions > Setup Webhooks.
    3. In the top right corner click +

    1. Fill in the Setup webhook form and click Submit.

    Webhooks forms

    The Setup a webhook forms look like the following.

    Create Project And Org Webhook Forms

    Forms have the following fields:

    Field Description
    Target URL The URL where the event data will be sent.
    Description Provides a brief summary of the webhook’s purpose.
    Project A drop-down list that lets you select from available projects.
    Content type Defines the data type for the payload in the webhook request via the HTTP Content-Type field.
    Secret A unique key for verifying the webhook’s origin, ensuring it’s genuinely from CVAT.
    For more information, see Webhook secret
    Enable SSL A checkbox for enabling or disabling SSL verification.
    Active Uncheck this box if you want to stop the delivery of specific webhook payloads.
    Send everything Check this box to send all event types through the webhook.
    Specify individual events Choose this option to send only certain event types.
    Refer to the List of available events for more information on event types.

    List of events

    The following events are available for webhook alerts.

    Resource Create Update Delete Description
    Organization Alerts for changes made to an Organization.
    Membership Alerts when a member is added to or removed from an organization.
    Invitation Alerts when an invitation to an Organization is issued or revoked.
    Project Alerts for any actions taken within a project.
    Task Alerts for actions related to a task, such as status changes, assignments, etc.
    Job Alerts for any updates made to a job.
    Issue Alerts for any activities involving issues.
    Comment Alerts for actions involving comments, such as creation, deletion, or modification.

    Payloads

    Create event

    Webhook payload object for create:<resource> events:

    Key Type Description
    event string Identifies the event that triggered the webhook, following the create:<resource> pattern.
    <resource> object Complete information about the created resource. Refer to the Swagger docs for individual resource details.
    webhook_id integer The identifier for the webhook that sends the payload.
    sender object Details about the user that triggered the webhook.

    An example of payload for the create:task event:

    
    {
     "event": "create:task",
        "task": {
            "url": "<http://localhost:8080/api/tasks/15>",
            "id": 15,
            "name": "task",
            "project_id": 7,
            "mode": "",
            "owner": {
                "url": "<http://localhost:8080/api/users/1>",
                "id": 1,
                "username": "admin1",
                "first_name": "Admin",
                "last_name": "First"
            },
            "assignee": null,
            "bug_tracker": "",
            "created_date": "2022-10-04T08:05:50.419259Z",
            "updated_date": "2022-10-04T08:05:50.422917Z",
            "overlap": null,
            "segment_size": 0,
            "status": "annotation",
            "labels": \[
                {
                    "id": 28,
                    "name": "label_0",
                    "color": "#bde94a",
                    "attributes": [],
                    "type": "any",
                    "sublabels": [],
                    "has_parent": false
                }
            \],
            "segments": [],
            "dimension": "2d",
            "subset": "",
            "organization": null,
            "target_storage": {
                "id": 14,
                "location": "local",
                "cloud_storage_id": null
            },
            "source_storage": {
                "id": 13,
                "location": "local",
                "cloud_storage_id": null
            }
        },
        "webhook_id": 7,
        "sender": {
            "url": "<http://localhost:8080/api/users/1>",
            "id": 1,
            "username": "admin1",
            "first_name": "Admin",
            "last_name": "First"
        }
    }
    

    Update event

    Webhook payload object for update:<resource> events:

    Key Type Description
    event string Identifies the event that triggered the webhook, following the update:<resource> pattern.
    <resource> object Provides complete information about the updated resource. See the Swagger docs for resource details.
    before_update object Contains keys of <resource> that were updated, along with their old values.
    webhook_id integer The identifier for the webhook that dispatched the payload.
    sender object Details about the user that triggered the webhook.

    An example of update:<resource> event:

    
    {
        "event": "update:task",
        "task": {
            "url": "<http://localhost:8080/api/tasks/15>",
            "id": 15,
            "name": "new task name",
            "project_id": 7,
            "mode": "annotation",
            "owner": {
                "url": "<http://localhost:8080/api/users/1>",
                "id": 1,
                "username": "admin1",
                "first_name": "Admin",
                "last_name": "First"
            },
            "assignee": null,
            "bug_tracker": "",
            "created_date": "2022-10-04T08:05:50.419259Z",
            "updated_date": "2022-10-04T11:04:51.451681Z",
            "overlap": 0,
            "segment_size": 1,
            "status": "annotation",
            "labels": \[
                {
                    "id": 28,
                    "name": "label_0",
                    "color": "#bde94a",
                    "attributes": [],
                    "type": "any",
                    "sublabels": [],
                    "has_parent": false
                }
            \],
            "segments": \[
                {
                    "start_frame": 0,
                    "stop_frame": 0,
                    "jobs": \[
                        {
                            "url": "<http://localhost:8080/api/jobs/19>",
                            "id": 19,
                            "assignee": null,
                            "status": "annotation",
                            "stage": "annotation",
                            "state": "new"
                        }
                    \]
                }
            \],
            "data_chunk_size": 14,
            "data_compressed_chunk_type": "imageset",
            "data_original_chunk_type": "imageset",
            "size": 1,
            "image_quality": 70,
            "data": 14,
            "dimension": "2d",
            "subset": "",
            "organization": null,
            "target_storage": {
                "id": 14,
                "location": "local",
                "cloud_storage_id": null
            },
            "source_storage": {
                "id": 13,
                "location": "local",
                "cloud_storage_id": null
            }
        },
        "before_update": {
            "name": "task"
        },
        "webhook_id": 7,
        "sender": {
            "url": "<http://localhost:8080/api/users/1>",
            "id": 1,
            "username": "admin1",
            "first_name": "Admin",
            "last_name": "First"
        }
    }
    

    Delete event

    Webhook payload object for delete:<resource> events:

    Key Type Description
    event string Identifies the event that triggered the webhook, following the delete:<resource> pattern.
    <resource> object Provides complete information about the deleted resource. See the Swagger docs for resource details.
    webhook_id integer The identifier for the webhook that dispatched the payload.
    sender object Details about the user that triggered the webhook.

    Here is an example of the payload for the delete:task event:

    
    {
        "event": "delete:task",
        "task": {
            "url": "<http://localhost:8080/api/tasks/15>",
            "id": 15,
            "name": "task",
            "project_id": 7,
            "mode": "",
            "owner": {
                "url": "<http://localhost:8080/api/users/1>",
                "id": 1,
                "username": "admin1",
                "first_name": "Admin",
                "last_name": "First"
            },
            "assignee": null,
            "bug_tracker": "",
            "created_date": "2022-10-04T08:05:50.419259Z",
            "updated_date": "2022-10-04T08:05:50.422917Z",
            "overlap": null,
            "segment_size": 0,
            "status": "annotation",
            "labels": \[
                {
                    "id": 28,
                    "name": "label_0",
                    "color": "#bde94a",
                    "attributes": [],
                    "type": "any",
                    "sublabels": [],
                    "has_parent": false
                }
            \],
            "segments": [],
            "dimension": "2d",
            "subset": "",
            "organization": null,
            "target_storage": {
                "id": 14,
                "location": "local",
                "cloud_storage_id": null
            },
            "source_storage": {
                "id": 13,
                "location": "local",
                "cloud_storage_id": null
            }
        },
        "webhook_id": 7,
        "sender": {
            "url": "<http://localhost:8080/api/users/1>",
            "id": 1,
            "username": "admin1",
            "first_name": "Admin",
            "last_name": "First"
        }
    }
    

    Webhook secret

    To validate that the webhook requests originate from CVAT, include a secret during the webhook creation process.

    When a secret is provided for the webhook, CVAT includes an X-Signature-256 in the request header of the webhook.

    CVAT uses the SHA256 hash function to encode the request body for the webhook and places the resulting hash into the header.

    The webhook recipient can verify the source of the request by comparing the received X-Signature-256 value with the expected value.

    Here’s an example of a header value for a request with an empty body and secret = mykey:

    X-Signature-256: e1b24265bf2e0b20c81837993b4f1415f7b68c503114d100a40601eca6a2745f
    

    Here is an example of how you can verify a webhook signature in your webhook receiver service:

    # webhook_receiver.py
    
    import hmac
    from hashlib import sha256
    from flask import Flask, request
    
    app = Flask(__name__)
    
    @app.route("/webhook", methods=["POST"])
    def webhook():
        signature = (
            "sha256="
            + hmac.new("mykey".encode("utf-8"), request.data, digestmod=sha256).hexdigest()
        )
    
        if hmac.compare_digest(request.headers["X-Signature-256"], signature):
            return app.response_class(status=200)
    
        raise app.response_class(status=500, response="Signatures didn't match!")
    

    Ping Webhook

    To confirm the proper configuration of your webhook and ensure that CVAT can establish a connection with the target URL, use the Ping webhook feature.

    Ping Webhook

    1. Click the Ping button in the user interface (or send a POST /webhooks/{id}/ping request through API).
    2. CVAT will send a webhook alert to the specified target URL with basic information about the webhook.

    Ping webhook payload:

    Key Type Description
    event string The value is always ping.
    webhook object Complete information about the webhook. See the Swagger docs for a detailed description of fields.
    sender object Information about the user who initiated the ping on the webhook.

    Here is an example of a payload for the ping event:

    
    {
       "event": "ping",
        "webhook": {
            "id": 7,
            "url": "<http://localhost:8080/api/webhooks/7>",
            "target_url": "<https://example.com>",
            "description": "",
            "type": "project",
            "content_type": "application/json",
            "is_active": true,
            "enable_ssl": true,
            "created_date": "2022-10-04T08:05:23.007381Z",
            "updated_date": "2022-10-04T08:05:23.007395Z",
            "owner": {
                "url": "<http://localhost:8080/api/users/1>",
                "id": 1,
                "username": "admin1",
                "first_name": "Admin",
                "last_name": "First"
            },
            "project": 7,
            "organization": null,
            "events": \[
                "create:comment",
                "create:issue",
                "create:task",
                "delete:comment",
                "delete:issue",
                "delete:task",
                "update:comment",
                "update:issue",
                "update:job",
                "update:project",
                "update:task"
            \],
            "last_status": 200,
            "last_delivery_date": "2022-10-04T11:04:52.538638Z"
        },
        "sender": {
            "url": "<http://localhost:8080/api/users/1>",
            "id": 1,
            "username": "admin1",
            "first_name": "Admin",
            "last_name": "First"
        }
    }
    

    Webhooks with API calls

    To create webhook via an API call, see Swagger documentation.

    For examples, see REST API tests.

    Example of setup and use

    This video demonstrates setting up email alerts for a project using Zapier and Gmail.

    7 - API & SDK

    How to interact with CVAT

    Overview

    In the modern world, it is often necessary to integrate different tools to work together. CVAT provides the following integration layers:

    • Server REST API + Swagger schema
    • Python client library (SDK)
      • REST API client
      • High-level wrappers
    • Command-line tool (CLI)

    In this section, you can find documentation about each separate layer.

    Component compatibility

    Currently, the only supported configuration is when the server API major and minor versions are the same as SDK and CLI major and minor versions, e.g. server v2.1.* is supported by SDK and CLI v2.1.*. Different versions may have incompatibilities, which lead to some functions in SDK or CLI may not work properly.

    7.1 - Server API

    Overview

    CVAT server provides HTTP REST API for interaction. Each client application - be it a command line tool, browser or a script - all interact with CVAT via HTTP requests and responses:

    CVAT server interaction image

    API schema

    You can obtain schema for your server at <yourserver>/api/docs. For example, the official CVAT.ai application has API documentation here.

    Examples

    Here you can see how a task is created in CVAT:

    Task creation example

    1. At first, we have to login
    2. Then we create a task from its configuration
    3. Then we send task data (images, videos etc.)
    4. We wait for data processing and finish

    Design principles

    Common pattern for our REST API is <VERB> [namespace] <objects> <id> <action>.

    • VERB can be POST, GET, PATCH, PUT, DELETE.
    • namespace should scope some specific functionality like auth, lambda. It is optional in the scheme.
    • Typical objects are tasks, projects, jobs.
    • When you want to extract a specific object from a collection, just specify its id.
    • An action can be used to simplify REST API or provide an endpoint for entities without objects endpoint like annotations, data, data/meta. Note: action should not duplicate other endpoints without a reason.

    When you’re developing new endpoints, follow these guidelines:

    • Use nouns instead of verbs in endpoint paths. For example, POST /api/tasks instead of POST /api/tasks/create.
    • Accept and respond with JSON whenever it is possible
    • Name collections with plural nouns (e.g. /tasks, /projects)
    • Try to keep the API structure flat. Prefer two separate endpoints for /projects and /tasks instead of /projects/:id1/tasks/:id2. Use filters to extract necessary information like /tasks/:id2?project=:id1. In some cases it is useful to get all tasks. If the structure is hierarchical, it cannot be done easily. Also you have to know both :id1 and :id2 to get information about the task. Note: for now we accept GET /tasks/:id2/jobs but it should be replaced by /jobs?task=:id2 in the future.
    • Handle errors gracefully and return standard error codes (e.g. 201, 400)
    • Allow filtering, sorting, and pagination
    • Maintain good security practices
    • Cache data to improve performance
    • Versioning our APIs. It should be done when you delete an endpoint or modify its behaviors. Versioning uses a schema with Accept header with vendor media type.

    7.2 - CVAT Python SDK

    Overview

    CVAT SDK is a Python library. It provides you access to Python functions and objects that simplify server interaction and provide additional functionality like data validation and serialization.

    SDK API includes several layers:

    • Low-level API with REST API wrappers. Located at cvat_sdk.api_client. Read more
    • High-level API. Located at cvat_sdk.core. Read more
    • PyTorch adapter. Located at cvat_sdk.pytorch. Read more
    • Auto-annotation API. Located at cvat_sdk.auto_annotation. Read more

    In general, the low-level API provides single-request operations, while the high-level one implements composite, multi-request operations, and provides local proxies for server objects. For most uses, the high-level API should be good enough, and it should be the right point to start your integration with CVAT.

    The PyTorch adapter is a specialized layer that represents datasets stored in CVAT as PyTorch Dataset objects. This enables direct use of such datasets in PyTorch-based machine learning pipelines.

    The auto-annotation API is a specialized layer that lets you automatically annotate CVAT datasets by running a custom function on the local machine. See also the auto-annotate command in the CLI.

    Installation

    To install an official release of CVAT SDK use this command:

    pip install cvat-sdk
    

    To use the PyTorch adapter, request the pytorch extra:

    pip install "cvat-sdk[pytorch]"
    

    We support Python versions 3.8 and higher.

    Usage

    To import package components, use the following code:

    For the high-level API:

    import cvat_sdk
    # or
    import cvat_sdk.core
    

    For the low-level API:

    import cvat_sdk.api_client
    

    For the PyTorch adapter:

    import cvat_sdk.pytorch
    

    7.2.1 - SDK API Reference

    7.2.1.1 - APIs

    All URIs are relative to http://localhost

    Class Method HTTP request Description
    AnalyticsApi create_report POST /api/analytics/reports Creates a analytics report asynchronously and allows to check request status
    AnalyticsApi get_reports GET /api/analytics/reports Method returns analytics report
    AssetsApi create POST /api/assets Method saves new asset on the server and attaches it to a corresponding guide
    AssetsApi destroy DELETE /api/assets/{uuid} Method deletes a specific asset from the server
    AssetsApi retrieve GET /api/assets/{uuid} Method returns an asset file
    AssetsApi retrieve_public GET /api/assets/{uuid}/public
    AuthApi create_login POST /api/auth/login
    AuthApi create_logout POST /api/auth/logout
    AuthApi create_password_change POST /api/auth/password/change
    AuthApi create_password_reset POST /api/auth/password/reset
    AuthApi create_password_reset_confirm POST /api/auth/password/reset/confirm
    AuthApi create_register POST /api/auth/register
    AuthApi create_signing POST /api/auth/signing This method signs URL for access to the server
    AuthApi retrieve_rules GET /api/auth/rules
    CloudstoragesApi create POST /api/cloudstorages Method creates a cloud storage with a specified characteristics
    CloudstoragesApi destroy DELETE /api/cloudstorages/{id} Method deletes a specific cloud storage
    CloudstoragesApi list GET /api/cloudstorages Returns a paginated list of storages
    CloudstoragesApi partial_update PATCH /api/cloudstorages/{id} Methods does a partial update of chosen fields in a cloud storage instance
    CloudstoragesApi retrieve GET /api/cloudstorages/{id} Method returns details of a specific cloud storage
    CloudstoragesApi retrieve_actions GET /api/cloudstorages/{id}/actions Method returns allowed actions for the cloud storage
    CloudstoragesApi retrieve_content GET /api/cloudstorages/{id}/content Method returns a manifest content
    CloudstoragesApi retrieve_content_v2 GET /api/cloudstorages/{id}/content-v2 Method returns the content of the cloud storage
    CloudstoragesApi retrieve_preview GET /api/cloudstorages/{id}/preview Method returns a preview image from a cloud storage
    CloudstoragesApi retrieve_status GET /api/cloudstorages/{id}/status Method returns a cloud storage status
    CommentsApi create POST /api/comments Method creates a comment
    CommentsApi destroy DELETE /api/comments/{id} Method deletes a comment
    CommentsApi list GET /api/comments Method returns a paginated list of comments
    CommentsApi partial_update PATCH /api/comments/{id} Methods does a partial update of chosen fields in a comment
    CommentsApi retrieve GET /api/comments/{id} Method returns details of a comment
    EventsApi create POST /api/events Method saves logs from a client on the server
    EventsApi list GET /api/events Method returns csv log file
    GuidesApi create POST /api/guides Method creates a new annotation guide binded to a project or to a task
    GuidesApi destroy DELETE /api/guides/{id} Method deletes a specific annotation guide and all attached assets
    GuidesApi partial_update PATCH /api/guides/{id} Methods does a partial update of chosen fields in an annotation guide
    GuidesApi retrieve GET /api/guides/{id} Method returns details of a specific annotation guide
    InvitationsApi create POST /api/invitations Method creates an invitation
    InvitationsApi destroy DELETE /api/invitations/{key} Method deletes an invitation
    InvitationsApi list GET /api/invitations Method returns a paginated list of invitations
    InvitationsApi partial_update PATCH /api/invitations/{key} Methods does a partial update of chosen fields in an invitation
    InvitationsApi retrieve GET /api/invitations/{key} Method returns details of an invitation
    IssuesApi create POST /api/issues Method creates an issue
    IssuesApi destroy DELETE /api/issues/{id} Method deletes an issue
    IssuesApi list GET /api/issues Method returns a paginated list of issues
    IssuesApi partial_update PATCH /api/issues/{id} Methods does a partial update of chosen fields in an issue
    IssuesApi retrieve GET /api/issues/{id} Method returns details of an issue
    JobsApi create POST /api/jobs Method creates a new job in the task
    JobsApi create_annotations POST /api/jobs/{id}/annotations/ Method allows to initialize the process of the job annotation upload from a local file or a cloud storage
    JobsApi destroy DELETE /api/jobs/{id} Method deletes a job and its related annotations
    JobsApi destroy_annotations DELETE /api/jobs/{id}/annotations/ Method deletes all annotations for a specific job
    JobsApi list GET /api/jobs Method returns a paginated list of jobs
    JobsApi partial_update PATCH /api/jobs/{id} Methods does a partial update of chosen fields in a job
    JobsApi partial_update_annotations PATCH /api/jobs/{id}/annotations/ Method performs a partial update of annotations in a specific job
    JobsApi retrieve GET /api/jobs/{id} Method returns details of a job
    JobsApi retrieve_annotations GET /api/jobs/{id}/annotations/ Method returns annotations for a specific job as a JSON document. If format is specified, a zip archive is returned.
    JobsApi retrieve_data GET /api/jobs/{id}/data Method returns data for a specific job
    JobsApi retrieve_data_meta GET /api/jobs/{id}/data/meta Method provides a meta information about media files which are related with the job
    JobsApi retrieve_dataset GET /api/jobs/{id}/dataset Export job as a dataset in a specific format
    JobsApi retrieve_preview GET /api/jobs/{id}/preview Method returns a preview image for the job
    JobsApi update_annotations PUT /api/jobs/{id}/annotations/ Method performs an update of all annotations in a specific job or used for uploading annotations from a file
    LabelsApi destroy DELETE /api/labels/{id} Method deletes a label. To delete a sublabel, please use the PATCH method of the parent label
    LabelsApi list GET /api/labels Method returns a paginated list of labels
    LabelsApi partial_update PATCH /api/labels/{id} Methods does a partial update of chosen fields in a labelTo modify a sublabel, please use the PATCH method of the parent label
    LabelsApi retrieve GET /api/labels/{id} Method returns details of a label
    LambdaApi create_functions POST /api/lambda/functions/{func_id}
    LambdaApi create_requests POST /api/lambda/requests Method calls the function
    LambdaApi delete_requests DELETE /api/lambda/requests/{id} Method cancels the request
    LambdaApi list_functions GET /api/lambda/functions Method returns a list of functions
    LambdaApi list_requests GET /api/lambda/requests Method returns a list of requests
    LambdaApi retrieve_functions GET /api/lambda/functions/{func_id} Method returns the information about the function
    LambdaApi retrieve_requests GET /api/lambda/requests/{id} Method returns the status of the request
    MembershipsApi destroy DELETE /api/memberships/{id} Method deletes a membership
    MembershipsApi list GET /api/memberships Method returns a paginated list of memberships
    MembershipsApi partial_update PATCH /api/memberships/{id} Methods does a partial update of chosen fields in a membership
    MembershipsApi retrieve GET /api/memberships/{id} Method returns details of a membership
    OrganizationsApi create POST /api/organizations Method creates an organization
    OrganizationsApi destroy DELETE /api/organizations/{id} Method deletes an organization
    OrganizationsApi list GET /api/organizations Method returns a paginated list of organizations
    OrganizationsApi partial_update PATCH /api/organizations/{id} Methods does a partial update of chosen fields in an organization
    OrganizationsApi retrieve GET /api/organizations/{id} Method returns details of an organization
    ProjectsApi create POST /api/projects Method creates a new project
    ProjectsApi create_backup POST /api/projects/backup/ Methods create a project from a backup
    ProjectsApi create_dataset POST /api/projects/{id}/dataset/ Import dataset in specific format as a project or check status of dataset import process
    ProjectsApi destroy DELETE /api/projects/{id} Method deletes a specific project
    ProjectsApi list GET /api/projects Returns a paginated list of projects
    ProjectsApi partial_update PATCH /a