Skip to main content

Batch Delete

The CLI allows you to delete comments based on a time period, for example all comments older than two years. This is useful for cleaning up historical data. Note that the time period is based on the comment's timestamp field, rather than the datetime the comment was uploaded to Re:infer.

note

This section assumes you have already installed and configured the CLI.

Backing Up Annotated Data

Before deleting or modifying your comments, you may optionally want to back up annotated comments, so as not to accidentally lose the manual work of the model trainers:

re get comments \
<project_name/source_name> \
--dataset <project_name/dataset_name> \
--reviewed-only true \
--file <output_file_name.jsonl>

If the source was added to multiple datasets, you should run the above command for each of those datasets.

Deleting Data

Deleting annotations changes model performance

If the comments you are deleting were added to one or more datasets where they could have been annotated, deleting annotated comments will result in a change of model performance in those datasets going forward (pinned models will be unaffected). You can optionally tell the CLI to skip annotated comments.

The command below will delete all comments in a source between FROM_TIMESTAMP and TO_TIMESTAMP excluding annotated comments. The timestamp should be in RFC 3339 format, e.g. 1970-01-02T03:04:05Z.

re delete bulk \
--source <project_name/source_name> \
--include-annotated=false \
--from-timestamp FROM_TIMESTAMP \
--to-timestamp TO_TIMESTAMP

If you are sure you want to delete annotated comments, you can set --include-annotated=true.