Batch Delete
The CLI allows you to delete comments based on a time period, for example all
comments older than two years. This is useful for cleaning up historical data.
Note that the time period is based on the comment's timestamp
field, rather
than the datetime the comment was uploaded to Re:infer.
This section assumes you have already installed and configured the CLI.
Backing Up Annotated Data
Before deleting or modifying your comments, you may optionally want to back up annotated comments, so as not to accidentally lose the manual work of the model trainers:
re get comments \
<project_name/source_name> \
--dataset <project_name/dataset_name> \
--reviewed-only true \
--file <output_file_name.jsonl>
If the source was added to multiple datasets, you should run the above command for each of those datasets.
Deleting Data
If the comments you are deleting were added to one or more datasets where they could have been annotated, deleting annotated comments will result in a change of model performance in those datasets going forward (pinned models will be unaffected). You can optionally tell the CLI to skip annotated comments.
The command below will delete all comments in a source between FROM_TIMESTAMP
and TO_TIMESTAMP
excluding annotated comments. The timestamp should be in
RFC 3339 format,
e.g. 1970-01-02T03:04:05Z
.
re delete bulk \
--source <project_name/source_name> \
--include-annotated=false \
--from-timestamp FROM_TIMESTAMP \
--to-timestamp TO_TIMESTAMP
If you are sure you want to delete annotated comments, you can set
--include-annotated=true
.