Skip to main content

Batch Upload

Billable Operation

You will be charged 1 AI unit per created comment, or per updated comment (based on its unique ID) if its text was modified.

The CLI allows you to upload comments (including pre-labelled comments) in batch. In addition to importing data into Re:infer in those cases where a live connection is not required, it can be used to upload pre-existing training data into Re:infer, or to overwrite existing comments or labels in Re:infer.

note

This section assumes you have already installed and configured the CLI.

Preparing Data

The CLI expects data in JSONL format (also called newline-delimited JSON), where each line is a JSON value. Many tools will be able to export JSONL files out-of-the-box. Please contact support if you have any questions.

Each line in the JSONL file represents a comment object. Each comment object should have at least a unique ID, a timestamp, and a piece of text, but can have other fields such as metadata. Please see the Comment reference to learn which fields to set for your data.

Each line in the JSONL file should have the following format (only required fields shown). (Note that this is shown with indentation for readability, but should be all on one line in your file.)

{
"comment": {
"id": "<unique id>",
"timestamp": "<timestamp>",
"messages": [
{
"body": {
"text": "<text of the comment>"
}
}
]
}
}

If you would like to upload labels alongside comments, you can include them like so (same as above, this is shown with indentation for readability, but should be all on one line in your file):

{
"comment": {
"id": "<unique id>",
"timestamp": "<timestamp>",
"messages": [
{
"body": {
"text": "<text of the comment>"
}
}
]
},
"labelling": {
"assigned": [
{
"name": "<Your Label Name>",
"sentiment": "<positive|negative>"
},
{
"name": "<Another Label Name>",
"sentiment": "<positive|negative>"
}
]
}
}

Uploading Data

Uploading Comments

The command below will upload comments to the specified source. We recommend to upload comments into a new empty source, as it makes rolling back easier if something went wrong - you just delete the source.

re create comments \
--source <project_name/source_name> \
--file <file_name.jsonl>

If you want to update existing comments, you should specify the --overwrite flag. The comments will be overwritten based on the comment.id field. We recommend that you make a backup copy of the source before updating comments in order to be able to recover the original comments if something goes wrong.

Uploading Comments with Labels

If you would like to upload labels together with your comments, you should specify a dataset into which the labels should be uploaded. The dataset should be connected to the source before you start uploading.

re create comments \
--source <project_name/source_name> \
--dataset <project_name/dataset_name> \
--file <file_name.jsonl>

You can overwrite labels on existing comments by specifying the --overwrite flag. Note that this will replace existing labels with new labels (not add existing labels to new labels). We recommend that you make a backup copy of the dataset before overwriting labels in order to be able to recover the original labels if something goes wrong.