Skip to main content

Get all datasets

GET/api/v1/datasets
Permissions required: View labels
curl -X GET 'https://<my_api_endpoint>/api/v1/datasets' \
-H "Authorization: Bearer $REINFER_TOKEN"

Get datasets by project

GET/api/v1/datasets/<project>
Permissions required: View labels

Get a dataset by name

GET/api/v1/datasets/<project>/<dataset_name>
Permissions required: View labels
curl -X GET 'https://<my_api_endpoint>/api/v1/datasets/<project>/example' \
-H "Authorization: Bearer $REINFER_TOKEN"

Get model tags for a dataset

GET/api/v1/datasets/<project>/<dataset>/model-tags
Permissions required: Model Admin
curl -X GET 'https://<my_api_endpoint>/api/v1/datasets/<project>/model-tags' \
-H "Authorization: Bearer $REINFER_TOKEN"

Create a dataset

PUT/api/v1/datasets/<project>/<dataset>
Permissions required: Datasets admin
curl -X PUT 'https://<my_api_endpoint>/api/v1/datasets/<project>/example' \
-H "Authorization: Bearer $REINFER_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"dataset": {
"description": "An optional long form description.",
"model_family": "english",
"source_ids": [
"18ba5ce699f8da1f"
],
"title": "An Example Dataset"
}
}'
NameTypeRequiredDescription
titlestringnoOne-line human-readable title for the dataset.
descriptionstringnoA longer description of the dataset.
source_idsarray<string>noAn array of source ids to be included in this dataset.
model_familystringnoDataset model family, can be english or german. Defaults to english.
has_sentimentbooleannoWhether labels in the dataset should be applied with sentiment. Defaults to true.

Update a dataset

POST/api/v1/datasets/<project>/<dataset>
Permissions required: Datasets admin
curl -X POST 'https://<my_api_endpoint>/api/v1/datasets/<project>/example' \
-H "Authorization: Bearer $REINFER_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"dataset": {
"title": "An Alternative Title"
}
}'
NameTypeRequiredDescription
titlestringnoOne-line human-readable title for the dataset.
descriptionstringnoA longer description of the dataset.
source_idsarray<string>noAn array of source ids to be included in this dataset.

Delete a dataset

DELETE/api/v1/datasets/<project>/<dataset_name>
Permissions required: Datasets admin
curl -X DELETE 'https://<my_api_endpoint>/api/v1/datasets/<project>/example' \
-H "Authorization: Bearer $REINFER_TOKEN"

Export a dataset

POST/api/v1/datasets/<project>/<dataset_name>/export
Permissions required: Export datasets
curl -X POST 'https://<my_api_endpoint>/api/v1/datasets/<project>/example/export' \
-H "Authorization: Bearer $REINFER_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"limit": 1
}'

This route lets you export a dataset. It returns a list of comments with assigned labels and latest available predictions. Other ways to export a dataset are CSV download in the browser and JSONL download via the CLI. For a detailed comparison, see the comparison table.

Request Format

NameTypeRequiredDescription
comment_uidsarray<string>noA list of at most 256 comment UIDs (in the format of source_id.comment_id). If provided, only these comments will be included in the response. No other filters may be passed with comment_uids.
source_idsarray<string>noA list of at most 1024 source IDs. If provided, only comments from these sources will be included in the response.
order_bystringnoOne of created_at or timestamp. If provided returns the comments sorted by either the API creation date of the comments (created_at), or the user defined comment timestamp (timestamp). The default is timestamp.
fromstringnoAn ISO-8601 timestamp. If provided, returns comments only from this timestamp onwards. The related order_by field controls which timestamp will be used for filtering.
tostringnoAn ISO-8601 timestamp. If provided, returns comments only until this timestamp (inclusive). The related order_by field controls which timestamp will be used for filtering.
continuationstringnoPagination token (provided in the response). Should be used to fetch the next limit number of comments.
limitnumbernoNumber of comments returned per response up to a maximum of 256. Default: 64.

Response Format

NameTypeDescription
commentsarray<Comment>A list of comments with their assigned and predicted labels.
continuationstringPagination token to fetch the next limit number of comments. If there are no further comments, this field will not be present in the response.

Where Comment has the following format:

NameTypeDescription
commentobjectComment object. The format is described in the Comment Reference.
annotationsobjectAn object containing a single field labels.assigned which is a list of labels assigned to this comment. The format is described in the Label Reference - note that it won't include predictions as these labels are assigned, not predicted.
predictionsobjectAn object containing a single field labels which is a list of labels predicted for this comment. The format is described in the Label Reference.