Skip to main content

Get predictions for a pinned model#

POST/api/v1/datasets/<organisation>/<dataset_name>/labellers/<version>/predict
Permissions required: View labels, View sources
curl -X POST 'https://reinfer.io/api/v1/datasets/org1/collateral/labellers/1/predict' \    -H "Authorization: Bearer $REINFER_TOKEN" \    -H "Content-Type: application/json" \    -d '{  "documents": [    {      "messages": [        {          "body": {            "text": "Hi Bob,\n\nCould you send me today'"'"'s figures?\n\nThanks,\nAlice"          },          "from": "alice@company.com",          "sent_at": "2011-12-11T11:02:03.000000+00:00",          "to": [            "bob@organisation.org"          ]        },        {          "body": {            "text": "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob"          },          "from": "bob@organisation.org",          "sent_at": "2011-12-11T11:05:10.000000+00:00",          "to": [            "alice@company.com"          ]        },        {          "body": {            "text": "Hi Bob,\n\nI think these are the wrong numbers - could you check?\n\nThanks again,\nAlice"          },          "from": "alice@company.com",          "sent_at": "2011-12-11T11:18:43.000000+00:00",          "to": [            "bob@organisation.org"          ]        }      ],      "timestamp": "2013-09-12T20:01:20.000000+00:00",      "user_properties": {        "number:Deal Value": 12000,        "string:City": "London"      }    },    {      "messages": [        {          "body": {            "text": "All,\n\nJust to let you know that processing is running late today.\n\nRegards,\nBob"          },          "from": "bob@organisation.org",          "sent_at": "2011-12-12T10:04:30.000000+00:00",          "to": [            "alice@company.com",            "carol@company.com"          ]        },        {          "body": {            "text": "Hi Bob,\n\nCould you estimate when you'"'"'ll be finished?\n\nThanks,\nCarol"          },          "from": "carol@company.com",          "sent_at": "2011-12-12T10:06:22.000000+00:00",          "to": [            "alice@company.com",            "bob@organisation.org"          ]        },        {          "body": {            "text": "Carol,\n\nWe should be done by 12pm. Sorry about the delay.\n\nBest,\nBob"          },          "from": "bob@organisation.org",          "sent_at": "2011-12-11T10:09:40.000000+00:00",          "to": [            "alice@company.com",            "carol@company.com"          ]        }      ],      "timestamp": "2013-09-13T18:03:56.000000+00:00",      "user_properties": {        "number:Deal Value": 4.9,        "string:City": "Luton"      }    }  ],  "threshold": 0.25}'

Request Format#

NameTypeRequiredDescription
documentsarray<Comment>yesA batch of at most 4096 documents, in the format described in the Comment Reference. Larger batches are faster (per document) than smaller ones.
thresholdnumbernoThe confidence threshold to filter the label results by. A number between 1.0 and 0.0. 0.0 will include all results. Set to "auto" to use auto-thresholds. If not set, the default threshold of 0.25 will be used.
labelsarray<Label>noA list of requested labels to be returned with optionally label-specific thresholds.

Where Label has the following format:

NameTypeRequiredDescription
namearray<string>yesThe name of the label to be returned, formatted as a list of hierarchical labels. For instance, the label "Parent Label > Child Label" will have the format ["Parent Label", "Child Label"].
thresholdnumbernoThe confidence threshold to use for the label. If not specified, will default to the threshold specified at the top-level.

Response Format#

NameTypeDescription
statusstringok if the request is successful, or error in case of an error. See the Overview to learn more about error responses.
predictionsarray<array<Label>>A list of array<Label> in the same order as the comments in the request, where each Label has the format described here.
entitiesarray<array<Entity>>A list of array<Entity> in the same order as the comments in the request, where each Entity has a format described here.
modelModelInformation about the model that was used to make the predictions, in the format described here.

Get predictions for a pinned model for raw emails#

POST/api/v1/datasets/<organisation>/<dataset_name>/labellers/<version>/predict-raw-emails
Permissions required: View labels, View sources
curl -X POST 'https://reinfer.io/api/v1/datasets/org1/collateral/labellers/1/predict-raw-emails' \    -H "Authorization: Bearer $REINFER_TOKEN" \    -H "Content-Type: application/json" \    -d '{  "documents": [    {      "raw_email": {        "body": {          "plain": "Hi Bob,\n\nCould you send me today'"'"'s figures?\n\nThanks,\nAlice"        },        "headers": {          "parsed": {            "Date": "Thu, 09 Jan 2020 16:34:45 +0000",            "From": "alice@company.com",            "Message-ID": "abcdef@company.com",            "Subject": "Figures Request",            "To": "bob@organisation.org"          }        }      },      "user_properties": {        "number:Deal Value": 12000,        "string:City": "London"      }    },    {      "raw_email": {        "body": {          "html": "<p>Alice,</p><p>Here are the figures for today.</p><p>Regards,<br/>Bob</p>"        },        "headers": {          "raw": "Message-ID: 012345@company.com\nDate: Thu, 09 Jan 2020 16:44:45 +0000\nSubject: Re: Figures Request\nFrom: bob@organisation.org\nTo: alice@company.com"        }      }    }  ],  "include_comments": true,  "threshold": 0.25,  "transform_tag": "name.0.ABCD1234"}'

Request Format#

NameTypeRequiredDescription
transform_tagstringyesA tag identifying the email integration sending the data. You should have recieved this tag during integration configuration setup.
documentsarray<Document>yesA batch of at most 4096 documents in the format described below. Larger batches are faster (per document) than smaller ones.
thresholdnumbernoThe confidence threshold to filter the label results by. A number between 1.0 and 0.0. 0.0 will include all results. Set to "auto" to use auto-thresholds. If not set, the default threshold of 0.25 will be used.
labelsarray<Label>noA list of requested labels to be returned with optionally label-specific thresholds.
include_commentsbooleannoIf set to true, the comments parsed from the emails will be returned in the response body.

Where Document has the following format:

NameTypeRequiredDescription
raw_emailRawEmailyesEmail data, in the format described below.
user_propertiesmap<string, string | number>noAny user-defined metadata that applies to the comment. The format is described in the Comment Reference.

Note: Some user properties are generated based on the email content. If these conflict with uploaded user properties, the request will fail with 422 Unprocessable Entity.

Where RawEmail has the following format:

NameTypeRequiredDescription
headersHeadersyesAn object containing the headers of the email.
bodyBodyyesAn object containing the main body of the email.

Where Headers has the following format:

NameTypeRequiredDescription
rawstringnoOne of raw and parsed is required. The raw email headers, given as a single string, with each header on its own line.
parsedmap<string, string | array<string>>noOne of raw and parsed is required. The parsed email headers, given as an object with string keys and string or array<string> values. Each key represents one email header. Lists of values will be concatenated with , before being set as a single header value.

If you require duplicate header keys, please use raw instead.

Where Body has the following format:

NameTypeRequiredDescription
plainstringnoAt least one of plain and html is required. The plaintext content of the email.
htmlstringnoAt least one of plain and html is required. The HTML content of the email.

Where Label has the following format:

NameTypeRequiredDescription
namearray<string>yesThe name of the label to be returned, formatted as a list of hierarchical labels. For instance, the label "Parent Label > Child Label" will have the format ["Parent Label", "Child Label"].
thresholdnumbernoThe confidence threshold to use for the label. If not specified, will default to the threshold specified at the top-level.

Response Format#

NameTypeDescription
statusstringok if the request is successful, or error in case of an error. See the Overview to learn more about error responses.
commentsarray<Comment>A list of comments parsed from the uploaded raw emails, in the format described in the Comment Reference. Only returned if you set include_comments in the request.
predictionsarray<array<Label>>A list of array<Label> in the same order as the comments in the request, where each Label has the format described here.
entitiesarray<array<Entity>>A list of array<Entity> in the same order as the comments in the request, where each Entity has a format described here.
modelModelInformation about the model that was used to make the predictions, in the format described here.

Note: For large requests, this endpoint may take longer to respond. You should increase your client timeout.

Get predictions for a pinned model by comment id#

POST/api/v1/datasets/<organisation>/<dataset_name>/labellers/<version>/predict-comments
Permissions required: View labels, View sources
curl -X POST 'https://reinfer.io/api/v1/datasets/<organisation>/example/labellers/0/predict-comments' \    -H "Authorization: Bearer $REINFER_TOKEN" \    -H "Content-Type: application/json" \    -d '{  "threshold": 0.25,  "uids": [    "18ba5ce699f8da1f.0001",    "18ba5ce699f8da1f.0002",    "b84d8e2641f36bf5.abc001"  ]}'

Request Format#

NameTypeRequiredDescription
uidsarray<string>yesA list of at most 4096 combined source_id-s and comment_id-s in the format of source_id.comment_id. Sources don't need to belong to the current dataset - so you can request predictions of comments for a source in a different (or no) dataset. Larger lists are faster (per comment) than smaller ones.
thresholdnumbernoThe confidence threshold to filter the label results by. A number between 1.0 and 0.0. 0.0 will include all results. Set to "auto" to use auto-thresholds. If not set, the default threshold of 0.25 will be used.
labelsarray<Label>noA list of requested labels to be returned with optionally label-specific thresholds.

Where Label has the following format:

NameTypeRequiredDescription
namearray<string>yesThe name of the label to be returned, formatted as a list of hierarchical labels. For instance, the label "Parent Label > Child Label" will have the format ["Parent Label", "Child Label"].
thresholdnumbernoThe confidence threshold to use for the label. If not specified, will default to the threshold specified at the top-level.

Response Format#

NameTypeDescription
statusstringok if the request is successful, or error in case of an error. See the Overview to learn more about error responses.
predictionsarray<Prediction>A list of predictions.
modelModelInformation about the model that was used to make the predictions, in the format described here.

Where Prediction has the following format:

NameTypeDescription
uidstringA combined source_id and comment_id in the format of source_id.comment_id.
labelsarray<array<Label>>A list of array<Label> in the same order as the comments in the request, where each Label has the format described here.
entitiesarray<array<Entity>>A list of array<Entity> in the same order as the comments in the request, where each Entity has a format described here.

Note: For large requests, this endpoint may take longer to respond. You should increase your client timeout.

Get model validation statistics#

GET/api/v1/datasets/<organisation>/<dataset_name>/labellers/<version>/validation
Permissions required: View labels, View sources
curl -X GET 'https://reinfer.io/api/v1/datasets/org1/collateral/labellers/latest/validation' \    -H "Authorization: Bearer $REINFER_TOKEN"

This route returns statistics of how well a model is performing. Same statistics can be viewed in the Validation page. A model's statistics can be requested with its integer version number. You can also use the special value latest to retrieve the most recently available validation scores.

Although this endpoint accepts both pinned and not pinned model versions, we recommend querying either pinned model versions or the special value latest, as statistics are not guaranteed to be available for not pinned model versions.

The response validation object contains the following fields:

NameTypeDescription
mean_average_precision_safefloatMean Average Precision score (between 0.0 and 1.0). This field will be null if MAP is unavailable.
num_labelsnumberNumber of labels in the taxonomy (at the time the model version was pinned).
num_reviewed_commentsnumberNumber of reviewed comments in the dataset (at the time the model version was pinned).
versionnumberModel version.
num_amber_labelsnumberNumber of labels in amber warning state.
num_red_labelsnumberNumber of labels in red warning state.
dataset_qualitystringOne of "poor", "average", "good", "excellent", representing the overall dataset quality rank. Can be null if there is not enough data.
coveragefloatA fractional value of label coverage in the dataset, between 0.0 and 1.0. Can be null if there is not enough data.