Comments
Data in Re:infer is represented as comment objects. When uploading data to Re:infer or fetching data from Re:infer it is therefore important to understand how different types of data (such as emails, support tickets, or chats) should be represented as comments. This document explains how to model your data as Re:infer comments to prepare it for upload to Re:infer, and how to understand data fetched from Re:infer.
The Overview section describes the overall structure of a comment object. If you would like to upload data to Re:infer via the API, or want to understand how to process data uploaded to Re:infer via the API, please refer to the Comments created via the API section which provides detailed descriptions for each of the commonly used types of comments (emails, support tickets, and chats). If you would like to understand how to process data uploaded to Re:infer via an integration, please refer to the Comments created by integrations section. Finally, for a full list of available comment object fields, please see the Reference.
Overview
Re:infer works with various types of text data such as emails, survey responses, support tickets, or customer reviews. What these types of data have in common is that they all consist of units of communication (an email, a survey response, a support ticket, a customer review). In Re:infer, we call any such unit of communication a comment.
Regardless of the type of unit of communication a comment represents, it will always have this basic structure:
{
"id": <UNIQUE ID>,
"timestamp": <TIMESTAMP>,
"messages": [
{
"body": { "text": <TEXT> },
...
}
],
"user_properties": { ... },
}
As shown in the code snippet above, in addition to the actual piece of text, a comment will always have an ID and a timestamp. The ID needs to be unique within the source containing the comment. The timestamp is used in the platform UI to filter and sort by date, and to generate date-based analytics.
In addition to these required fields, other fields should be set depending on the type of the comment. If your data has been uploaded to Re:infer via an integration, Re:infer will automatically populate all necessary fields. The following sections describe this in more detail.
Comments created via the API
Emails
While the easiest way to sync emails into Re:infer is via the
Exchange integration, in
cases where you do your own email extraction you can sync emails via the API.
You should use the
sync-raw-emails
endpoint for raw emails, and the
sync
endpoint for processed
emails.
When syncing raw emails, you should provide the extracted MIME email headers and email body as-is (see the Reference for a description of the raw email format). Re:infer will parse the headers and clean the email body. (Note that the raw email example below shows a very small number of headers for brevity. You should send all extracted headers to Re:infer, which are likely to be much longer than in the example.)
Re:infer will set the email-specific fields in the message object messages[0]
,
set the thread_id
field and thread_properties
object, clean up the email
body by stripping quoted emails and putting the signature into a separate
signature
field, and populate the user_properties
object with metadata
extracted from email headers.
Note that if a field is not present in the email, it won't be set in the comment at all (rather than being set to a null or empty value). For example, the comment in the example below does not contain a BCC: field.
If you enrich emails with other data prior to uploading to Re:infer, you can provide this additional data in the user properties of the comment.
The processed raw email will look like the processed email example below - note the number of additional fields created by Re:infer. If you would like to upload processed emails, you should structure them as in the processed email example.
- Raw Email
- Processed Email
{
"raw_email": {
"body": {
"plain": "Hi Bob,\n\nCould you send me the figures for today?\n\nThanks,\nAlice"
},
"headers": {
"raw": "From: Alice Smith <alice@example.com>\nDate: Tue, 3 Aug 2021 10:57:42 +0100\nMessage-ID: <e7784b5b@mail.example.com>\nSubject: Figures for today\nTo: Bob <bob@company.com>\nCc: Joe <joe@company.com>"
}
},
"user_properties": {
"string:Team": "Team XYZ"
}
}
{
"comment": {
"id": "3c6537373834623562406d61696c2e6578616d706c652e636f6d3e",
"timestamp": "2021-08-03T09:57:42Z",
"user_properties": {
"string:Has Signature": "Yes",
"string:Sender": "alice@example.com",
"string:Thread": "<e7784b5b@mail.example.com>",
"string:Message ID": "<e7784b5b@mail.example.com>",
"number:Recipient Count": 2,
"number:Participant Count": 3,
"number:Position in Thread": 1,
"string:Sender Domain": "example.com",
"string:Team": "Team XYZ"
},
"messages": [
{
"body": {
"text": "Hi Bob,\n\nCould you send me the figures for today?"
},
"signature": {
"text": "Thanks,\nAlice"
},
"subject": {
"text": "Figures for today"
},
"to": ["\"Bob\" <bob@company.com>"],
"cc": ["\"Joe\" <joe@company.com>"],
"sent_at": "2021-08-03T09:57:42Z",
"from": "\"Alice Smith\" <alice@example.com>"
}
],
"thread_id": "3c6537373834623562406d61696c2e6578616d706c652e636f6d3e"
},
"thread_properties": {
"duration": null,
"response_time": null,
"num_messages": 1,
"num_participants": 3,
"first_sender": "alice@example.com",
"thread_position": 0
}
}
Thread Properties
The following thread properties are available.
Name | Description |
---|---|
thread_position | Position of comment in thread, calculated by ordering the comments by timestamp . Starts at 0 . |
num_messages | Number of comments in thread. |
num_participants | Total number of unique participants (From, To, CC, BCC) in thread. |
first_sender | Sender of the first comment in thread. |
duration | Difference (in seconds) between the timestamp s of first and last comment in thread. Will be set to null if num_messages is 1 (i.e. thread contains only 1 comment).(Note: the timestamp of a comment corresponds to the sent_at field of the corresponding raw email.) |
response_time | Difference (in seconds) between the first comment in thread and the first response in thread. The first response in thread is the oldest comment where sender is not first_sender . Will be set to null if there are no responses in thread (i.e. if all emails in thread are from the same sender). |
Each time a new comment is added to the platform, the thread properties of the corresponding thread are updated.
Note that, apart from thread_position
, all properties are same for each
comment in thread.
Support Tickets
In addition to the main text, a typical support ticket submitted via a form may have a subject, information about the sender (such as name or email address), and additional structured data (such as the topic of the ticket) which can be uploaded as part of the user properties of the comment.
The example below shows how to format a support ticket as a Re:infer comment and how that comment will be displayed in the platform UI. Of course, your user properties may be different depending on the data you collect.
{
"id": "dbcb03ad",
"timestamp": "2020-02-26T16:09:00Z",
"messages": [
{
"body": {
"text": "Hi Support Team\n\nPlease could you look into my broadband service network status. I don't have any signal."
},
"subject": {
"text": "Network Outage for over 24 hours - Customer account number 1234567"
},
"from": "alice.smith@example.com"
}
],
"user_properties": {
"string:Customer Name": "Alice Smith",
"string:Source": "Support Form",
"string:Topic": "Broadband"
}
}
Chat
Of all types of comments, chats is the only case where you should set multiple
messages per comment in the messages
array (all other types of comments have
one message per comment). Additionally for each message you can set the sender
and the time the message was sent. Similar to suppor tickets, you will typically
have structured data (such as chat duration or resolution status) that you can
upload as part of the comment's user properties.
The example below shows how to format a chat conversation as a Re:infer comment and how that comment will be displayed in the platform UI. Of course, your user properties may be different depending on the data you collect.
{
"id": "5be6a3e4",
"timestamp": "2020-02-28T19:20:22Z",
"user_properties": {
"number:Duration": 542,
"string:Close Reason": "Complete Acknowledged",
"string:Resolution": "Unknown"
},
"messages": [
{
"body": {
"text": "Hi, my name is Alice 👋 How can I help?"
},
"sent_at": "2020-02-28T19:20:01Z",
"from": "Agent"
},
{
"body": {
"text": "Hi. I would like to close my account"
},
"sent_at": "2020-02-28T19:22:39Z",
"from": "Customer"
},
{
"body": {
"text": "Thanks for waiting. Please call our account team at this number: 012-3456-7890."
},
"sent_at": "2020-02-28T19:27:50Z",
"from": "Agent"
},
{
"body": {
"text": "Ok, thanks, I will follow up with them"
},
"sent_at": "2020-02-28T19:28:31Z",
"from": "Customer"
},
{
"body": {
"text": "Sure thing! Anything else I can help you with today?"
},
"sent_at": "2020-02-28T19:28:42Z",
"from": "Agent"
},
{
"body": {
"text": "No. Thanks."
},
"sent_at": "2020-02-28T19:29:03Z",
"from": "Customer"
}
]
}
Comments created by integrations
Emails (Microsoft Exchange)
Microsoft Exchange emails ingested into Re:infer via the Exchange integration are automatically converted into comment objects in the same way as raw emails.
Reference
Comments
See the table below for a list of available comment fields. If you are unfamiliar with Re:infer comment objects, please refer to the Overview.
Name | Type | Required | Description |
---|---|---|---|
id | string | yes | Identifies a comment uniquely within a source. Any hexadecimal string of up to 1024 characters is valid (conforms to /[0-9a-f]{1,1024}/). |
timestamp | string | yes | A ISO-8601 timestamp indicating when the comment was created. If the timestamp does not specify a timezone, UTC will be assumed. The timestamp must be in the range 1950-01-01T00:00:00Z to 2049-12-31T23:59:59Z inclusive. |
messages | array<Message> | yes | An array of zero or more messages. Conversations are represented as a chronological series of messages, whilst a single piece of text should be a single-element array. |
user_properties | map<string, string | number> | no | Any user-defined metadata that applies to the comment. There are two possible types: string and number . The key of a user property has the format "type:name", eg. "string:Domain Name" or "number:Star Rating". The user property name may consist of letters, numbers, spaces, and underscores, and may contain up to 32 characters (conforms to /\w([\w ]{0,30}\w)?/). The value must be a string or a number depending on the type of the user property. |
thread_id | string | no | An ID uniquely identifying an email thread. Any hexadecimal string of up to 1024 characters is valid (conforms to /[0-9a-f]{1,1024}/). |
uid | string | set by Re:infer | A combined source and comment ID in the form of source_id.comment_id . You should not be setting this field directly as it's automatically generated by Re:infer for uploaded comments. |
created_at | string | set by Re:infer | A ISO-8601 timestamp with the same constraints as the timestamp field. You should not be setting this field directly as it's automatically generated by Re:infer when the comment is created. |
updated_at | string | set by Re:infer | A ISO-8601 timestamp with the same constraints as the timestamp field. You should not be setting this field directly as it's automatically generated by Re:infer when the comment is updated. |
Where Message
has the following format:
Name | Type | Required | Description |
---|---|---|---|
body | Content | yes | An object containing the main body text of the message. |
subject | Content | no | An object containing the messages's subject. |
signature | Content | no | An object containing the messages's signature. |
from | string | no | The message sender. |
to | array<string> | no | An array of primary recipients. |
cc | array<string> | no | An array of carbon-copy recipients. |
bcc | array<string> | no | An array of blind carbon-copy recipients. |
sent_at | string | no | A ISO-8601 timestamp indicating when the message was created. If the timestamp does not specify a timezone, UTC will be assumed. |
language | string | no | The original language of the message. If this is supplied, both text and translated_from should be supplied for the Content fields. |
Where Content
has the following format:
Name | Type | Required | Description |
---|---|---|---|
text | string | yes | If language (other than the source's language ) has been supplied, this should be the translated text of the content. Otherwise, it should be in the original language it was collected; it will be translated if not in the source's language and the source has should_translate set to true . At most 65536 characters. |
translated_from | string | no | If language (other than the source's language ) has been supplied, this should by the original text of the content. Supplying this field without having supplied a language will result in an error. At most 65536 characters. |
Raw Emails
See the table below for a list of available raw email fields.
Name | Type | Required | Description |
---|---|---|---|
headers | Headers | yes | An object containing the headers of the email. |
body | Body | yes | An object containing the main body of the email. |
Where Headers
has the following format:
Name | Type | Required | Description |
---|---|---|---|
raw | string | no | One of raw and parsed is required. The raw email headers, given as a single string, with each header on its own line. |
parsed | map<string, string | array<string>> | no | One of raw and parsed is required. The parsed email headers, given as an object with string keys and string or array<string> values.Each key must be ASCII, and represents one email header. Value strings may be any valid UTF-8. Lists of values will be concatenated with , before being set as a single header value. If you require duplicate header keys, please use raw instead. |
Where Body
has the following format:
Name | Type | Required | Description |
---|---|---|---|
plain | string | no | At least one of plain and html is required. The plaintext content of the email. At most 65536 characters. |
html | string | no | At least one of plain and html is required. The HTML content of the email. |