Communication Mining - Threads issue

Hi @Ramya_K,

there are a couple of options here, but basically it boils down to what Anil G wrote: find a common identifier. Depending on how and from where you get your emails in the first place, you might look for Thread ID / index, or similar sounding properties, e.g. ‘conversation ID / index’. Just be careful and test what are scenarios when that identifier changes and when not (e.g. moving mail between folders, domain name changes in addresses, etc.)

If you know that one or more emails are related upfront then you can also generate some id of your own, like a hash, and put it into user property.

Another way to tackle this would be to upload to CM with API, not Studio Activities. When uploading with Activities the body of your emails is being automatically parsed by the platform and some generic stuff get removed or wrapped into predefined elements, like signatures. With API you have full control over body, which you can build as a concatenation of 2 email bodies that you know are a thread, just mind the 62k something character limit for single CM message.

Lastly, you can use your model without uploading messages into CM entirely (still costs 1 AI unit per message). For that you use Predict endpoint or Activity which gives you a response straight away for you to parse. In this scenario you would be calling that endpoint or activity for each email in your thread and combine the data from responses accordingly.

Cheers,
Tom