Having defined issue taxonomy and a comprehensive set of labels, the next step builds an ML training dataset.
As mentioned in the , Dell supports agents use CRM to record customer issues, queries, requests, and other important diagnostic/troubleshooting notes, which are known as “Case Logs.” Case logs are free text entered by agents that contain information about customer-facing issues, diagnostic steps, and resolution recommendations. Therefore, we decided to use these logs for training our ICC model. The logs are saved in a database along with other key information related to products and troubleshooting activities.
To build the training dataset, two approaches were used:
The results are multiclass datasets for consumer issues with approximately 24 K records in English, Mandarin, and Portuguese collected during 2019 to 2020. Each record primarily consists of two textual fields, which make the source: log title, log description and the target into one or more standardized T1-T2-T3 labels. In addition, a primary key was also stored for traceability – linking to each diagnostic/communication activity.