This task involved construction of a hierarchical taxonomy for standardization of technical issue descriptions. For this purpose, we used several existing technical issue taxonomies used by different business analytics applications and teams. In parallel, we performed an Exploratory Data Analysis (EDA) of case logs to understand the detailed requirements of taxonomy including capturing technical issue semantics, nomenclature, and minimizing duplication and the number of categories (see Exploratory data analysis). We opted for three-tier label hierarchy for describing each technical issue, where the:
For example, consider the following T1-T2-T3 label:
Here, T1 = [Power] is the least granular level of information and under power; T2 = [AC Adapter], and T3 = [AC Adapter Noise Issue] is most granular level of information provided.
The combination of the T1-T2-T3 label hierarchy can succinctly describe over 85 percent of frequently occurring technical issues. In addition to deriving labels with the aid of EDA, we liaised with the domain experts to consolidate several exiting taxonomies and curate a set of all-encompassing T1-T2-T3 labels.
Furthermore, a set of syntactic rules were established on how to format each T1-T2-T3 label. These rules ensure future conformity and standardization as new labels are added. Example formatting rules include:
Table 1. Example T1-T2-T3 Labels for describing consumer technical Issues
Tier 1 | Tier 2 | Tier 3 |
Bluescreen | After OSRI | On Boot |
Backup & Data Management | Backup Products & Services | Drivers |
Chassis | Damage | Hinges |
Display (internal) | Damage | LCD bezel |
A major concern of having a fixed set of T1-T2-T3 labels for describing technical issues is that novel and unseen issues may not be represented or captured by downstream ML models. To address this concern, a label governance process has been established that ensures new standardized labels are introduced periodically while minimizing duplication with respect to existing labels. Another concern is the growing number of labels over time with the new additions, and therefore the governance process also includes steps to invalidate or remove infrequent or outdated labels.