About
Subscribe

Detecting problem calls

A Spescom DataVoice technology overview
Johannesburg, 08 Oct 2004

In a contact centre environment it is often crucial to observe client interactions that may have a harmful effect on business. Scrutinising these interactions is relatively simple for a small contact centre. However, as the number of customer service agents (CSAs) increases, monitoring all, or a meaningful percentage of exchanges, becomes increasingly difficult.

Customer dissatisfaction is more prevalent than reported. CSAs cannot be expected to report on client discontent and supervisors cannot be expected to effectively manage more than 10 operators if no means of separating good and bad interactions exists.

Measuring productive output as an indicator of successful interaction is too indirect and removed from the cause. The random monitoring of calls does not yield enough data to spot trends. Effectively, supervisors spend more time monitoring and listening to recordings than analysing and coaching CSAs.

A tool that retrieves records of both good and poor customer interactions would be invaluable. This tool could assist in recovering the relationship or training the agent to employ effective communication.

Finding problem calls

Problem calls have distinct characteristics that can be analysed by identifying the call profile.

The call profile is characterised by the following:

* Emotions are the most obvious feature
* Speech content, ie the words spoken
* The rhythm of the conversation
* Frequency of speaker switch-over
* Duration of CSA speech and absence of pauses
* Telephony call-control information, such as on-hold duration, number of call- transfers and call duration
* Customer data, call classification, and customer contact frequency

Once the problem call has been detected, voice and screen recording become indispensable tools that allow supervisors to establish the actual conversation profile including the actions or circumstances of the situation. They can then use the information to coach and improve the quality of service offered by CSAs.

What do we listen for?

Emotions expressed by the client are probably the strongest indication of how a conversation is progressing. Emotions like fear and anger are regarded as counter-productive while joy contributes to the success of a conversation.

Some emotions may be neutral or dependent on context. Absence of emotion may be a good indicator of either success or failure, depending on the situation. Unfortunately emotion is also one of the most difficult parameters to detect, whether by machine or human intervention.

Studies have shown that humans are able to distinguish anger best (73% accuracy) and fear worst (50% accuracy). Confusion can be found somewhere between the parameters of sadness and fear, sadness and no emotion, as well as between happiness and fear.

Emotions are best identified in speech by pitch (maximum, range and mean), speech rate, intensity or volume and speech quality. Often, the rate of change of these parameters gives the best indication.

Other indications of conversation quality include aspects like rhythm of the conversation, rate of switchover from one speaker to another and presence of over-speak.

Frequent switch-over could denote a lively discussion, but probably indicates problematic behaviour as CSAs are expected to direct and control the conversation to provide optimum customer satisfaction.

The opportunity for both caller and CSA to contribute regularly to the conversation without interruption indicates the rhythm of the conversation. It is also representative of client participation and interest. The measurement of pauses and the total non-speech duration in a call falls into the same category.

Over-speak occurs when one speaker interrupts the other and is suggestive of a problem, either on the CSA or client side.

All these constraints are simple to measure and use in speech analysis and problem call detection.

Spescom DataVoice speech analysis

For the past 15 years Spescom DataVoice has been actively involved in speech analysis research and development. It has developed world-leading technology in this field such as Recall+. This application allows users to search recorded calls for specific speakers, speech content, or a combination of both. Mining the content of recorded calls is useful for a variety of reasons and the same technology can be used to find problem calls.

The ability to combine speech analysis results from a variety of sources allows the broader concept of problem call detection to become a reality. This concept is referred to as fusion and enables the combined use of speech and speaker recognition, emotion detection and any of the other indicators to find problem calls.

Fusion is not accomplished easily. Explicit search criteria such as "all calls with more than 20% silence, plus where the customer was transferred more than 3 times" produce measurable results. However, emotion detection and speech and speaker recognition are complex and search results are presented as confidence scores. The comparison of scores is only possible once the scores have been aligned to mean the same thing.

Spescom DataVoice has been developing these applications using the N-Best processing approach. Rather than returning a single, best-match result, the system returns a number of results, effectively sorting the data to provide the most likely result at the top of the list.

The latest research and development is focused on the idea of detecting problem calls by creating statistical models for the call profile. Users simply select a sample set of calls that represent a specific problem scenario. A statistical model is developed to mirror this speech and is then used to find similar conversations.

These results are combined with word and speaker recognition results using fusion. Consequently a powerful search tool is created, capable of finding information using many different dimensions.

Conclusion

As research continues, enhanced, more accurate statistical mechanisms will evolve, ultimately allowing users to fine tune solutions to meet individual needs.

The combination of methods discussed utilising the DataVoice Speech Analysis tools will decisively lead to an improved success rate in detecting problem calls.

1 Emotion in Speech: Recognition and Application to Call Centers, VA Petrushin, Proceedings of ANNIE'99
2 Statement supported by the results of the 2004 NIST Speaker Recognition Workshop evaluation

Share

Editorial contacts

Susan Richter
Blain Communications
(011) 462 4974
blaincomms@iafrica.com