Basic methods of Text Analysis

May 20, 2020 at 7:00 AM
text analytics picture.png

There are some ‘BASIC’ methods in text analysis and then there are some ’ADVANCED’ ones!

But first, “Get your BASICs right” :)

There are 3 Basic methods.

1. Word frequency

The number of times certain words or concepts occur in a text. Simply speaking, it helps you to identify the popular words or expressions used in any sort of conversation.

For example: For a retail clothing brand, if most customers use the word ‘expensive’ often. This suggests that the pricing may need a review or the brand needs to be marketed to the suitable crowd.

2. Collocation (co-location)

Some words often appear together, this technique helps to identify them. For example: In the customer reviews of Booking.com, words like ‘Swimming’ and ‘pool’ tend to co-exist and appear together rather than individually.

The common type of collocation are:

Bigrams- Two adjacent co-existing words such as ‘time table’, ‘air conditioner’ or ‘ice cream’.

Trigrams- Three adjacent co-existing words in the same order such as ‘to be honest’, ‘for your information’ or ‘out of network’.

Fourgrams- Four adjacent words such as ‘Once upon a time’ also exist, however, bigrams and trigrams are relatively more common.

This technique helps to identify semantic structures (semantic means words connected with a meaning) and it counts bigrams and trigrams as one word. This helps to improve the granularity of the insights.

Example: For an automotive servicing agency, if most customers may talk about ‘delay in delivery’, this suggests to you that there is a problem around timely delivery of the serviced automobile.

3. Concordance

Concordance will help to understand the context in which ‘a word’ is generally used across all the texts., This will help to remove a certain extent of ambiguity in human language.

Concordance pulls in all texts that contain the queried word and breaks the texts into 3 parts.

  1. Target — The word you are looking for.
  2. Preceding context- The chunk of words in the text ‘BEFORE’ this target word.
  3. Following context- The chunk of words in the text remaining ‘AFTER’ this target word.

For example: If you notice your customers using the word ‘good’ very often and you want to understand the context of this word, concordance will help you.

Simple,right!?

It normally finds the exact matches of the word. This means, if the word ‘application’ needs to be searched, simply writing ‘app’ won’t call for the concordance of the word ‘application’.

Continue reading about the Advanced techniques used in text analysis as they are the core to understanding the whole Text Analysis in the next blog.

Written by: Aishwarya Prasad

At: Medium