Overview:
Text analysis can be used to understand documents (ie. annual reports, communication products, training manuals, websites, online message boards). It is also useful on interview, focus group or open-ended survey question data once the data has been transcribed.
Most relevant to our analytical needs in a public consultation situation are content analysis and thematic analysis
- Content analysis refers to an analysis of the content of the text. Simply, it refers to counting words or phrases or categories.
- Thematic analysis refers to an analysis of broader themes in the text. Simply, it refers to identifying key concepts, metaphors, ideas, or trends in the body of text under study.
Divide and Conquer:
When analyzing text, the body of data is divided into segments.
- Segments are parts of text – “chunks” or “quotations” which contain the categories or themes being studied.
- The smallest possible unit of analysis is a single word. For example, sometimes the occurrence of specific words will be counted for the purposes of content analysis.
Intercoder Agreement:
To ensure validity of the analysis, it’s best to strive for intercoder agreement:
- Intercoder agreement is a measure of how closely two different coders agree on the categories or themes for a text. You can ensure intercoder agreement, by having two or more coders code portions of the text separately, then come together to discuss and agree upon coding approaches.
Process:
The best process for qualitative text analysis can be summed up in the following diagram:

- Importantly, in qualitative coding, this process needs to start with a research question, but the question may change as the research progresses.
- Furthermore the process is iterative. While it usually begins with reading and interpreting the text, the process can continue as many times as necessary until the researcher feels they have gleaned all possible insights from the data.