Topics by Alpha (α)

The α value of a topic is a measure of the frequency with which it appears in the corpus. Topics with a high α appear in many documents with a high weight; as such, they often refer to very general themes, such as the genre of the document collection itself. Topics with a low α value are outliers, and may appear in one or two documents.

Documents by Topic Entropy (H)

Topic entropy refers to the degree to which topic weights are equally distributed in a document. In a high entropy document, the weighting of individual topics is less pronounced, tending toward an equiprobable distribution; in a low entropy document, fewer topics have a disproportionate weight. Documents with very high entropy may be empty. Documents with very low entropy may have a high thematic specificity.