Skip to content

AbuseOverview

Mark A. Greenwood edited this page Nov 26, 2025 · 1 revision

Abuse Overview

This tab provides an overview of the entire dataset (unless filters have been selected), as for the general overview tab, but this time including only abusive messages. It contains the same visualisations, but also includes some additional ones, as follows.

Abusive words and phrases

Two visualisations depict the most frequent abusive words and phrases. The first is a bar chart showing the most frequent abusive words or phrases found in abusive messages, colour coded by type. Hovering the mouse over one of the bars displays the exact frequency of that word or phrase in the abusive messages (note that this is not the same as the frequency in the dataset overall, as some (potentially) abusive terms might appear in either non-relevant messages (not abusive towards the targets, but towards someone else) or used in a non-abusive context (cf for example “You are an idiot” (abusive if “you” refers to a targeted journalist) vs “the government are idiots” (non-abusive towards a journalist).

Abusive phrases categorised by topic

The second visualisation is a word cloud depicting the top 100 (or as many as exist) abusive words or phrases in abusive messages. As before, word size is related to frequency. These are not colour-coded. Hovering the mouse on a word or term will show the exact frequency, while clicking on it will add it to the search filter.

Abusive phrase word cloud

Abuse Types

Next is a visualisation showing the breakdown of abuse type, as a pie chart. The segments are colour-coded, with the same key as for the abuse bar chart. Hovering over a segment depicts the statistics (frequency, proportion of all abuse, and - for the outer ring - proportion of the parent category. For example, sexist and explicit abuse is found in 1655 messages, and represents 18.65% of personal abuse, and 13.04% of all abuse.

Types of abuse shown in a hierarchy

Relation Between Abuse and Topic

The next visualisation shows the relationship between most frequent abuse terms (y-axis) and all the topics associated with the messages containing that term, colour-coded. In the example below, we see that the green topic (Politics) is most prevalent for all abusive terms, though other topics are also relevant. These topics are the same as the ones in the dataset overview tab.

Abusive phrases and topic breakdown

Intersectionality

Intersectionality refers to messages which contain more than one abuse type within a single message. This is depicted using two visualisations: a heat map and a chord diagram. They display the same information but in different ways. In the heat map, the darker the colour, the greater the number of overlapping messages. In the examples below, we see that racist and reputational abuse are strongly linked. In the heatmap, hovering the mouse over a box provides frequency information.

Intersectionality viewed as a heatmap

Intersectionality viewed as a chord diagram

Clone this wiki locally