-
Notifications
You must be signed in to change notification settings - Fork 0
AbuseOverview
This tab provides an overview of the entire dataset (unless filters have been selected), as for the general overview tab, but this time including only abusive messages. It contains the same visualisations, but also includes some additional ones, as follows.
Two visualisations depict the most frequent abusive words and phrases. The first is a bar chart showing the most frequent abusive words or phrases found in abusive messages, colour coded by type. Hovering the mouse over one of the bars displays the exact frequency of that word or phrase in the abusive messages (note that this is not the same as the frequency in the dataset overall, as some (potentially) abusive terms might appear in either non-relevant messages (not abusive towards the targets, but towards someone else) or used in a non-abusive context (cf for example “You are an idiot” (abusive if “you” refers to a targeted journalist) vs “the government are idiots” (non-abusive towards a journalist).

The second visualisation is a word cloud depicting the top 100 (or as many as exist) abusive words or phrases in abusive messages. As before, word size is related to frequency. These are not colour-coded. Hovering the mouse on a word or term will show the exact frequency, while clicking on it will add it to the search filter.

Next is a visualisation showing the breakdown of abuse type, as a pie chart. The segments are colour-coded, with the same key as for the abuse bar chart. Hovering over a segment depicts the statistics (frequency, proportion of all abuse, and - for the outer ring - proportion of the parent category. For example, sexist and explicit abuse is found in 1655 messages, and represents 18.65% of personal abuse, and 13.04% of all abuse.

The next visualisation shows the relationship between most frequent abuse terms (y-axis) and all the topics associated with the messages containing that term, colour-coded. In the example below, we see that the green topic (Politics) is most prevalent for all abusive terms, though other topics are also relevant. These topics are the same as the ones in the dataset overview tab.

Intersectionality refers to messages which contain more than one abuse type within a single message. This is depicted using two visualisations: a heat map and a chord diagram. They display the same information but in different ways. In the heat map, the darker the colour, the greater the number of overlapping messages. In the examples below, we see that racist and reputational abuse are strongly linked. In the heatmap, hovering the mouse over a box provides frequency information.


This wiki is a work in progress, and parts of it may still be under construction.