How to Build a Word Collocation Network Graph via Tableau & R? Part 1

DigNo Ape
3 min readMay 11, 2023

--

In corpus linguistics, a collocation is a series of words or terms that co-occur more often than would be expected by chance. In phraseology, a collocation is a type of compositional phraseme, meaning that it can be understood from the words that make it up.

We are going to show how to build a tool to present word collocation network graph to improve the language learning quality and efficiency.

This article will focus on the first part of the following two main parts:

  1. Excel and R: Set the coordinates (X1, X2) for each word.
  2. Visualize it in Tableau.

Tools: MS Excel, R

Steps:

  1. In Excel, adding all combinations of Verb-Noun in Tab “edge” .

2. Calculate the counts for each word and save it in Tab “nodes”, including verbs and nouns.

3. Run R Script to get the net X1 and X2 and export it as “export.xlsx”.

> library(igraph)
> edges <- read.xlsx('Collocation.xlsx',sheet = 'edge')
> nodes <- read.xlsx('Collocation.xlsx',sheet = 'nodes')
> net <- graph.data.frame(edges, nodes, directed = T)
> layout <- layout_with_fr(net)
> df <- data.frame(layout)
> write_xlsx(df, 'export.xlsx')

4. Paste X1 and X2 in “export.xlsx” in Tab “nodes”. They should have the same row counts. Also, add a group column, which will be used in Tableau.

5. In Tab “edge”, look up the X1 and X2 values from Tab “nodes” for both Verb (VX1, VX2) and Noun (NX1, NX2).

6. Create a new Tab “edges” by moving all combinations twice with VX1/VX2 and NX1/NX2 respectively. As you can see the following example, grab-attention’s first X1/ X2 is from its VX1/VX2 and second is from NX1/NX2. Same for hold-attention.

Tab edges
Tab edge

We will continue to discuss how to visualize the data in Tab “edges” and “nodes” in Tableau in the next article. Thank you!

--

--

DigNo Ape
DigNo Ape

Written by DigNo Ape

我們秉持著從原人進化的精神,不斷追求智慧的累積和工具的運用來提升生產力。我們相信,每一個成員都擁有無限的潛力,透過學習和實踐,不斷成長和進步。

No responses yet