Generate word correlation network to analyze given data with filters for all highly correlated words excluding stop words.

Arguments

data

the sentiment140 train or test data containing variables user for username, date for date and text for text of the tweet.

user_list

a vector of users for which to filter the dataset

start_date_time

input start_date_time in POSIXct format on which to filter the dataset

end_date_time

input end_date_time in POSIXct format on which to filter the dataset

keyword_list

a list of string keywords on which to filter the dataset

correlation_threshold

threshold beyond which to plot the network

Value

a list object with raw filtered dataframe, word_cors aggregated dataframe that holds the correlated words and a plot representing the network.

Examples

data(sentiment140_test) word_cor_network()
#> $raw #> # A tibble: 354 x 14 #> polarity id date query user text nouns adjectives #> <chr> <int> <dttm> <chr> <chr> <chr> <int> <int> #> 1 Positive 217 2009-05-25 17:29:39 mcdo… Mami… mgg … 7 2 #> 2 Positive 2140 2009-05-20 02:38:17 nike Chet… ew n… 4 2 #> 3 Negative 224 2009-05-25 17:34:51 chen… QCWo… ife?… 9 3 #> 4 Positive 569 2009-06-07 21:38:16 kind… rach… @lon… 8 2 #> 5 Positive 2546 2009-06-08 00:13:48 kind… k8tb… " lo… 7 1 #> 6 Positive 1019 2009-05-11 05:21:25 lebr… unde… atch… 3 1 #> 7 Negative 2110 2009-05-18 01:14:35 Malc… blin… @por… 7 3 #> 8 Positive 256 2009-05-27 23:59:18 goog… maex… " am… 3 1 #> 9 Negative 413 2009-06-02 03:17:04 time… Jaso… " ha… 11 4 #> 10 Positive 1003 2009-05-11 03:18:59 kind… Happ… y Ki… 1 0 #> # … with 344 more rows, and 6 more variables: prepositions <int>, #> # articles <int>, pronouns <int>, verbs <int>, adverbs <int>, #> # interjections <int> #> #> $word_cors #> # A tibble: 5,256 x 3 #> item1 item2 correlation #> <chr> <chr> <dbl> #> 1 flay bobby 1 #> 2 bobby flay 1 #> 3 korea north 0.911 #> 4 north korea 0.911 #> 5 star trek 0.893 #> 6 trek star 0.893 #> 7 api twitter 0.879 #> 8 twitter api 0.879 #> 9 gladwell malcolm 0.820 #> 10 malcolm gladwell 0.820 #> # … with 5,246 more rows #> #> $plot
#>