Generate bigram network to analyze given data with filters for all most common bigrams excluding stop words.

Arguments

data

the sentiment140 train or test data containing variables user for username, date for date and text for text of the tweet.

user_list

a vector of users for which to filter the dataset

start_date_time

input start_date_time in POSIXct format on which to filter the dataset

end_date_time

input end_date_time in POSIXct format on which to filter the dataset

keyword_list

a list of string keywords on which to filter the dataset

counts_quantile

the quantile beyond which to visualize the bigrams

Value

a list object with raw filtered dataframe, bigram_counts aggregated dataframe that holds the frequency counts of bigrams and a plot representing the network.

Examples

bigram_network()
#> $raw #> # A tibble: 354 x 14 #> polarity id date query user text nouns adjectives #> <chr> <int> <dttm> <chr> <chr> <chr> <int> <int> #> 1 Positive 217 2009-05-25 17:29:39 mcdo… Mami… mgg … 7 2 #> 2 Positive 2140 2009-05-20 02:38:17 nike Chet… ew n… 4 2 #> 3 Negative 224 2009-05-25 17:34:51 chen… QCWo… ife?… 9 3 #> 4 Positive 569 2009-06-07 21:38:16 kind… rach… @lon… 8 2 #> 5 Positive 2546 2009-06-08 00:13:48 kind… k8tb… " lo… 7 1 #> 6 Positive 1019 2009-05-11 05:21:25 lebr… unde… atch… 3 1 #> 7 Negative 2110 2009-05-18 01:14:35 Malc… blin… @por… 7 3 #> 8 Positive 256 2009-05-27 23:59:18 goog… maex… " am… 3 1 #> 9 Negative 413 2009-06-02 03:17:04 time… Jaso… " ha… 11 4 #> 10 Positive 1003 2009-05-11 03:18:59 kind… Happ… y Ki… 1 0 #> # … with 344 more rows, and 6 more variables: prepositions <int>, #> # articles <int>, pronouns <int>, verbs <int>, adverbs <int>, #> # interjections <int> #> #> $bigram_counts #> # A tibble: 982 x 3 #> word1 word2 counts #> <chr> <chr> <int> #> 1 time warner 25 #> 2 http bit.ly 24 #> 3 ime warner 7 #> 4 malcolm gladwell 7 #> 5 bobby flay 6 #> 6 http tinyurl.com 6 #> 7 twitter api 6 #> 8 warner cable 6 #> 9 love love 5 #> 10 museum 2 5 #> # … with 972 more rows #> #> $plot
#>