Generate bigram network

Generate bigram network to analyze given data with filters for all most common bigrams excluding stop words.

Arguments

data	the sentiment140 train or test data containing variables `user` for username, `date` for date and `text` for text of the tweet.
user_list	a vector of users for which to filter the dataset
start_date_time	input start_date_time in POSIXct format on which to filter the dataset
end_date_time	input end_date_time in POSIXct format on which to filter the dataset
keyword_list	a list of string keywords on which to filter the dataset
counts_quantile	the quantile beyond which to visualize the bigrams

Value

a list object with raw filtered dataframe, bigram_counts aggregated dataframe that holds the frequency counts of bigrams and a plot representing the network.

Examples

bigram_network()
#> $raw
#> # A tibble: 354 x 14
#>    polarity    id date                query user  text  nouns adjectives
#>    <chr>    <int> <dttm>              <chr> <chr> <chr> <int>      <int>
#>  1 Positive   217 2009-05-25 17:29:39 mcdo… Mami… mgg …     7          2
#>  2 Positive  2140 2009-05-20 02:38:17 nike  Chet… ew n…     4          2
#>  3 Negative   224 2009-05-25 17:34:51 chen… QCWo… ife?…     9          3
#>  4 Positive   569 2009-06-07 21:38:16 kind… rach… @lon…     8          2
#>  5 Positive  2546 2009-06-08 00:13:48 kind… k8tb… " lo…     7          1
#>  6 Positive  1019 2009-05-11 05:21:25 lebr… unde… atch…     3          1
#>  7 Negative  2110 2009-05-18 01:14:35 Malc… blin… @por…     7          3
#>  8 Positive   256 2009-05-27 23:59:18 goog… maex… " am…     3          1
#>  9 Negative   413 2009-06-02 03:17:04 time… Jaso… " ha…    11          4
#> 10 Positive  1003 2009-05-11 03:18:59 kind… Happ… y Ki…     1          0
#> # … with 344 more rows, and 6 more variables: prepositions <int>,
#> #   articles <int>, pronouns <int>, verbs <int>, adverbs <int>,
#> #   interjections <int>
#> 
#> $bigram_counts
#> # A tibble: 982 x 3
#>    word1   word2       counts
#>    <chr>   <chr>        <int>
#>  1 time    warner          25
#>  2 http    bit.ly          24
#>  3 ime     warner           7
#>  4 malcolm gladwell         7
#>  5 bobby   flay             6
#>  6 http    tinyurl.com      6
#>  7 twitter api              6
#>  8 warner  cable            6
#>  9 love    love             5
#> 10 museum  2                5
#> # … with 972 more rows
#> 
#> $plot
#>

Arguments

Value

Examples

Contents