ngrams that most frequently contain a keyword in a Gallicagram corpus over a period
gallicagram_with.Rd
Returns the most frequent ngrams containing a keyword over a given period.
Usage
gallicagram_with(
keyword,
corpus = "lemonde",
from = "earliest",
to = "latest",
n_results = 20,
after = FALSE,
length = 2
)
Arguments
- keyword
A character string. Keyword to search. The string cannot contain more words than the
max_length
for this corpus, as indicated in thelist_corpora
dataset.- corpus
A character string. The corpus to search. The list of available corpora can be found in the
list_corpora
dataset.- from
An integer or "earliest". The starting year. If set to "earliest", it uses the earliest date at which the data is reliable for this corpus, as described in
list_corpora
.- to
An integer or "latest". The end year. If set to "latest", it uses the latest date at which the data is reliable for this corpus, as described in
list_corpora
.- n_results
An integer. The number of most frequently associated words to return.
n_results
can also be set to "all" to return all the available results.- after
A boolean. Whether to consider only words following the keyword and not those preceding. Set to
FALSE
by default.- length
An integer. The length of the ngrams considered. Can be up to 3 in the "books" and "press" corpora and 4 in the "lemonde" corpus.
Value
A tibble. With the n_results
most frequent ngrams containing
the keyword
searched (ngram
)
and the number of occurrences over the period (n_occur
).
It also returns the input parameters
keyword
, corpus
, from
and to
.
Details
This function is only available for the three main corpora (historical press, Gallica books, Le Monde newspaper).
This function corresponds to the Joker
route of the API,
accessed through the 'Joker' function on the Gallicagram app.
When length = 1, it is analogous to the 'Joker' function on Ngram Viewer.
It is analogous to gallicagram_with_month
but for a period instead of
a given month.
For instance "camarade" is often followed by "staline" or "khrouchtchev" in
Le Monde. The function returns the most frequent ngrams of the form
"camarade *" when setting after = TRUE
. after = FALSE
also
includes the most frequent ngrams of the form "* camarade".
Searching the "presse" corpus can require a long running time.
Examples
gallicagram_with("camarade", from = 1960, to = 1970)
#> # A tibble: 20 × 6
#> n_occur ngram keyword corpus from to
#> <int> <chr> <chr> <chr> <dbl> <dbl>
#> 1 452 le camarade camarade lemonde 1960 1970
#> 2 404 son camarade camarade lemonde 1960 1970
#> 3 256 camarade de camarade lemonde 1960 1970
#> 4 235 un camarade camarade lemonde 1960 1970
#> 5 198 du camarade camarade lemonde 1960 1970
#> 6 124 camarade khrouchtchev camarade lemonde 1960 1970
#> 7 113 leur camarade camarade lemonde 1960 1970
#> 8 91 notre camarade camarade lemonde 1960 1970
#> 9 68 d'un camarade camarade lemonde 1960 1970
#> 10 62 au camarade camarade lemonde 1960 1970
#> 11 52 mon camarade camarade lemonde 1960 1970
#> 12 48 camarade dubcek camarade lemonde 1960 1970
#> 13 44 camarade mao camarade lemonde 1960 1970
#> 14 44 ancien camarade camarade lemonde 1960 1970
#> 15 41 camarade et camarade lemonde 1960 1970
#> 16 31 camarade waldeck camarade lemonde 1960 1970
#> 17 27 camarade qui camarade lemonde 1960 1970
#> 18 24 sa camarade camarade lemonde 1960 1970
#> 19 24 camarade du camarade lemonde 1960 1970
#> 20 23 camarade togliatti camarade lemonde 1960 1970