# load the corpus
= Corpus('brown').load(path_to_brown_corpus) brown
concordance
Functionality for concordance analysis.
Using the Concordance class
There are examples below showing how to use the Concordance class directly to output concordances or concordance plots. The recommended way to use this functionality is through the Conc class. This provides an interface to create frequency lists, concordances, collocation tables, keyword tables and more.
Concordance class API reference
Concordance
Concordance (corpus:conc.corpus.Corpus)
Class for concordancing.
Type | Details | |
---|---|---|
corpus | Corpus | Corpus instance |
# instantiate the Concordance class
= Concordance(brown) report_brown
Concordance.concordance
Concordance.concordance (token_str:str, context_length:int=5, order:str='1R2R3R', page_size:int=20, page_current:int=1, show_all_columns:bool=False, use_cache:bool=True, ignore_punctuation:bool=True, filter_context_str:str|None=None, filter_context_length:int|tuple[int,int]=5)
Report concordance for a token string.
Type | Default | Details | |
---|---|---|---|
token_str | str | token string to get concordance for | |
context_length | int | 5 | number of words to show on left and right of token string |
order | str | 1R2R3R | order of sort columns - one of 1L2L3L, 3L2L1L, 2L1L1R, 1L1R2R, 1R2R3R, LEFT, RIGHT |
page_size | int | 20 | number of results to display per results page |
page_current | int | 1 | current page of results |
show_all_columns | bool | False | df with all columns or just essentials |
use_cache | bool | True | retrieve the results from cache if available (currently ignored) |
ignore_punctuation | bool | True | whether to ignore punctuation in the concordance sort |
filter_context_str | str | None | None | if a string is provided, the concordance lines will be filtered to show lines with contexts containing this string |
filter_context_length | int | tuple[int, int] | 5 | ignored if filter_context_str is None, otherwise this is the context window size per side in tokens - if an int (e.g. 5) context lengths on left and right will be the same, for independent control of left and right context length pass a tuple (context_length_left, context_left_right) |
Returns | Result | concordance report results |
When configuring order
, you can use LEFT
as a shortcut for 1L2L3L
and RIGHT
as a shortcut for 1R2R3R
.
Examples
See the note above about accessing this functionality through the Conc class.
'good at', context_length = 10, order='1R2R3R').display() report_brown.concordance(
Concordance for "good at" | |||
---|---|---|---|
Brown Corpus, Context tokens: 10, Order: 1R2R3R | |||
Doc Id | Left | Node | Right |
484 | about twenty miles away , and he was also pretty | good at | anything in the carpentry line . was a vivid , |
263 | he says , ' as a storyteller and was precociously | good at | description , dialogue , and most of the other staples |
479 | and not a method of passing the day . was | good at | his job . probably was n't hard for him to |
474 | trying to flatter her vanity . You must have been | good at | history at school . did you go to school '' |
82 | enough of unequal merit , but all of them pretty | good at | that . consisted of a new arrangement of ` ` |
474 | Why not '' ? ? said . I 'm not | good at | that kind of thing '' . This afternoon let 's |
Total Concordance Lines: 6 | |||
Total Documents: 5 | |||
Showing 6 lines | |||
Page 1 of 1 |
Concordance.concordance_plot
Concordance.concordance_plot (token_str:str, page_size:int=10, append_info:bool=True)
Create a concordance plot.
Type | Default | Details | |
---|---|---|---|
token_str | str | token string for concordance plot | |
page_size | int | 10 | number of plots per page |
append_info | bool | True | append token position info to the concordance line preview screens visible when hover over the plot lines |
Returns | Plot | concordance plot object, add .display() to view in notebook |
Examples
See the note above about accessing this functionality through the Conc class.
='cause', page_size=10, append_info=True).display() conc_reuters.concordance_plot(token_str
Concordance Plot for "cause"
Reuters Corpus
Total Documents: 121
Total Concordance Lines: 135
Total Concordance Lines: 135