Wordlists¶

class cltoolkit.wordlist.Wordlist(datasets, ts=None, concept_id_factory=<function Wordlist.<lambda>>)[source]¶

A collection of one or more lexibank datasets, aligned by concept.

Parameters

datasets (typing.List[pycldf.dataset.Dataset]) – The datasets you want to load, provided as list of pycldf.Dataset.
ts (typing.Optional[pyclts.transcriptionsystem.TranscriptionSystem]) – A TranscriptionSystem (as provided by pyclts), if you want to work with phonological features from CLTS.
concept_id_factory (typing.Callable[[dict], str]) –

Variables

datasets –
languages – DictTuple
senses – DictTuple
concepts – DictTuple
forms – DictTuple
Wordlist.graphemes – DictTuple
sounds – DictTuple

iter_forms_by_concepts(concepts=None, languages=None, aspect=None, filter_by=None, flat=False)[source]¶

Iterate over the concepts in the data and return forms for a given language.

Parameters

concepts – List of concept identifiers, all concepts if not specified.
language – List of language identifiers, all languages if not specified.
aspect – Select attribute of the Form object instead of the Form object.
filter_by – Use a function to filter the data to be output.
flatten – Return a one-dimensional array of the data.

Note

The function returns for each concept (selected by ID) the form for each language, or the specific aspect (attribute) of the form, provided this exists.

class cltoolkit.util.DictTuple(items, **kw)[source]¶: An object allowing access to items of a tuple as if it were a dict keyed with the id attribute of the contained objects.