Wordlists

class cltoolkit.wordlist.Wordlist(datasets, ts=None, concept_id_factory=<function Wordlist.<lambda>>)[source]

A collection of one or more lexibank datasets, aligned by concept.

Parameters
  • datasets (typing.List[pycldf.dataset.Dataset]) – The datasets you want to load, provided as list of pycldf.Dataset.

  • ts (typing.Optional[pyclts.transcriptionsystem.TranscriptionSystem]) – A TranscriptionSystem (as provided by pyclts), if you want to work with phonological features from CLTS.

  • concept_id_factory (typing.Callable[[dict], str]) –

Variables
  • datasets

  • languagesDictTuple

  • sensesDictTuple

  • conceptsDictTuple

  • formsDictTuple

  • Wordlist.graphemesDictTuple

  • soundsDictTuple

iter_forms_by_concepts(concepts=None, languages=None, aspect=None, filter_by=None, flat=False)[source]

Iterate over the concepts in the data and return forms for a given language.

Parameters
  • concepts – List of concept identifiers, all concepts if not specified.

  • language – List of language identifiers, all languages if not specified.

  • aspect – Select attribute of the Form object instead of the Form object.

  • filter_by – Use a function to filter the data to be output.

  • flatten – Return a one-dimensional array of the data.

Note

The function returns for each concept (selected by ID) the form for each language, or the specific aspect (attribute) of the form, provided this exists.

class cltoolkit.util.DictTuple(items, **kw)[source]

An object allowing access to items of a tuple as if it were a dict keyed with the id attribute of the contained objects.