speechbrain.utils.dictionaries module

Dictionary utilities, e.g. synonym dictionaries.

Authors
  • Sylvain de Langen 2024

Summary

Classes:

SynonymDictionary

Loads sets of synonym words and lets you look up if two words are synonyms.

Reference

class speechbrain.utils.dictionaries.SynonymDictionary[source]

Bases: object

Loads sets of synonym words and lets you look up if two words are synonyms.

This could, for instance, be used to check for equality in the case of two spellings of the same word when normalization might be unsuitable.

Synonyms are not considered to be transitive: If A is a synonym of B and B is a synonym of C, then A is NOT considered a synonym of C unless they are added in the same synonym set.

static from_json_file(file) SynonymDictionary[source]

Parses an opened file as JSON, where the top level structure is a list of sets of synonyms (i.e. words that are all synonyms with each other), e.g. [ ["hello", "hi"], ["say", "speak", "talk"] ].

Parameters:

file – File object that supports reading (e.g. an `open`ed file)

Returns:

Synonym dictionary frm the parsed JSON file with all synonym sets added.

Return type:

SynonymDictionary

static from_json_path(path) SynonymDictionary[source]

Opens a file and parses it as JSON, with otherwise the same semantics as from_json_file(), which uses an opened file.

Parameters:

path (str) – Path to the JSON file

Returns:

Synonym dictionary frm the parsed JSON file with all synonym sets added.

Return type:

SynonymDictionary

add_synonym_set(words: Iterable[str]) None[source]

Add a set of words that are all synonyms with each other.

Parameters:

words (Iterable[str]) – List of words that should be defined as synonyms to each other

__call__(a: str, b: str) bool[source]

Check for the equality or synonym equality of two words.

Parameters:
  • a (str) – First word to compare. May be outside of the known dictionary.

  • b (str) – Second word to compare. May be outside of the known dictionary. The order of arguments does not matter.

Returns:

Whether a and b should be considered synonyms. Not transitive, see the main class documentation.

Return type:

bool

get_synonyms_for(word: str) set[source]

Returns the set of synonyms for a given word.

Parameters:

word (str) – The word to look up the synonyms of. May be outside of the known dictionary.

Returns:

Set of known synonyms for this word. Do not mutate (or copy it prior). May be empty if the word has no known synonyms.

Return type:

set of str