cc.journeyman.speechio.canonicalise

Replace words and phrases in provided input with preferred equivalents drawn from a thesaurus.

Broadly, the problem with allowing unconstrained input is that the words that users choose to use for things we actually know about may not be our preferred words. This namespace is intended to provide a mechanism to canonicalise phrases and words in the input.

The input is expected to be a lispified Stanford NLP parse tree; the output is expected to be a new parse tree, with the difference being that the returned tree may have nodes marked :AMBIG, whose children are different possible canonical interpretations.

default-thesaurus-content

For testing and development only!

test-thesaurus

TODO: write docs

Thesaurus

protocol

members

lookup

(lookup this speech-part)

Return the canonical expression equivalent to this speech-part, or just this speech-part if none are found. speech-part is expected to be a lispified (see as-lisp) Stanford NLP parse tree.

If multiple canonical equivalents are found, return a new node whose :label value is :AMBIG and whose :children value is a list of the possible equivalents.