Word embeddings as Natural Semantic Metalanguages

If we constrain a word/token embedding’s features to represent semantic primes in sequence, how much information could it capture compared to an unconstrained word embedding of equivalent dimension? What could such a mathematical metalanguage tell us about NSM theory?

There are something like 150,000 words in current use in English – far more than the 20,000-30,000 in the typical adult’s vocabulary. Thankfully, rare and complex words can be explained using simple, common words. Building on this principle, Natural Semantic Language theory postulates that all the meanings of all words and phrases in human languages can ultimately be decomposed into some small set of simple words with meanings that can’t be further explained (i.e.“semantic primes”). A Natural Semantic Metalanguage (NSM) is the smallest set of such semantic primes that’s capable of expressing any meaning in any language. The most prominent list of semantic primes currently stands at 65 words.

Word embeddings like word2vec or the first layer of BERT, RoBERTa, and GPT-3 also decompose natural-language words into a small set of universal components or features. But unlike the semantic primes of an NSM, a word embedding’s features are continuous decimal values (rather than the binary presence/absence of a semantic prime) and do not align with any single human-comprehensible meaning.

In this project, we aim to learn a word embedding whose features represent the interpretable semantic primes of an NSM. In theory this can be accomplished by modifying the learning algorithm’s loss function such that each prime in the NSM must be represented by a vector with one 1 and 64 0s (along with a few other tricks)*.

If successful, the NSM embedding produced will have a number of unique uses as a research tool, including:

It can quantify the expressiveness of any proposed NSM by measuring the performance of its embedding on downstream NLU tasks
If we allow N unconstrained features (not aligned to any existing prime), we can learn what expressiveness is missing from a given NSM by seeing what information the embedding captures in those features**
Because it is pre-aligned with any other embedding produced by this method, it would render alignment between embeddings or basic translation between human languages much simpler
Used as the embedding layer for an LLM, this approach could render a notoriously opaque family of models significantly more interpretable

Obviously, the viability of this project hangs on an intuitive but unproven hypothesis about the relationship between distributional word embedding algorithms and human speech – a stronger version or a specific corollary of the distributional hypothesis. This means, among other things, that a poorly-performing NSM embedding wouldn’t say much against NSM theory even if perfectly executed. But positive results would constitute an encouraging quantitative corroboration.

Note: project is pending discussion with several potential collaborators

*most notably, we’ll need to induce sparsity via L1 regularization or other means, and include one additional feature per prime that indicates the order of primes in the NSM decomposition of a word.

** n.b. these unconstrained features must be added after training the constrained features so that the embedding does not rely on the unconstrained features to more conveniently express meaning that can be captured by the primes