Toward Interpretable Word Embeddings
Brash means bold, audacious, brazen. It can also have secondary meanings with a negative hue, like impertinent, impudent, insolent, rude. And mix in a bit of pushy, reckless, rash, impetuous (in the sense that it is quick, not slow; reckless, not cautious; rash, not considered; pushy, not shy).

One way to look at this is to say: which of the following two words could be an atom making up the molecule of brashness? quick, or slow? reckless, or cautious?
Another color band: impulsive, hasty, foolhardy, incautious, indiscreet.
And then: It’s also self-confident, cocky (which is like self-confident, only more so), leaning more toward aggressive than passive or shy; outspoken, assertive, assured.
How would we want word embeddings to reflect this relationship? We want brash to be closest to bold, audacious, and brazen. But we also want to recognize the elemental makeup (the atomic makeup) of words and their senses. The other ones are less important, but they fill out the body of brash, illuminating the contours of meaning. What are the the elemental meanings inherent in words? Can percentages be assigned to them?
Instead of the clear but crude King – Man + Woman = Queen, we want something like brash – impulsive = assertive. Or, depending on strength, something like 100 * brash – 5 * impulsive = assertive. Or would it make sense to take the square root of excellence? At what would we hope to arrive?
The way to do this is probably to manually create a series of atomized decomposition of some words, as above, and see if this can become a recognizable pattern.
Here is some code as a start:
import numpy as np
def build_semantic_embeddings(vocabulary, definitions, initial_embeddings, iterations=100):
vocab_size = len(vocabulary)
embedding_size = initial_embeddings.shape[1]
embeddings = initial_embeddings.copy()
for _ in range(iterations):
new_embeddings = np.zeros((vocab_size, embedding_size))
for word_idx, word in enumerate(vocabulary):
defining_words = definitions[word]
defining_idxs = [vocabulary.index(w) for w in defining_words if w in vocabulary]
# Averaging the vectors of defining words
if defining_idxs:
defining_vectors = [embeddings[idx] for idx in defining_idxs]
new_embeddings[word_idx] = np.mean(defining_vectors, axis=0)
embeddings = new_embeddings.copy()
return embeddings
And if we can include a word’s opposite (antonyms) then here is some pseudocode sketching out how that might work:
def update_embedding(word_embedding, related_words, relationship_type, update_weight=1):
for related_word in related_words:
if relationship_type == 'synonym':
word_embedding += update_weight * initial_embeddings[related_word]
elif relationship_type == 'antonym':
word_embedding -= update_weight * initial_embeddings[related_word]
return word_embedding
for word in vocabulary:
synonyms = get_synonyms(word)
antonyms = get_antonyms(word)
word_embedding = initial_embeddings[word]
word_embedding = update_embedding(word_embedding, synonyms, 'synonym')
word_embedding = update_embedding(word_embedding, antonyms, 'antonym')
updated_embeddings[word] = word_embedding
In order to make this work, we may need a rethinking of vector space. Rather than one vast space into which all words are fit, we may need clusters of words, along axes.