Word2Vec Deciphers Crypto Lingo...

ICO, FUD, FOMO, HODL!! Say what, now?

Posted by Mitchell Eccles on May 24, 2018

Thanks to the work of many great researchers, a whole load of machine learning techniques are readily accessible. One technique that consistently blows my mind is Word2Vec. It’s stupidly simple to use and the results absolutely fascinate me. Briefly, if you provide Word2Vec with enough examples of human text/prose, it is able to learn relationships between words and concepts. Word2Vec is a shallow neural network that learns embeddings for words by predicting their context. Or by predicting the word given its context. Check Google for the in-depth detail. The resulting word embeddings then sit in an n-dimensional vector space; allowing us to do some cool stuff, like vector maths.

I lose hours playing with Word2Vec models and exploring their output. So this time, I’ve decided to share those hours with you. I've trained Word2Vec on set of Reddit conversations regarding one of my favourite subjects. Cryptocurrency. The results are astonishing and can help anyone pick up the crypto lingo.

Word2Vec I love you! You know the answer to shit that I’ve never explicitly told you. Imagine going into a new field or domain where the lingo is foreign to you... Now surely, you'd love to have a Word2Vec model to help you understand it? I believe ML algorithms like Word2Vec are going to enable our symbiotic relationship with machines, "AI" if you like, allowing us grow and develop as a species.

And why stop with domain translation? Word2Vec models are just the starting point for a lot of cool Natural Language Processing (NLP) tasks. The word embeddings from a Word2Vec model can be inputs to other machine learning algorithms - like a domain name evaluation classifier. Or how about a product categorizer? Or the next cool project you're working on?

If you want help on where to start with Word2Vec and NLP or would like to explore ideas, then feel free to get in touch