Histwords from willaim leif is a NLP project that provides diachronic word embeddings. What that means is that it creates word embeddings that are consistent across different decades - this makes it possible to compare the meaning of words over time.

I think histwords is really cool, as are the visualizations on their github page. Wanting to explore the change of words over time for myself, I worked with jun on adding an interactive visualization that allows one to enter multiple words and compare how they changed over time.

The visualization styles were:

A table view:

table view

tSNE Embeddings:

tSNE embedding

Timeline view:

timeline view

One particularly cool feature is the ability to plot two or more words on the same graph to see how they change, but it can get a bit cluttered. The following image shows a comparison of how awful and terrible changed over time.


The code is now included in the histwords repository and I’m hoping other people are getting some interesting use out of it.

I’ve been wanting to setup a webserver so that others can interactively explore, but the size of the embeddings (multiple GB) is too prohibitive for me to host them in RAM on my own server. When the cost of RAM comes down to the point where I can rent an 8GB instance for 20$/mo, I will do so.