Other articles


  1. Word Mover’s Distance in Python

    Word mover’s distance classification in Python

    A guide to scikit-learn compatible nearest neighbors classification using the recently introduced word mover’s distance (WMD). Joint post with the awesome Matt Kusner!

    Source of this Jupyter notebook.

    In document classification and other natural language processing applications, having a good measure of the similarity of two texts can be a valuable building block. Ideally, such a measure would capture semantic information. Cosine similarity on bag-of-words vectors is known to do well in practice, but it inherently cannot capture when documents say the same thing in completely different words.

    read more

    There are comments.

  2. Kemeny-Young Optimal Rank Aggregation in Python

    Rank aggregation is a problem with many important applications and naive approaches to it go wrong in subtle ways.

    Let’s say that your national Quidditch league is dominated by five major wizard sports newspapers. Yes, the ones with moving images and everything. Every week after the games, each of them publishes a ranking of the star players. For now, let’s suppose that the set of players under investigation is always the same, as the problem becomes a bit more complicated otherwise.

    read more

    There are comments.

  3. Site move

    I finally got around to moving my entire website, including the blog, to Pelican. I probably would have gotten away with it too if it weren’t for those meddling kids who hacked my friend’s server and convinced me that it’s worth the effort to go static.

    It ...

    read more

    There are comments.

Page 1 / 10 »