Other Resources
---------------

Blog posts, tutorial videos, hackathons and other useful Gensim resources, from around the internet.

- *Use FastText or Word2Vec?* Comparison of embedding quality and performance. `Jupyter Notebook <https://github.com/RaRe-Technologies/gensim/blob/ba1ce894a5192fc493a865c535202695bb3c0424/docs/notebooks/Word2Vec_FastText_Comparison.ipynb>`__
- Multiword phrases extracted from *How I Met Your Mother*. `Blog post by Mark Needham <http://www.markhneedham.com/blog/2015/02/12/pythongensim-creating-bigrams-over-how-i-met-your-mother-transcripts/>`__
- *Using Gensim LDA for hierarchical document clustering*. `Jupyter notebook by Brandon Rose <http://brandonrose.org/clustering#Latent-Dirichlet-Allocation>`__
- *Evolution of Voldemort topic through the 7 Harry Potter books*. `Blog post <http://rare-technologies.com/understanding-and-coding-dynamic-topic-models/>`__
- *Movie plots by genre*: Document classification using various techniques: TF-IDF, word2vec averaging, Deep IR, Word Movers Distance and doc2vec. `Github repo <https://github.com/RaRe-Technologies/movie-plots-by-genre>`__
- *Word2vec: Faster than Google? Optimization lessons in Python*, talk by Radim Řehůřek at PyData Berlin 2014. `Youtube video <https://www.youtube.com/watch?v=vU4TlwZzTfU>`__
- *Word2vec & friends*, talk by Radim Řehůřek at MLMU.cz 7.1.2015. `Youtube video <https://www.youtube.com/watch?v=wTp3P2UnTfQ>`__

..
   - ? `Making an Impact with NLP <https://www.youtube.com/watch?v=oSSnDeOXTZQ>`__ -- Pycon 2016 Tutorial by Hobsons Lane
   - ? `NLP with NLTK and Gensim <https://www.youtube.com/watch?v=itKNpCPHq3I>`__ -- Pycon 2016 Tutorial by Tony Ojeda, Benjamin Bengfort, Laura Lorenz from District Data Labs
   - ? `Word Embeddings for Fun and Profit <https://www.youtube.com/watch?v=lfqW46u0UKc>`__ -- Talk at PyData London 2016 talk by Lev Konstantinovskiy. See accompanying `repo <https://github.com/RaRe-Technologies/movie-plots-by-genre>`__
   - ? English Wikipedia; TODO: convert to proper .py format
   - ? `Colouring words by topic in a document, print words in a
     topics <https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/topic_methods.ipynb>`__
   - ? `Topic Coherence, a metric that correlates that human judgement on topic quality. <https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/topic_coherence_tutorial.ipynb>`__
   - ? `America's Next Topic Model slides <https://speakerdeck.com/tmylk/americas-next-topic-model-at-pydata-berlin-august-2016?slide=7>`__
      - How to choose your next topic model, presented at Pydata Berlin 10 August 2016 by Lev Konstantinovsky
   - ?  `Dynamic Topic Modeling and Dynamic Influence Model Tutorial <https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/dtm_example.ipynb>`__
   - ?  `Python Dynamic Topic Modelling Theory and Tutorial <https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/ldaseqmodel.ipynb>`__
   - ? `Word Movers Distance for Yelp Reviews tutorial <https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/WMD_tutorial.ipynb>`__
     - FIXME WMD superceded by soft cosine similarity = faster and better? any numbers / tutorials for that?
   - ? `Great illustration of corpus preparation <https://linanqiu.github.io/2015/10/07/word2vec-sentiment/>`__, `Code <https://github.com/linanqiu/word2vec-sentiments>`__
     - ? `Alternative <https://medium.com/@klintcho/doc2vec-tutorial-using-gensim-ab3ac03d3a1#.nv2lxvbj1>`__,
     - ? `Alternative 2 <https://districtdatalabs.silvrback.com/modern-methods-for-sentiment-analysis>`__
   - ? `Doc2Vec on customer reviews <http://multithreaded.stitchfix.com/blog/2015/03/11/word-is-worth-a-thousand-vectors/>`__
   - ? `Doc2Vec on Airline Tweets Sentiment Analysis <https://www.zybuluo.com/HaomingJiang/note/462804>`__
   - ? `Deep Inverse Regression with Yelp Reviews <https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/deepir.ipynb>`__ (Document Classification using Bayesian Inversion and several word2vec models, one for each class)
