How Dracula Works

Inspired by CharSCNN [1,3], Dracula combines character-level embeddings with two levels of deep LSTM representation for compelling performance, with small model sizes.

Compared with GATE, Dracula
      obtains similar performance (with embeddings size 128) from a model
      that's a fraction of the size.

Dracula's performance (with 128-size word embeddings) on GATE's TweetIE T-Eval dataset is competitive with Derczynski et. al [2], but Dracula achieves this performance without word lists, frequency filtering, or any regular expressions.

Technical details

Dracula is implemented using Theano, and originates from Pierre Luc Carrier and Kyunghyun Cho's LSTM Networks for Sentiment Analysis tutorial. It's also readily portable to other frameworks, such as Google's TensorFlow.

Further reading

  1. C. Santos and B. Zadrozny, "Learning character-level representations for part-of-speech tagging", Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 1818--1826, 2014.
  2. L. Derczynski, A. Ritter, S. Clark and K. Bontcheva, "Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data.", RANLP, pp. 198--206, 2013.
  3. C. D. Santos and M. Gatti, "Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts", COLING, pp. 69-78, 2014.
Back to Index | About the Author

Image credits: Victorian Borders by Web Design Hot!, fonts by FontLibrary.