Sentiment Classifier

CircleCI Documentation Status

About

The goal of this project was to create a sentiment classifier API that could use various models and datasets.

It is written in Python and uses the following libraries:

  • Flask: for the API
  • Tensorflow & Keras: for Machine Learning

For more details about the project, you can refer to these slides.

So far we are only using the IMDB large movie review dataset. But we plan to use more datasets later on.

Installation

Here are the required steps to get started with the API:

  • Clone the repository
  • Download the IMDB dataset and place it in the data folder. We use pre-trained word embeddings from FastText, so you might want to download them to the data folder as well:
  • Create a virtual environment, and install the requirements from requirements.txt file
  • Add “sentiment_classifier” to your PYTHONPATH:
export PYTHONPATH=.:$PYTHONPATH
  • Train the models by running:
python sentiment_classifier/scripts/train.py
  • Run the API:
python sentiment_classifier/api/wsgi.py
  • Test the API:
import requests

r = requests.post(
  "http://localhost:8000/api/classify",
  json={"text": "I love it"}
)

Getting Started

Make sure to checkout this notebook to better understand how the code works: Example Model Notebook.

To train the classifiers, run the train.py scripts located in sentiment_classifier/scripts.

You can also refer to the documentation.

Indices and tables