Deploying the web application to a public server

After we have tested the web application locally, we are now ready to deploy our web application onto a public web server. For this tutorial, we will be using the PythonAnywhere web hosting service, which specializes in the hosting of Python web applications and makes it extremely simple and hassle-free. Furthermore, PythonAnywhere offers a beginner account option that lets us run a single web application free of charge.

To create a new PythonAnywhere account, we visit the website at and click on the Pricing & signup link that is located in the top-right corner. Next, we click on the Create a Beginner account button where we need to provide a username, password, and a valid e-mail address. After we have read and agreed to the terms and conditions, we should have a new account.

Unfortunately, the free beginner account doesn't allow us to access the remote server via the SSH protocol from our command-line terminal. Thus, we need to use the PythonAnywhere web interface to manage our web application. But before we can upload our local application files to the server, need to create a new web application for our PythonAnywhere account. After we clicking on the Dashboard button in the top-right corner, we have access to the control panel shown at the top of the page. Next, we click on the Web tab that is now visible at the top of the page. We proceed by clicking on the Add a new web app button on the left, which lets us create a new Python 3.4 Flask web application that we name movieclassifier.

After creating a new application for our PythonAnywhere account, we head over to the Files tab to upload the files from our local movieclassifier directory using the PythonAnywhere web interface. After uploading the web application files that we created locally on our computer, we should have a movieclassifier directory in our PythonAnywhere account. It contains the same directories and files as our local movieclassifier directory has, as shown in the following screenshot:

Lastly, we head over to the Web tab one more time and click on the Reload <username> button to propagate the changes and refresh our web application. Finally, our web app should now be up and running and publicly available via the address <username>


Unfortunately, web servers can be quite sensitive to the tiniest problems in our web app. If you are experiencing problems with running the web application on PythonAnywhere and are receiving error messages in your browser, you can check the server and error logs which can be accessed from the Web tab in your PythonAnywhere account to better diagnose the problem.

Updating the movie review classifier

While our predictive model is updated on-the-fly whenever a user provides feedback about the classification, the updates to the clf object will be reset if the web server crashes or restarts. If we reload the web application, the clf object will be reinitialized from the classifier.pkl pickle file. One option to apply the updates permanently would be to pickle the clf object once again after each update. However, this would become computationally very inefficient with a growing number of users and could corrupt the pickle file if users provide feedback simultaneously. An alternative solution is to update the predictive model from the feedback data that is being collected in the SQLite database. One option would be to download the SQLite database from the PythonAnywhere server, update the clf object locally on our computer, and upload the new pickle file to PythonAnywhere. To update the classifier locally on our computer, we create an script file in the movieclassifier directory with the following contents:

import pickle
import sqlite3
import numpy as np
import os

# import HashingVectorizer from local dir
from vectorizer import vect

def update_model(db_path, model, batch_size=10000):

    conn = sqlite3.connect(db_path)
    c = conn.cursor()
    c.execute('SELECT * from review_db')

    results = c.fetchmany(batch_size)
    while results:
        data = np.array(results)
        X = data[:, 0]
        y = data[:, 1].astype(int)

        classes = np.array([0, 1])
        X_train = vect.transform(X)
        clf.partial_fit(X_train, y, classes=classes)
        results = c.fetchmany(batch_size)

    return None

cur_dir = os.path.dirname(__file__)

clf = pickle.load(open(os.path.join(cur_dir,
                 'classifier.pkl'), 'rb'))
db = os.path.join(cur_dir, 'reviews.sqlite')

update_model(db_path=db, model=clf, batch_size=10000)

# Uncomment the following lines if you are sure that
# you want to update your classifier.pkl file
# permanently.

# pickle.dump(clf, open(os.path.join(cur_dir,
#             'pkl_objects', 'classifier.pkl'), 'wb')
#             , protocol=4)

The update_model function will fetch entries from the SQLite database in batches of 10,000 entries at a time unless the database contains fewer entries. Alternatively, we could also fetch one entry at a time by using fetchone instead of fetchmany, which would be computationally very inefficient. Using the alternative fetchall method could be a problem if we are working with large datasets that exceed the computer or server's memory capacity.

Now that we have created the script, we could also upload it to the movieclassifier directory on PythonAnywhere and import the update_model function in the main application script to update the classifier from the SQLite database every time we restart the web application. In order to do so, we just need to add a line of code to import the update_model function from the script at the top of

# import update function from local dir
from update import update_model

We then need to call the update_model function in the main application body:

if __name__ == '__main__':
    update_model(filepath=db, model=clf, batch_size=10000)
