After we have tested the web application locally, we are now ready to deploy our web application onto a public web server. For this tutorial, we will be using the PythonAnywhere web hosting service, which specializes in the hosting of Python web applications and makes it extremely simple and hassle-free. Furthermore, PythonAnywhere offers a beginner account option that lets us run a single web application free of charge.
To create a new PythonAnywhere account, we visit the website at https://www.pythonanywhere.com and click on the Pricing & signup link that is located in the top-right corner. Next, we click on the Create a Beginner account button where we need to provide a username, password, and a valid e-mail address. After we have read and agreed to the terms and conditions, we should have a new account.
Unfortunately, the free beginner account doesn't allow us to access the remote server via the SSH protocol from our command-line terminal. Thus, we need to use the PythonAnywhere web interface to manage our web application. But before we can upload our local application files to the server, need to create a new web application for our PythonAnywhere account. After we clicking on the Dashboard button in the top-right corner, we have access to the control panel shown at the top of the page. Next, we click on the Web tab that is now visible at the top of the page. We proceed by clicking on the Add a new web app button on the left, which lets us create a new Python 3.4 Flask web application that we name movieclassifier
.
After creating a new application for our PythonAnywhere account, we head over to the Files tab to upload the files from our local movieclassifier
directory using the PythonAnywhere web interface. After uploading the web application files that we created locally on our computer, we should have a movieclassifier
directory in our PythonAnywhere account. It contains the same directories and files as our local movieclassifier
directory has, as shown in the following screenshot:
Lastly, we head over to the Web tab one more time and click on the Reload <username>.pythonanywhere.com button to propagate the changes and refresh our web application. Finally, our web app should now be up and running and publicly available via the address <username>.pythonanywhere.com
.
Unfortunately, web servers can be quite sensitive to the tiniest problems in our web app. If you are experiencing problems with running the web application on PythonAnywhere and are receiving error messages in your browser, you can check the server and error logs which can be accessed from the Web tab in your PythonAnywhere account to better diagnose the problem.
While our predictive model is updated on-the-fly whenever a user provides feedback about the classification, the updates to the clf
object will be reset if the web server crashes or restarts. If we reload the web application, the clf
object will be reinitialized from the classifier.pkl
pickle file. One option to apply the updates permanently would be to pickle the clf
object once again after each update. However, this would become computationally very inefficient with a growing number of users and could corrupt the pickle file if users provide feedback simultaneously. An alternative solution is to update the predictive model from the feedback data that is being collected in the SQLite database. One option would be to download the SQLite database from the PythonAnywhere server, update the clf
object locally on our computer, and upload the new pickle file to PythonAnywhere. To update the classifier locally on our computer, we create an update.py
script file in the movieclassifier
directory with the following contents:
import pickle import sqlite3 import numpy as np import os # import HashingVectorizer from local dir from vectorizer import vect def update_model(db_path, model, batch_size=10000): conn = sqlite3.connect(db_path) c = conn.cursor() c.execute('SELECT * from review_db') results = c.fetchmany(batch_size) while results: data = np.array(results) X = data[:, 0] y = data[:, 1].astype(int) classes = np.array([0, 1]) X_train = vect.transform(X) clf.partial_fit(X_train, y, classes=classes) results = c.fetchmany(batch_size) conn.close() return None cur_dir = os.path.dirname(__file__) clf = pickle.load(open(os.path.join(cur_dir, 'pkl_objects', 'classifier.pkl'), 'rb')) db = os.path.join(cur_dir, 'reviews.sqlite') update_model(db_path=db, model=clf, batch_size=10000) # Uncomment the following lines if you are sure that # you want to update your classifier.pkl file # permanently. # pickle.dump(clf, open(os.path.join(cur_dir, # 'pkl_objects', 'classifier.pkl'), 'wb') # , protocol=4)
The update_model
function will fetch entries from the SQLite database in batches of 10,000 entries at a time unless the database contains fewer entries. Alternatively, we could also fetch one entry at a time by using fetchone
instead of fetchmany
, which would be computationally very inefficient. Using the alternative fetchall
method could be a problem if we are working with large datasets that exceed the computer or server's memory capacity.
Now that we have created the update.py
script, we could also upload it to the movieclassifier
directory on PythonAnywhere and import the update_model
function in the main application script app.py
to update the classifier from the SQLite database every time we restart the web application. In order to do so, we just need to add a line of code to import the update_model
function from the update.py
script at the top of app.py
:
# import update function from local dir from update import update_model
We then need to call the update_model
function in the main application body:
… if __name__ == '__main__': update_model(filepath=db, model=clf, batch_size=10000) …