Introduction¶

The aim of this notebook is to create a machine learning model and transform it into an API which, when given some novel input parameters, returns the model’s prediction.

The model¶

The model that is going to be used is a Random Forest model, built using a data set of Titanic passengers (data/train.csv). The model looks to predict the probability of whether a passenger would have survived.

The goal¶

Input is an API call such as

/predict?class=2&sex=male&age=22&sibsp=2&parch=0&title=mr

with a response in the form of

{ "probabilityOfSurvival": 0.95 }

Creating the model¶

In [1]:

import pandas as pd

Import the data from the CSV file train.csv.

In [2]:

train = pd.read_csv('data/train.csv')

Explore the data.

In [3]:

train.head()

Out[3]:

	PassengerId	Survived	Pclass	Name	Sex	Age	SibSp	Ticket	Fare	Cabin	Embarked
0	1	0	3	Braund, Mr. Owen Harris	male	22.0	1	A/5 21171	7.2500	NaN	S
1	2	1	1	Cumings, Mrs. John Bradley (Florence Briggs Th...	female	38.0	1	PC 17599	71.2833	C85	C
2	3	1	3	Heikkinen, Miss. Laina	female	26.0	0	STON/O2. 3101282	7.9250	NaN	S
3	4	1	1	Futrelle, Mrs. Jacques Heath (Lily May Peel)	female	35.0	1	113803	53.1000	C123	S
4	5	0	3	Allen, Mr. William Henry	male	35.0	0	373450	8.0500	NaN	S

In [4]:

train.dtypes

Out[4]:

PassengerId      int64
Survived         int64
Pclass           int64
Name            object
Sex             object
Age            float64
SibSp            int64
Parch            int64
Ticket          object
Fare           float64
Cabin           object
Embarked        object
dtype: object

Get the mapping for the gender by creating a function that retrieves the categories of a column of type category. This will return a number and its corresponding value.

In [5]:

def _get_category_mapping(column):
    """ Return the mapping of a category """
    return dict([(cat, code) for code, cat in enumerate(column.cat.categories)])

The unique values before converting to a category:

In [6]:

train['Sex'].unique()

Out[6]:

array(['male', 'female'], dtype=object)

In [7]:

train['Sex'] = train['Sex'].astype('category')
sex_mapping = _get_category_mapping(train['Sex'])
train['Sex'] = train['Sex'].cat.codes

and the values after converting:

In [8]:

sex_mapping

Out[8]:

{'female': 0, 'male': 1}

We will keep the mapping so we can use it later once we deploy the Model as a Service.

In [9]:

train['Sex'].unique()

Out[9]:

array([1, 0])

Create categories for the titles by extracting them from the names.

In [10]:

train['Name'].head(10)

Out[10]:

0                              Braund, Mr. Owen Harris
1    Cumings, Mrs. John Bradley (Florence Briggs Th...
2                               Heikkinen, Miss. Laina
3         Futrelle, Mrs. Jacques Heath (Lily May Peel)
4                             Allen, Mr. William Henry
5                                     Moran, Mr. James
6                              McCarthy, Mr. Timothy J
7                       Palsson, Master. Gosta Leonard
8    Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)
9                  Nasser, Mrs. Nicholas (Adele Achem)
Name: Name, dtype: object

In [11]:

FRENCH_MAPPING = {
    'Mme': 'Mrs',   # Madame
    'Mlle': 'Miss', # Mademoiselle
    'M.': 'Mr',     # Monsieur
}

In [12]:

FINAL_TITLES = [
    'master',
    'miss',
    'mr',
    'mrs'
]

In [13]:

import re
def _extract_title(column):
    """ Extract the title """
    # Remove dots
    title_column = column.apply(lambda x: re.sub(r'(.*, )|(\..*)', '', x).lower()).astype(str)
    # Map the French to English titles
    title_column = title_column.replace(FRENCH_MAPPING)
    # Create the categories based on the final titles and the rare title
    title_column = title_column.apply(lambda x: 'rare title' if x not in FINAL_TITLES else x)
    return title_column

In [14]:

train['Title'] = _extract_title(train['Name'])

In [15]:

train[['Name', 'Title']].head()

Out[15]:

	Name	Title
0	Braund, Mr. Owen Harris	mr
1	Cumings, Mrs. John Bradley (Florence Briggs Th...	mrs
2	Heikkinen, Miss. Laina	miss
3	Futrelle, Mrs. Jacques Heath (Lily May Peel)	mrs
4	Allen, Mr. William Henry	mr

In [16]:

import matplotlib.pyplot as plt
%matplotlib inline
train.groupby('Title')['Name'].count().plot.pie(title="Distribution of titles")

Out[16]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f45edca0cf8>

Let's now convert the titles to a category as we did for the gender:

In [26]:

train['Title'] = train['Title'].astype("category")
title_mapping = _get_category_mapping(train['Title'])
train['Title'] = train['Title'].cat.codes

In [29]:

title_mapping

Out[29]:

{'master': 0, 'miss': 1, 'mr': 2, 'mrs': 3, 'rare title': 4}

Let us now investigate the ages of the people aboard the Titanic.

In [17]:

train.groupby('Age')['Name'].count().nlargest(20).plot.bar(title="Top 20 ages")

Out[17]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f45eda0cac8>

In [18]:

train[train['Age'].isnull()]

Out[18]:

	PassengerId	Survived	Pclass	Name	Sex	Age	SibSp	Parch	Ticket	Fare	Cabin	Embarked	Title
5	6	0	3	Moran, Mr. James	1	NaN	0	0	330877	8.4583	NaN	Q	mr
17	18	1	2	Williams, Mr. Charles Eugene	1	NaN	0	0	244373	13.0000	NaN	S	mr
19	20	1	3	Masselmani, Mrs. Fatima	0	NaN	0	0	2649	7.2250	NaN	C	mrs
26	27	0	3	Emir, Mr. Farred Chehab	1	NaN	0	0	2631	7.2250	NaN	C	mr
28	29	1	3	O'Dwyer, Miss. Ellen "Nellie"	0	NaN	0	0	330959	7.8792	NaN	Q	miss
29	30	0	3	Todoroff, Mr. Lalio	1	NaN	0	0	349216	7.8958	NaN	S	mr
31	32	1	1	Spencer, Mrs. William Augustus (Marie Eugenie)	0	NaN	1	0	PC 17569	146.5208	B78	C	mrs
32	33	1	3	Glynn, Miss. Mary Agatha	0	NaN	0	0	335677	7.7500	NaN	Q	miss
36	37	1	3	Mamee, Mr. Hanna	1	NaN	0	0	2677	7.2292	NaN	C	mr
42	43	0	3	Kraeff, Mr. Theodor	1	NaN	0	0	349253	7.8958	NaN	C	mr
45	46	0	3	Rogers, Mr. William John	1	NaN	0	0	S.C./A.4. 23567	8.0500	NaN	S	mr
46	47	0	3	Lennon, Mr. Denis	1	NaN	1	0	370371	15.5000	NaN	Q	mr
47	48	1	3	O'Driscoll, Miss. Bridget	0	NaN	0	0	14311	7.7500	NaN	Q	miss
48	49	0	3	Samaan, Mr. Youssef	1	NaN	2	0	2662	21.6792	NaN	C	mr
55	56	1	1	Woolner, Mr. Hugh	1	NaN	0	0	19947	35.5000	C52	S	mr
64	65	0	1	Stewart, Mr. Albert A	1	NaN	0	0	PC 17605	27.7208	NaN	C	mr
65	66	1	3	Moubarek, Master. Gerios	1	NaN	1	1	2661	15.2458	NaN	C	master
76	77	0	3	Staneff, Mr. Ivan	1	NaN	0	0	349208	7.8958	NaN	S	mr
77	78	0	3	Moutal, Mr. Rahamin Haim	1	NaN	0	0	374746	8.0500	NaN	S	mr
82	83	1	3	McDermott, Miss. Brigdet Delia	0	NaN	0	0	330932	7.7875	NaN	Q	miss
87	88	0	3	Slocovski, Mr. Selman Francis	1	NaN	0	0	SOTON/OQ 392086	8.0500	NaN	S	mr
95	96	0	3	Shorney, Mr. Charles Joseph	1	NaN	0	0	374910	8.0500	NaN	S	mr
101	102	0	3	Petroff, Mr. Pastcho ("Pentcho")	1	NaN	0	0	349215	7.8958	NaN	S	mr
107	108	1	3	Moss, Mr. Albert Johan	1	NaN	0	0	312991	7.7750	NaN	S	mr
109	110	1	3	Moran, Miss. Bertha	0	NaN	1	0	371110	24.1500	NaN	Q	miss
121	122	0	3	Moore, Mr. Leonard Charles	1	NaN	0	0	A4. 54510	8.0500	NaN	S	mr
126	127	0	3	McMahon, Mr. Martin	1	NaN	0	0	370372	7.7500	NaN	Q	mr
128	129	1	3	Peter, Miss. Anna	0	NaN	1	1	2668	22.3583	F E69	C	miss
140	141	0	3	Boulos, Mrs. Joseph (Sultana)	0	NaN	0	2	2678	15.2458	NaN	C	mrs
154	155	0	3	Olsen, Mr. Ole Martin	1	NaN	0	0	Fa 265302	7.3125	NaN	S	mr
...	...	...	...	...	...	...	...	...	...	...	...	...	...
718	719	0	3	McEvoy, Mr. Michael	1	NaN	0	0	36568	15.5000	NaN	Q	mr
727	728	1	3	Mannion, Miss. Margareth	0	NaN	0	0	36866	7.7375	NaN	Q	miss
732	733	0	2	Knight, Mr. Robert J	1	NaN	0	0	239855	0.0000	NaN	S	mr
738	739	0	3	Ivanoff, Mr. Kanio	1	NaN	0	0	349201	7.8958	NaN	S	mr
739	740	0	3	Nankoff, Mr. Minko	1	NaN	0	0	349218	7.8958	NaN	S	mr
740	741	1	1	Hawksford, Mr. Walter James	1	NaN	0	0	16988	30.0000	D45	S	mr
760	761	0	3	Garfirth, Mr. John	1	NaN	0	0	358585	14.5000	NaN	S	mr
766	767	0	1	Brewe, Dr. Arthur Jackson	1	NaN	0	0	112379	39.6000	NaN	C	rare title
768	769	0	3	Moran, Mr. Daniel J	1	NaN	1	0	371110	24.1500	NaN	Q	mr
773	774	0	3	Elias, Mr. Dibo	1	NaN	0	0	2674	7.2250	NaN	C	mr
776	777	0	3	Tobin, Mr. Roger	1	NaN	0	0	383121	7.7500	F38	Q	mr
778	779	0	3	Kilgannon, Mr. Thomas J	1	NaN	0	0	36865	7.7375	NaN	Q	mr
783	784	0	3	Johnston, Mr. Andrew G	1	NaN	1	2	W./C. 6607	23.4500	NaN	S	mr
790	791	0	3	Keane, Mr. Andrew "Andy"	1	NaN	0	0	12460	7.7500	NaN	Q	mr
792	793	0	3	Sage, Miss. Stella Anna	0	NaN	8	2	CA. 2343	69.5500	NaN	S	miss
793	794	0	1	Hoyt, Mr. William Fisher	1	NaN	0	0	PC 17600	30.6958	NaN	C	mr
815	816	0	1	Fry, Mr. Richard	1	NaN	0	0	112058	0.0000	B102	S	mr
825	826	0	3	Flynn, Mr. John	1	NaN	0	0	368323	6.9500	NaN	Q	mr
826	827	0	3	Lam, Mr. Len	1	NaN	0	0	1601	56.4958	NaN	S	mr
828	829	1	3	McCormack, Mr. Thomas Joseph	1	NaN	0	0	367228	7.7500	NaN	Q	mr
832	833	0	3	Saad, Mr. Amin	1	NaN	0	0	2671	7.2292	NaN	C	mr
837	838	0	3	Sirota, Mr. Maurice	1	NaN	0	0	392092	8.0500	NaN	S	mr
839	840	1	1	Marechal, Mr. Pierre	1	NaN	0	0	11774	29.7000	C47	C	mr
846	847	0	3	Sage, Mr. Douglas Bullen	1	NaN	8	2	CA. 2343	69.5500	NaN	S	mr
849	850	1	1	Goldenberg, Mrs. Samuel L (Edwiga Grabowska)	0	NaN	1	0	17453	89.1042	C92	C	mrs
859	860	0	3	Razi, Mr. Raihed	1	NaN	0	0	2629	7.2292	NaN	C	mr
863	864	0	3	Sage, Miss. Dorothy Edith "Dolly"	0	NaN	8	2	CA. 2343	69.5500	NaN	S	miss
868	869	0	3	van Melkebeke, Mr. Philemon	1	NaN	0	0	345777	9.5000	NaN	S	mr
878	879	0	3	Laleff, Mr. Kristo	1	NaN	0	0	349217	7.8958	NaN	S	mr
888	889	0	3	Johnston, Miss. Catherine Helen "Carrie"	0	NaN	1	2	W./C. 6607	23.4500	NaN	S	miss

177 rows × 13 columns

In the dataset the age is missing for 177 persons. Use a linear model to calculate the age based on the class, gender and number of siblings/spouses based on the rows with an age.

In [19]:

LIN_MOD_FEATURES = [
    'Pclass',
    'Sex',
    'SibSp'
]

In [20]:

LIN_MOD_TARGET = [
    'Age'
]

In [21]:

from sklearn import linear_model

def _create_linear_model(frame):
    """ Create linear model """
    imput = frame[frame.Age.notnull()]
    features = imput[LIN_MOD_FEATURES]
    target = imput[LIN_MOD_TARGET]
    model = linear_model.LinearRegression()
    model.fit(features, target)
    return model

In [22]:

linear_mod = _create_linear_model(train)

Calculate the predicted age for all the rows:

In [23]:

train['PredictedAge'] = linear_mod.predict(train[LIN_MOD_FEATURES])

Merge the predicted age into the dataframe where there is no age yet:

In [24]:

train['Age'] = train.apply(lambda x: x.Age if pd.notnull(x.Age) else x.PredictedAge, axis=1)

Drop the prediction column since it is not needed anymore:

In [25]:

train.drop(['PredictedAge'], axis=1, inplace=True)

To estimate the chance of survival we will use a Random Forest Classifier.

In [36]:

FEATURES = [
    'Pclass',
    'Sex',
    'Age',
    'SibSp',
    'Parch',
    'Title',
]
TARGET = 'Survived'
NUM_TREES = 500
MAX_FEATURES = 2

In [42]:

from sklearn.ensemble import RandomForestClassifier

def _create_random_forest_classifier(frame):
    """ Build a random forest classifier """
    features = frame[FEATURES]
    target = frame[TARGET]
    model = RandomForestClassifier(n_estimators=NUM_TREES,
                                   max_features=MAX_FEATURES,
                                   random_state=754)
    model.fit(features, target)
    return model

In [43]:

random_forest_classifier = _create_random_forest_classifier(train)

As a final step for the MaaS we will save the model and the two mappings to disk. This way we can upload them to the microservice in the next step.

In [40]:

from sklearn.externals import joblib

def _save_variable(variable, filename):
    """ Save a variable to a file """
    joblib.dump(variable, filename)

In [41]:

_save_variable(random_forest_classifier, 'random_forest.mdl')
_save_variable(title_mapping, 'title_mapping.pkl')
_save_variable(sex_mapping, 'sex_mapping.pkl')

Deploying the model as a service¶

To deploy the model as a service, I am going to use the web framework Flask. This makes it easy to interact with the variables we saved in the previous step and it is straighforward to create a simple web app with only a few routes. The app.py contains all the magic and will be used in the Dockerfile to get the container with the model online. This is the code for the main application:

In [44]:

%%file app.py
#!/usr/bin/env python
# # -*- coding: utf-8 -*-
""" Flask API for predicting probability of survival """
import json
import sys
from flask import Flask, jsonify, request, render_template, url_for
from sklearn.externals import joblib
import numpy as np

try:
    saved_model = joblib.load('random_forest.mdl')
    sex_mapping = joblib.load('sex_mapping.pkl')
    title_mapping = joblib.load('title_mapping.pkl')
except:
    print("Error loading application. Please run `python create_random_forest.py` first!")
    sys.exit(0)

app = Flask(__name__)

@app.route('/')
def main():
    """ Main page of the API """
    return "This is the main page"

@app.route('/predict', methods=['GET'])
def predict():
    """ Predict the probability of survival """
    args = request.args
    required_args = ['class', 'sex', 'age', 'sibsp', 'parch', 'title']
    # Simple error handling for the arguments
    diff = set(required_args).difference(set(args.keys()))
    if len(diff) > 0:
        return "Error: wrong arguments. Missing arguments {}".format(str(diff))
    person_features = np.array([args['class'],
                                sex_mapping[args['sex']],
                                args['age'],
                                args['sibsp'],
                                args['parch'],
                                title_mapping[args['title'].lower()]
                               ]).reshape(1, -1)
    probability = saved_model.predict_proba(person_features)[:, 1][0]
    return jsonify({'probabilityOfSurvival': probability})

if __name__ == '__main__':
    app.run(host='0.0.0.0')

Writing app.py

Breakdown¶

The first bit

try:
    saved_model = joblib.load('random_forest.mdl')
    sex_mapping = joblib.load('sex_mapping.pkl')
    title_mapping = joblib.load('title_mapping.pkl')
except:
    print("Error loading application. Please run `python create_random_forest.py` first!")
    sys.exit(0)

will import the three variables that we saved during training the model. We can use the mappings to map the input arguments of the API to the proper fields of the model.

The next part is default for a Flask application:

app = Flask(__name__)

@app.route('/')
def main():
    """ Main page of the API """
    return "This is the main page"

The predict route is slightly more advanced. It will take parameters using the GET method and verify if all six required arguments are present.

args = request.args
required_args = ['class', 'sex', 'age', 'sibsp', 'parch', 'title']
# Simple error handling for the arguments
diff = set(required_args).difference(set(args.keys()))
if len(diff) > 0:
    return "Error: wrong arguments. Missing arguments {}".format(str(diff))

If all arguments are present, it will create a feature array to feed to the prediction model by using numpy. This is where the mappings come in, as an input for the title we have a string, but this is mapped to the corresponding index of the mapping to match to the feature in the model.

person_features = np.array([args['class'],
                            sex_mapping[args['sex']],
                            args['age'],
                            args['sibsp'],
                            args['parch'],
                            title_mapping[args['title'].lower()]
                           ]).reshape(1, -1)

Finally the probability is calculated based on the features and the probability is fed back using a JSON object.

probability = saved_model.predict_proba(person_features)[:, 1][0]
return jsonify({'probabilityOfSurvival': probability})

The last bit of app.py is the default way of starting the Flask server. I have modified the host from localhost to 0.0.0.0 in order to be able to access the API from any IP address.

app.run(host='0.0.0.0')

Docker¶

To run the model as a service, I will use Docker to create a container where the server is running and the endpoint for the prediction is exposed. The Dockerfile is very basic. It will use Python 3, copy the contents to the container, install the requirements and start the server.

# Base image
FROM python:3
# Copy contents
COPY . /app
# Change work directory
WORKDIR /app
# Install the requirements
RUN pip install -r requirements.txt
# Start the application
CMD ["python", "app.py"]

where requirements.txt contains the following:

certifi==2018.1.18
click==6.7
Flask==0.12.2
itsdangerous==0.24
Jinja2==2.10
MarkupSafe==1.0
numpy==1.14.0
pandas==0.22.0
python-dateutil==2.6.1
pytz==2017.3
scikit-learn==0.19.1
scipy==1.0.0
six==1.11.0
Werkzeug==0.14.1

The docker-compose.yml will simply build the current Dockerfile and expose port 5000.

version: '2'
services:
  flask:
    build: .
    ports:
      - "5000:5000"

Running docker-compose up -d in this folder will now start the server and by going to the IP of the machine the endpoint should be visible on port 5000 and route predict.

Execution¶

Let's put it to the test: create the files, spin up Docker and check the API response.

In [45]:

%%file Dockerfile
# Base image
FROM python:3
# Copy contents
COPY . /app
# Change work directory
WORKDIR /app
# Install the requirements
RUN pip install -r requirements.txt
# Start the application
CMD ["python", "app.py"]

Writing Dockerfile

In [46]:

%%file requirements.txt
certifi==2018.1.18
click==6.7
Flask==0.12.2
itsdangerous==0.24
Jinja2==2.10
MarkupSafe==1.0
numpy==1.14.0
pandas==0.22.0
python-dateutil==2.6.1
pytz==2017.3
scikit-learn==0.19.1
scipy==1.0.0
six==1.11.0
Werkzeug==0.14.1

Writing requirements.txt

In [47]:

%%file docker-compose.yml
version: '2'
services:
  flask:
    build: .
    ports:
      - "5000:5000"

Writing docker-compose.yml

In [51]:

!docker-compose up -d

WARNING: The Docker Engine you're using is running in swarm mode.

Compose does not use swarm mode to deploy services to multiple nodes in a swarm. All containers will be scheduled on the current node.

To deploy your application across the swarm, use `docker stack deploy`.

WARNING: Found orphan containers (notebooks_anaconda_1) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
Building flask
Step 1/5 : FROM python:3
3: Pulling from library/python








Digest: sha256:18e515f2cd7fd40c019bce12fda36b9a9c58613cf6fb8d6e58f831ef565a7b81
Status: Downloaded newer image for python:3
 ---> d69bc9d9b016
Step 2/5 : COPY . /app
 ---> b97435c40d8d
Step 3/5 : WORKDIR /app
Removing intermediate container 15515f492696
 ---> 277d4cd7e786
Step 4/5 : RUN pip install -r requirements.txt
 ---> Running in ba1465b7f6e3
Collecting certifi==2018.1.18 (from -r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/fa/53/0a5562e2b96749e99a3d55d8c7df91c9e4d8c39a9da1f1a49ac9e4f4b39f/certifi-2018.1.18-py2.py3-none-any.whl (151kB)
Collecting click==6.7 (from -r requirements.txt (line 2))
  Downloading https://files.pythonhosted.org/packages/34/c1/8806f99713ddb993c5366c362b2f908f18269f8d792aff1abfd700775a77/click-6.7-py2.py3-none-any.whl (71kB)
Collecting Flask==0.12.2 (from -r requirements.txt (line 3))
  Downloading https://files.pythonhosted.org/packages/77/32/e3597cb19ffffe724ad4bf0beca4153419918e7fa4ba6a34b04ee4da3371/Flask-0.12.2-py2.py3-none-any.whl (83kB)
Collecting itsdangerous==0.24 (from -r requirements.txt (line 4))
  Downloading https://files.pythonhosted.org/packages/dc/b4/a60bcdba945c00f6d608d8975131ab3f25b22f2bcfe1dab221165194b2d4/itsdangerous-0.24.tar.gz (46kB)
Collecting Jinja2==2.10 (from -r requirements.txt (line 5))
  Downloading https://files.pythonhosted.org/packages/7f/ff/ae64bacdfc95f27a016a7bed8e8686763ba4d277a78ca76f32659220a731/Jinja2-2.10-py2.py3-none-any.whl (126kB)
Collecting MarkupSafe==1.0 (from -r requirements.txt (line 6))
  Downloading https://files.pythonhosted.org/packages/4d/de/32d741db316d8fdb7680822dd37001ef7a448255de9699ab4bfcbdf4172b/MarkupSafe-1.0.tar.gz
Collecting numpy==1.14.0 (from -r requirements.txt (line 7))
  Downloading https://files.pythonhosted.org/packages/dc/ac/5c270dffb864f23315e9c1f9e0a0b300c797b3c170666c031c4de42aacae/numpy-1.14.0-cp36-cp36m-manylinux1_x86_64.whl (17.2MB)
Collecting pandas==0.22.0 (from -r requirements.txt (line 8))
  Downloading https://files.pythonhosted.org/packages/da/c6/0936bc5814b429fddb5d6252566fe73a3e40372e6ceaf87de3dec1326f28/pandas-0.22.0-cp36-cp36m-manylinux1_x86_64.whl (26.2MB)
Collecting python-dateutil==2.6.1 (from -r requirements.txt (line 9))
  Downloading https://files.pythonhosted.org/packages/4b/0d/7ed381ab4fe80b8ebf34411d14f253e1cf3e56e2820ffa1d8844b23859a2/python_dateutil-2.6.1-py2.py3-none-any.whl (194kB)
Collecting pytz==2017.3 (from -r requirements.txt (line 10))
  Downloading https://files.pythonhosted.org/packages/a3/7f/e7d1acbd433b929168a4fb4182a2ff3c33653717195a26c1de099ad1ef29/pytz-2017.3-py2.py3-none-any.whl (511kB)
Collecting scikit-learn==0.19.1 (from -r requirements.txt (line 11))
  Downloading https://files.pythonhosted.org/packages/3d/2d/9fbc7baa5f44bc9e88ffb7ed32721b879bfa416573e85031e16f52569bc9/scikit_learn-0.19.1-cp36-cp36m-manylinux1_x86_64.whl (12.4MB)
Collecting scipy==1.0.0 (from -r requirements.txt (line 12))
  Downloading https://files.pythonhosted.org/packages/d8/5e/caa01ba7be11600b6a9d39265440d7b3be3d69206da887c42bef049521f2/scipy-1.0.0-cp36-cp36m-manylinux1_x86_64.whl (50.0MB)
Collecting six==1.11.0 (from -r requirements.txt (line 13))
  Downloading https://files.pythonhosted.org/packages/67/4b/141a581104b1f6397bfa78ac9d43d8ad29a7ca43ea90a2d863fe3056e86a/six-1.11.0-py2.py3-none-any.whl
Collecting Werkzeug==0.14.1 (from -r requirements.txt (line 14))
  Downloading https://files.pythonhosted.org/packages/20/c4/12e3e56473e52375aa29c4764e70d1b8f3efa6682bef8d0aae04fe335243/Werkzeug-0.14.1-py2.py3-none-any.whl (322kB)
Building wheels for collected packages: itsdangerous, MarkupSafe
  Running setup.py bdist_wheel for itsdangerous: started
  Running setup.py bdist_wheel for itsdangerous: finished with status 'done'
  Stored in directory: /root/.cache/pip/wheels/2c/4a/61/5599631c1554768c6290b08c02c72d7317910374ca602ff1e5
  Running setup.py bdist_wheel for MarkupSafe: started
  Running setup.py bdist_wheel for MarkupSafe: finished with status 'done'
  Stored in directory: /root/.cache/pip/wheels/33/56/20/ebe49a5c612fffe1c5a632146b16596f9e64676768661e4e46
Successfully built itsdangerous MarkupSafe
Installing collected packages: certifi, click, itsdangerous, Werkzeug, MarkupSafe, Jinja2, Flask, numpy, pytz, six, python-dateutil, pandas, scikit-learn, scipy
Successfully installed Flask-0.12.2 Jinja2-2.10 MarkupSafe-1.0 Werkzeug-0.14.1 certifi-2018.1.18 click-6.7 itsdangerous-0.24 numpy-1.14.0 pandas-0.22.0 python-dateutil-2.6.1 pytz-2017.3 scikit-learn-0.19.1 scipy-1.0.0 six-1.11.0
Removing intermediate container ba1465b7f6e3
 ---> efdb1e8577ad
Step 5/5 : CMD ["python", "app.py"]
 ---> Running in 5f7361303e7b
Removing intermediate container 5f7361303e7b
 ---> 1ad5cf22e305
Successfully built 1ad5cf22e305
Successfully tagged notebooks_flask:latest
WARNING: Image for service flask was built because it did not already exist. To rebuild this image you must use `docker-compose build` or `docker-compose up --build`.
Creating notebooks_flask_1 ...

In [53]:

!docker ps | grep flask

5e7d954d4215        notebooks_flask                    "python app.py"          19 minutes ago      Up 19 minutes          0.0.0.0:5000->5000/tcp             notebooks_flask_1

Verification¶

The following code will call the API with some parameters and print the result.

In [55]:

import requests

params = {
    'class': 2,
    'age': 22,
    'sibsp': 2,
    'parch': 0,
    'title': 'mr',
    'sex': 'male',
}

url = 'http://hub.jitsejan.com:5000/predict'
r = requests.get(url, params)
print(r.url)
print(r.json())

http://hub.jitsejan.com:5000/predict?class=2&age=22&sibsp=2&parch=0&title=mr&sex=male
{'probabilityOfSurvival': 0.006333333333333333}

And that is it! The JSON object can now be returned to the front-end of the web application and be displayed in a fancy way, but that is outside the scope of this notebook.

Let's clean up by retrieving the Docker ID and removing the container.

In [57]:

!docker stop $(docker ps -aqf "name=flask")

5e7d954d4215

In [58]:

!docker rm $(docker ps -aqf "name=flask")

5e7d954d4215

Check my Github for the original notebook.