Skip to main content

Distributed Machine Learning Predictions Using Tensorflow

Achieve faster ML model serving for your users at the edge by running a distributed ML model server. This tutorial will use CloudFlow to deploy TensorFlow Serving with an example pretrained model.

The TensorFlow container we will use is available on DockerHub.

note

Before starting, create a new CloudFlow Project and then delete the default Deployment and ingress-upstream Service to prepare the project for your new deployment.

Prerequisites

  • You need an account on Docker Hub.
  • You need Docker installed so that you can build a docker image.

Pull Down the Pretrained Model

Pull down the TensorFlow example models from GitHub so that we can build one of them into the container image.

mkdir my-tensorflow-example
cd my-tensorflow-example
git clone https://github.com/tensorflow/serving

Create a Dockerfile for Your Container Image

The container image you'll build relies upon TensorFlow Serving on Docker Hub. We'll just use one of those models you downloaded, called saved_model_half_plus_two_cpu. It halves a value, then adds 2.

Dockerfile
FROM tensorflow/serving

ADD serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu /models/half_plus_two

ENTRYPOINT ["/usr/bin/tf_serving_entrypoint.sh"]

Build and Publish the Image

Build the image and push it to Docker Hub, substituting YOUR_DOCKERHUB_ACCOUNT accordingly.

docker build -t my-tensorflow-image .
docker tag my-tensorflow-image YOUR_DOCKERHUB_ACCOUNT/tensorflow:latest
docker push YOUR_DOCKERHUB_ACCOUNT/tensorflow:latest

Create a Kubernetes Deployment for TensorFlow

Next, create the deployment for TensorFlow as tensorflow-deployment.yaml substituting YOUR_DOCKERHUB_ACCOUNT accordingly. This will direct CloudFlow to distribute the container you've pushed to Docker Hub.

tensorflow-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: tensorflow
name: tensorflow
spec:
replicas: 1
selector:
matchLabels:
app: tensorflow
template:
metadata:
labels:
app: tensorflow
spec:
containers:
- image: YOUR_DOCKERHUB_ACCOUNT/tensorflow:latest
imagePullPolicy: Always
name: tensorflow
resources:
requests:
memory: ".5Gi"
cpu: "500m"
limits:
memory: ".5Gi"
cpu: "500m"
env:
- name: MODEL_NAME
value: half_plus_two

Apply this deployment resource to your Project with either the Kubernetes dashboard or kubectl apply -f tensorflow-upstream.yaml.

Expose the Service on the Internet

We want to expose the TensorFlow service on the Internet. Create ingress-upstream.yaml as defined below.

ingress-upstream.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app: ingress-upstream
name: ingress-upstream
spec:
ports:
- name: 80-80
port: 80
protocol: TCP
targetPort: 8501
selector:
app: tensorflow
sessionAffinity: None
type: ClusterIP

Apply this service resource to your Project with either the Kubernetes dashboard or kubectl apply -f ingress-upstream.yaml.

See the pods running on CloudFlow's network using kubectl get pods -o wide.

The -o wide switch shows where your container is running according to the default AEE location optimization strategy. Your container will be optimally deployed according to traffic.

Start Making Predictions at the Edge

Exercise the ML prediction service substituting YOUR_ENVIRONMENT_HOSTNAME accordingly.

curl -d '{"instances": [1.0, 2.0, 5.0]}' \
-X POST http://YOUR_ENVIRONMENT_HOSTNAME/v1/models/half_plus_two:predict

The result you'll get:

{
"predictions": [2.5, 3.0, 4.5]
}