December, 2022

Deploying a prediction model and automating it's lifecycle

In this project I deployed a neural network predicting taxiprices. This consisted of training and storing it via the cloud, automating and monitoring it's lifecycle and making it accessible via Docker and FastAPI.

The goal of this project was to bring a provided neural network predicting the price of a taxi fare into production, being able to train it on a big data set (55M rows) and to automate it’s lifecycle. The project constituted of the following steps:

  1. Package the code - mplementing the code of a provided notebook through setting up a python package that can be ran via command and deployed.
  2. Implement incremental processing and incremental learning - To circumvent memory and time constraints, I implemented chunk by chunk preprocessing and partial fitting of the model.
  3. Cloud training - Sourcing data from a warehouse using Google Bigquery and training as well as evaluating and using the model from a virtual machine using Google Compute Engine.
  4. Automating the model lifecycle - To ensure reliability over time, I used MLflow to store the trained models and monitor their performance over time in the cloud. Consequently, I implemented Prefect to automate a workflow that preprocesses new data, evaluates the performance of the current model on the new data and trains it on the new data to see how the perfoemance changes.
  5. Deployment - Lastly, I created a Dockerimage, pushed it to Google Cloud Run and built a prediction API using FastAPI to enable the use of the model via API requests.
    Example API request:
    ".../predict?pickup_datetime=2013-07-06%2017:18:00
    &pickup_longitude=-73.950655
    &pickup_latitude=40.783282
    &dropoff_longitude=-73.984365
    &dropoff_latitude=40.769802
    &passenger_count=2"

    Example response:
As this project is part of an official cooperation with an external party, I cannot make the whole repository publicly available. However, I am happy to share with you some excerpts of the code I wrote for it.

    Tech stack

  • Python
  • MLflow
  • Google Cloud Run
  • Google Bigquery
  • Google Compute Engine
  • Google Prefect
  • Google Docker
  • Google FastAPI


Code excerpts


Incremental processing and training:
MLflow and Prefect:
FastAPI: