A nighttime city skyline with tall skyscrapers illuminated, reflected in the water in the foreground.

PROJECT : CITIZEN

Machine learning for public safety

Infrastructure and Machine Learning for a Real-Time Public Safety App

TL;DR:

Contributed to ML and infrastructure development at Citizen, a real-time public safety app. Built a Feature Store and supporting data pipelines, fine-tuned OpenAI models via API, and trained a machine learning model to intelligently distribute mobile push notifications while reducing user noise.

I contributed to infrastructure and machine learning development for Citizen, a real-time public safety application used to inform users about incidents as they happen. The work focused on building reliable data foundations and enabling machine learning workflows that could operate effectively in a time-sensitive, production environment.

A key part of the project was designing and building a Feature Store to support multiple machine learning use cases. This involved creating data pipelines that transformed raw event and user data into consistent, reusable features, allowing models to be trained and served more reliably across teams.

In addition, I worked on improving system outputs by fine-tuning OpenAI models through their API. These models were used to enhance downstream functionality, with an emphasis on quality, consistency, and alignment with product requirements rather than experimentation alone.

I also trained and integrated a machine learning model responsible for the intelligent distribution of mobile push notifications. The goal was to deliver relevant alerts to users while reducing noise, balancing timeliness, user context, and system load. This required close attention to data quality, model performance, and operational constraints typical of real-time consumer applications.

Overall, the work helped strengthen Citizen’s ML infrastructure and enabled more targeted, scalable, and maintainable machine learning features within the app.

My role in this project

ML Ops engineer to provide the architecture for ML engineers to train their models. Created a FeatureStore from which ML engineers can easily fetch the features they need for their ML models. Created a monitoring bot that tracks the ML performance and spits out updates to Slack.

Challenges and trade-offs

The initial training pipeline required a Dataproc (Spark) cluster to preprocess raw data at training time: one-hot encoding categorical variables, assembling feature arrays, and repartitioning large Parquet datasets. This was costly, slow to iterate on, and meant feature logic lived in PySpark code that was disconnected from the serving path. We migrated the feature computation upstream into BigQuery by writing SQL scripts. These BigQuery tables were then registered as Feature Groups in Vertex AI Feature Store via Terraform, with point-in-time lookup queries replacing the Spark transformations. This unified training and serving behind the same feature definitions, eliminated the Dataproc cluster entirely, and made the training pipeline a straightforward BigQuery export, reducing infrastructure cost and ensuring feature consistency between training and inference.

Tech stack used in this project

Python, GCP, Vertex AI, Vertex Pipeliens, Terraform, PyTorch, PySpark