Aside

Matthew Bain

Contact

Download as a PDF

Core tools

Platforms

Key competencies

Last updated on 2025-12-04.

Main

Matthew Bain

Data Engineer at Madiba Inc.

Data professional with 5+ years of experience building statistical solutions at scale. My work spans neuroscience, assistive tech, and housing, with a focus on statistically sound, interpretable machine learning. I help organizations understand their data and automate the hard stuff.

LeetCode Kaggle Tableau Streamlit

Professional Experience

Data Engineer

Madiba Inc.

Irvine, Califonia

Present - Sep 2025

  • Build statistical tools and machine learning models to identify customer segmentation patterns, forecast demand, and predict churn within housing datasets (Python, Scikit-learn)
  • Tune statistical and machine learning models and visualize their outputs using plotting libraries, interactive application frameworks, and reporting tools to convey key insights to executives (Snowflake, Streamlit)

Open Source Developer

Independent

Toronto, ON

Aug 2025 - Jan 2024

  • Developed a software package [datopy] simplifying data model design and ETL workflows (Python, Pandas)
  • Built a data management package [autocv] automating job application workflows (R, Excel)

Data Analyst

VocaLinks Inc.

Toronto, ON

Oct 2023 - Jun 2023

  • Created and communicated market breakdowns and technical tutorials (R Markdown, Python) on 4 AI-assisted assistive technology products, facilitating data-driven discussions among industry leaders
  • Distilled key statistical insights in hearing research into visually compelling reports (ggplot (R), Seaborn), driving a 100% increase in sales of an assistive listening product
  • Designed the data layer of an Azure speech recognition data center, enabling rapid delivery of meaningful analytics to 200+ customers via the cloud

Research Analyst

University of Western Ontario

London, ON

Aug 2022 - Sep 2021

  • Designed scalable online experiments (AWS, Amazon MTurk, JavaScript) to expedite data collection for 3 auditory neuroscience projects, yielding 2 publications and a robust cloud infrastructure
  • Provided multivariate statistical modeling support for research on neural signatures of epilepsy, laying the groundwork for automated labeling of 10+ hours of multimedia stimuli (Python, GCP)

Cognitive Neuroscientist

University of Western Ontario

London, ON

Aug 2021 - Jul 2018

  • Designed and deployed 11 experiments assessing neural speech processing in 300+ research subjects
  • Developed a time series analysis toolbox (R, MATLAB) to measure hearing impairment through brain imaging, enhancing diagnostic sensitivity by a factor of 2

Graduate Teaching Assistant

University of Western Ontario

London, ON

Apr 2021 - Sep 2018

  • Directed 4 weekly research methods and statistics labs for 100+ undergraduate neuroscience students, leading to a nomination recognizing excellence in teaching and mentorship

Education

BSc (Honours), Mathematics and Statistics

McMaster University

Hamilton, ON

Dec 2023 - Sep 2021

MSc, Neuroscience [Thesis]

University of Western Ontario

London, ON

Aug 2020 - Sep 2018

BSc (Honours), Neuroscience

McMaster University

Hamilton, ON

Apr 2018 - Sep 2014

Certifications

Databricks Certified Data Engineer Associate [Certificate]

Databricks

N/A

Mar 2027 - Mar 2025

Tableau Certified Data Analyst [Certificate]

Tableau

N/A

Feb 2027 - Feb 2025

Professional Machine Learning Engineer

Google Cloud Platform (GCP)

N/A

Expected Feb 2026

Machine Learning Specialization [Certificate]

Stanford University

N/A

Dec 2024

Selected Projects

datopy (Data Management Python Package) [Documentation]

N/A

N/A

Present - Jan 2024

  • Implemented and maintain a package for handling unstructured data, providing a simple interface for data modeling, extraction, validation, and building ETL pipelines (Pandas, PyTest, Pydantic, GitHub Actions)

mlvizz (Machine Learning Data Visualization Package)

N/A

N/A

Present - Apr 2024

  • Build Python interfaces for efficient, intuitive model selection, tuning, inspection, and ML pipelines through modular, object-oriented designs with built-in data validation (Scikit-learn, TensorFlow, ArviZ, PyMC, SciPy)

statvizz (Statistical Data Visualization Package)

N/A

N/A

Present - Apr 2024

  • Build Python extensions unifying Pandas data summarization with Seaborn statistical plotting functionality for a unified, intuitive, fully transparent statistical plotting interface (Pandas, Seaborn, SciPy, Matplotlib)

mathvizz (Mathematical Visualization Package)

N/A

N/A

Present - Jul 2024

  • Build Python interfaces for exploring the geometry of machine learning math, including linear maps, derivatives, statistical distributions, and series approximations (SymPy, SciPy, Seaborn, Matplotlib)

One-stop Superstore Dashboard [Tableau]

N/A

N/A

Feb 2025 - Feb 2025

  • Designed an interactive dashboard displaying sales in a convenient, unified view, with key KPIs and sales logs visualized in both space and time (SQL, Excel)

autocv (Automated CV-generation R Package) [Documentation]

N/A

N/A

Jul 2024 - May 2024

  • Built a data management package automating job application workflows using validated data stores and macro-integrated spreadsheets, resulting in an efficient, extensible workflow (dplyr (R), pkgdown, Excel)