Main
Matthew Bain
Data Engineer at Madiba Inc.
Data professional with 5+ years of experience building statistical solutions at scale. My work spans neuroscience, assistive tech, and housing, with a focus on statistically sound, interpretable machine learning. I help organizations understand their data and automate the hard stuff.
Professional Experience
Data Engineer
Madiba Inc.
Irvine, Califonia
Present - Sep 2025
- Build statistical tools and machine learning models to identify customer segmentation patterns, forecast demand, and predict churn within housing datasets (Python, Scikit-learn)
- Tune statistical and machine learning models and visualize their outputs using plotting libraries, interactive application frameworks, and reporting tools to convey key insights to executives (Snowflake, Streamlit)
Open Source Developer
Independent
Toronto, ON
Aug 2025 - Jan 2024
Data Analyst
VocaLinks Inc.
Toronto, ON
Oct 2023 - Jun 2023
- Created and communicated market breakdowns and technical tutorials (R Markdown, Python) on 4 AI-assisted assistive technology products, facilitating data-driven discussions among industry leaders
- Distilled key statistical insights in hearing research into visually compelling reports (ggplot (R), Seaborn), driving a 100% increase in sales of an assistive listening product
- Designed the data layer of an Azure speech recognition data center, enabling rapid delivery of meaningful analytics to 200+ customers via the cloud
Research Analyst
University of Western Ontario
London, ON
Aug 2022 - Sep 2021
- Designed scalable online experiments (AWS, Amazon MTurk, JavaScript) to expedite data collection for 3 auditory neuroscience projects, yielding 2 publications and a robust cloud infrastructure
- Provided multivariate statistical modeling support for research on neural signatures of epilepsy, laying the groundwork for automated labeling of 10+ hours of multimedia stimuli (Python, GCP)
Cognitive Neuroscientist
University of Western Ontario
London, ON
Aug 2021 - Jul 2018
- Designed and deployed 11 experiments assessing neural speech processing in 300+ research subjects
- Developed a time series analysis toolbox (R, MATLAB) to measure hearing impairment through brain imaging, enhancing diagnostic sensitivity by a factor of 2
Graduate Teaching Assistant
University of Western Ontario
London, ON
Apr 2021 - Sep 2018
- Directed 4 weekly research methods and statistics labs for 100+ undergraduate neuroscience students, leading to a nomination recognizing excellence in teaching and mentorship
Education
BSc (Honours), Mathematics and Statistics
McMaster University
Hamilton, ON
Dec 2023 - Sep 2021
BSc (Honours), Neuroscience
McMaster University
Hamilton, ON
Apr 2018 - Sep 2014
Certifications
Professional Machine Learning Engineer
Google Cloud Platform (GCP)
N/A
Expected Feb 2026
Selected Projects
datopy (Data Management Python Package) [Documentation]
N/A
N/A
Present - Jan 2024
- Implemented and maintain a package for handling unstructured data, providing a simple interface for data modeling, extraction, validation, and building ETL pipelines (Pandas, PyTest, Pydantic, GitHub Actions)
mlvizz (Machine Learning Data Visualization Package)
N/A
N/A
Present - Apr 2024
- Build Python interfaces for efficient, intuitive model selection, tuning, inspection, and ML pipelines through modular, object-oriented designs with built-in data validation (Scikit-learn, TensorFlow, ArviZ, PyMC, SciPy)
statvizz (Statistical Data Visualization Package)
N/A
N/A
Present - Apr 2024
- Build Python extensions unifying Pandas data summarization with Seaborn statistical plotting functionality for a unified, intuitive, fully transparent statistical plotting interface (Pandas, Seaborn, SciPy, Matplotlib)
mathvizz (Mathematical Visualization Package)
N/A
N/A
Present - Jul 2024
- Build Python interfaces for exploring the geometry of machine learning math, including linear maps, derivatives, statistical distributions, and series approximations (SymPy, SciPy, Seaborn, Matplotlib)
One-stop Superstore Dashboard [Tableau]
N/A
N/A
Feb 2025 - Feb 2025
- Designed an interactive dashboard displaying sales in a convenient, unified view, with key KPIs and sales logs visualized in both space and time (SQL, Excel)
autocv (Automated CV-generation R Package) [Documentation]
N/A
N/A
Jul 2024 - May 2024
- Built a data management package automating job application workflows using validated data stores and macro-integrated spreadsheets, resulting in an efficient, extensible workflow (dplyr (R), pkgdown, Excel)