Main
Matthew Bain
Data Scientist | Tableau, Databricks | Neuroscience Researcher Turned AI Developer
Data Scientist with 5+ years experience and certifications in Tableau and Databricks. My work spans neuroscience and assistive tech, with a focus on robust, interpretable ML. I help organizations understand their data and automate the hard stuff. I am seeking a role involving analytics and pipeline development.
Professional Experience
Open Source Developer
Independent
Toronto, ON
Present - Jan 2024
- Developed a software package [datopy] simplifying data model design and ETL workflows (Python, Pandas)
- Built a data management package [autocv] automating job application workflows (R, Excel)
- Implemented a data app [zip-explorer] displaying geospatial KPIs in space and time (Streamlit, Plotly)
Data Scientist
VocaLinks Inc.
Toronto, ON
Oct 2023 - Jun 2023
- Created and communicated market breakdowns and technical tutorials (R Markdown, Python) on 4 AI-assisted assistive technology products, facilitating data-driven discussions among industry leaders
- Distilled key statistical insights in hearing research into visually compelling reports (ggplot (R), Seaborn), driving a 100% increase in sales of an assistive listening product
- Designed the data layer of an Azure speech recognition data center, enabling rapid delivery of meaningful analytics to 200+ customers via the cloud
Research Analyst
University of Western Ontario
London, ON
Aug 2022 - Sep 2021
- Designed scalable online experiments (AWS, Amazon MTurk, JavaScript) to expedite data collection for 3 auditory neuroscience projects, yielding 2 publications and a robust cloud infrastructure
- Provided multivariate statistical modeling support for research on neural signatures of epilepsy, laying the groundwork for automated labeling of 10+ hours of multimedia stimuli (Python, GCP)
Computational Cognitive Neuroscientist
University of Western Ontario
London, ON
Aug 2021 - Jul 2018
- Designed and deployed 11 experiments assessing neural speech processing in 300+ research subjects
- Developed a time series analysis toolbox (R, MATLAB) to measure hearing impairment through brain imaging, enhancing diagnostic sensitivity by a factor of 2
Graduate Teaching Assistant
University of Western Ontario
London, ON
Apr 2021 - Sep 2018
- Directed 4 weekly research methods and statistics labs for 100+ undergraduate neuroscience students, leading to a nomination recognizing excellence in teaching and mentorship
Head of Finance
Lorne Park Landscaping
Mississauga, ON
Apr 2018 - Apr 2015
- Cofounded and managed payroll, client invoicing, and government remittances for a successful partnership specializing in property maintenance, servicing 80+ clients and leading a team of 5 employees
Education
BSc (Honours), Mathematics and Statistics
McMaster University
Hamilton, ON
Dec 2023 - Sep 2021
BSc (Honours), Neuroscience
McMaster University
Hamilton, ON
Apr 2018 - Sep 2014
Certifications
Professional Machine Learning Engineer
Google Cloud Platform (GCP)
N/A
Expected May 2025
Selected Projects
datopy (Data Management Python Package) [Documentation]
N/A
N/A
Present - Jan 2024
- Implemented and maintain a package for handling unstructured data, providing a simple interface for data modeling, extraction, validation, and building ETL pipelines (Pandas, PyTest, Pydantic, GitHub Actions)
mlvizz (Machine Learning Data Visualization Package)
N/A
N/A
Present - Apr 2024
- Build Python interfaces for efficient, intuitive model selection, tuning, inspection, and ML pipelines through modular, object-oriented designs with built-in data validation (Scikit-learn, TensorFlow, ArviZ, PyMC, SciPy)
statvizz (Statistical Data Visualization Package)
N/A
N/A
Present - Apr 2024
- Build Python extensions unifying Pandas data summarization with Seaborn statistical plotting functionality for a unified, intuitive, fully transparent statistical plotting interface (Pandas, Seaborn, SciPy, Matplotlib)
mathvizz (Mathematical Visualization Package)
N/A
N/A
Present - Jul 2024
- Build Python interfaces for exploring the geometry of machine learning math, including linear maps, derivatives, statistical distributions, and series approximations (SymPy, SciPy, Seaborn, Matplotlib)
One-stop Superstore Dashboard [Tableau]
N/A
N/A
Feb 2025 - Feb 2025
- Designed an interactive dashboard displaying sales in a convenient, unified view, with key KPIs and sales logs visualized in both space and time (SQL, Excel)
Zip Explorer (Geospatial Data Analysis Streamlit App) [Web App]
N/A
N/A
Dec 2024 - Nov 2024
- Implemented a data application with interactive visualizations of geospatial data in space and time, across various KPIs and at state and county levels (Streamlit, Pandas, Plotly, Altair, GeoPandas)
autocv (Automated CV-generation R Package) [Documentation]
N/A
N/A
Jul 2024 - May 2024
- Built a data management package automating job application workflows using validated data stores and macro-integrated spreadsheets, resulting in an efficient, extensible workflow (dplyr (R), pkgdown, Excel)