Here are some of the more interesting machine learning and data science projects I’ve pursued.
As part of a project on Data Science for Social Good, I ran text clustering algorithms on well-known hateful and toxic subreddits, and collaborated with a cross-disciplinary team of artists, architects and engineers to present the findings at The Cooper Union 2018 End of Year Show. I also wrote a blog post on my results, and gave a talk on the data science that went into the project.
Here are some open source software libraries that I’ve made substantial contributions to.
PyMC3 is a popular Python framework for Bayesian modeling and probabilistic machine learning, focusing on Markov chain Monte Carlo (MCMC) and variational inference (VI) algorithms. I’m a member of the core development team, and contribute to the PyMC3 internals and documentation. I wrote a blog post on tips and tricks for Bayesian modelling using PyMC3.
Pyfolio is a Python library for analyzing the performance and risk of financial portfolios. It is integrated into the Quantopian platform. I developed the risk and performance attribution capabilities of pyfolio: read more on my blog post here.
Alphalens is a Python library for analyzing the performance of predictive alpha factors for algorithmic trading. It is integrated into the Quantopian platform. I help develop new features and troubleshoot issues.