# Archive

## Cookbook - Managing the Bias-Variance Tradeoff

The bias-variance tradeoff is a unique result in machine learning: it sits on extremely solid theoretical foundations, and has a ludicrously far-reaching scope of applicability.

## Understanding Hate Speech on Reddit through Text Clustering

Have you heard of /r/TheRedPill? It’s an online forum (a subreddit, but I’ll explain that later) where people (usually men) espouse an ideology predicated entirely on gender. ‘Swallowers of the red pill’, as they call themselve...

## Why Latent Dirichlet Allocation Sucks

As I learn more and more about data science and machine learning, I’ve noticed that a lot of resources out there go something like this…

## Fruit Loops and Learning - The LUPI Paradigm and SVM+

Here’s a short story you might know: you have a black box, whose name is Machine Learning Algorithm. It’s got two modes: training mode and testing mode.

## Linear Discriminant Analysis for Starters

Linear discriminant analysis (commonly abbreviated to LDA, and not to be confused with latent Dirichlet allocation) is a very common dimensionality reduction technique for classification problems.

## Linear Regression for Starters

I was recently inspired by this following PyData London talk by Vincent Warmerdam. It’s a great talk: he has a lot of great tricks to make simple, small-brain models really work wonders, and he emphasizes thinking about your pr...

## Fruit Loops and Learning - The LUPI Paradigm and SVM+

Here’s a short story you might know: you have a black box, whose name is Machine Learning Algorithm. It’s got two modes: training mode and testing mode.

## Linear Discriminant Analysis for Starters

Linear discriminant analysis (commonly abbreviated to LDA, and not to be confused with latent Dirichlet allocation) is a very common dimensionality reduction technique for classification problems.

## Linear Regression for Starters

I was recently inspired by this following PyData London talk by Vincent Warmerdam. It’s a great talk: he has a lot of great tricks to make simple, small-brain models really work wonders, and he emphasizes thinking about your pr...

## Linear Discriminant Analysis for Starters

Linear discriminant analysis (commonly abbreviated to LDA, and not to be confused with latent Dirichlet allocation) is a very common dimensionality reduction technique for classification problems.

## Understanding Hate Speech on Reddit through Text Clustering

Have you heard of /r/TheRedPill? It’s an online forum (a subreddit, but I’ll explain that later) where people (usually men) espouse an ideology predicated entirely on gender. ‘Swallowers of the red pill’, as they call themselve...

## Why Latent Dirichlet Allocation Sucks

As I learn more and more about data science and machine learning, I’ve noticed that a lot of resources out there go something like this…

## Understanding Hate Speech on Reddit through Text Clustering

Have you heard of /r/TheRedPill? It’s an online forum (a subreddit, but I’ll explain that later) where people (usually men) espouse an ideology predicated entirely on gender. ‘Swallowers of the red pill’, as they call themselve...

## Why Latent Dirichlet Allocation Sucks

As I learn more and more about data science and machine learning, I’ve noticed that a lot of resources out there go something like this…

## Cookbook - Bayesian Modelling with PyMC3

Recently I’ve started using PyMC3 for Bayesian modelling, and it’s an amazing piece of software! The API only exposes as much of heavy machinery of MCMC as you need - by which I mean, just the pm.sample() method.

## Cookbook - Managing the Bias-Variance Tradeoff

The bias-variance tradeoff is a unique result in machine learning: it sits on extremely solid theoretical foundations, and has a ludicrously far-reaching scope of applicability.

## Portfolio Risk Analytics and Performance Attribution with Pyfolio

I was lucky enough to have the chance to intern at Quantopian this summer. During that time I contributed some exciting stuff to their open-source portfolio analytics engine, pyfolio, and learnt a truckload of stuff while doing...

## Portfolio Risk Analytics and Performance Attribution with Pyfolio

I was lucky enough to have the chance to intern at Quantopian this summer. During that time I contributed some exciting stuff to their open-source portfolio analytics engine, pyfolio, and learnt a truckload of stuff while doing...

## Fruit Loops and Learning - The LUPI Paradigm and SVM+

Here’s a short story you might know: you have a black box, whose name is Machine Learning Algorithm. It’s got two modes: training mode and testing mode.

## Linear Regression for Starters

I was recently inspired by this following PyData London talk by Vincent Warmerdam. It’s a great talk: he has a lot of great tricks to make simple, small-brain models really work wonders, and he emphasizes thinking about your pr...

## Cookbook - Bayesian Modelling with PyMC3

Recently I’ve started using PyMC3 for Bayesian modelling, and it’s an amazing piece of software! The API only exposes as much of heavy machinery of MCMC as you need - by which I mean, just the pm.sample() method.

## Cookbook - Bayesian Modelling with PyMC3

Recently I’ve started using PyMC3 for Bayesian modelling, and it’s an amazing piece of software! The API only exposes as much of heavy machinery of MCMC as you need - by which I mean, just the pm.sample() method.