If You Order Chipotle Online, You Are Probably Getting Less Food
Comparing weights of orders
R
Miscellaneous
Here’s a quick one. The question posed here is “do you get less food if you order your Chipotle order online versus in person?” There are plenty of posts going back years claiming that their orders are…
Don’t Evaluate Your Model On a SMOTE Dataset
or: try this one weird trick to increase your AUC
R
Prediction
I recently found a paper published called “Advancing Recidivism Prediction for Male Juvenile Offenders: A Machine Learning Approach Applied to Prisoners in Hunan Province”. In it, the authors make use of a very small recidivism data set focusing on youth in Hunan province, which originally appears in a 2017 PLOS ONE article…
The Great American Coffee Taste Test
A deeper dive with Bayes
R
Bayesian Statistics
In October I was lucky enough to participate in popular coffee YouTuber James Hoffman’s Great American Coffee Taste Test. In short, participants got 4 samples of coffee and were able to brew, taste, and rate them live. One the interesting parts of this was that the data was freely shared…
Synthetic Controls and Small Areas
A short discussion on ‘microsynthetic’ controls
R
Causal Inference
Andrew Gelman recently covered a mildly controversial paper in criminology that suggested that a policy of “de-prosecution” by the Philadelphia District Attorney’s office resulted in an increase in homicides. This has sparked a lot…
Building an Outlier Ensemble from ‘Scratch’
Part 3: Histogram-based anomaly detector
Anomaly Detection
This is the third part of a 3-part series. In the first two posts I described how I built a principal components analysis anomaly detector and a k-nearest neighbors anomaly detector as components for a ensemble model. This third post will discuss the last piece, which…
Anomaly Detection for Time Series
Applying a PCA anomaly detector
Anomaly Detection
PCA
Time Series
Identifying outliers in time series is one of the more common applications…
No matching items