Unemployment benefits during COVID

Written on August 7, 2020

Notes from two articles about unemployment insurance (UI) benefits during COVID-19 in the US. The two articles center around the fact that Congress paid an extra flat $600/week in UI benefits on top of state UI benefits. This results in workers receiving more in total UI benefits than their previous salary.

How frequent is this phenomenon? Does it discourage returning to work? Is there a better way?

Read More

Future data jobs

Written on January 30, 2019

This is a list of data-related jobs I think are likely to develop in the next 5 to 10 years.

Read More

Hay Day

Written on November 30, 2018

Hay Day is a mobile free-to-play farming game released in Summer 2012 by Supercell. These are notes from 2015, when I played the game for a couple months. I first go over the core mechanic of the game, fulfilling orders. Then I cover two of its very constrained, yet elegantly-designed, multi-player interactions: the Daily Dirt newspaper and neighborhoods. Finally, I’ll have to mention some less elegant parts of the game: manipulative monetization, and grinding.

Read More

PPS sampling in Redshift SQL

Written on October 20, 2018

Sampling is often necessary for heavy algorithms to run on large populations. Sometimes, we want to draw some elements more frequently than others. For example, we could draw US states proportionally to their population. In this case, we could use probability-proportional-to-size (PPS) sampling. This post provides an efficient SQL script to sample any number of weighted elements via PPS sampling.

Read More

Redshift joins and distkeys

Written on August 2, 2018

Redshift is a distributed columnar database. Columnar databases scale easily by distributing data between nodes in very specific ways. This typically comes with a cost: joins are expensive, if not impossible. Redshift is no exception, although it allows a distribution key by which to spread records. This way, joins on distkeys do not require records to be shuffled between nodes, and are somewhat cheap. Surprisingly, however, joins on distkey and another column currently require a shuffle.

Read More

Tracking Newsfeed activity

Written on August 1, 2018

Let’s imagine a user browsing Facebook’s Newsfeed. This post describes how I would track basic Newsfeed user activity, and design supporting schemas in Redshift.

Read More

Summarizing funnel duration via percentiles

Written on July 19, 2018

App users often go through processes involving several steps, like browsing > adding to cart > checking out. We want to measure the typical duration of each step and of the whole process. Averaging durations naively (dividing sum by count) skews towards users who take an abnormally long time to go through a step. What about using percentiles instead?

Read More

Opaque mechanics in Shop Heroes

Written on February 10, 2016

Shop Heroes is an item-shop game released on mobile and Kongregate in mid 2014 by CloudCade. The gameplay consists of crafting equipment for heroes, who then go on quests to find materials, which are used to craft more equipment. Players can gather in Towns to share the costs incurred by leveling up buildings, which unlock more heroes and quests.

Read More