It’s been a productive summer/fall, and as a result we have three new papers published/in submission.

Buffer Sizing and Video QoE Measurements at Netflix

I spent the summer working at Netflix, doing measurements related to how the sizes of router buffers affect video performance. We learned a whole bunch of stuff about how routers work, and this was the result.

This is a classic networking area that my advisor Nick has done a lot of work in. When too many packets are sent to a router at once, the router puts them in a buffer so that it can send them later when it gets a chance. If the buffer is full, it throws away the packets. The question is, what size should the buffer be? Too large and maybe packets wait around a long time to be sent. Too small and maybe too many packets get thrown away.

In this work, we basically set up a nice, controlled experiment and tried a bunch of different router buffer sizes and observed what happened. We found all kinds of weird things, like queues which never drained, TCP going truly wild with large queueing delay, and buggy router scheduling algorithms. Check it out!

Coded Trace reconstruction in a constant number of traces

This paper was a fun side project with Ray and Josh. We have a coding theory reading group, and last fall I presented some papers on trace reconstruction. Trace reconstruction is this cool problem where you get independent, random substrings of some initial string and the goal is to recover the initial string. For instance, maybe the string is “wow cool paper” and the random traces you get are “ww cool ppr,” “wow cool,” and “cool paper,” which together let you figure out the initial string.

In the spring we read this very interesting paper on Coded Trace Reconstruction. In regular trace reconstruction, the string is either (1) any string or (2) a uniformly random string, and there’s this huge exponential gap between the best known upper and lower bounds for both. The exciting thing to us about this Coded Trace Reconstruction paper is that it beat the number of traces required for a uniformly random string. We were curious how far we could push this idea.

We wound up showing a couple very cool results:

  1. We give a construction of high rate, binary codes that can recover the initial string from a constant number of traces. It’s kinda surprising that this is possible, since even the random string case requires some polylogarithmic number of traces.
  2. We show theorems which take upper and lower bounds for trace reconstruction and turn them, in a black-box way, into a construction for coded trace reconstruction.
  3. We show that if you are a bit generous and give us a constant sized alphabet instead of a binary alphabet, we can get matching upper and lower bounds for coded trace reconstruction. I found this particularly exciting because usually there’s this horrible exponential gap in any trace reconstruction problem, and here we were able to avoid that.

The paper is on arXiv

On estimating the number of flows

This is a little paper we wrote for the Stanford Buffer Sizing Workshop. There’s a bunch of existing work that says the size of a router buffer depends on the number of flows using that buffer. Unfortunately, none of the work says how to figure out the number of flows using a router. It turns out to be not easy, and we spent some time thinking about definitions, how to estimate those definitions, and what this means in practice. I think the conclusion is that if you design a new congestion control algorithm, it would save everyone a lot of trouble if you could make it so the buffer requirements didn’t depend on the number of flows, whatever that means.