Research Interests

I am working on developing efficient techniques of using redundancy to reduce delay in cloud storage and computing. My toolbox includes probability and stochastic processes, queueing, and coding theory. More broadly, I am interested in using stochastic modeling and analysis to provide insights into design of efficient systems.

Selected Projects

Queueing Redundant Tasks to Reduce Delay in Cloud Systems

We model job service in a distributed system with n servers by n identical queues each of the servers. Each incoming job is forked to all or a subset of the n servers, and we wait for k of them to finish. The case k=1 corresponds to replication of the job at the servers. This is one of the first works to analyze the response time of queues with redundancy.

Straggler Replication in Parallel Computing

In large-scale computing where a job has hundreds of parallel tasks, the slowest task becomes the bottleneck. Frameworks such as MapReduce relaunch replicas of straggling tasks to cut the tail latency. We develop insights into the best relaunching time, and the number of replicas to relaunch to reduce latency, without a significant increase in computing cost. For heavy-tail distributions, redundancy can reduce latency and cost simultaneously!

Streaming Communication

Unlike traditional file transfer where only total delay matters, streaming requires fast and in-order delivery of individual packets to the user. We analyze the trade-off between throughput and the in-order delivery delay, and in particular how it is affected by the frequency of feedback to the source. We propose a simple combination of repetition and greedy linear coding that achieves close to optimal throughput-delay trade-off.