Researchers often use simulations when designing latest algorithms, since testing ideas in the actual world may be each costly and dangerous. But because it’s unimaginable to capture every detail of a fancy system in a simulation, they typically collect a small amount of real data that they replay while simulating the components they need to review.
Referred to as trace-driven simulation (the small pieces of real data are called traces), this method sometimes ends in biased outcomes. This implies researchers might unknowingly select an algorithm that isn’t one of the best one they evaluated, and which can perform worse on real data than the simulation predicted that it should.
MIT researchers have developed a brand new method that eliminates this source of bias in trace-driven simulation. By enabling unbiased trace-driven simulations, the brand new technique could help researchers design higher algorithms for quite a lot of applications, including improving video quality on the web and increasing the performance of knowledge processing systems.
The researchers’ machine-learning algorithm draws on the principles of causality to learn the way the info traces were affected by the behavior of the system. In this fashion, they’ll replay the right, unbiased version of the trace in the course of the simulation.
When put next to a previously developed trace-driven simulator, the researchers’ simulation method accurately predicted which newly designed algorithm can be best for video streaming — meaning the one which led to less rebuffering and better visual quality. Existing simulators that don’t account for bias would have pointed researchers to a worse-performing algorithm.
“Data will not be the one thing that matter. The story behind how the info are generated and picked up can also be necessary. If you should answer a counterfactual query, you should know the underlying data generation story so you simply intervene on those things that you simply actually need to simulate,” says Arash Nasr-Esfahany, an electrical engineering and computer science (EECS) graduate student and co-lead writer of a paper on this latest technique.
He’s joined on the paper by co-lead authors and fellow EECS graduate students Abdullah Alomar and Pouya Hamadanian; recent graduate student Anish Agarwal PhD ’21; and senior authors Mohammad Alizadeh, an associate professor of electrical engineering and computer science; and Devavrat Shah, the Andrew and Erna Viterbi Professor in EECS and a member of the Institute for Data, Systems, and Society and of the Laboratory for Information and Decision Systems. The research was recently presented on the USENIX Symposium on Networked Systems Design and Implementation.
Specious simulations
The MIT researchers studied trace-driven simulation within the context of video streaming applications.
In video streaming, an adaptive bitrate algorithm continually decides the video quality, or bitrate, to transfer to a tool based on real-time data on the user’s bandwidth. To check how different adaptive bitrate algorithms impact network performance, researchers can collect real data from users during a video stream for a trace-driven simulation.
They use these traces to simulate what would have happened to network performance had the platform used a special adaptive bitrate algorithm in the identical underlying conditions.
Researchers have traditionally assumed that trace data are exogenous, meaning they aren’t affected by aspects which are modified in the course of the simulation. They’d assume that, in the course of the period after they collected the network performance data, the alternatives the bitrate adaptation algorithm made didn’t affect those data.
But this is commonly a false assumption that ends in biases in regards to the behavior of recent algorithms, making the simulation invalid, Alizadeh explains.
“We recognized, and others have recognized, that this fashion of doing simulation can induce errors. But I don’t think people necessarily knew how significant those errors could possibly be,” he says.
To develop an answer, Alizadeh and his collaborators framed the difficulty as a causal inference problem. To gather an unbiased trace, one must understand different causes that affect the observed data. Some causes are intrinsic to a system, while others are affected by the actions being taken.
Within the video streaming example, network performance isaffected by the alternatives the bitrate adaptation algorithm made — nevertheless it’s also affected by intrinsic elements, like network capability.
“Our task is to disentangle these two effects, to try to know what points of the behavior we’re seeing are intrinsic to the system and the way much of what we’re observing relies on the actions that were taken. If we will disentangle these two effects, then we will do unbiased simulations,” he says.
Learning from data
But researchers often cannot directly observe intrinsic properties. That is where the brand new tool, called CausalSim, is available in. The algorithm can learn the underlying characteristics of a system using only the trace data.
CausalSim takes trace data that were collected through a randomized control trial, and estimates the underlying functions that produced those data. The model tells the researchers, under the very same underlying conditions that a user experienced, how a brand new algorithm would change the consequence.
Using a typical trace-driven simulator, bias might lead a researcher to pick out a worse-performing algorithm, despite the fact that the simulation indicates it must be higher. CausalSim helps researchers select one of the best algorithm that was tested.
The MIT researchers observed this in practice. After they used CausalSim to design an improved bitrate adaptation algorithm, it led them to pick out a brand new variant that had a stall rate that was nearly 1.4 times lower than a well-accepted competing algorithm, while achieving the identical video quality. The stall rate is the period of time a user spent rebuffering the video.
Against this, an expert-designed trace-driven simulator predicted the other. It indicated that this latest variant should cause a stall rate that was nearly 1.3 times higher. The researchers tested the algorithm on real-world video streaming and confirmed that CausalSim was correct.
“The gains we were getting in the brand new variant were very near CausalSim’s prediction, while the expert simulator was way off. This is basically exciting because this expert-designed simulator has been utilized in research for the past decade. If CausalSim can so clearly be higher than this, who knows what we will do with it?” says Hamadanian.
During a 10-month experiment, CausalSim consistently improved simulation accuracy, leading to algorithms that made about half as many errors as those designed using baseline methods.
In the long run, the researchers need to apply CausalSim to situations where randomized control trial data will not be available or where it is particularly difficult to get well the causal dynamics of the system. In addition they need to explore find out how to design and monitor systems to make them more amenable to causal evaluation.