While Santa Claus can have a magical sleigh and nine plucky reindeer to assist him deliver presents, for corporations like FedEx, the optimization problem of efficiently routing holiday packages is so complicated that they often employ specialized software to seek out an answer.
This software, called a mixed-integer linear programming (MILP) solver, splits a large optimization problem into smaller pieces and uses generic algorithms to attempt to find the most effective solution. Nonetheless, the solver could take hours — and even days — to reach at an answer.
The method is so onerous that an organization often must stop the software partway through, accepting an answer that shouldn’t be ideal but the most effective that might be generated in a set period of time.
Researchers from MIT and ETH Zurich used machine learning to hurry things up.
They identified a key intermediate step in MILP solvers that has so many potential solutions it takes an unlimited period of time to unravel, which slows all the process. The researchers employed a filtering technique to simplify this step, then used machine learning to seek out the optimal solution for a particular sort of problem.
Their data-driven approach enables an organization to make use of its own data to tailor a general-purpose MILP solver to the issue at hand.
This latest technique sped up MILP solvers between 30 and 70 percent, with none drop in accuracy. One could use this method to acquire an optimal solution more quickly or, for especially complex problems, a greater solution in a tractable period of time.
This approach might be used wherever MILP solvers are employed, resembling by ride-hailing services, electric grid operators, vaccination distributors, or any entity faced with a thorny resource-allocation problem.
“Sometimes, in a field like optimization, it is vitally common for people to think about solutions as either purely machine learning or purely classical. I’m a firm believer that we would like to get the most effective of each worlds, and it is a really strong instantiation of that hybrid approach,” says senior writer Cathy Wu, the Gilbert W. Winslow Profession Development Assistant Professor in Civil and Environmental Engineering (CEE), and a member of a member of the Laboratory for Information and Decision Systems (LIDS) and the Institute for Data, Systems, and Society (IDSS).
Wu wrote the paper with co-lead authors Siriu Li, an IDSS graduate student, and Wenbin Ouyang, a CEE graduate student; in addition to Max Paulus, a graduate student at ETH Zurich. The research shall be presented on the Conference on Neural Information Processing Systems.
Tough to resolve
MILP problems have an exponential variety of potential solutions. As an example, say a traveling salesperson wants to seek out the shortest path to go to several cities after which return to their city of origin. If there are various cities which might be visited in any order, the variety of potential solutions is likely to be greater than the variety of atoms within the universe.
“These problems are called NP-hard, which implies it is vitally unlikely there’s an efficient algorithm to resolve them. When the issue is large enough, we are able to only hope to attain some suboptimal performance,” Wu explains.
An MILP solver employs an array of techniques and practical tricks that may achieve reasonable solutions in a tractable period of time.
A typical solver uses a divide-and-conquer approach, first splitting the space of potential solutions into smaller pieces with a way called branching. Then, the solver employs a way called cutting to tighten up these smaller pieces so that they might be searched faster.
Cutting uses a algorithm that tighten the search space without removing any feasible solutions. These rules are generated by just a few dozen algorithms, often called separators, which were created for various sorts of MILP problems.
Wu and her team found that the technique of identifying the perfect combination of separator algorithms to make use of is, in itself, an issue with an exponential variety of solutions.
“Separator management is a core a part of every solver, but that is an underappreciated aspect of the issue space. Considered one of the contributions of this work is identifying the issue of separator management as a machine learning task to start with,” she says.
Shrinking the answer space
She and her collaborators devised a filtering mechanism that reduces this separator search space from greater than 130,000 potential mixtures to around 20 options. This filtering mechanism draws on the principle of diminishing marginal returns, which says that probably the most profit would come from a small set of algorithms, and adding additional algorithms won’t bring much extra improvement.
Then they use a machine-learning model to select the most effective combination of algorithms from among the many 20 remaining options.
This model is trained with a dataset specific to the user’s optimization problem, so it learns to decide on algorithms that best suit the user’s particular task. Since an organization like FedEx has solved routing problems persistently before, using real data gleaned from past experience should lead to higher solutions than ranging from scratch every time.
The model’s iterative learning process, often called contextual bandits, a type of reinforcement learning, involves picking a possible solution, getting feedback on how good it was, after which trying again to seek out a greater solution.
This data-driven approach accelerated MILP solvers between 30 and 70 percent with none drop in accuracy. Furthermore, the speedup was similar once they applied it to a less complicated, open-source solver and a more powerful, industrial solver.
In the longer term, Wu and her collaborators wish to apply this approach to much more complex MILP problems, where gathering labeled data to coach the model might be especially difficult. Perhaps they’ll train the model on a smaller dataset after which tweak it to tackle a much larger optimization problem, she says. The researchers are also concerned with interpreting the learned model to higher understand the effectiveness of various separator algorithms.
This research is supported, partially, by Mathworks, the National Science Foundation (NSF), the MIT Amazon Science Hub, and MIT’s Research Support Committee.