Abstract
We propose and test two variations of the Adaptive Large Neighborhood Search (ALNS) meta-heuristic: First, we add time sensitivity to the operator selection scheme to optimize the ALNS for both solution quality and runtime. We reward comparatively slow operators with reduced rewards for finding improvements. This ensures that the meta-heuristic is slowed down less by operators which consistently find good solutions but take long to do so. Secondly, we replace the Adaptive Layer with a Learned Operator Selection Policy trained via Deep-Q Learning. The training takes both solution quality and operator runtime into account. We test our algorithms against classic ALNS as well as random operator selection. We perform an analysis of how
operator portfolios affect performance. Our chosen problem domain is the Capacitated Vehicle Routing Problem with 100 to 400 customer nodes.
Keywords: adaptive large neighborhood search; vehicle routing; optimization; logistics; deep learning

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright (c) 2026 Christopher Dudel
