Airlifts demand the delivery of large sets of cargo into areas of need under tight deadlines. Yet, there are many obstacles preventing timely delivery. Airports have limited capacity to process airplanes, thus limiting throughput and potentially creating bottlenecks. Weather disruptions can cause delays or force airplanes to re-route. Unexpected cargo may be staged for an urgent delivery.

This competition challenges participants to design agents that can plan and execute an airlift operation. Quick decision-making is needed to rapidly adjust plans in the face of disruptions along the delivery routes. The decision-maker will also need to incorporate new cargo delivery requests that appear during the episode. The primary objective is to meet the specified deadlines, with a secondary goal of minimizing cost. Solutions can incorporate machine learning, optimization, path planning heuristics, or any other technique.

This page summarizes key aspects of the competition. To view more details, select a section from the menu on the left.


The air network consists of a graph, where nodes are capacity-constrained airports, and edges are routes with an associated cost and time-of-flight. Each cargo item is stored at a node, and must be picked up by agents (airplanes) and delivered to a destination node. Different airplane models can have different route networks. In fact, the network for a specific model may be disconnected, meaning that some airplanes may not be able to reach all airports. Time is needed after an airplane lands to refuel and to load/unload cargo, taking up precious processing capacity at the airport. There are two delivery deadlines: a soft deadline by which the cargo is desired, and a hard deadline after which the delivery is considered missed (with a heavy penalty).

A small example scenario is shown below. Airports (small squares) are shown with connecting routes (white lines). Cargo is staged at three airports in the pickup area (green rectangle). Each is designated for delivery at a specific airport in the area of need (yellow circle). The agent algorithm guides four airplanes through the network to pick up and deliver the cargo. Routes undergo random disruptions, requiring airplanes to either wait for the disruption to clear, or follow a different route. Example scenario

For more details, see the Model documentation.


The simulation environment is written in Python and follows the PettingZoo multi-agent reinforcement learning interface. An agent issues an action for each airplane, indicating which cargo to load/unload at an airport and which airport to fly to next. The agents observe a number of state variables, including airplane status, cargo locations, route availability, etc… A NetworkX object is provided for each airplane, allowing the agent to easily plan paths using existing library methods. A reward signal generated by the environment penalizes late deliveries, missed deliveries, and movement.

A minimal agent code example follows.

from airlift.envs import AirliftEnv, AirliftWorldGenerator, ActionHelper

# Agent algorithm goes here
def policy(obs):
    actions = ActionHelper.sample_valid_actions(obs)
    return actions

env = AirliftEnv(AirliftWorldGenerator())
obs = env.reset()
while True:
    actions = policy(obs)
    obs, rewards, dones, infos = env.step(actions)
    if all(dones.values()):

For more details, see the Interface documentation.

Scoring and Evaluation#

Each episode is assigned a score based on missed deliveries, late deliveries, and total flight cost. This score is normalized against baseline algorithms: participants will receive a score of 0 if they only perform as well as a random agent, and will receive a score of 1 if they perform as well as a simple “shortest path” baseline algorithm. Scores greater than 1 indicate that the algorithm is exceeding the performance of the baselines.

An algorithm will be evaluated over a number of episode scenarios. Scenarios are generated according to a random generative model, with scenarios becoming progressively more difficult. In the beginning stages, there will be one type of airplane which can reach all airports in the air network. Later stages will have specialized airplane types: large aircraft can carry large cargo loads over long distances, but cannot land at small airports located in the drop off area. Instead, they will need to leave cargo at intermediate airports where light aircraft can retrieve the cargo and complete the delivery.

The evaluation will proceed until either:

  1. the percentage of missed deliveries exceeds a preset threshold, or

  2. a time limit is reached.

The overall score will be the sum of the normalized scores over all episodes. In addition to performing well on individual episodes, algorithms can also increase their score by completing more episodes.

For more details, see the Evaluation documentation.


The winners of Phase 1 will be recognized but the winners of the overall competition will be determined by Phase 2. Participants do not have to submit to Phase 1 in order to participate in Phase 2.

  • Warm Up Phase: November 9th, 2023.

  • Competition Phase 1 Begins: November 29th, 2023, (participants score themselves against the public test scenarios). One submission allowed per day, seven per week. A set of test scenarios are available here.

  • Competition Phase 2 Begins: January 12th, 2024, (participants score themselves against the public test scenarios). One submission allowed per day, seven per week. A set of Round 2 test scenarios available here.

  • Competition Phase Ends: March 1st, 2024

  • Results Announcement: March 16th, 2024

The competition phase will consist of two rounds. Round 1 will only have one airplane type, while round 2 will have multiple airplane types. This means that some agents may not be able to make the complete delivery (there may not be a path in the network), but will have to work with other agents to complete it.

Student Recognition#

Students are encouraged to participate in this challenge. Top student submissions will be recognized. If you are a student please utilize your university e-mail when registering at CodaLab.



  1. Carmen Chiu, Adis Delanovic, Jill Platts, Alexa Loy, and Andre Beckus. A methodology for flattening the command and control problem space. In Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications IV. International Society for Optics and Photonics, SPIE, 2022. URL: https://doi.org/10.1117/12.2615180.

  2. Steven F. Baker, David P. Morton, Richard E. Rosenthal, and Laura Melody Williams. Optimizing military airlift. Operations Research, 50(4):582–602, 2002. URL: https://doi.org/10.1287/opre.50.4.582.2864.

  3. Dimitris Bertsimas, Allison Chang, Velibor V. Mišić, and Nishanth Mundru. The airlift planning problem. Transportation Science, 53(3):773–795, 2019. URL: https://doi.org/10.1287/trsc.2018.0847, doi:10.1287/trsc.2018.0847.

  4. Gerald G. Brown, W. Matthew Carlyle, Robert F. Dell, and John W. Brau. Optimizing intratheater military airlift in iraq and afghanistan. 2013. Military Operations Research, 18(3), pp. 35-52. URL: http://hdl.handle.net/10945/38129.