Simulator Overview
This simulator is designed to explore strategies and tactics for large-scale swarm-on-swarm combat, particularly in the area of missile and UAV defense, and with a bias toward small smart munitions.
In the initial phase of the project, we have implemented a simple aerodynamic model that roughly estimates the capabilities of different interceptors and threats, including (augmented) proportional navigation for the interceptors and the ability to evade for the threats. Both interceptors and threats have perfect knowledge of their opponents at a configurable sensor update rate. While the initial positions and velocities of the threats are hard-coded for each engagement scenario, our simulator automatically clusters the threats, predicts their future positions, plans when to automatically launch the interceptors, and assigns a threat to each interceptor to pursue.
Table of Contents
Introduction
To minimize the engagement cost, maximize the terminal speed and agility, and simultaneously defend against multiple threats, we propose using a hierarchical air defense system (ADS), where cheap, possibly unguided "carrier interceptors" carry multiple smaller, intelligent missile interceptors as submunitions. Once the carrier interceptors are close to their intended cluster of targets, they each release the missile interceptors that light their rocket motor and accelerate up to speeds on the order of 1 km/s. The missile interceptors rely on the carrier interceptors’ track data as well as their own radio frequency and/or optical sensors to acquire their targets. Then, they distribute the task of engaging the many threats among themselves.
The command structure is hierarchical for better autonomy and tractability because to successfully defend against hundreds of threats, we intend to recursively cluster and assign them to groups of interceptors. At the bottom of the hierarchy, a single missile interceptor is assigned to puruse a single threat. At the next level of the hierarchy, a carrier interceptor is assigned to defend against a cluster of threats. Above that, a salvo of interceptors is assigned to defend against a swarm of threats, and so on.
The simulator implements the following architecture to tractably defend against a swarm of threats. This hierarchical command structure naturally leads to a recursive architecture. Details for each block are provided below in this page.
The simulator architecture was designed to be as modular as possible with interfaces for each component to facilitate the development of and comparison between new algorithms. Future versions will model the non-idealities of the sensors, considering the sensor range and resolution limits, and implement a realistic communication model between the interceptors. We also plan to explore optimal control and machine learning approaches to launch sequencing, target assignment, trajectory generation, and control.
Agents
The simulator performs a multi-agent simulation between two types of agents: interceptors and threats. The threats will target the static asset, located at the origin of the coordinate system, and the interceptors will defend the asset from the incoming threats. Currently, all interceptors are launched from the origin of the coordinate system as well.
There are two types of interceptors:
- Carrier interceptors: interceptors that carry and dispense other interceptors (e.g., Hydra-70 rockets).
- Missile interceptors: interceptors that pursue threats (e.g., micromissiles).
There are also two types of threats:
- Fixed-wing threats: Pursue their targets using proportional navigation.
- Rotary-wing threats: Pursue their targets using direct linear guidance.
Physics
Agent Model
Each agent is modeled as a point mass, i.e., a 3-DOF body without rotational dynamics. It has instantaneous acceleration in all directions, subject to constraints, because we do not model any sensing, actuation, or airframe delays. As a point mass, each agent is represented by a six-dimensional state vector consisting of the agent's three-dimensional position and three-dimensional velocity. The input to the system is a three-dimensional acceleration vector.
The state vector is given by:
where
The input vector is given by:
where
The nonlinear state evolution equation is given by:
where
Agent Acceleration
The agent acceleration is given by:
Unlike interceptors, threats are not subject to drag or gravity.
The air drag is given by:
where
We do impose some constraints on the acceleration:
- Interceptors can only accelerate normal to their velocity (no thrust during the midcourse phase), i.e.,
. Therefore, for interceptors. - Threats may have some forward acceleration, which is bounded by the maximum forward acceleration specified for each threat type.
- The normal acceleration is constrained by the maximum number of g's that the agent's airframe can pull:
denotes the maximum normal acceleration that the airframe can pull at the reference speed .
Perception
Currently, all agents are equipped with an ideal sensor, one that can peek through the fog of war with no noise and no delay. Sensing is performed within the agent's frame of reference using spherical coordinates, so each sensor output
where
Interceptors are constrained in their sensor update frequency, which is configurable for each interceptor type. As a result, interceptors can change their actuation input at a rate faster than the sensor update frequency. The simulator currently uses a naive guidance filter that simply performs a zero-order hold interpolation on the latest sensor output and applies the latest acceleration to a model of the sensing target until the next sensor output arrives and updates the model's position, velocity, and acceleration. In other words, for
where the initial conditions
Threats are assumed to be omniscient, so they have no frequency constraint on their sensor output and know the positions, velocities, and accelerations of all interceptors at all times.
Clustering
To cluster the threats into more manageable clusters, we use a size and radius-constrained clustering algorithm.
- Size constraint: Maximum 7 threats per cluster
- Radius constraint: Cluster radius must be less than or equal to 1 km The size constraint is necessary, especially at the bottommost level of the hierarchy, because each carrier interceptor only carries up to seven missile interceptors. The radius constraint ensures that when the submunitions are released, each missile interceptor will have sufficient terminal speed to reach all of the threats within the cluster. As each cluster represents an additional launched interceptor, the algorithm should minimize the number of clusters
to minimize the engagement cost.
Agglomerative Clustering
A simple greedy algorithm to satisfy both the size and radius constraints is agglomerative clustering. Each agent to be clustered starts in its own cluster, and while the distance between the two closest clusters is less than the radius constraint, the two closest clusters are merged together as long as the resulting cluster satisfies the size constraint. If the two closest clusters cannot be merged due to the size constraint, the algorithm proceeds to the next two closest clusters. This clustering algorithm continues until no more clusters can be merged.
Constrained k-means Clustering
The simulator also implements a constrained
Minimum Cost Flow Clustering
Instead of modifying the standard
This algorithm has yet to be implemented in the Unity simulator.
Prediction
The simulator currently uses a simple predictor, where the future position of the threat is extrapolated based on the threat's current velocity and acceleration. Future work involves using a probabilistic predictor that accounts for the threat's maneuverability.
Predictor-Planner Convergence
The predictor estimates the position of the threat at some given time-to-intercept while the planner estimates the time-to-intercept given the threat position. In order to find the time-to-intercept and the intercept position, the predictor and the planner must converge to the same time-to-intercept and intercept position.
The simulator achieves predictor-planner convergence by performing a 2-step iterative process that repeatedly updates the time-to-intercept estimate using the planner and then updates the intercept position using the predictor.
The iterative launch planner also performs a few sanity checks to check for corner cases. For example, this algorithm may not always converge to a solution, so to prevent an infinite loop, the algorithm should declare no launch if the distance between the intercept position and the estimated threat position exceed some threshold.
// Check that the intercept position and the predicted position are within some threshold
// distance of each other.
if (Vector3.Distance(interceptPosition, targetPosition) >= InterceptPositionThreshold) {
return LaunchPlan.NoLaunch;
}
Furthermore, the algorithm checks that the threat is indeed heading towards the intercept position.
// Check that the target is moving towards the intercept position. Otherwise, the interceptor
// should wait to be launched.
Vector3 targetToInterceptPosition = interceptPosition - initialState.Position;
Vector3 targetToPredictedPosition = targetPosition - initialState.Position;
if (Vector3.Dot(targetToInterceptPosition, targetToPredictedPosition) < 0) {
return LaunchPlan.NoLaunch;
}
Finally, the interceptor might be launched into the posterior hemisphere away from the threat if the estimated intercept position is behind the launch position. The solution is to delay the launch until the intercept position is in the anterior hemisphere.
// Check that the interceptor is moving towards the target. If the target is moving too fast,
// the interceptor might be launched backwards because the intercept position and the predicted
// position are behind the asset. In this case, the interceptor should wait to be launched.
Vector3 interceptorToInterceptPosition = interceptPosition;
Vector3 threatToPredictedPosition = targetPosition - initialState.Position;
if (Vector3.Dot(interceptorToInterceptPosition, threatToPredictedPosition) > 0) {
return LaunchPlan.NoLaunch;
}
Planning
Launch Angle Planner Lookup Table
Since the dynamics of the interceptors is nonlinear, the simulator implements a lookup table to determine the launch angle at which the carrier interceptor should be launched given the range and altitude of the threat. The planner attempts to maximize the terminal speed to ensure the maximum kill probability.
To generate the lookup table, the nonlinear dynamics of the carrier interceptors along with those of the released missile interceptors were simulated while sweeping through the launch angle, the submunition dispense time, and the submunition motor light time. The maximum speed is achieved if the missile interceptors light their motors directly prior to reaching the intercept position.
The resulting trajectory points were interpolated to generate the lookup table shown below:
In the current version of the simulator, only the un-interpolated samples of the lookup table are used, and the launch angle planner performs a nearest-neighbor interpolation to determine the optimal launch angle given the range and altitude of the threat.
Assignment
Maximum Speed Assignment
Interceptors should strive to maximize their terminal speed to maximize their agility, and the assignment algorithm should optimize for this objective.
Fractional Speed
Interceptors lose speed through the air drag and through the lift-induced drag. The air drag force is proportional to the squared speed of the interceptor, and the fractional speed of the interceptors decreases exponentially as a function of the distance traveled. The lift-induced drag causes the fractional speed to decrease exponentially as a function of the bearing change. During the bearing change, additional air drag is incurred due to the minimum turning radius.
where
The simulator uses a cost-based assignment scheme, where the cost represents the speed loss, i.e., is inversely related to the fractional speed.
Vector3 directionToThreat = threat.GetPosition() - interceptor.GetPosition();
float distanceToThreat = directionToThreat.magnitude;
float angleToThreat =
Vector3.Angle(interceptor.GetVelocity(), directionToThreat) * Mathf.Deg2Rad;
// The fractional speed is the product of the fractional speed after traveling the distance
// and of the fractional speed after turning.
float fractionalSpeed = Mathf.Exp(
-((distanceToThreat + angleToThreat * minTurningRadius) / distanceTimeConstant +
angleToThreat / angleTimeConstant));
float cost = (float)interceptor.GetSpeed() / fractionalSpeed;
Cost-based assignment schemes can be formulated as integer linear programming (ILP) problems that optimize over boolean variables
Cover Assignment
In the cover assignment scheme, the constraints ensure that at least one interceptor is assigned to every threat to guarantee coverage over all threats.
The objective is as follows:
The following constraint ensures that each interceptor only pursues one threat.
The following constraint ensures that at least one interceptor is assigned to every threat.
One issue with the cover assignment is that the resulting distribution of interceptors may be uneven.
Even Assignment
The even assignment guarantees that the interceptors are evenly distributed among the threats.
The following constraint is added to ensure even coverage.
The even assignment is used by the current version of the simulator.
Weighted Even Assignment
Instead of evenly distributing the interceptors among all threats, a more optimal assignment should take into consideration the threat level of each threat. We propose a weighted even assignment, where
The following constraint is modified to reflect the mitigation factors of different threats.
Controls
Proportional Navigation
Using the fact that constant bearing decreasing range (CBDR) leads to a collision, we apply an acceleration normal to the velocity vector to correct for any bearing drift. In the simulator, proportional navigation follows the simple control law:
where
Proportional navigation is effective for non-accelerating targets and guarantees a collision. However, simply using true proportional navigation as a guidance law leads to some undesired behavior when the rate of change of the bearing
- Increasing range: The closing velocity may be negative, i.e., the distance between the agent and its target may actually be increasing. In this case, the agent should apply a maximum normal acceleration in any direction to turn around, but since
, the normal acceleration is minimal. To overcome this issue, the navigation gain is increased significantly if the closing velocity is negative. csharp// Handle the case where the closing velocity is negative. if (closingVelocity < 0) { navigationGain = Mathf.Max(1f, Mathf.Abs(closingVelocity) * 100f); }
- Spiral behavior: If the target is at a near-constant
bearing from the agent, the agent may end up in a spiral encircling the target because is roughly constant and so . To overcome this limitation, the agent will apply a higher navigation gain if the target bearing is within of . csharp// Handle the case of the spiral behavior if the target is at a bearing of 90 degrees +- 10 degrees. if (Mathf.Abs(Mathf.Abs(sensorOutput.position.azimuth) - Mathf.PI / 2) < 0.2f || Mathf.Abs(Mathf.Abs(sensorOutput.position.elevation) - Mathf.PI / 2) < 0.2f) { // Clamp the line-of-sight rate at 0.2 rad/s. float minLosRate = 0.2f; losRateAz = Mathf.Sign(losRateAz) * Mathf.Max(Mathf.Abs(losRateAz), minLosRate); losRateEl = Mathf.Sign(losRateEl) * Mathf.Max(Mathf.Abs(losRateEl), minLosRate); navigationGain = Mathf.Abs(closingVelocity) * 100f; }
Augmented Proportional Navigation
Augmented proportional navigation adds a feedthrough term proportional to the agent’s acceleration:
where
Augmented proportional navigation is equivalent to true proportional navigation if the target is not accelerating.
Ground Avoidance
Gravity only acts on interceptors as the simulator assumes that the threats are able to compensate for gravity. Currently, interceptors do not consider gravity when determining their acceleration input for the next simulation step. As a result, gravity acts as a disturbance to each interceptor's dynamics system and may cause the interceptor to collide with the ground if not accounted for.
Threats may also collide with the ground, especially after having performed an evasion maneuver from pursuing interceptors. The simulator implements a basic ground avoidance algorithm for the threats: If the threat's vertical speed will cause the threat to collide with the ground before the threat will hit the asset, the threat will adjust its vertical velocity to be a linear combination of navigating towards the asset and pulling up away from the ground.
// Calculate the time to ground.
float altitude = transformPosition.y;
float sinkRate = -transformVelocity.y;
float timeToGround = altitude / sinkRate;
// Calculate the time to target.
float distanceToTarget = sensorOutput.position.range;
float timeToTarget = distanceToTarget / transformSpeed;
float groundProximityThreshold = Mathf.Abs(transformVelocity.y) * 5f;
if (sinkRate > 0 && timeToGround < timeToTarget) {
// Evade upwards normal to the velocity.
Vector3 upwardsDirection = Vector3.Cross(transformForward, transformRight);
// For the y-component, interpolate between the calculated acceleration input and the upward acceleration.
float blendFactor = 1 - (altitude / groundProximityThreshold);
accelerationInput.y = Vector3.Lerp(accelerationInput, upwardsDirection * CalculateMaxNormalAcceleration(), blendFactor).y;
}
Threat
Kill Probability
To model non-idealities and noise in the sensing and guidance systems, each threat is assigned a hit radius and a kill probability. Whenever an interceptor is within the hit radius of the threat, the simulator treats it as an intercept, where the interceptor has collided with the threat. However, the simulator does not automatically count the threat as destroyed; instead, the kill probability determines whether the threat is destroyed. Unlike threats, interceptors are always destroyed during an intercept.
Intercept Evasion
When interceptors get too close to their intended target, the threat performs an evasive maneuver to expend the interceptor's speed and remaining energy. During the evasive maneuver, the threat performs the following:
- The threat accelerates to its maximum speed.
- The threat turns away from the incoming interceptor at its maximum normal acceleration and tries to align its velocity vector to be normal to the interceptor's velocity vector. Since the threat applies a normal acceleration, the interceptor must turn too and thus sacrifice speed due to the lift-induced drag.
// Evade the interceptor by aligning the threat's velocity vector to be normal to the interceptor's velocity vector.
Vector3 normalVelocity = Vector3.ProjectOnPlane(transformVelocity, interceptorVelocity);
Vector3 normalAccelerationDirection = Vector3.ProjectOnPlane(normalVelocity, transformVelocity).normalized;
// Turn away from the interceptor.
Vector3 relativePosition = interceptorPosition - transformPosition;
if (Vector3.Dot(relativePosition, normalAccelerationDirection) > 0) {
normalAccelerationDirection *= -1;
}
If the threat is too close to the ground, however, it must ensure that it does not collide with the ground. Therefore, as it approaches the ground, the threat instead performs a linear combination of:
- turning to evade the interceptor, as described above, and
- turning parallel to the ground.
// Avoid evading straight down when near the ground.
float altitude = transformPosition.y;
float sinkRate = -transformVelocity.y;
float groundProximityThreshold = Mathf.Abs(transformVelocity.y) * 5f;
if (sinkRate > 0 && altitude < groundProximityThreshold) {
// Determine evasion direction based on the bearing to the interceptor.
Vector3 toInterceptor = interceptorPosition - transformPosition;
Vector3 rightDirection = Vector3.Cross(Vector3.up, transform.forward);
float angle = Vector3.SignedAngle(transform.forward, toInterceptor, Vector3.up);
// Choose the direction that turns away from the interceptor.
Vector3 bestHorizontalDirection = angle > 0 ? -rightDirection : rightDirection;
// Interpolate between horizontal evasion and upward ground avoidance.
float blendFactor = 1 - (altitude / groundProximityThreshold);
normalAccelerationDirection = Vector3.Lerp(normalAccelerationDirection, bestHorizontalDirection + transform.up * 5f, blendFactor).normalized;
}
Automatic Re-Clustering and Re-Assignment
Since threats can sometimes successfully evade the interceptors, the IADS must be prepared to handle cases of missed intercepts. The simulator currently implements automatic re-clustering of missed or escaping threats as well as automatic re-assignment of missed interceptors.
- Re-clustering: Since interceptors are unpropelled in their midcourse phase, once the threat speed is greater than the interceptor speed, the threat can evade the assigned interceptor. Every 5 seconds, the IADS checks whether any threat has escaped the assigned interceptors and places the escaping threats back into the clustering queue. Every 2 seconds, the IADS will process the clustering queue and, if necessary, launch additional interceptors.
- Re-assignment: Re-assigning a new threat to an interceptor can happen in two ways:
- The assigned threat has already been intercepted by another interceptor.
- The assigned threat has escaped the interceptor and is located behind the interceptor, such that it would be ineffective for the interceptor to perform a 180 degree turn. In these cases, the interceptor will be placed onto a queue containing all interceptors to be assigned a threat. At the end of the update, all interceptors in the queue will be assigned according to the assignment scheme, possibly doubling up on some threats.