Motivation
Unmanned aerial vehicles (UAV) have gained significant roles and aided
the military on the battlefield over the last decade by performing missions
such as reconnaissance, surveillance, and target tracking with the aid
of humans. These vehicles are now being considered for more complex
missions that involve increased decision making to operate in cluttered
environments. This approach attempts to alleviate the human workload by
designing a more autonomous vehicle. One avenue being explore is the use
of vision to sense the environment. Decisions are made based on the type
of mission through image processing which then passes those commands to
the control system to navigate the aircraft.
With UAV becoming more prevalent in the aerospace community, researchers
are striving to extend their capabilities while making them more reliable.
Operating in various surroundings and from long range are an important
resource for military applications. For example, imagine a scenario where
a UAV is tasked with a surveillance mission to fly long range then through
a city where targets of interest are located. The first step in this mission
is completing the long range flight to the desired location which may require
an aerial refueling maneuver. The UAV uses senor fusion from both INS and vision
to complete this task autonomously. The mission continues on as the UAV is
on course for the desired city. Ultimately, the UAV arrives at the desired
location and maneuvers through the city to track and estimate the target's
position and orientation using visual measurements. This type of mission for
UAV requires both accurate knowledge of the aircraft states along with reliable
image processing for state estimation.
Problem Statement
The problem addressed in this work consists of target state estimation of unknown
stochastic motion for autonomous systems using a moving monocular camera. The
estimation of 3-dimensional points in space given two perspective views relies
heavily on camera configuration, accurate camera calibration, and perfect image
processing; however, the practical realizations in a camera systems involve
limitations to configurations, correspondence issues, significant uncertainties
and noise. Therefore, in order to estimate the states of a moving object in
the presence of uncertainty, several key issues have to be addressed. These
technical challenges include:
-
segmenting moving targets from stationary targets within the scene
-
classifying moving targets into deterministic and stochastic motions
-
coupling the vehicle dynamics into the sensor observations (i.e. images)
-
formulating the homography equations between a moving camera and the
viewable targets
-
propagating the effects of uncertainty through the state estimation
equations
-
establishing confidence bounds on target state estimation
State Estimation
An image processing technique used to estimate the relative translation and
rotation in 3D space between two consecutive image frames is called an homography.
This 3D scene reconstruction of a moving target is determined using the known
motion of a camera and a moving reference frame. Therefore, the combination of
vision and traditional sensors such as a global positioning system (GPS) and an
inertial measurement unit (IMU) facilitates the problem of estimating the states
of a moving target for a single camera configuration.
In general, a single moving camera alone is unable to reconstruct the 3D scene
containing moving objects. This restriction is due to the loss of the epipolar
constraint, where the plane formed by the position vectors of the target and
the translation vector is no longer valid. The contribution of this work establishes
the Euclidean homography between the target and the reference object from a single
image through transformations that maintain the reference object stationary in the
image across two frames. Relating this information with known measurements from GPS
and IMU the reconstruction of the target's motion regardless of its dynamics can be
retained. Several assumptions are required for this approach work including the
objects must remain in the image at all times, feature point distance is known, and
known motion from both the camera and the reference object. These relationships
can then be related back to the vehicle's frame through a known transformation and used
in control strategies that perform either homing or docking maneuvers.
Simulation and Results
A simulation was executed in Matlab and replayed in a virtual environment to
test the state estimation algorithm. The setup consisted of three vehicles:
an UAV flying above with a mounted camera, a reference ground vehicle, and
a target vehicle. The camera setup considered in this problem consist of a
single camera attached to the UAV with fixed position and orientation.
While in flight the camera measures and tracks feature points on both the
target vehicle and the reference vehicle. This simulation assumes perfect
camera calibration, feature point extraction, and tracking so that the
state estimation algorithm can be verified. Later more realistic aspects
of the camera system will be included in this simulation to create a
practical scenario.
The motion of the vehicles were generated to cover a vast range of situations
to test the algorithm. The UAV's motion was generated from a nonlinear aircraft
model. Meanwhile, the reference vehicle and the target vehicle exhibited a
standard car model with similar velocities. Sinusoidal disturbances were added
to the target's position and heading to create some complexity it's motion. The
three trajectories are plotted below for illustration.
|