Skip to content

Sensor Integration and Signal processing Part II

Average filter

Average is sum of a given data set divided by the number of data in it.

For example,If the given data set contains k datum (x1,x2,,x4) the average is obtained by the following expression:

x¯k=x1+x2++xkk.

The expression above treats whole data in a single batch and thus called 'batch expression'.

The recursive expression boosts computational efficiency by reusing prior results (notably for large datasets) and cuts memory usage, needing only the previous average, new data, and total data count, unlike Average Filter which requires the full dataset.

For the averaging filter, we can get the following recursive expression.

x¯k=k1kx¯k1+1kxk

If we use the Equation above to get average, only the average computed in previous step(x¯k1), total number of datum in the data set(k), and freshly added datum(x¯k) are all we need. Here, we do not need whole datum in the data set. This is why a recursive expression is useful for computing average of a data set with a datum being added sequentially.

This equation could be simplified even further. If we define αk/(k1),following relationship between α and k stands:

αk1k=11k,1k=1α.

And we can get a simplified form of the recursive expression:

x¯k=k1kx¯k1+1kxk=αx¯k1+(1α)xk.

The name of this Equation is 'average filter'.

The average filter is applicable to both average calculation and sensor initialization, as perfectly exemplified by a digital weight scale.

  • The zero point of a digital weight scale changes continuously for various reasons, so an initialization process is required to determine the zero point by collecting measurements over a certain period after power-on. If all measurements have to be stored to calculate the average, an expensive microprocessor will be needed, which is obviously an unfavorable option. This can be avoided by adopting an average filter.
Average filter
Figure 1: Average filter.

The plot of the average filter's voltage results shows violent fluctuations in raw measurements versus a stable filtered output. As data accumulates, noise fades and the output approaches the true voltage average; this noise-filtering property of averaging is why it is named a "filter".

Moving average filter

5-day moving average curve plots the moving average of stock prices over a 5-day window, which is used to track the medium-to-long-term trends of stock prices and eliminate confusion caused by daily fluctuations. This chapter will elaborate on the concept of the moving average.

Moving average is an average of not all the data in the data set, but of certain number of recent measurements. This method discards the oldest data whenever a new datum comes in and in that way it maintains the number of datum to get the average.

Moving average of n datum is expressed as following:

x¯k=xkn+1+xkn+2++xkn.

Beware that x¯k used here for moving average filter has different meaning from that used to describe average filter. In the average filter, x¯k meant the average of k datum. In contrast, x¯k for moving average filter is the average of n datum from (kn+1)th to kth datum but not the average of all k datum.

Leaving x¯k on the left hand side we get recursive expression:

x¯k=x¯k1+xkxknn

This is 'moving average filter'. It is not as neat as average filter but still is in recursive expression.

Load required for computation of an average is not increased even after a new datum is fed in. This inherent characteristic of low computing load gives little room for recursive expression to provide any advantage unlike average filter.

The moving average filter involves a direct trade-off between noise reduction and phase lag. While increasing the sample size (n) enhances noise suppression, it introduces delays that prevent the output from reflecting real-time variations. Conversely, reducing n improves responsiveness but sacrifices filtering performance.

In practice, the value of n must be tailored to the signal's dynamics:

  • For rapid variations: Use a smaller n to minimize lag and ensure quick tracking.
  • For slow variations: Use a larger n to maximize noise reduction and achieve a stable average.

The Moving Average Filter involves a fundamental trade-off between noise suppression and response time:

  • Higher n: Enhances noise reduction but introduces latency, hindering real-time tracking.
  • Lower n: Enables faster response to data changes but sacrifices filtering quality.

Conclusion: The optimal value for n must be chosen based on the signal's characteristics. Furthermore, unlike the standard average filter, its recursive implementation offers limited algorithmic advantages.

1st order Low-pass filter

When moving average filter is brought into real world, you will notice that reducing noise and tracking variation of the given data at the same time is not an easy task. Quite frequently, even if you change the number of datum to get average, the result is not so satisfactory.

Breaking the definition of moving average in Equation into each term.

x¯k=xkn+1+xkn+2++xkn=1nxkn+1+1nxkn+2++1nxk

From this expression, we see that each term in the moving average has same weighting (1/n) to the other. Most recent datum (xk) and the oldest datum (xkn+1) have same weight in the average.

An aircraft has an attitude sensor which returns 50 measurements in 1 second. Moving average filter of the flight management computer computes an average with 50 collected datum(n=50). In this case, the oldest datum used to compute the average is the one measured 1 second ago. When the aircraft is in flight, which one, between the one measured 1 second ago and the one measured most recent, would be closer to current attitude?

The answer would most possibly be the one measured most recent. However, moving average filter processes all measurements with equal weighting.

Unlike we have done so far, recursive expression will be directly given for low-pass filter. The accurate name for this filter is '1st order low-pass filter'.

x¯k=αx¯k1+(1α)xk.

Where α is a constant in the range of 0<α<1.

From now on, we will use the term 'estimate' instead of 'average' or 'mean' because x¯k in a low-pass filter has nothing to do with average. According to the Equation, previous estimate x¯k1 is computed as following:

x¯k1=αx¯k2+(1α)xk1.

Thus:

x¯k=αx¯k1+(1α)xk=α[αx¯k2+(1α)xk1]+(1α)xk=α2x¯k2+α(1α)xk1+(1α)xk=α2[αx¯k3+(1α)xk2]+α(1α)xk1+(1α)xk=α3x¯k3+α2(1α)xk2+α(1α)xk1+(1α)xk

Therefore, there are following relationships among measurements:

<α2(1α)<α(1α)<1α.

Equation confirms that weighting coefficients decrease as data ages. Consequently, older data points exert less influence on the final estimate, ensuring the filter prioritizes recent information.

The estimate of a 1st-order low-pass filter is highly sensitive to the coefficient α:

  • Small α (e.g., 0.4): Higher noise levels but faster response with minimal lag.
  • Large α (e.g., 0.9): Superior noise reduction at the cost of significant time delay.

1st order low-pass-filter is also known as 'exponentially weighted moving average filter'.

Kalman filter algorithm

It receives only one input (measurement, zk) and returns one output (estimate, x^k).

Internal process is done through a four-step computation.

  1. Step I (Prediction): Computes predicted variables (denoted by a - superscript) based on the system model.
  2. Step II (Kalman Gain): Calculates the Kalman gain using the output from Step I and external presets/noise characteristics (H and R).
  3. Step III (Estimation): Computes the current estimate using the measurement input and Step I's prediction (conceptually similar to a low-pass filter).
  4. Step IV (Error Covariance): Computes the error covariance to evaluate the estimate's accuracy and determine its reliability.

The Kalman filter can be distilled into a continuous recursive loop comprising two main phases: Prediction and Estimation.

  • Prediction Process (Step I): Utilizes the previous state and system parameters (A and Q) to compute the predicted state (x^k and Pk).
  • Estimation Process (Steps II-IV): Incorporates the current measurement (zk) alongside parameters (H and R) to compensate for prediction errors, yielding the final optimal estimate (x^k,Pk).

Summary: The algorithm operates as a simple closed loop: Predict state → Compensate with measurement → Repeat.

There was a reason why low-pass filter has been mentioned so frequently whenever Kalman filter was brought up. Let us look at the expression for computing an estimate in Step III for examination.

x^k=x^k+Kk(zkHx^k)

where zk means measurement and x^k means prediction.

Simply change:

x^k=x^k+Kk(zkHx^k)=x^k+KkzkKkHx^k=(IKkH)x^k+Kkzk=(IKk)x^k+Kkzk If H=Ix^k=αx¯k1+(1α)xk=(1K)x¯k1+(1(1K))xk=(1K)x¯k1+Kxk

Compare this expression with Equation, which is for computation of an estimate in Kalman filter.

Kalman filter is almost the same. The way it assigns weighting is even same with the only difference lying in the fact that Kalman filter is not using the previous estimate but prediction.

For Kalman filter algorithm:

x^k=x^k+Kk(zkHx^k)

First of all, a prediction(x) and a new measurement(z) are required.

Now the only variable left is Kk. This is called 'Kalman gain' and if the value of this variable is known, the new estimate could be computed.

Kk=PkH(HPkH+R)1

We will take a look into an error covariance which is the last step of the estimation process

Pk=PkKkHPk

Error covariance indicates the difference between the estimate from Kalman filter and the true but unknown value. In other words, error covariance is a degree of accuracy of the estimate.

If Pk is large, the error of the estimate is large and if Pk is small, the error of the estimate is small.

There is a following relationship between Pk and its estimate(x^k)and error covariance(Pk).

xkN(x^k,Pk).

This basically means the variable xk follows a normal distribution with mean x^k , and covariance Pk, but there is a lot more deep meaning in here.

That is, Kalman filter computes probability distribution of the estimate of the variable xk and selects the one with highest probability as the estimate.

This means that if we draw the probability of the values that xk could have, it comes out as a bellshaped distribution. And the width of this bell-shaped curve is determined by Pk.

Error covariance is defined as the following.

Pk=E[(xkx^k)(xkx^k)].

where E[] is an operator to compute the mean of the variable inside the curly brackets. xkx^k in the right hand side means the difference between the true but unknown and the estimate, i.e.the error of the estimate. In other words, the error covariance is the mean of the square of the error of the estimate. This is why the size of the error covariance is proportional to the error of the estimate.

In the prediction procedure, how the estimate x^k will vary when time changes from tk to tk+1 is predicted. In other words, it predicts what value will the estimate of current time point have at the next time point tk+1.

x^k+1=Ax^kPk+1=APkA+Q

Here, x^k and Pk are the values calculated from Steps III and IV, respectively. And A and Q are already defined by the system model. The other variables in the system model such as H and R are not used.

Please pay attention to the notation for the prediction. The subscript ‘k+1'attached to indicate that it is the value at the time point t_{k+1}$ and the superscript '-' for indicating that it is a prediction.

Now let us examine the expression to compute the estimate in Kalman filter.

Unlike a 1st-order low-pass filter, the Kalman filter does not directly reuse the previous estimate ( x^k1 ) Instead, it incorporates an intermediate prediction step, leading to specific terminology:

  • A Priori Estimate: The prediction (x^k), derived from the previous estimate before the current measurement is considered.
x^k=Ax^k1.
  • A Posteriori Estimate: The final estimate ( x^k ), computed after updating the a priori estimate with the new measurement.
x^k=x^k+Kk(zkHx^k)=Ax^k1+Kk(zkHAx^k1)

In the expression above, Hx^k has the meaning of 'the measurement computed with the prediction.' In other words, it is the prediction of the measurement.

The quality of prediction is determined by how close the system model is to the actual system and the performance of estimation is determined by the prediction.

It is not an exaggeration to say that the performance of Kalman filter is determined by system model. As we have seen so far system model is the one that is always critical.

Kalman filer deals with a linear state model like the one described in the following.

xk+1=Axk+wkzk=Hxk+vk.
  • xk: state variable,(n×1) column vector
  • zk: measurement,(m×1) column vector
  • A: state transubstantiation,(n×n) matrix
  • H: state-to-measurement matrix,(m×n)matrix
  • wk: state transitionnnoise,(n×1) column vector
  • vk: measurement noise,(m×1) column vector

The Kalman filter models physical quantities using state-space equations and assumes all disturbances are white noise.

  • State Variables (x): The physical quantities being estimated (e.g., position, velocity).
  • Matrix A (System Model): Describes the system's kinematics, defining how the state evolves over time.
  • Process Noise (wk): Unpredictable physical disturbances affecting the actual system.
  • Matrix H (Observation Model): Maps the internal state variables to the external sensor measurements.
  • Measurement Noise (vk): Inherent inaccuracies and errors in the sensor readings.

Step 1: The Blueprint — System Model (Matrix A)

  • Concept: Imagine a car driving at a constant speed of 10 m/s on a straight road. The state variable x consists of the car's position and velocity. Matrix A contains the fundamental equations of motion from physics: Positionnew=Positionold+Velocity×Time.

Step 2: Reality Hits — Process Noise (wk)

  • Concept: In reality, constant speed is impossible. A sudden gust of headwind, a pothole, or the driver lightly tapping the brakes will alter the momentum. These real-world physical disturbances affecting the system are called process noise wk.
  • Because of the process noise wk, the actual physical state xk deviates from our perfect blueprint. The car speeds up and slows down slightly, making the true path drift.

Step 3: Flawed Perception — Observation Matrix (H) & Measurement Noise (vk)

  • Concept: We want to track the car, so we install a GPS. The observation matrix H defines what the sensor can actually "see" (e.g., the GPS maps the state by only reading the 'position' and ignoring the 'velocity'). However, GPS signals suffer from atmospheric interference. The inherent inaccuracies of the sensor itself are called measurement noise vk.
  • Even though the car is truly driving along the path, the GPS reports coordinates as scattered crosses. They bounce around the true position due to the measurement noise.
  • The Ultimate Goal of the Kalman Filter: To look at only the scattered crosses (with vk), acknowledge that the world is imperfect (with wk), and use the laws of physics (Matrix A) to guess exactly where that true line is.

Noise

  • Q: covariance matrix of wk, (n×n) diagonal matrix
  • R: covariance matrix of vk, (m×m) diagonal matrix

Covariance matrix is defined as a matrix consisting of the variance of the variable. For example, if there are n noises w1,w2,,wn and the variance of each variable is σ12,σ22,,σn2 respectively, the covariance matrix could be written as the following. The covariance matrix R of the measurement vk is expressed in the same way

Q=[σ12000σ22000σn2]

It is desirable to define the matrices Q and R accurately but there is a limit for doing this analytically because there are various sources of errors.

First, the matrix R appears in the formula for Kalman gain.

Kk=PkH(HPkH+R)1.

In this expression, Kalman gain decreases as R increases. How would the decreased Kalman gain affect the estimate? Let us see Equation below:

x^k=(IKk)x^k+Kkzk.

Balancing Kalman Gain (K) and Measurement Noise (R):

  • Mechanism: A lower Kalman gain reduces the weight given to the measurement and increases reliance on the system's prediction.
  • Effect: The final estimate becomes smoother, with reduced variation, as it is less sensitive to external sensor noise.
  • Tuning Tip: If you want the filter to trust its internal model more and ignore noisy measurements, increase the value of R.

Now let us take a look at an expression with the matrix Q. This is used to get the prediction of the error covariance(Pk+1)

Pk+1=APkA+Q.

In this expression, the prediction of the error covariance increases as the matrix Q increases.

Summing up, when Q increases, Kalman gain also increases.

Compared with 1st order low-pass filter

Kalman filter computes the estimate with the following equation. When the equation is expanded, it reveals its form close to that of the 1st order low-pass filter.

In other words, Kalman filter is similar to the 1st order low-pass filer in a sense that the estimate is obtained from the sum of the prediction (x^k)and the current measurement (zk) with appropriate weightings.

However, in Kalman filter, the weighting (Kk) applied to get the estimate is not constant but different at each data point. That is, the weighting is not maintained constant but updated for each data point through the formula. This is a significant difference between Kalman filter and the 1st order low-pass filter.

Kk=PkH(HPkH+R)1

Error covariance(Pk)is a measure of representing the error of the estimate. If Pk is large,the estimation error is also large and if Pk is small, the estimation error is also small.

Pk=PkKkHPk.

There is a relationship between the estimate (x^k) and the error covariance (Pk) as following:

xkN(x^k,Pk).

In Kalman filter algorithm, prediction process is also included. Prediction is a totally separate process from estimation and the system model is crucial for this. All it predicts are the state variables and the error covariance.

x^k+1=Ax^kPk+1=APkA+Q.

To design Kalman filter, a system model in the following form must be established first.The performance of Kalman filter varies greatly depending on how close the system model is to the actual system.

xk+1=Axk+wkzk=Hxk+vk.

When deriving the system model, the characteristics of the associated noise must also be sought thoroughly. Noise also plays important role in Kalman filter. The noise in Kalman filter is a white noise that follows normal distribution.

Example1: train with constant velocity, position is measured

The Flaw in Basic Velocity Calculation (ΔPosition/ΔTime):

  • Noise Amplification: Calculating velocity by directly dividing a noisy position difference by a very small time step (Δt) drastically amplifies the error, resulting in massive, jagged spikes in velocity (a classic problem with numerical differentiation).
  • Clunky Alternatives: Traditional workarounds like moving averages or polynomial approximations are inelegant, and often yield poor fits.
  • The Solution: The Kalman filter proves its true value in resolving exactly this type of noise challenge.

Designing the Kalman Filter (Train Example): The mandatory first step is deriving the system model. Since our targets are the train's dynamics, we define position and velocity as our primary state variables:

x={positionvelocity}

Let us set the system model for the example as following.

xk+1=Axk+wkzk=Hxk+vkA=[1Δt01],H=[10]

Where Δt is the interval of time measuring the position.

xk+1=Axk+wk=[1Δt01]xk+wk

If the definitions of the state variables are applied, the meaning becomes clear,

[positionvelocity]k+1=[1Δt01][positionvelocity]k+[0wk]=[position+velocityΔtvelocity+w]k

Let us first take a look at the expression for the position.

positionk+1=positionk+velocitykΔt.

This expression is a mathematical description of a physical principle stating that 'current position = previous position + displacement'. This is why the system noise is not included in the expression. Now let us take a look at the expression for the velocity.

velocityk+1=velocityk+wk.

This means that the velocity of the train is affected only by the system noise (wk) and no external force is acting on the train.

Let us look into the measurement equation of the system model. Apply the matrix H to the measurement equation.

zk=Hxk+vk=[10][positionvelocity]k+vk=positionk+vk.

The last remaining task for the derivation of the system model is the decision of the error covariance matrices (Q and R). In most cases, the error characteristics of the measurement noise (vk) are provided by the sensor manufacturer.

If not, they should be determined based on experiment and experience. The modeling of the system noise(wk)is more difficult and this has no other basis than knowledge and experience about the system.

If both covariance matrices are difficult to be obtained in analytical manner, they should be determined by trial and error.

Example2: train with constant velocity, velocity is measured

The previous example will be reversed. That is, estimating the position with the measured velocity.

H=[01]

Let us apply this to the system model and simplify. Now you will understand why the matrix H has to be changed like this:

zk=Hxk+vk=[01][positionvelocity]k+vk=velocityk+vk.

Conclusion

We have learned that Kalman filter not only removes noise but also has a capability to estimate velocity with position only. How could a Kalman filter estimate the velocity which has not been measured at all? The source of this amazing capability will be discussed here.

First, let us think about a low-pass filter. A low-pass filter does not have any assumption about the measured signal. It just applies some weights to the new measurement and the old estimate and then sums them. Therefore, estimating a physical quantity that has not been measured is basically impossible. In a low-pass filter, there is no estimate if there is no measurement.

But Kalman filter is different. It assumes that the mathematical model of the system is known. In other words, the principle the system relies on to return the estimate as an output is known. The secret of estimating the physical quantity that has not even been measured is lying here. By the system model which describes the relationship between the state variables, the state variable that has-not-been measured is estimated indirectly.

The description seems quite difficult to understand, so let us use an example.

If the velocity at present is 80 [m/s], the distance covered in 1 [s] should be 80 [m], Because the result of the integration of velocity with respect to time is distance and this physical law is true. Even if there is a noise in the position data or some error in the system model, the train should be at the position around 80 [m] and being at 800 [m] is impossible. The reverse is also true. If the distance covered in 1 [s] is 80 [m], the velocity could not be 800 [m/s] but it makes sense when it is about 80 [m/s].

Likewise, Kalman filter estimates the state variable that has not been measured by utilizing the information from the system model. The Kalman filter in the example used this feature to estimate the velocity from the position. Since the velocity estimate must obey the law of physics, it is obvious that the trend of change in position is included in the estimate.

Summarizing, the capability of Kalman filter to estimate the physical quantity that has not been measured comes from the 'system model'. However, using the system model for the estimation is like using a doubleedged sword. If the system model is very different from the actual system, not only the estimate gets messed up but the Kalman filter algorithm itself could diverge and cause damage to the whole system in the extreme case.

Tacking an object in an image

A method to track an object on a two-dimensional plane with Kalman filter will be discussed. This technique is required when tracking a target on a radar scope or developing a system to track certain object through a camera in real time.

In other words, the derivation of the system model should be based on the state variables set as horizontal(x-axis) and vertical (y-axis) positions and velocities

x={positionxvelocityxpositionyvelocityy}xk+1=Axk+wkzk=Hxk+vkA=[1Δt000100001Δt0001],H=[10000010]

Applying the matrix A into the Equation, we could realize that the following relationship in x-and y-directions is described in a matrix-form.

{positionk+1velocityk+1}={positionk+velocitykΔtvelocityk}+wk

Attitude reference system ~

Kalman Filter: Sensor Fusion & Inertial Navigation Beyond "noise removal" and "state estimation," the Kalman filter's most powerful feature is Sensor Fusion. This algorithm brilliantly combines data from multiple cheap, flawed sensors to achieve the precision of a single, high-end sensor.

  • Aerospace Roots: Debuting as a star in the Apollo lunar program, the Kalman filter is now the absolute backbone of modern navigation systems for aircraft and satellites.
  • Practical Example: We will tackle a fundamental navigation problem: estimating the horizontal attitude (orientation) of an object using data from both accelerometers and gyroscopes.

(Note: These devices are known as inertial sensors, and using them for tracking is called Inertial Navigation. While it encompasses position, velocity, and attitude, we will strictly focus on attitude estimation for simplicity.)

Here, our interest is limited to the horizontal attitude so the roll and pitch angles (Φ and θ,respectively) are the only consideration among all three angles.

Rate of change in the Euler angles (Φ˙,θ˙,ψ˙) but the angular rates (p,q,r) of the airplane. The relationship between the Euler angles and angular velocities is well known in kinematics.

[Φ˙θ˙ψ˙]=[1sinΦtanθcosΦtanθ0cosΦsinΦ0sinΦcosθcosΦcosθ][pqr]

Current attitude could be obtained by applying the angular velocities measured from the gyros (p,q,r).

In the accelerations measured by the accelerometers ( fx,fy,fz )there are various accelerations mixed in including the gravitational acceleration, those caused by the change of velocity in magnitude and/or direction. This could be expressed as following.

[fxfyfz]=[u˙v˙w˙]+[0wvw0uvu0][pqr]+g[sinθcosθsinΦcosθcosΦ]

where u,v, and w are the velocities along each axis in body frame and p,q, and r are the angular velocities about each axis in body frame, respectively, and g is the gravitational acceleration.

Now let us look into detail one by one. The accelerations ( fx,fy,fz ) and angular rates( p,q,r ) are from measurement. The gravitational acceleration(g) is also known. The remaining values are the velocities( u, v,w) and accelerations( u˙,v˙,w˙ ) along each axis in body frame.

  • First, let us consider the case where the system is stationary. In this case,both velocities and accelerations along each axis in body frame are all 0 .
u˙=v˙=w˙=0u=v=w=0
  • For the case when the system is moving with constant velocity, the accelerations along each axis in body frame is 0 and since there is no change in the attitude, the angular velocities are also 0 .
u˙=v˙=w˙=0p=q=r=0
  • For both cases, the first two terms in the right hand side of Equation become 0 . Then, the equation becomes as simple as following.
[fxfyfz]=g[sinθcosθsinϕcosθcosϕ]

From this expression, the formulae for roll and pitch angles could be derived.

Φ=sin1(fygcosθ)θ=sin1(fxg).

The Golden Rule of Sensor Fusion: Complementarity Using a cheap gyro or accelerometer independently is flawed, but their characteristics are perfectly complementary:

  • Accelerometer: Its error is bounded and doesn't grow over time (ideal for mid-to-long-term reference), but it's highly noisy in the short term.
  • Gyroscope: Highly sensitive to quick changes (excellent for short-term tracking), but suffers from unbounded error accumulation (drift) over time.

The Core Idea: Use the accelerometer's long-term stability to fix the gyroscope's long-term drift! This is the essence of Sensor Fusion: strategically combining sensors to achieve performance impossible by any single unit alone.

Extended Kalman Filter (EKF)

  • Most realistic robotic problems involve nonlinear functions:
Xt+1=ft(Xt,ut)+εtεtN(0,Qt)Zt=ht(Xt)+δtδtN(0,Rt)
  • Versus linear setting:
Xt+1=AtXt+Btut+εtεtN(0,Qt)Zt=CtXt+dt+δtδtN(0,Rt)
  • Dynamics model: for xt "close to" μt we have:
ft(xt,ut)ft(μt,ut)+ft(μt,ut)xt(xtμt)=ft(μt,ut)+Ft(xtμt)
  • Measurement model: for xt "close to" μt we have:
ht(xt)ht(μt)+ht(μt)xt(xtμt)=ht(μt)+Ht(xtμt)

In a linearized Kalman filter, this problem was solved by linearization. Likewise, an extended Kalman filter derives a linear model by linearizing the nonlinear model. The technique for linearization is classic.

A=fx|x^k,H=hx|x^k.

First extended Kalman fiter uses nonlinear system model equations f(xk) and h(xk) in place of Axk, and Hxk, in linear model. Second, the matrices A and H are the Jacobians of the nonlinear model. Here, the values in the Jacobian matrices are obtained by applying the previous estimate.

An EKF example

The system state variables are defined as following.

x={horizontal distancevelocityaltitude}={x1x2x3}

Since the velocity and the altitude are assumed to be constant, the equation of motion of the system could be described as following:

{x˙1x˙2x˙3}=[010000000]{x1x2x3}+{0w1w2}=Ax+w

The meaning of this particular system model is simple. The first formula represents the relationship between horizontal distance and velocity. As you know, it means the rate of change in horizontal distance equals velocity:

x˙1=x2.

The second and third formulae describe the assumption that both the velocity and the altitude are constant. In an ideal system, the rate of change (which is derivative) should be 0, but in real world the value has a slight variation due to various error sources.This is mathematically modeled as the system noise w1 and w2:

x˙2=w1x˙3=w2

Now let us look into the measurement model. The physical quantity measured by the radar is slant range.From the figure presented just before, we could see the slant range is defined as following.

r=x12+x32+v=h(x)+v.

On the other hand, the measurement model is nonlinear, so the matrix H should be obtained by calculating the Jacobian of Equation

H=[hx1hx2hx3]=[x1x12+x320x3x12+x32]

All we need to express the tilted angle of a horizontal plane are the roll and pitch angles (Φ and θ respectively) but to give clear description of the expression, the yaw angle (ψ) is also included as a state variable.

The system model is the one already introduced as Equation:

[Φ˙θ˙ψ˙]=[1sinΦtanθcosΦtanθ0cosΦsinΦ0sinΦcosθcosΦcosθ][pqr]=[p+qsinΦtanθ+rcosΦtanθqcosΦrsinΦqsinΦsecθ+rcosΦsecθ]+w=f(x)+w.

To implement an extended Kalman filter, the Jacobian of Equation must be known. This is defined as following:

A=[f1Φf1θf1ψf2Φf2θf2ψf3Φf3θf3ψ]

Let us look into the result.

  • Convergence: Despite initial errors or noise, the EKF estimations successfully converge to the true values over time.
  • Tracking Performance: The algorithm effectively filters out the nonlinear radar measurement noise to track the constant velocity and linear distance.

Unscented Kalman Filter (UKF)

  • Problem: The Extended Kalman Filter (EKF) handles nonlinear models by linearizing them using Jacobian matrices (first-order Taylor expansion).
  • Limitation: For highly nonlinear systems, this linear approximation introduces significant truncation errors, which can lead to poor tracking performance or even filter divergence.
  • Solution: The Unscented Kalman Filter (UKF) was designed to overcome this exact flaw by eliminating the need for explicit linearization.

"It is easier to approximate a probability distribution than it is to approximate an arbitrary nonlinear function."

Instead of trying to linearize the complex nonlinear equation y=f(x), the UKF approximates the probability distribution of the state variable x.

Picks a minimal set of sample points that match 1st, 2nd and 3rd moments of a Gaussian:

χ0=x¯,W0=κ/(n+κ)χi=x¯+((n+κ)Pxx),W0=1/2(n+κ)χi+n=x¯((n+κ)Pxx),Wi+n=1/2(n+κ)
  • x¯: mean; Pxx: covariance; i: i-th column; xRn
  • k: extra degree of freedom to fine-tune the higher order moments of the approximation; when x is Gaussian, n+k=3 is a suggested heuristic
  • EKF approximates f to first order and ignores higher-order terms
  • UKF uses f exactly, but approximates p(x).

Simple Steps:

  • Dynamics update: Can simply use unscented transform and estimate the mean and variance at the next time from the sample points
  • Observation update: Use sigma-points from unscented transform to compute the covariance matrix between Xt and Zt. Then can do the standard update.

UKF summary:

  • Highly efficient: Same complexity as EKF, with a constant factor slower in typical practical applications
  • Better linearization than EKF: Accurate in first two terms of Taylor expansion (EKF only first term) + capturing more aspects of the higher order terms
  • Derivative-free: No Jacobians needed
  • Still not optimal!

Quiz

Tutorial Q1

Multi-Sensor Fusion for a Ground Vehicle A ground vehicle moves in a plane and is equipped with three sensors:

  • IMU: provides high-frequency measurements of acceleration and angular velocity
  • Wheel odometer: provides medium-frequency measurements of forward velocity
  • GPS: provides low-frequency measurements of position

The IMU has the highest update rate but suffers from drift.

The wheel odometer is accurate in the short term.

The GPS updates slowly but provides globally referenced position information. A filter needs to be designed to fuse these sensors for vehicle state estimation.

Questions:

1. Using an Extended Kalman Filter (EKF), how should the state variables be selected? Write the state equation and measurement equations, and derive the state matrix. (IMU bias does not need to be considered.)

(1) State Variable Selection

We can choose:

xk=[pxpyvθ]

Where px,py represent position, v is velocity, and θ is the heading angle. Other states like IMU biases can also be added.

State Equation:

The IMU measures the longitudinal acceleration ak and angular velocity ωk. Treating these as system inputs, the vehicle kinematic model can be written as:

px,k+1=px,k+vkcosθkΔtpy,k+1=py,k+vksinθkΔtvk+1=vk+akΔtθk+1=θk+ωkΔt

Written in vector form:

xk+1=f(xk,uk)+wk

Where:

uk=[akωk]

The nonlinear state function is:

f(xk,uk)=[px,k+vkcosθkΔtpy,k+vksinθkΔtvk+akΔtθk+ωkΔt]

Because the state equation is nonlinear, the EKF requires computing the Jacobian matrix of f(xk,uk) :

Ak=fx

Which yields:

Ak=[10cosθkΔtvksinθkΔt01sinθkΔtvkcosθkΔt00100001]

For the measurement equation, this system has two types of external measurements: wheel odometry and GPS. The IMU primarily acts as an input for state prediction here.

The wheel odometer measures the forward velocity of the vehicle, therefore:

zkwheel =vk+vkwheel 

Written in matrix form:

zkwheel =[0010]xk+vkwheel 

vkwheel  represents the wheel odometer measurement noise. Thus, the measurement function for the wheel odometer is:

hwheel (xk)=[0010]xk

The corresponding measurement matrix is: Hkwheel =[0010]. The GPS measures the vehicle's position, therefore:

zkgps=[px,kpy,k]+vkgps

Written in matrix form:

zkgps=[10000100]xk+vkgps

So the measurement function for the GPS is:

hgps(xk)=[10000100]xk

The corresponding measurement matrix is:

Hkgps=[10000100]
2. Why cannot the measurements from the three sensors simply be averaged?

(2) Why can't we just average them directly?

Because the three sensors measure different physical quantities, and they have different precisions, frequencies, and error characteristics. Simple averaging lacks physical meaning and fails to reflect the difference in reliability among the sensors. The purpose of the filter is to rationally weight and fuse this information based on the dynamic model and noise levels.

3. How should the filter be designed to handle different sampling rates?

(3) How to handle different sampling frequencies?

A "high-frequency prediction, low-frequency update" approach is typically adopted: the state prediction is continuously performed using the IMU frequency as the main loop; when wheel odometer data arrives, a single update is performed using the odometer; when GPS data arrives, another update is performed using the GPS. In other words, updates are applied asynchronously as soon as sensor data arrives, without requiring them to be perfectly synchronized.

Tutorial Q2

You are designing an altitude tracking algorithm for an unmanned helicopter executing a hovering mission. To measure the vertical distance to the ground in real-time, a sonar sensor is installed at the bottom of the helicopter. A certain 1D discrete system satisfies xk+1=xk+wk, and the measurement equation is zk=xk+vk. It is given that the previous estimate and error covariance at time k1 are:

x^k1=6Pk1=2

and the process noise variance is Q=1. At time k, the measurement value provided by the sensor is zk=9, and the measurement noise variance is R. You need to provide detailed steps to explain your calculation.

Please calculate sequentially:

(1) Prediction (A priori estimate) x^k (2) Prediction of the error covariance Pk (3) Kalman gain Kk (4) Estimate (A posteriori estimate) x^k (5) Error covariance Pk

Because the system is 1D (meaning state transition F=1 and measurement matrix H=1 ), the calculations are as follows: (1)

x^k=Ax^k1x^k=1×6=6

(2)

Pk=APk1AT+QPk=1×2×1+1=3

(3)

Kk=PkHT(HPkHT+R)1Kk=3×1×(1×3×1+R)1=33+R=0.5

(4)

x^k=x^k+Kk(zkHx^k)x^k=6+(33+R)(91×6)x^k=6+(33+R)×3=6+93+R=7.5

(5)

Pk=PkKkHPkPk=3(33+R)×1×3Pk=393+R=3(3+R)93+R=3R3+R=1.5

Tutorial Q3

For a one-dimensional system, suppose that at time k the prior estimate (prediction) is:

x^k=10Pk=4

Two sensors measure the same state simultaneously: Sensor 1 yields z1=12,R1=4, and Sensor 2 yields z2=11,R2=1. Assume that the measurement matrix is H=1. The sequential update method is adopted: first update using Sensor 1, then update using Sensor 2. Please calculate:

1. The state estimate and covariance after the first update.
2. The final estimate and covariance after the second update.

First update: using Sensor 1 The initial prior prediction is: x^k=10,Pk=4. The Kalman gain is:

Kk,1=PkHT(HPkHT+R1)1=44+4=0.5

The state estimate after the first update is:

x^k,1=x^k+Kk,1(z1Hx^k)=10+0.5(1210)=11

The covariance after the first update is:

Pk,1=(IKk,1H)Pk=(10.5)×4=2

Second update: using Sensor 2 Now, treat the result of the first update as the new prior prediction for the second update: meaning x^k,2=x^k,1=11, and Pk,2=Pk,1=2. The Kalman gain is:

Kk,2=Pk,2HT(HPk,2HT+R2)1=22+10.667

The state estimate after the second update (the final estimate) is:

x^k=x^k,2+Kk,2(z2Hx^k,2)=11+23(1111)=11

The final covariance is:

Pk=(IKk,2H)Pk,2=(123)×20.667

Released under the MIT License.