Dynamic Bayesian networks

In the examples we have seen so far, we have mainly focused on variable-based models. In these types of models, we mainly focus on representing the variables of the model. As in the case of our restaurant example, we can use the same network structure for multiple restaurants as they share the same variables. The only difference in all these networks would be the different states in the case of different restaurants. These types of models are known as variable-based models.

Let's take a more complex example. Let's say we want to model the state of a robot traveling over some trajectory. In this case, the state of the variables will change with time, and also, the states of some variables at some instance t might depend on the state of the robot at instance Dynamic Bayesian networks. Clearly, we can't model such a situation with a variable-based model. So, generally, for such problems, we use dynamic Bayesian networks (DBNs).

Assumptions

Before discussing the simplifying assumptions that DBNs make, let's first see the notations that we are going to use in the case of DBNs. As DBNs are defined over a range of time, with each time instance having the same variables, representing the instantiation of a random variable Assumptions at a time instance t, we will be using Assumptions. The variable Assumptions is now known as a template variable as it can't take any values itself. This template variable is instantiated at various time instances, and at each instance t, the variable Assumptions can take values from Assumptions. Also, for a set of random variables Assumptions, we use Assumptions, where Assumptions to denote the set of variables Assumptions. Similarly, we use the notation to denote the assignments to this set of variables.

As we can see, the number of variables will be huge between any considerable time difference and hence, our joint distribution over such trajectories will be very complex. Therefore, we make some assumptions to simplify our distribution.

Discrete timeline assumption

The first simplifying assumption that we make is to have a discrete timeline rather than having a continuous one. So, the measurement of the states of the random variables are taken at some predetermined time interval Discrete timeline assumption. With this assumption now, the random variable Discrete timeline assumption represents the values of the variables at a time instance Discrete timeline assumption.

Using this assumption, we can now write the distribution over the variable over a time period 0 to T as follows:

Discrete timeline assumption

Therefore, the distribution over trajectories is the product of conditional distribution over the variables at each previous time instance, given all the past variables.

The Markov assumption

The second assumption that we make is as follows:

The Markov assumption

Putting this in simple words, the variables at time t + 1 can directly depend only on the variables at time t and are thus, independent of all the variables The Markov assumption for The Markov assumption. Any system that satisfies this condition is known as Markovian. This assumption reduces the earlier joint distribution equation to the following:

The Markov assumption

In other words, this assumption also constraints our network, such that the variables in The Markov assumption can't have any edges from any other variable in The Markov assumption.

However, the problem with this assumption is that it may not hold in all cases. Let's take an example to show this. Suppose we want to model the location of a car. As we can see, we can easily predict the location of the car in the future, given the observations about the past. Also, let's assume that we only have two random variables {L,O} and L representing the location of the car and O representing the observed location. Here, we might think that our model satisfies the Markov assumption as the location at t + 1 will only depend on the location at time t and is independent of the location at The Markov assumption for The Markov assumption. However, this intuition might turn out to be wrong as we don't know the velocity or the direction of travel of the car. Had we known the previous locations of the car, we could have easily estimated both the direction and velocity. So, in such cases, to make our model closer to satisfying our Markov assumption, we can add the variables direction and velocity in our model. Now, at each instance of time, if we know the velocity and direction of motion of the car, we can predict the next instance using just the values of the previous instance. Now, to account for the changes in the velocity and direction, we can also add variables such as weather conditions and road conditions. With the addition of these extra variables, our model is now close to being Markovian.

Model representation

The Markov assumption and the independence assumption that we saw in the previous section allow us to represent the joint distribution very compactly, even over infinite trajectories. All we need to define is the distribution for the initial state and a transition model Model representation. We can represent the preceding car example using a network as shown in Fig 7.4, Fig 7.5, and Fig 7.6.

Model representation

Fig 7.4: The 2-TBN network for the car example

The following flowchart depicts the network structure at time t = 0:

Model representation

Fig 7.5: The network structure

The following figure is the flowchart that shows the unrolled DBN over a two-time slice:

Model representation

Fig 7.6: Unrolled DBN over a two-time slice

Also, we define the interface variables Model representation as variables whose values at time t have a direct effect on the variables at time t + 1. Therefore, only the variables in Model representation can be parents of the variables in Model representation. Also, the preceding car example is an example of a two-time slice Bayesian network (2-TBN). We define a 2-TBN for a process over Model representation as a conditional Bayesian network over Model representation, given Model representation, where Model representation is a set of interface variables. In our example, all the variables are interface variables, except for O.

Overall, this 2-TBN represents the following conditional distribution:

Model representation

For each template variable Model representation, the CPD Model representationis known as the template factor. This template factor is instantiated multiple times in the network for each Model representation.

Currently, none of the Python libraries for PGM has a concrete implementation to work with DBN. However, pgmpy developers are currently working on it so it should soon be available in pgmpy.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset