For MAP queries in graphical models, we introduce another operation on factors called maximization.
Let X be a set of variables, a variable, and a factor. We define factor maximization of Y in to be a factor over the variables X such that the following occurs:
Let's take an example of factor maximization to make this clearer:
Therefore, in the preceding example of the A -> B network, we had .Also, another important property of maximization is that it can be inserted in equations if some of the factors don't involve the variable over which the maximization is being performed. More formally, for a variable :
This is a very important property of maximization as it allows us to push the maximization operation inside equations, as we used to push summation in the case of the variable elimination operation. This avoids the full joint distribution and allows us to operate on much smaller factors.
Let's now try a sample run of the algorithm on the late-for-school model:
Step |
Variable eliminated |
Factors used |
Intermediate factor |
New factor |
---|---|---|---|---|
1 |
A |
|
|
|
2 |
J |
|
|
|
3 |
R |
|
|
|
4 |
Q |
|
|
|
5 |
G |
|
|
|
6 |
L |
|
|
|
We can clearly see that the max-marginal operation is very similar to the variable elimination we performed. The only difference is that rather than marginalizing the intermediate factor over the variable to be eliminated, we maximize over the variable to be eliminated.
We can compute the max-marginal over networks using pgmpy
:
In [1]: from pgmpy.models import BayesianModel In [2]: from pgmpy.factors import TabularCPD In [3]: from pgmpy.inference import VariableElimination # Constructing the model In [4]: model = BayesianModel( [('rain', 'traffic_jam'), ('accident', 'traffic_jam'), ('traffic_jam', 'long_queues'), ('traffic_jam', 'late_for_school'), ('getting_up_late', 'late_for_school')]) In [5]: cpd_rain = TabularCPD('rain', 2, [[0.4], [0.6]]) In [6]: cpd_accident = TabularCPD('accident', 2, [[0.2], [0.8]]) In [7]: cpd_traffic_jam = TabularCPD( 'traffic_jam', 2, [[0.9, 0.6, 0.7, 0.1], [0.1, 0.4, 0.3, 0.9]], evidence=['rain', 'accident'], evidence_card=[2, 2]) In [8]: cpd_getting_up_late = TabularCPD('getting_up_late', 2, [[0.6], [0.4]]) In [9]: cpd_late_for_school = TabularCPD( 'late_for_school', 2, [[0.9, 0.45, 0.8, 0.1], [0.1, 0.55, 0.2, 0.9]], evidence=['getting_up_late', 'traffic_jam'], evidence_card=[2, 2]) In [10]: cpd_long_queues = TabularCPD('long_queues', 2, [[0.9, 0.2], [0.1, 0.8]], evidence=['traffic_jam'], evidence_card=[2]) In [11]: model.add_cpds(cpd_rain, cpd_accident, cpd_traffic_jam, cpd_getting_up_late, cpd_late_for_school, cpd_long_queues) # Calculating max marginals In [12]: model_inference = VariableElimination(model) In [13]: model_inference.max_marginal( variables=['late_for_school']) Out[13]: 0.5714285714285714 In [14]: model_inference.max_marginal( variables=['late_for_school', 'traffic_jam']) Out[14]: 0.40547815820543098 # For any evidence in the network we can simply pass the evidence # argument which is a dict of the form of {variable: state} In [15]: model_inference.max_marginal( variables=['late_for_school'], evidence={'traffic_jam': 1}) Out[15]: 0.5714285714285714 In [16]: model_inference.max_marginal( variables=['late_for_school'], evidence={'traffic_jam': 1, 'getting_up_late': 0}) Out[16]: 0.80000000000000004 In [17]: model_inference.max_marginal( variables=['late_for_school','long_queues'], evidence={'traffic_jam': 1, 'getting_up_late': 0} Out[17]: 0.6399999999999999 # Again as in the case of VariableEliminaion we can also pass the # elimination order of variables for MAP queries. If not specified # pgmpy automatically computes the best elimination order for the # query. In [18]: model_inference.m_marginal( variables=['late_for_school'], elimination_order=['traffic_jam', 'getting_up_late', 'rain', 'long_queues', 'accident']) Out[18]: 0.5714285714285714 In [19]: model_inference.max_marginal( variables=['late_for_school'], evidence={'accident': 1}, elimination_order=['traffic_jam', 'getting_up_late', 'rain', 'long_queues']) Out[19]: 0.57142857142857129