Predicting Population Growth: Modeling with Projection Matrices
Janet Steven* and James Kirkwood†, *Department of Biology, Sweet Briar College, Sweet Briar, VA 24595, USA, †Department of Mathematical Sciences, Sweet Briar College, Sweet Briar, VA 24595, USA
Biologists studying an organism in its natural environment often start with a description of a population’s size and demographic structure. How many individuals are in the population? What major life stages does each individual pass through? How many offspring are generated? Once this initial description is complete, biologists frequently move on to examine the causes and consequences of the observed size and structure. Is the population shrinking or growing? Which demographic stage experiences the greatest mortality? How does survival at a particular stage affect overall growth? Matrix population models that utilize projection matrices function to bridge the gap between population structure and population dynamics by providing a mathematical framework that both summarizes and models a population [1].
A matrix-based model can incorporate the different stages an organism goes through during its life, and then predict both the overall growth of the population and the distribution of individuals across these life stages. A projection matrix generated from data collected in a natural population models transitions between stages for a given time interval and allows us to predict how many individuals will be in each stage at any point in the future, assuming that transition probabilities and reproduction rates do not change.
Projection matrices are widely used in ecology and conservation biology to address both theoretical and practical questions. They can reveal the life stages that have the greatest influence on population growth and are therefore potentially shaped most strongly by natural selection [2]. They have been used to examine the impact of herbivory on plant populations [3,4] and the relative importance of alternative mating systems [5]. But perhaps their greatest impact is in providing quantitative predictions for conservation management decisions; because projection matrices can use the current structure and size of a population to predict future growth, they are often used to assess extinction risk of the population [6,7]. Moreover, because the models can inform the relative importance of each life stage to growth, they are used in guiding decisions about hunting limits and choosing key life stages to target during conservation [8]. Projection matrices in combination with some knowledge of available habitat area can also help conservation biologists determine the size and number of reserves needed to protect a species from extinction [9].
In this chapter, we review the basics of stage-structured population growth models and explore the underlying matrix algebra that makes them possible. We provide code for MATLAB and R that assumes only a basic understanding of the software interface and work through an example from a population of wild ginseng threatened by overharvesting. We close with a summary of additional analytical techniques involving projection matrices.
When studying wild populations, and especially species of conservation interest, biologists frequently want to predict whether the number of individuals in a population is growing, stable, or shrinking. The simplest models of population growth involve data on the number of births and deaths in a population within a given time frame. Frequently, biologists also collect additional data on the life cycle of an organism, such as how often it reproduces, how many offspring it makes, and how long it lives. Because each of these factors is often dependent on the age or size of an organism, biologists also include data that permits the separation of the life cycle into ages or stages. For example, the chance that an individual plant will survive and reproduce varies considerably with the age and size of the plant; the chance that a seed will survive and germinate, and then that the seedling will establish and grow, is typically very small. Many plants make thousands of seeds with only a few surviving to become new plants. But once a plant becomes established, often its chance of surviving to the next year is relatively high. Expanding a population growth model to include stages in the life cycle allows us to estimate the influence of each stage on population growth, and provides additional information for conservation management decisions.
Therefore, we need a mathematical model that describes a population in the following ways: (1) divides the life cycle of an individual into multiple stages, (2) keeps track of transitions, or how individuals move through those stages, (3) estimates the probability that an individual will die during a stage, and (4) keeps track of reproduction, both in terms of how many new individuals are made for each individual in a given stage and in terms of the stage in which new individuals appear.
First, we will construct a mathematical framework that does these four things. No model reflects the natural world with 100% accuracy, but describing population structure with a mathematical model will allow us to make observations and predictions that can guide our understanding of the dynamics in wild populations. Because the model is essentially a mathematical description of the population, we can use it to explore how the population might grow under certain conditions or how a single stage is affecting overall population growth. We will explain the mathematical theory that allows us to conduct these analyses, and then we will work through examples that use the models to predict the future of the population and identify the life stages that are critical for population growth.
In many models of population growth, life stages are defined based on morphological changes during growth, or changes in size. In some organisms, development leads to natural categories; seeds, seedlings, and reproductive plants, for example, or egg, larva, pupa, and adult in butterflies. In other organisms, sometimes it makes more sense to categorize individuals on the basis of age. Here we will only discuss models that use discrete categories based on life stage, but see [8] for an introduction to age-structured models (also called Leslie matrices). In the mathematical model we are building, the number of life stages must be finite (fixed). To simplify the mathematics, we impose a discrete time structure, where the time can only take the values If is any non-negative integer value, any changes that occur in the interval between time and time are treated as occurring simultaneously at time .
In between time and time , an individual could die, reproduce, move to another life stage, or stay unchanged. In the most general version of the model, all four of these events are possible for every life stage. In addition, the new individuals that appear could start life in any life stage, and individuals could switch between any two life stages in one time interval. Biologically, this model seems unlikely; an adult chicken can’t go back to being a chick, and a mature tree can’t develop in a year. But for the purposes of building a general model, we are going to allow for all possible transitions and the possibility that new individuals appear in every life stage. When we develop models specific to a particular organism, this general framework can be restricted by setting biologically impossible transitions to zero.
If we know how many members there are in each stage at time and we know something about birth rates, death rates, and transitions to other stages, we can predict the number of members in each stage at time . This is the first step in understanding whether the population is growing or shrinking and how each life stage contributes to that change. Mathematically, we divide the life cycle of the organism into stages, which we designate . These stages could be based on changes in development or size; models often begin with a juvenile stage. The number of stages is dependent on the biology of the system and the nature of the data. We designate the number of individuals in stage at time as . Once we have defined the stages and number of individuals in each, we want to find , the expected number of individuals in stage at time . Note that we are considering to be a known number and to be a prediction. Finding this predicted number of individuals in a stage at time depends on knowing the number of individuals in all stages at time because each stage could potentially contribute to . Knowing for all stages , we want to find the expected value of . To do this, we must consider the ways an individual of stage at time can produce an individual in stage at time . They are:
1. Movement between stages. An individual moves from stage to stage . If the probability for this transition in one unit of time is , then the expected number of members produced in stage by a single member in stage in one unit of time is . Note that and may be the same and is the probability that an individual in stage stays in stage after one time interval. In many models death is not considered to be a stage, and if this is the case, then the probabilities across all stages will not necessarily sum to 1. The probabilities are called transition probabilities.
2. Reproduction. The individual in stage produces new offspring, and the average number of offspring generated in one unit of time by one individual of stage that begin life in stage is denoted .
Therefore, if we know the following, we can determine the total number of members expected in stage at time :
1. the number of individuals in each stage at time ,
2. the probability that an individual in each of those stages will become an individual in stage , after one time interval,
3. and the average number of offspring in stage generated by a single individual in stage , denoted .
Using the terms we defined above, we can express our ideas more precisely. Each member that is in stage at time will produce an average of members in stage at time . Since there are members in stage at time , the stage will be expected to result in members in stage at time .
To get , that is the total number of members expected in stage at time , we sum over all stages to get
(7.1)
This formula will tell us how many individuals we expect in a single life stage after one time interval has passed. To keep track of the number of individuals in all life stages simultaneously, we turn to matrix algebra.
The framework provided by linear algebra allows us to build a population growth model that includes all the life stages we have defined and then use that model to gather information about the possible fate of the population. We can build a matrix that encompasses transitions among life stages, reproduction, and mortality for a specified time interval. We can also use the tools of linear algebra to describe properties of the projection matrix that are analogous to the biological characteristics of the population.
To generate a projection matrix for a population, we construct a square matrix that has an entry for each possible transition between the life stages. The entry in the th row and th column of is given by
Estimates for the probabilities and the average number of offspring are based on the relative number of individuals and offspring production by those individuals in each stage in natural populations, typically over multiple years (see [8] for more details on how this is done).
To illustrate the creation of a matrix from transition probabilities, we will work through an example from the biological literature on American ginseng (Panax quinquefolius). Ginseng is a plant native to the eastern United States and grows in forest understories. It is a perennial, and new leaves come up from an underground stem every year. This stem is enlarged and stores nutrients and starch. It also produces a variety of compounds that give ginseng a distinctive odor and taste, and give it value as an herbal medicine. The plants are often harvested from wild populations and the underground stems sold, sometimes for a considerable sum. Ginseng is becoming increasingly rare in many places, and its decline is likely a result of habitat loss and overharvesting. Therefore, it has been the subject of studies that use projection matrices to model the future viability of populations. The data we use for the examples below are from Charron and Gagnon [10].
The life cycle of ginseng lends itself easily to classification into life stages. Seeds spend a year and a half in the soil before germinating, and when they do germinate, they produce a seedling that becomes established in its first year and stores resources in its underground stem for the next year. At the end of the summer, it dies back above ground but the stem stays alive below ground. The next spring, the underground stem sends up a shoot with one compound leaf on it (compound leaves typically have five leaflets). The following year, the plant may make one leaf again, or it may make two leaves. In subsequent years, plants may make three or four leaves; rarely, plants produce five or six leaves. Plants sometimes flower when they have two leaves but more commonly wait until they have three or four leaves. The flowers will develop into fruits with seeds in them, and then these seeds will disperse away from the parent plant and become a new plant. The interval between time and time is a year, and because the data we are using were collected from the wild populations in the fall when the seeds were ripe, we can imagine the year starting at this time in the fall. In our model, we have six life stages: seed, seedling, one-leaved plant, two-leaved plant, three-leaved plant, and four-leaved plant.
An individual in each life stage has a certain probability of dying, staying in its stage, or moving into another stage. For example, we expect that some seedlings will become one-leaved plants in the next year, and some will die (the probability that a seedling will stay a seedling is 0). All nonzero transitions among life stages are illustrated in Figure 7.1.
Figure 7.1 Life stages, transition probabilities, and reproduction (in italics) for a population of American ginseng. Values represent the average number of plants for every one plant that transition to another stage (or remain in the same stage) from one year to the next. In other words, for every 1 seed, 0.15 seedlings will be present in the next year, and for every 1 three-leaved plant 3.90 seeds will be present in the next year. Data taken from Population 2, 1986–1987 in [10].
With six life stages all possibly transitioning into all other stages, we have 36 different elements to keep track of, and we use a matrix:
Each row and column in the matrix represents a life stage; the columns are for the life stage in a given year (time ), and the rows represent the life stages in the next year (time ). Each element in the matrix represents the average number of new plants in the next year that are produced by a single plant in the current year. In other words, if the value in the second column (representing seedlings) and third row (representing one-leaved plants) is , we expect for every one seedling at time there will be 0.21 one-leaved plants in the population at time . To put it another way, 21% of the seedlings will survive to be one-leaved plants in the next year.
The first column in the matrix represents the transition from the stage of being a seed to all other possible stages. Seeds can only become seedlings in the first year; therefore all but one of the values in the first column will be zero. More formally, and drawing from the data presented in Figure 7.1:
The probability that a seed at time :
remains a seed is 0, therefore ,
becomes a seedling at time is 0.15, therefore ,
becomes a one-leaved plant at time is 0, therefore ,
becomes a two-leaved plant at time is 0, therefore ,
Note the probabilities do not sum to 1. This is because not all seeds (in fact only 15%) survive to become seedlings in the following year.
From these probabilities, we know that the first column of the projection matrix is as shown in Table 7.1.
Table 7.1
The first column of the projection matrix for a population of American ginseng. The values in this column are the probability of transitioning from the seed stage to another stage (data from [10]).
To generate the second column of the matrix, we must consider what happens to the seedlings between the first year (time ) and the second year (time ). Seedlings can either die or become plants with one leaf; they do not flower, so they do not make seeds, and they cannot become bigger than a one-leaved plant in just a year. Therefore:
The probability that a seedling at time :
makes seeds at time is 0, therefore ,
stays a seedling at time is 0, therefore ,
becomes a one-leaved plant at time is 0.21, therefore ,
becomes a two-leaved plant at time is 0, therefore ,
From these probabilities, we know that the second column of the projection matrix is as shown in Table 7.2.
Table 7.2
The second column of the projection matrix for a population of American ginseng. The values in this column are the probability of transitioning from the seedling stage to another stage (data from [10]).
To generate the third column of the matrix, we consider what happens to plants with one leaf between time and time . We know from the diagram in Figure 7.1 that one-leaved plants have a probability of staying a one-leaved plant in the next year of 0.55, and a 0.35 probability of becoming a two-leaved plant in the next year. They don’t flower, so they can’t make seeds, and they can’t revert back to being seedlings. Therefore:
The probability that a one-leaved plant at time :
makes seeds at time is 0, therefore ,
becomes a seedling at time is 0, therefore ,
stays a one-leaved plant at time is 0.55, therefore ,
makes a two-leaved plant at time is 0.35, therefore ,
The fourth column of the matrix represents transitions from the two-leaved stage to other stages. There are several possible fates for a two-leaved plant; they occasionally transition back into one-leaved plants between one year and the next, they often stay two-leaved plants in the next year, they sometimes become three-leaved plants in the next year, and they rarely become four-leaved plants in the next year. Occasionally, a two-leaved plant flowers, which means it is generating seeds that are present in the following year. Therefore, based on the data from Figure 7.1:
The average number of seeds per plant generated by two-leaved plants between time and time is 0.27, therefore .
The probability that a two-leaved plant is a seedling at time is 0, therefore .
The probability that a two-leaved plant is a one-leaved plant at time is 0.05, therefore .
The probability that a two-leaved plant is a two-leaved plant at time is 0.45, therefore .
The probability that a two-leaved plant is a three-leaved plant at time is 0.41, therefore .
The probability that a two-leaved plant is a four-leaved plant at time is 0.05, therefore .
If we continue to build the matrix in the same fashion from Figure 7.1, we get the complete projection matrix in tabular form, presented in Table 7.3. The same information can be described by the following projection matrix:
(7.2)
Table 7.3
The projection matrix for a population of American ginseng (data from [10]).
While the general model allows for individuals in all life stages to potentially transfer to any other life stage within one time interval and reproduce offspring that appear in any other life stage, the biological reality of our study species means several transitions do not actually occur. Therefore, we have a number of zeroes in our matrix, and only the first row reflects the creation of offspring. In addition, the probability that a four-leaved plant stays a four-leaved plant is an interesting case; because no four-leaved plants died or reverted to a smaller stage over the course of the study, the probability is one, and it appears that the four-leaved plants are immortal. This cannot happen in reality, but because these probabilities are our best estimates based on data and we have no additional information from which to estimate the “true” number, we will leave it at one. Gathering additional years of data from this population could generate a more biologically realistic estimate.
It is also useful to point out that all matrix elements except those involving the production of seeds are transition probabilities and are thus less than or equal to one. However, for the first row, the values represent the average number of seeds made by plants in the other stages; therefore they may be greater than 1 and they are values that represent reproduction.1 As previously mentioned, if we sum the transition probabilities for a particular (non-seed) stage at time , they do not necessarily total to 1; that is because the probabilities only reflect the fraction of individuals that survive from one stage to the next.
We have constructed a projection matrix to summarize how plants growing in a wild population are changing from year to year; the matrix can now serve as a model for the wild population, and we can manipulate the model to learn more about the population. First, we will use this matrix to predict the number of individuals we expect in each stage after one year, given a starting number of individuals in each stage chosen by us. The number of individuals in each stage at time can be expressed as a vector; for example, if we have a population of ginseng with 800 seeds, 90 seedlings, 56 one-leaved plants, 23 two-leaved plants, 31 three-leaved plants and 11 four-leaved plants, the vector would look like this:
(7.3)
Here we go through the steps of conducting the calculations for predicting the number of two-leaved plants in the population at time , and then we present the matrix algebra to calculate the number of individuals in all life stages simultaneously.
Using the projection matrix (7.2) that we constructed above, we can predict the number of two-leaved plants at time (the next year) by applying Eq. (7.1). Recall that is the number of members in stage j at time , the entry is the average number of members that a single member in stage j produces in stage i between and , and that two-leaved plants are in stage 4. So
Written out in words, this becomes:
(number of seeds at time ) (probability that a seed becomes a two-leaved plant between and )
(number of seedlings at time ) (probability that a seedling becomes a two-leaved plant between and )
(number of one-leaved plants at time ) (the probability that a one-leaved plant becomes a two-leaved plant between and )
(number of two-leaved plants at time ) (the probability that a two-leaved plant stays a two-leaved plant between and )
(number of three-leaved plants at time ) (the probability that a three-leaved plant becomes a two-leaved plant between and )
(number of four-leaved plants at time ) (the probability that a four-leaved plant becomes a two-leaved plant between and ) = 800 × 0 + 90 × 0 + 56 × 0.35 + 23 × 0.45 + 31 × 0 + 11 × 0 = 29.95.
Therefore, we predict that in the next year the population will have 29.95 two-leaved individuals.
To expand our predictions to include every life stage in the following year, we will make full use of matrix algebra. Note that what we have done in the previous paragraph is multiply the vector (7.3) by the fourth row of the projection matrix. To determine the number of individuals in every stage at time , we can multiply the entire matrix by the vector. We can represent this group of equations as a matrix equation
(7.4)
If we let
and we indicate vectors and matrices in bold, then Eq. (7.4) can be written in matrix form as
(7.5)
To determine the number of individuals in every stage after a year for our example, we use the following equation, which multiplies the matrix (7.2) by the vector (7.3):
Matrix multiplication by hand is tedious and there are many software systems that can be used to carry out matrix computations. Here we give the commands for loading the matrix A and the vector n into MATLAB or R and performing the multiplication. In MATLAB, the projection matrix A can be defined with the following command:
A = [0 0 0 0.27 3.90 40.00; 0.15 0 0 0 0 0; 0 0.21 0.55 0.05 0 0; 0 0 0.35 0.45 0 0; 0 0 0 0.41 0.78 0; 0 0 0 0.05 0.19 1.0]
Similarly, we can define the vector n with the following command:
To multiply the matrix A by the vector n, we use the command:
To conduct the same analysis in R, first define matrix (7.2) as A using:
A <- matrix(c(0, 0, 0, 0.27, 3.90, 40.00, 0.15, 0, 0, 0, 0, 0, 0, 0.21, 0.55, 0.05, 0, 0, 0, 0, 0.35, 0.45, 0, 0, 0, 0, 0, 0.41, 0.78, 0, 0, 0, 0, 0.05, 0.19, 1.0), nrow = 6, byrow = T)
Vector (7.3)can be entered with the command:
To multiply A by n, use:
The resulting vector is then
(7.6)
The interpretation of the vector components is presented in Table 7.4. Note that while we expect fewer seeds and seedlings in the next year, the number of three- and four-leaved plants has increased.
Table 7.4
Predicted number of individuals at time for a population of American ginseng based on the computation from Eq. (7.5).
In some cases, it might be helpful to predict the number of individuals in each stage one time interval into the future, but most of the time we want to make predictions further into the future. From the projection matrix we can compute the predicted average number of members at any later time. For example, if we wanted to predict the number of individuals in each stage after two time intervals, we could calculate the number in each stage after one interval and then multiply the resulting vector by the transition matrix again. Mathematically, we can write
The vector of the number of individuals in each stage at time is multiplied by the projection matrix once for each time interval; in this case, since there are two time intervals, the projection matrix is raised to the second power.
To predict the distribution of individuals over stages at any time, we multiply the initial vector of the number of individuals in each stage class by the projection matrix to a power equal to the number of time intervals. Therefore, for any positive integer
(7.7)
After each generation, the population will grow or shrink in overall size, but the proportion of individuals in each stage at time becomes very similar to the proportion at time for large values of k. For example, if we predict the number of individuals in each stage class for a population with projection matrix (7.2) and the initial distribution of vector (7.3) after 10 years, we get the following distribution:
For the same population and initial distribution, we predict the following distribution after 11 years:
To compare these two vectors in terms of the relative distribution of individuals across stages, we can standardize by dividing each value in the vector by the total number of individuals in all stages. If we do so, we get:
While the total number of individuals in the population is changing, the relative proportion of individuals in each stage changes very little between years after 10 years have passed. After a long period of time (as , this distribution approaches a stable state. The stable distribution provides useful information about a population, and we next investigate the method by which it is calculated.
We begin with reviewing some key concepts from linear algebra and explore the theoretical structure that makes the calculation of the stable distribution possible. Readers without a linear algebra background can find the details on linear algebra basics in any linear algebra text, including matrix algebra, determinants, bases of linear spaces, and linear combinations of vectors (see, e.g., [11]). We will return to the application of the model to biological data in Section 7.9.
If A is an matrix, the nonzero vector v is an eigenvector of A with eigenvalue if
(7.8)
In other words, multiplying an eigenvector by its associated eigenvalue gives the same answer as multiplying the eigenvector by the entire matrix. This characteristic of eigenvalues and eigenvectors makes them handy tools for exploring the properties of matrices.
To find the eigenvalues and eigenvectors of a matrix we use the following equation:
The expression is a polynomial with being the variable. This polynomial is called the characteristic polynomial for A. When we set the characteristic polynomial equal to 0, we obtain the eigenvalues of A. An matrix has at most n eigenvalues, each with its own eigenvectors. After each eigenvalue has been determined, then its corresponding eigenvector(s) is (are) found by solving the equation . For example, suppose that a certain matrix A is defined as:
Then
and
Therefore, the eigenvalues of A are and .
Each eigenvalue has at least one eigenvector. The total possible number of linearly independent (that is, different no matter how they are scaled) eigenvectors for an eigenvalue is the algebraic dimension of the eigenvalue. In our example above, the eigenvalue 1 has only one linearly independent eigenvector, and the eigenvalue −4 has a maximum of two linearly independent eigenvectors. However, there could be fewer such eigenvectors. The actual number of eigenvectors for each eigenvalue can be determined only by more calculations (see [11]). The geometric dimension of the eigenvalue is the number of linearly independent eigenvectors of an eigenvalue that actually exist. Thus, the geometric dimension is less than or equal to the algebraic dimension. The set of linear combinations of the eigenvectors for an eigenvalue is its eigenspace. Fortunately for us, if the matrix we are using to describe the population meets certain characteristics, we can isolate a single eigenvalue with one associated eigenvector (and therefore an eigenspace of 1) that describes the growth and stable distribution of our population. The Perron-Frobenius theorem makes this possible.
In calculating the stable distribution that is generated by a particular projection matrix, we will take advantage of a theorem in matrix algebra; the Perron-Frobenius theorem, named after the German mathematicians who proved it in the early century. The theorem describes the properties of a matrix that satisfies a certain set of conditions (see [12] for a proof). If the projection matrices generated from biological data meet these conditions, the theorem allows us to use eigenvalues and eigenvectors to predict population growth and the stable distribution of individuals across stages. First, we will examine the theorem and its assumptions, and then we will explain how to apply it to calculating the stable distribution. The Perron-Frobenius theorem states: If A is a positive matrix, that is if for all and , then
a. There is an eigenvalue r of A for which r is positive and for any other eigenvalue of A. (Note that A may have complex eigenvalues.)
b. The algebraic dimension, and thus the geometric dimension, of the eigenspace of r is 1.
c. There is an expression of the eigenvector of r in which all of the entries are positive.
In summary, the theorem states that positive, square matrices have a single positive eigenvalue that is larger than the others, and associated with that eigenvalue is a single eigenvector with only positive entries. We can use this eigenvalue and its associated eigenvector to calculate the stable distribution of individuals across stages for a projection matrix after an infinite number of time intervals. This distribution and its eigenvalue provide information about the possible future growth and structure of a population.
The assumptions of the theorem are not always met by models built from biological examples, but the theorem still applies if certain additional assumptions are met. Projection matrices are structured to have the same number of columns and rows, because we allow for the possibility that every stage could transition to every other stage. Therefore, the assumption of a square matrix is met. A matrix A is positive when for all i and j. (In other words, a matrix is positive when every element in the matrix is greater than 0.) Our example matrix (7.2) is clearly not a positive matrix; it has a number of elements that are 0. Fortunately, the Perron-Frobenius theorem still holds if the matrix is nonnegative, irreducible, and primitive.
A nonnegative matrix may contain zeroes but has no negative elements; that is for all i and j. Models constructed from biological data do not contain negative values because organisms can only contribute neutrally or positively to growth; therefore, we expect all the matrices we construct to be nonnegative.
A matrix is irreducible if there is a path through the life cycle that connects every stage to every other stage, or more generally, beginning at any stage i it is possible to get to the stage j in a finite number of steps. If a matrix contains a stage that acts only as a sink and does not contribute to reproduction or any other life stage, the matrix is reducible. Reducible matrices are uncommon but they can occur when a post-reproductive life stage is included in the model (Figure 7.3). (See [1] for more information on working with reducible models.) Our ginseng example is irreducible because every individual has the potential to contribute to reproduction, either directly or by passing through other life stages first.
Figure 7.3 Life stages, transitions between stages, and reproduction for a pod of killer whales. Adults are divided into two stages; reproductive adults, which contribute to the “yearling” stage by having offspring, and “post-reproductive adults,” which do not reproduce or revert to other stages, making the matrix reducible. Modified from [13].
A nonnegative, irreducible matrix must also be primitive for the Perron-Frobenius theorem to still hold true. A primitive matrix is a matrix that, when raised to a sufficiently high power, contains only positive elements. In other words, there is a positive integer m for which for every entry in the matrix A. Again, it is uncommon for a projection matrix constructed from biological data to fail to be primitive, but it can occur if organisms only reproduce after they reach a certain stage and then die before reproducing again (Figure 7.4). If a matrix contains a self-loop (that is, individuals in a stage can stay in that stage in the following time step), it is primitive. We can also raise the matrix to a higher power and see whether the elements all become positive to determine whether it is primitive. Reducible matrices will also fail this test; at least one element will always remain 0 because it is not connected back into to the rest of the matrix through any path.
Figure 7.4 Life stages, transitions between stages, and reproduction for Chironomus riparius, a midge species. Adults have wings, and lay their eggs in water. The eggs hatch into larvae, which pass through four larval stages before pupating. Only adults reproduce, and adults reproduce only once, resulting in a transition matrix that is not primitive. Modified from [14].
If we raise matrix (7.2) to the fifth power, we get:
Thus, the matrix in our ginseng example is primitive. It is also nonnegative and irreducible; therefore we can continue using the Perron-Frobenius theorem to determine the stable distribution of individuals across stages.
After you have entered a matrix and named it A, the MATLAB command for this calculation is:
In R, you can multiply the matrix A by itself five times using the following commands:
To determine the stable state of a population, we want to know the vector that describes the relative distribution of individuals across stages as the number of time intervals goes to infinity. While in reality no population will exist an infinite amount of time, this distribution can act as a descriptor of the population and gives us the ability to make comparisons among life stages within the distribution and with stable distributions from other populations.
Here, we will work through the steps to explain why the eigenvector associated with the largest eigenvalue allows us to determine the stable distribution of individuals across stage classes. Suppose that A is a projection matrix that meets the assumptions of the Perron-Frobenius theorem and that is any vector. We can describe the vector using the eigenvectors that form a basis for this vector space. A set of vectors is a basis for a vector space V provided any vector in V can be written as a linear combination of in exactly one way. Suppose that is a basis of eigenvectors and let denote the eigenvalue of . Thus,
where are real-valued constants.
Multiplying by the matrix A from the left, we get:
Because is an eigenvector of A and , we have:
Recall from Eq. (7.7) that we can raise A to the power to determine the distribution of individuals across stages after units of time. Similarly,
(7.9)
Now we need to determine what happens when k goes to infinity. Since the dominant eigenvalue is greater than the absolute value of each of the other eigenvalues, dividing another eigenvalue by the dominant eigenvalue will always generate a value less than one. If we raise that value to the power, we expect it to approach zero as k approaches infinity. Expressed mathematically, we mean: Since for , we have for and so as . Therefore, if we divide both sides of Eq. (7.9) by , we get
When we take the limit as , all terms except the one containing the dominant eigenvector will give a limit of zero. Thus,
(7.10)
Applied to the vector of the number of individuals at each of the stages at time , Eq. (7.10) shows that the eigenvector associated with the dominant eigenvalue describes a long-range, and stable, distribution of individuals across stages. More specifically, if is the dominant eigenvalue for , combining Eqs. (7.7) and (7.10) yields
(7.11)
As before, it is important to note that this distribution does not give us the proportion (relative distribution) of individuals at each of the life stages. To obtain the relative distribution at the steady state, each of the elements of the vector will have to be divided by the sum of all the elements.
From a practical perspective, Eq. (7.11) states that to find the relative distribution at the steady state, one needs to compute the dominant eigenvalue of the projection matrix, calculate an eigenvector for the dominant eigenvalue, and then normalize it by dividing each of its components by the sum of its elements.
To illustrate the convergence from Eq. (7.11), we turn again to the ginseng example with projection matrix A from Eq. (7.2) and distribution vector from Eq. (7.3). Based on Eq. (7.10), we expect that if we calculate the dominant eigenvalue for the matrix and evaluate the expression for large values of k, we will obtain approximations for both the steady state distribution of individuals across stages and of the eigenvector associated with the dominant eigenvalue.
Using software for the computations (see next section for the appropriate MATLAB and R commands), we can find that the dominant eigenvalue for matrix A is . We can then compute the proportions of individuals across stages k years into the future from:
If we let k =100, we find
In order to compare this vector with others, we must normalize it to a sum of 1. The sum of the entries of the vector above is 1051.59, so therefore
Letting k =200 gives
The sum of the entries of the vector above is 1060.81, and
The entries for the normalized vectors for and are actually equal to 6 decimal places, illustrating the convergence established by Eqs. (7.10) and (7.11). Furthermore, based on Eq. (7.8), if this distribution is the same as the eigenvector associated with the dominant eigenvalue, we should be able to multiply it by either A or and obtain the same vector. When we do this, we find
so the equilibrium state is the normalized eigenvector for the dominant eigenvalue.
Now that we have explained the mathematics behind determining the stable distribution of individuals across stages, we can begin to interpret the biological meaning of eigenvalues and eigenvectors. As shown above, the normalized long-range stable distribution of individuals across states is the same as the normalized eigenvector associated with the dominant eigenvalue. This eigenvector is also sometimes called the right eigenvector. Therefore, if we can calculate this eigenvector from a projection matrix and normalize it so that the sum of the entries is 1, we get the proportion of individuals in each stage when the population is in a stable state.
The dominant eigenvalue itself also provides useful biological information; it is an estimate of population growth. If the eigenvalue is larger than 1, the population is predicted to grow, and if it is less than 1, the population will diminish. This estimate is useful when trying to understand the future of a population, but it assumes that the projection matrix stays constant over time, which is biologically unrealistic. As a population grows, competition and resource availability are likely to change survival and reproduction rates, resulting in changes in the projection matrix. In addition, a constant projection matrix ignores any changes in the environment from year to year. The estimate of growth provided by the dominant eigenvalue, therefore, is more a measure of what the population is capable of, given its current state, than a measure of what will actually happen in the future. In addition, the quality of our estimates is directly related to the accuracy of the initial data that were used to generate the projection matrix. Estimating projection matrices from multiple years of data and incorporating environmental uncertainty into the model with a stochastic approach can lead to more biologically accurate estimates; see [8] for more details. We can calculate the dominant eigenvalue and its associated eigenvector using MATLAB and R, as demonstrated below. When we do so for our ginseng example, we find that the estimated growth rate of the population is 1.1841. Because this value is larger than one, we expect that if the projection matrix did not change this population would grow. Through the computations below, we obtain again the steady state distribution of individuals across stages (see Table 7.5). As expected, they are the same as the approximated values we obtained for large in the previous sections. We predict that if the population illustrated in Figure 7.1 were to reach a stable state, the majority of individuals in the population would be seeds; this is not surprising from a biological perspective, given the low rate at which seeds become seedlings. It is also interesting to note that we predict more four-leaved individuals than two- and three-leaved individuals.
Table 7.5
Steady state distribution of individuals across stages for a population of American ginseng (data from [10]).
Stage | Number of Individuals |
Seeds | 0.8067 |
Seedlings | 0.1022 |
One-leaved plants | 0.0352 |
Two-leaved plants | 0.0168 |
Three-leaved plants | 0.0170 |
Four-leaved plants | 0.0221 |
We can define the projection matrix (7.2) as A in MATLAB with the following command:
A = [0 0 0 0.27 3.90 40.00; 0.15 0 0 0 0 0; 0 0.21 0.55 0.05 0 0; 0 0 0.35 0.45 0 0; 0 0 0 0.41 0.78 0; 0 0 0 0.05 0.19 1.0]
We can generate all eigenvalues of A using the command eig (A), and we want to give the vector of eigenvalues a name so we can use it in later commands, so we define eig (A) as E with the command
The largest eigenvalue is the dominant eigenvalue, and the growth rate of the population. To identify the largest eigenvalue, first we ask MATLAB to identify the index of the maximum eigenvalue, then we rename the maximum eigenvalue “lambda” with the following two commands:
To calculate the right eigenvector, which gives the stable distribution of individuals across stages, we use the following command to generate eigenvectors and their associated eigenvalues:
(We are using the same eig (A) command as before, but now we are requesting MATLAB to calculate right eigenvectors, which we define as W, and the associated eigenvalues, which MATLAB gives us as diagonals in the matrix that we have labeled d.) Next, we must pull out the dominant eigenvector and find its associated eigenvalue using:
Next, we need to rescale the dominant right eigenvector so that it sums to 1 and represents proportions of individuals across stage classes:
If you are using R to conduct your calculations, first download and install the popbio package. We can define our projection matrix (7.2) as A in R with the following command:
A <- matrix (c(0, 0, 0, 0.27, 3.90, 40.00, 0.15, 0, 0, 0, 0, 0, 0, 0.21, 0.55, 0.05, 0, 0, 0, 0, 0.35, 0.45, 0, 0, 0, 0, 0, 0.41, 0.78, 0, 0, 0, 0, 0.05, 0.19, 1.0), nrow = 6, byrow = T)
To identify the dominant eigenvalue, which is the growth rate of the population, we use the command
To get the stable distribution of individuals across stages, standardized to total to 1, we use the command
Note that the R commands are tailored specifically for generating values directly related to projection matrices in biology, rather than presenting all of the eigenvalues and eigenvectors for the matrix. If you would like to see all the eigenvalues and eigenvectors for your matrix, use the command
In this chapter, we have introduced you to the basic structure and theory behind using projection matrices in estimating population growth and structure. Biologists and mathematicians have gone much further in exploring methods to account for natural variability in populations. Sensitivity analysis uses partial derivatives of matrix elements to determine the relative impact of each element on and therefore examines the importance of each life stage on overall population growth [1]. Computer simulations are used to generate multiple estimates of growth rates under different environmental states, providing a better understanding of how robust a population is to environmental change [8]. Models can also incorporate spatial data on patches or multiple populations, providing information on the relative contribution of each patch or population to overall growth [8]. The flexibility and power of the projection matrix in population growth models has led to many important discoveries, contributions, and practical applications in ecology and conservation biology. We encourage you to explore the topic further through the literature cited here, or better yet, collect your own data from a wild population and apply the tools you learned here to predict its future growth.
1. Caswell H. Matrix population models: construction, analysis, and interpretation. 2nd ed. Sunderland, MA: Sinauer Associates; 2001.
2. Burns JH, Blomberg SP, Crone EE, et al. Empirical tests of life-history evolution theory using phylogenetic analysis of plant demography. J Ecol. 2010;98:334–344.
3. Steets JA, Knight TM, Ashman T-L. The interactive effects of herbivory and mixed mating for the population dynamics of Impatiens capensis. Am Nat. 2007;170:113–127.
4. Knight TM, Caswell H, Kalisz S. Population growth rate of an understory herb decreases non-linearly across a gradient of deer herbivory. Forest Ecol Manag. 2009;257:1095–1103.
5. Le Corff J, Horvitz CC. Population growth versus population spread of an ant-dispersed neotropical herb with a mixed reproductive strategy. Ecol Model. 2005;188:41–51.
6. Lande R. Demographic models of the northern spotted owl (Strix occidentalis caurina). Oecologia. 1988;75:601–607.
7. Crone EE, Menges ES, Ellis MM, et al. How do plant ecologists use matrix population models?. Ecol Lett. 2010;14:1–8.
8. Morris WF, Doak DF. Quantitative conservation biology: theory and practice of population viability analysis. Sunderland, MA: Sinauer Associates; 2002.
9. Menges ES. Population viability analyses in plants: challenges and opportunities. Trends Ecol Evol. 2000;15:51–56.
10. Charron D, Gagnon D. The demography of northern populations of Panax quinquefolium (American ginseng). J Ecol. 1991;79:431–445.
11. Lay D. Linear algebra and its applications. 3rd ed. Boston, MA: Addison Wesley; 2002.
12. Horn RA, Johnson CR. Matrix analysis. Cambridge, England: Cambridge University Press; 1985.
13. Brault S, Caswell H. Pod-specific demography of killer whales (Orcinus orca). Ecology. 1993;74:1444–1454.
14. Charles S, Ferreol M, Chaumot A, Péry ARR. Food availability effect on population dynamics of the midge Chironomous riparius: a Leslie modeling approach. Ecol Model. 2004;175:217–229.
15. Schwarzkopf T, Trevisan MC, Silva JF. A matrix model for the population dynamics of Hyptis suaveolens, an annual weed. Ecotropicos. 2009;22:23–26.
1It is somewhat easy to confuse the projection matrix with the transition matrix of Markov chains, because some entries in the projection matrix are transition probabilities. We caution against trying to make connections between the two.