Types of Particle Swarm Optimizations

The following is a list of just some of the variants of Particle Swarm Optimization:

Traditional Particle Swarm Optimization
Canonical Particle Swarm Optimization
Fully informed Particle Swarm Optimization

Let's now talk a little bit about the theory behind swarm intelligence, and then we'll move into two of the more specialized types of study in that field: Particle Swarm Optimization and ant swarm optimization, which are both direct drop-in replacements for back propagation!

However fascinating and intriguing you find this, please remember that nothing is perfect and there's no single shiny bullet that works for everything. This is a fascinating theory of study and entire books have been written on the subject. However, we always need to keep in mind the No Free Lunch Theorem for Optimization.

The No Free Lunch Theorem for Optimization states that no one can propose any one specific algorithm for solving all optimization problems. The success of an algorithm in solving one specific set of problems does not guarantee that it solves all optimization problems. More concretely, all optimization techniques perform equally well on average if you consider every optimization problem despite the performance on a subset of problems.

In a very well written paper titled A time performance comparison of Particle Swarm Optimization in mobile devices, written by Luis Antonio Beltrán Prieto, Zuzana Komínkova-Oplatková, Rubén Torres Frías, and Juan Luis Escoto Hernández, Particle Swarm Optimization is described like this:

"PSO is an optimization technique developed by Kennedy and Eberhart inspired by the collective behavior of animal groups, such as swarms of insects, to build a swarm of particles, i.e., a set of candidate solutions which flow through the parameter space generating trajectories driven by the best individuals. The initial population (swarm) consists of random solutions (particles) for the problem and is considered as a population of homogeneous agents which interact locally with other individuals without any central control. As a result, collective behavior is generated, thus evolution relies on cooperation and competition among individuals through the different epochs (generations). Each particle defines trajectories in the parameter space according to a motion function which is affected by velocity, inertia, cognitive coefficient and social coefficient. The objective is to find the global best solutions by stochastic weighting of the aforementioned elements. The process is iterative until a stopping criterion is met."

More intuitive analogies for Particle Swarm Optimization are birds and how they behave collaboratively, or a swarm of bees and how they determine which flowers to visit or which humans to attack! If you've ever watched a flock of birds flying or inadvertently knocked down a bee's nest then you know exactly what I am referring to.

Now, instead of dealing in just theory, let's take a hypothetical journey, a Treasure Hunt. I am intentionally going to make this as verbose as possible to ensure the analogy fits the problem space. It goes something like this.

You and several of your friends are in a mountainous region trying to find a hidden treasure worth a lot of money. We are not quite sure where the treasure is located, but we do know that it is in the deepest valley in the region. This equates to the minimum elevation in terms of height above sea level.

Let us also state that all our friends can communicate with one another using their cell phones (let's assume we have cell service here!). Let's also assume for now that our cell phones have GPS apps on them that tell us the elevation we are currently at. We will search each day for the treasure until we find it, and at the end of each day we will have either found the treasure and are rich, or we need to update our information and try again the next day. So, here's what each person has:

A cell phone with a GPS app to determine elevation.
Pen and paper to track our information at the end of each day. On this we will write the best position we have found (individually), which is our personal best, or PBEST. We will also write on this paper the best position that the entire team has found thus far, being our global best value or GBEST.

The following are the rules that we need to follow in our search:

Each person will start in a random location and in a random direction. We determine our elevation right away and write it on our paper. It would be best for us if each person was spread out as much as possible so that we can be efficient and cover more ground, but this is not necessary.
Our journey will take T number of days, to which at this point we are now aware of what that value is or will be.
Every morning we will plan our day.
Communications can only take place at the end of each day.
Each morning, everyone compares the elevations they are at and updates GBEST on their paper.
GBEST is the only piece of information each person can share (location and elevation).

Each person will update PBEST on their paper if they find a better position.
PBEST information is not shared; no one cares about anything but GBEST.
Take notes of this one; to move each day, each person takes (for instance) x steps in the direction of the last day, y steps in the direction towards PBEST, and z steps in the direction of GBEST. Confused?
Steps are random as we need some form of randomness in the search to make a stochastic search pattern for everyone as a collective group (that is, a flock or swarm of people).

With these few rules behind us, we can start our journey to the treasure. The team as a collective will keep locating different regions while watching the GBEST location found thus far. There is no guarantee of course that we will find the treasure, or that we will find it in the minimal number of days, but generally our search should be effective. Remember, no individual knows the exact location of the treasure, but cooperates with the swarm to develop collective intelligence to help find the treasure faster. For sure, it's better than a completely random search!

Let's try and plot out our steps in pseudo-pseudo-code:

Initialize a population of random solutions. For x number of decision variables, we have an x-space in which our solution exists as particles. Each particle has n variables and stores the best fitness for itself and the team.
For each iteration (either a number or a fitness value), calculate the fitness and store the best fitness variable (PBEST) and communicate this to the swarm.
Identify GBEST by comparing all the information we have received from the collective swarm.
Identify what will take us in the direction of GBEST considering our PBEST and GBEST.
Move in a specific time step in the direction of our velocity vector.
Over time, each team member (our particles in the swarm) will identify better GBEST variables and navigate towards them, thus also improving their PBEST at the same time.

With Particle Swarm Optimization we have three basic components that we should briefly discuss. They are, in no particular order:

Position: Similar to the location in the preceding analogy, referring to the parameter values. This refers to where a particle (bird, bee, fish, and so on) is in an x-dimensional search space.
Velocity: Similar to the movement direction in the preceding analogy, it is used for storing velocity, which will update each particle's position.
Fitness: Similar to the elevation in the preceding analogy, this shows how fit the particle is.

Velocity is the main part of our Particle Swarm Optimization. It considers the current position of the particle, the best position found by the swarm (GBEST) (all particles), and the best position of the current particle (PBEST). Mathematically, it breaks down like this:

There are also three hyperparameters that we should mention as you will be hearing about them a lot.

Inertia Weight (W): The inertia weight controls the impact of the previous historical velocities on the current velocity. It regulates the trade-off between the global and local exploration abilities. If the inertia is high, particles are constrained in changing their direction and thus turn around much slower. This implies a larger exploration area and less possibility of convergence towards the optimum.

If inertia is small, then only a small amount of momentum is present from the previous time-step; this allows for much faster changes in direction. The problem here is that it could take quite a bit longer to converge.

By decreasing the inertia weight, it is easier to obtain a better global search ability and make the particles enter the optimal value area earlier. This means it will then be easier to have a better search ability and optimum value.

C1: Cognitive intelligence
C2: Social intelligence

It should be noted that C1 and C2 are positive constants that control how far an individual particle can move within a single iteration. Lower values will allow particles to stray further from the targeted regions before being reined in. Higher values will result in shorter, more abrupt movements toward, or past, the targeted region. By default, we will set both values to 2.0.

You should experiment with the cognitive intelligence and social intelligence values, as sometimes different values lead to improved performance.

Table of Contents for Types of Particle Swarm Optimizations

Create new playlist

Sign In

Sign Up

Table of Contents for
Types of Particle Swarm Optimizations