Bad character heuristic

The Boyer-Moore algorithm compares the pattern and the text string in the direction from right to left. It uses the bad character heuristic to shift the pattern. According to the bad character shift concept, if there is a mismatch between the character of the pattern and the text, then we check if the mismatched character of the text occurs in the pattern or not. If this mismatched character (also known as a bad character) does not appear in the pattern, then the pattern will be shifted next to this character, and if that character appears somewhere in the pattern,  we shift the pattern to align with the occurrence of that character with the bad character of the text string.

Let's understand this concept by using an example. Consider a text string (T) and the pattern = {acacac}. We start by comparing the characters from right to left, that is, character b of the text string and character c of the pattern. They do not match, so we look for the mismatched character of the text string, that is, b, in the pattern. Since it does not occur in the pattern, we shift the pattern next to the mismatched character, as shown in the following diagram:

Let's look at another example. We start by comparing characters of the text string and the pattern from right to the left, and we get a mismatch for the character d of the text. Here, the suffix ac is matched, but the characters d and c do not match, and the mismatched character d does not occur in the pattern. Therefore, we shift the pattern to the mismatched character, as shown in the following diagram:

Let's consider another example case for the bad character heuristic. Here, the suffix ac is matched, but the next characters, a and c, do not match, so we search for the occurrences of the mismatched character a in the pattern. Since it has two occurrences in the pattern, we have two options so that we can align the mismatched character, as shown in the following diagram. In such a situation, where we have more than one option to shift the pattern, we move the pattern with the minimum amount of shifts to avoid any possible match. (In other words, it would be the rightmost occurrence of that character in the pattern.) If we would have only one occurrence of the mismatched character in the pattern, we can easily shift the pattern in such a way that the mismatched character is aligned.

In the following example, we would prefer option 1 to shift the pattern:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset