6.3 Password Guessing

Password guessing takes many forms. Several unusual stories have appeared over the years on computer security discussion groups.

An Incident: A woman found a coworker using her password-protected computer. She confronted the coworker, who admitted to having found her password by phoning the “Psychic Friends Hotline.” A helpful psychic had told her the password.

The story is probably an urban legend, but it carries a nugget of truth. Careless or untrained computer users often choose passwords with a strong personal association. Through careful listening and observation of such people, a potential attacker can construct a list of likely passwords. Traditionally, a so-called “psychic” establishes credibility by speaking of things apparently unknown to others. Such psychics develop this information by studying and observing the target and by making shrewd guesses. This is also an effective strategy for guessing passwords.

In the 1980s, the U.S. government published guidelines for using passwords. At the time, there were two perceived risks to passwords: disclosure (intentional or accidental) and trial-and-error guessing. Neither attack would necessarily be discovered immediately, if at all. To limit the vulnerability, the guidelines recommended periodic password changes: monthly, yearly, or something in between. The recommended time frame was not based on the amount of time it would take for a masquerading attacker to do serious damage. Instead, it was calibrated to the estimated time it would take for a trial-and-error guessing attack to succeed.

Interactive Guessing

Trial-and-error attacks take on several guises. The traditional approach was to perform successive login attempts, guessing a different password each time. A more automated form might use an automated script to feed a series of password guesses to the login command. Many systems address this by inserting time delays into the login operation. The delays may lengthen in the face of multiple failed password guesses.

Login operations also may keep a count of the number of failed password attempts. The number of failed guesses is reported to system administrators. The system may report the number of guesses directed against a particular user identity to the owner of that identity. Finally, many systems define a maximum number of guesses. When the number exceeds that threshold, the system won’t allow that user identity to log in. Early guidelines suggested a limit of five guesses in a row; some modern systems limit users to three incorrect guesses.

The restriction to three guesses may make sense on a game show or perhaps in decades past when a user only had a single password to remember. Today, a typical user may need to juggle hundreds of password-protected accounts. Unless the user consults a written list whenever he logs in, he is not always going to enter the correct password on the first or even the second try. Matters become worse for seldom-used accounts; users can’t possibly remember such passwords except by trial-and-error guessing or by consulting a saved list.

Guess restrictions may have serious side effects when applied to administrative accounts. An attacker once disabled a system by making numerous guesses to the administrator’s user identity. The system locked out the administrator. The attacker then caused another problem that froze the system until an administrator cleared the problem. With the administrator locked out, there was no way to clear the problem.

Network-Based Guessing

Attackers are not limited to interactive login guessing on some modern networking systems. For example, some older forms of Microsoft file sharing use passwords for authentication. An attacker can try to crack a password by making successive file sharing requests, each with a different password guess. This attack takes place at network speeds.

Unfortunately, it isn’t always practical to treat such attacks the same as interactive logins. On some systems, the natural behavior of the network system may make it hard to distinguish legitimate traffic from a trial-and-error guessing attack.

Offline Password Cracking

The most powerful modern attack on passwords is the offline attack, which attacks hashed passwords. We also call this password cracking.

For many years, Unix password files were protected from writing but not from reading. The designers relied on the hash function to protect the passwords. As systems improved, however, so did their ability to calculate password hashes very quickly. At first, Unix designers redesigned the hash function to take longer, but that was only a temporary measure. By the 1990s, Unix systems were storing password hashes in inaccessible files.

FIGURE 6.9 illustrates an automated attack on password hashes. For input, the attack takes one or more password hashes, like Alice’s shown in the lower right-hand corner (the value “u3#”). The attack generates all possible passwords, one by one. In the figure, the attack generates three-letter passwords. It then calculates the hash for each of these passwords and compares it against the hash of the user’s password. Note that Alice’s hash matches the seventh hash value generated. When (if) hash values match, the attack retrieves the password it used to produce the hash (the seventh password, “aag”).

A procedure diagram depicts the steps in the offline trial and error attack on password hashes.

FIGURE 6.9 Offline trial-and-error attack on Alice’s password hash.

This attack is much faster, and thus more dangerous, than online attacks. While an online attack may be slowed down or blocked entirely, if the attacked system detects too many bad passwords, the offline attack proceeds at full speed. Moreover, the system under attack can’t possibly detect that the offline attack is in progress.

Modern systems try to reduce the risk of offline attacks in two ways. First, many systems demand users create longer and more complicated passwords. Second, no modern system intentionally discloses its hashed password file to its users. Unfortunately, modern systems don’t keep all encrypted and hashed passwords protected. Networking software often transmits encrypted or hashed passwords between systems. Attackers may intercept these hashed passwords and apply an offline attack.

6.3.1 Password Search Space

Our passwords cannot resist a guessing attack if the attacker can guess every possible password through trial and error. Whether the attack is online or offline, it’s obviously easier to try to guess a three- or four-character password than to guess a much longer one. Our first step in measuring the strength of our password system is to count the total number of possible passwords. We call this the search space.

The search space provides the upper bound for the number of guesses required in a trial-and-error attack. The attacker will always find the password if he can try every possible choice. This is actually the attacker’s worst-case scenario: The password will generally be found by random chance long before every possible choice has been tried.

Older—and cheaper—computers could not handle both uppercase and lowercase letters. Early users chose passwords from the 26-letter alphabet. In theory, any letter in a password could be one of the 26 available letters. We calculate the search space S of different-sized passwords as follows:

images

If we know that all passwords are of exactly the same length, then the calculation is very simple, with A representing the number of characters we can choose from (letters, digits, etc.), and L being the number of characters in the password:

images

Most systems that allow L-length passwords also accept shorter passwords. Traditional Unix systems, for example, accepted any password from length zero (no password) to eight characters. Although the fixed-length calculation may provide a tolerable estimate of the number of passwords, it underestimates the real number. To calculate the search space for all possible Unix passwords, we add up the search space for passwords of that length and shorter:

images

This summation produces a finite geometric progression, which we calculate by doing the sum. We also can reduce it to an equivalent and simpler algebraic form:

images

This yields a great number of passwords: 217 billion worth. In the early days of passwords, this was enough to discourage the most motivated attacker, but the risks grew as computer performance improved.

We make passwords harder to crack by making the attacker work harder. We refer to this as the work factor. For passwords, we increase the work factor by increasing the search space:

  • ■   Increase the length of the password; allow it to contain more characters.

  • ■   Increase the size of the alphabet or character set from which users choose their passwords.

When we vary the password’s length, character set size, or both, we use the following sum to calculate the search space; A being the size of the character set, and L being the maximum length of the password in characters.

images

Again, the progression has a simplified algebraic form:

images

If we include both uppercase and lowercase letters in our eight-letter passwords, we double the size of the alphabet and greatly increase the search space. We make the search space even larger, of course, if our passwords include digits and punctuation. In all, today’s standard American Standard Code for Information Interchange (ASCII) character set has 94 printable characters, excluding the space character. Because words are delimited by spaces, password software developers don’t always allow spaces.

Some high-security applications use passphrases that use longer, multiword phrases as secrets. People might not remember a long collection of letters and punctuation, but they might remember a long text phrase. Unlike passwords, a passphrase may include spaces as well as uppercase and lowercase letters and punctuation. TABLE 6.4 shows the number of possible passwords or phrases given various character set sizes and password lengths.

TABLE 6.4 Search Space for Random Passwords or Passphrases

images

The table illustrates password styles that yield search spaces ranging from 1011 to 1063. Modern computational power can efficiently perform trial-and-error attacks in the range of billions and trillions (1012). Problems in the quintillions (1018) have been cracked, though each attack requires a lot of dedicated resources.

Lacking further information, this would suggest that eight-character passwords are borderline safe, especially if we don’t restrict ourselves to single-case characters. Extending passwords to 16 characters would seem to place them out of the range of current attacks. This assumes, however, that people choose random passwords. This doesn’t always happen in practice.

6.3.2 Truly Random Password Selection

In an ideal world, the attacker’s work factor is the same as the password’s search space; the only way the attacker can reliably guess a password is by trying every possible password. If we look at the probabilities, the attacker has an even chance of success after performing half as many trials as the size of the password’s search space—but this is the ideal case. If passwords are not chosen randomly, then the attacker may exploit the bias and guess passwords faster.

When we use the word random in casual conversation, it often means “unplanned” or even “impulsive.” People often talk about almost any hard-to-interpret information as being “random.” In statistics, the word random refers to a collection of unbiased and uncorrelated data. That means something specific about what is in the collection, but says very little about the order in which the data appears.

In the field of security, however, we demand a really strong definition of random. When we require something to be random for security purposes, we can’t rely on personal impulses. True randomness arises from a truly random process, like flipping a coin, rolling dice, or some other independent, unbiased action. Random events may include opening to the random page of a large book or measuring stochastic events at the electronic level inside a circuit.

This is cryptographic randomness. For data to be cryptographically random, it’s not enough to produce a collection that’s free of statistical bias in the traditional sense. The data must not be part of a computed—or computable—sequence. Random data can’t be produced by a procedure. We need each choice to be produced independently from the previous one. The choices can only come from measuring truly random events.

This poses a problem inside a computer. It is easy to produce pseudorandom numbers. That is what we get when we call the “random” function in typical programming languages or in applications like Excel. These numbers come from a procedure best called a pseudorandom number generator (PRNG). These numbers are not truly random because they are determined by the procedure that generates them. Typically, these generators start from a “seed” value (often the time of day in seconds) and generate hard-to-predict numbers. In fact, the only randomness in the sequence often comes from the seed, which may or may not be random. There may be ways to unravel such sequences of numbers in a way that threatens security. When we need truly random choices, we should not use pseudorandom numbers.

There are many ways to generate truly random numbers, but they all require truly random inputs. For example, the four dice shown in FIGURE 6.10 can generate a random four-digit decimal number. The term “diceware” is often applied to techniques that produce passwords from groups of dice. The Electronic Frontier Foundation (EFF) publishes a well-respected strategy using five standard dice.

A photograph depicts four decimal dice. The first die represents numbers in thousands. The second represents numbers in hundreds. The third die represents numbers in tens and the last die represents numbers in ones.

FIGURE 6.10 Decimal dice can produce truly random numbers.

Courtesy of Dr. Richard Smith.

If the random numbers don’t need to be secret, we can find truly random numbers on the internet. Sites like “random.org” provide numbers generated by truly random phenomena. The website doesn’t exactly roll dice to generate the numbers, but the effect is the same. Some operating systems, notably Linux, have developed “truly random sources” that carefully monitor certain types of behavior in the system and derive randomness from variations in such behavior.

There are several applications and other products that generate secret and truly random passwords from truly random input. We also can do this manually by selecting letters, digits, and punctuation randomly from large books. We randomly open the book, put our finger down, and pick the next character we find. Although this will yield hard-to-guess passwords, we still won’t choose passwords that take full advantage of the search space. The books themselves are biased: Individual letters don’t occur with the same frequency. The letter “e” will appear much more often than the letter “q” in such passwords. We will look more closely at such biases in Section 6.4.

6.3.3 Cracking Speeds

For now, we’ll assume our passwords are perfectly random. If so, how big must our passwords be to resist trial-and-error attacks? If we have a hashed password file from a system restricted to single-case words of eight characters or less, then we need to hash 217 billion separate passwords. How long would that take?

Moore’s law predicts that computing power doubles every 18 months, thus, our ability to crack passwords will continue to improve. TABLE 6.5 summarizes different degrees of cracking feasibility, based on theory and capabilities in 2014. The following paragraphs discuss the prospects of different cracking technologies.

TABLE 6.5 Different Degrees of Cracking Feasibility

Cracking Technology Tests Completed Estimated Time
Desktop 109 Hours
DES Cracker 1018 Hours
Pico computing 1020 Hours
Physical limit using the sun’s energy 1075 10 billion years

A modern desktop computer can easily calculate 100,000 hashes (105) per second. It would take most of a month to hash every eight-letter word. This may be practical if the password is important enough and the delay isn’t significant. If, however, we put some serious computational muscle into it, we can dramatically reduce the calculation time. For example, if we throw 100 such computers at the problem, we hash all of those passwords in a little over 6 hours.

For even more serious password cracking, consider the Data Encryption Standard (DES) Cracker. Designed and built in 1998, the Cracker tests encryption keys through trial and error at a rate of 88 billion keys per second (almost 1011). Over 10 years later, we can safely assume that it’s possible to build a similar machine that cracks password hashes. At that rate, we could calculate 217 billion hashes in less than 3 seconds. Although we will discuss its purpose and construction further in Section 7.3, for now we should appreciate its sheer speed.

Although it seems likely that progress will continue for many years, some researchers believe computations will ultimately be limited by the fundamentals of physics. One U.S. government study examined the minimum theoretical energy requirements needed to sustain information state transitions at a physical level. The result was used to estimate the energy required to crack encryption keys. Their conclusion was that, in theory, we could test as many as 1075 keys or hashes, but it would take all of the energy generated by the sun over its entire lifetime. This is only a theoretical result. Some researchers suspect that “quantum computing” techniques might speed up key cracking. Today, however, we are limited to trial-and-error computation, possibly aided by testing numerous keys in parallel on separate CPUs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset