The structure of the Falcon project we're about to code is nowhere near as extended as the interface one. We'll code five files altogether. In your ch12
folder, create a new one called pwdapi
. This is its final structure:
$ tree -A pwdapi/ pwdapi/ ├── core │ ├── handlers.py │ └── passwords.py ├── main.py └── tests └── test_core ├── test_handlers.py └── test_passwords.py
The API was all coded using TDD, so we're also going to explore the tests. However, I think it's going to be easier for you to understand the tests if you first see the code, so we're going to start with that.
This is the code for the Falcon application:
main.py
import falcon from core.handlers import ( PasswordValidatorHandler, PasswordGeneratorHandler, ) validation_handler = PasswordValidatorHandler() generator_handler = PasswordGeneratorHandler() app = falcon.API() app.add_route('/password/validate/', validation_handler) app.add_route('/password/generate/', generator_handler)
As in the example in Chapter 10, Web Development Done Right, we start by creating one instance for each of the handlers we need, then we create a falcon.API
object and, by calling its add_route
method, we set up the routing to the URLs of our API. We'll get to the definitions of the handlers in a moment. Firstly, we need a couple of helpers.
In this section, we will take a look at a couple of classes that we'll use in our handlers. It's always good to factor out some logic following the Single Responsibility Principle.
In OOP, the Single Responsibility Principle (SRP) states that every class should have responsibility for a single part of the functionality provided by the software, and that responsibility should be entirely encapsulated by the class. All of its services should be narrowly aligned with that responsibility.
The Single Responsibility Principle is the S in S.O.L.I.D., an acronym for the first five OOP and software design principles introduced by Robert Martin.
I heartily suggest you to open a browser and read up on this subject, it is very important.
All the code in the helpers section belongs to the core/passwords.py
module. Here's how it begins:
from math import ceil from random import sample from string import ascii_lowercase, ascii_uppercase, digits punctuation = '!#$%&()*+-?@_|' allchars = ''.join( (ascii_lowercase, ascii_uppercase, digits, punctuation))
We'll need to handle some randomized calculations but the most important part here is the allowed characters. We will allow letters, digits, and a set of punctuation characters. To ease writing the code, we will merge those parts into the allchars
string.
The PasswordValidator
class is my favorite bit of logic in the whole API. It exposes an is_valid
and a score
method. The latter runs all defined validators ("private" methods in the same class), and collects the scores into a single dict which is returned as a result. I'll write this class method by method so that it does not get too complicated:
class PasswordValidator: def __init__(self, password): self.password = password.strip()
It begins by setting password
(with no leading or trailing spaces) as an instance attribute. This way we won't then have to pass it around from method to method. All the methods that will follow belong to this class.
def is_valid(self): return (len(self.password) > 0 and all(char in allchars for char in self.password))
A password is valid when its length is greater than 0 and all of its characters belong to the allchars
string. When you read the is_valid
method, it's practically English (that's how amazing Python is). all
is a built-in function that tells you if all the elements of the iterable you feed to it are True
.
def score(self): result = { 'length': self._score_length(), 'case': self._score_case(), 'numbers': self._score_numbers(), 'special': self._score_special(), 'ratio': self._score_ratio(), } result['total'] = sum(result.values()) return result
This is the other main method. It's very simple, it just prepares a dict with all the results from the validators. The only independent bit of logic happens at the end, when we sum the grades from each validator and assign it to a 'total'
key in the dict, just for convenience.
As you can see, we score a password by length, by letter case, by the presence of numbers, and special characters, and, finally, by the ratio between letters and numbers. Letters allow a character to be between 26 * 2 = 52 different possible choices, while digits allow only 10. Therefore, passwords whose letters to digits ratio is higher are more difficult to crack.
Let's see the length validator:
def _score_length(self): scores_list = ([0]*4) + ([1]*4) + ([3]*4) + ([5]*4) scores = dict(enumerate(scores_list)) return scores.get(len(self.password), 7)
We assign 0 points to passwords whose length is less than four characters, 1 point for those whose length is less than 8, 3 for a length less than 12, 5 for a length less than 16, and 7 for a length of 16 or more.
In order to avoid a waterfall of if
/elif
clauses, I have adopted a functional style here. I prepared a score_list,
which is basically [0, 0, 0, 0, 1, 1, 1, 1, 3, ...]
. Then, by enumerating it, I got a (length, score) pair for each length less than 16. I put those pairs into a dict, which gives me the equivalent in dict form, so it should look like this: {0:0, 1:0, 2:0, 3:0, 4:1, 5:1, ...}
. I then perform a get
on this dict with the length of the password, setting a value of 7 as the default (which will be returned for lengths of 16 or more, which are not in the dict).
I have nothing against if
/elif
clauses, of course, but I wanted to take the opportunity to show you different coding styles in this final chapter, to help you get used to reading code which deviates from what you would normally expect. It's only beneficial.
def _score_case(self): lower = bool(set(ascii_lowercase) & set(self.password)) upper = bool(set(ascii_uppercase) & set(self.password)) return int(lower or upper) + 2 * (lower and upper)
The way we validate the case is again with a nice trick. lower
is True
when the intersection between the password and all lowercase characters is non-empty, otherwise it's False
. upper
behaves in the same way, only with uppercase characters.
To understand the evaluation that happens on the last line, let's use the inside-out technique once more: lower or upper
is True
when at least one of the two is True
. When it's True
, it will be converted to a 1
by the int
class. This equates to saying, if there is at least one character, regardless of the casing, the score gets 1 point, otherwise it stays at 0.
Now for the second part: lower and upper
is True
when both of them are True
, which means that we have at least one lowercase and one uppercase character. This means that, to crack the password, a brute-force algorithm would have to loop through 52 letters instead of just 26. Therefore, when that's True
, we get an extra two points.
This validator therefore produces a result in the range (0, 1, 3), depending on what the password is.
def _score_numbers(self): return 2 if (set(self.password) & set(digits)) else 0
Scoring on the numbers is simpler. If we have at least one number, we get two points, otherwise we get 0. In this case, I used a ternary operator to return the result.
def _score_special(self): return 4 if ( set(self.password) & set(punctuation)) else 0
The special characters validator has the same logic as the previous one but, since special characters add quite a bit of complexity when it comes to cracking a password, we have scored four points instead of just two.
The last one validates the ratio between the letters and the digits.
def _score_ratio(self): alpha_count = sum( 1 if c.lower() in ascii_lowercase else 0 for c in self.password) digits_count = sum( 1 if c in digits else 0 for c in self.password) if digits_count == 0: return 0 return min(ceil(alpha_count / digits_count), 7)
I highlighted the conditional logic in the expressions in the sum
calls. In the first case, we get a 1 for each character whose lowercase version is in ascii_lowercase
. This means that summing all those 1's up gives us exactly the count of all the letters. Then, we do the same for the digits, only we use the digits string for reference, and we don't need to lowercase the character. When digits_count
is 0, alpha_count / digits_count
would cause a ZeroDivisionError
, therefore we check on digits_count
and when it's 0 we return 0. If we have digits, we calculate the ceiling of the letters:digits ratio, and return it, capped at 7.
Of course, there are many different ways to calculate a score for a password. My aim here is not to give you the finest algorithm to do that, but to show you how you could go about implementing it.
The password generator is a much simpler class than the validator. However, I have coded it so that we won't need to create an instance to use it, just to show you yet again a different coding style.
class PasswordGenerator: @classmethod def generate(cls, length, bestof=10): candidates = sorted([ cls._generate_candidate(length) for k in range(max(1, bestof)) ]) return candidates[-1] @classmethod def _generate_candidate(cls, length): password = cls._generate_password(length) score = PasswordValidator(password).score() return (score['total'], password) @classmethod def _generate_password(cls, length): chars = allchars * (ceil(length / len(allchars))) return ''.join(sample(chars, length))
Of the three methods, only the first one is meant to be used. Let's start our analysis with the last one: _generate_password
.
This method simply takes a length, which is the desired length for the password we want, and calls the sample function to get a population of length elements out of the chars
string. The return value of the sample function is a list of length elements, and we need to make it a string using join
.
Before we can call sample
, think about this, what if the desired length exceeds the length of allchars
? The call would result in ValueError: Sample larger than the population
.
Because of this, we create the chars
string in a way that it is made by concatenating the allchars
string to itself just enough times to cover the desired length. To give you an example, let's say we need a password of 27 characters, and let's pretend allchars
is 10 characters long. length / len(allchars)
gives 2.7, which, when passed to the ceil
function, becomes 3. This means that we're going to assign chars
to a triple concatenation of the allchars
string, hence chars
will be 10 * 3 = 30 characters long, which is enough to cover our requirements.
Note that, in order for these methods to be called without creating an instance of this class, we need to decorate them with the classmethod
decorator. The convention is then to call the first argument, cls
, instead of self
, because Python, behind the scenes, will pass the class object to the call.
The code for _generate_candidate
is also very simple. We just generate a password and, given the length, we calculate its score, and return a tuple (score, password).
We do this so that in the generate
method we can generate 10 (by default) passwords each time the method is called and return the one that has the highest score. Since our generation logic is based on a random function, it's always a good way to employ a technique like this to avoid worst case scenarios.
This concludes the code for the helpers.
As you may have noticed, the code for the helpers isn't related to Falcon at all. It is just pure Python that we can reuse when we need it. On the other hand, the code for the handlers is of course based on Falcon. The code that follows belongs to the core/handlers.py
module so, as we did before, let's start with the first few lines:
import json import falcon from .passwords import PasswordValidator, PasswordGenerator class HeaderMixin: def set_access_control_allow_origin(self, resp): resp.set_header('Access-Control-Allow-Origin', '*')
That was very simple. We import json
, falcon
, and our helpers, and then we set up a mixin which we'll need in both handlers. The need for this mixin is to allow the API to serve requests that come from somewhere else. This is the other side of the CORS coin to what we saw in the JavaScript code for the interface. In this case, we boldly go where no security expert would ever dare, and allow requests to come from any domain ('*'
). We do this because this is an exercise and, in this context, it is fine, but don't do it in production, okay?
This handler will have to respond to a POST
request, therefore I have coded an on_post
method, which is the way you react to a POST
request in Falcon.
class PasswordValidatorHandler(HeaderMixin): def on_post(self, req, resp): self.process_request(req, resp) password = req.context.get('_body', {}).get('password') if password is None: resp.status = falcon.HTTP_BAD_REQUEST return None result = self.parse_password(password) resp.body = json.dumps(result) def parse_password(self, password): validator = PasswordValidator(password) return { 'password': password, 'valid': validator.is_valid(), 'score': validator.score(), } def process_request(self, req, resp): self.set_access_control_allow_origin(resp) body = req.stream.read() if not body: raise falcon.HTTPBadRequest('Empty request body', 'A valid JSON document is required.') try: req.context['_body'] = json.loads( body.decode('utf-8')) except (ValueError, UnicodeDecodeError): raise falcon.HTTPError( falcon.HTTP_753, 'Malformed JSON', 'JSON incorrect or not utf-8 encoded.')
Let's start with the on_post
method. First of all, we call the process_request
method, which does a sanity check on the request body. I won't go into finest detail because it's taken from the Falcon documentation, and it's a standard way of processing a request. Let's just say that, if everything goes well (the highlighted part), we get the body of the request (already decoded from JSON) in req.context['_body']
. If things go badly for any reason, we return an appropriate error response.
Let's go back to on_post
. We fetch the password from the request context. At this point, process_request
has succeeded, but we still don't know if the body was in the correct format. We're expecting something like: {'password': 'my_password'}
.
So we proceed with caution. We get the value for the '_body'
key and, if that is not present, we return an empty dict. We get the value for 'password'
from that. We use get
instead of direct access to avoid KeyError
issues.
If the password is None,
we simply return a 400 error (bad request). Otherwise, we validate it and calculate its score, and then set the result as the body of our response.
You can see how easy it is to validate and calculate the score of the password in the parse_password
method, by using our helpers.
We return a dict with three pieces of information: password
, valid
, and score
. The password information is technically redundant because whoever made the request would know the password but, in this case, I think it's a good way of providing enough information for things such as logging, so I added it.
What happens if the JSON-decoded body is not a dict? I will leave it up to you to fix the code, adding some logic to cater for that edge case.
The generator handler has to handle a GET
request with one query parameter: the desired password length.
class PasswordGeneratorHandler(HeaderMixin): def on_get(self, req, resp): self.process_request(req, resp) length = req.context.get('_length', 16) resp.body = json.dumps( PasswordGenerator.generate(length)) def process_request(self, req, resp): self.set_access_control_allow_origin(resp) length = req.get_param('length') if length is None: return try: length = int(length) assert length > 0 req.context['_length'] = length except (ValueError, TypeError, AssertionError): raise falcon.HTTPBadRequest('Wrong query parameter', '`length` must be a positive integer.')
We have a similar process_request
method. It does a sanity check on the request, even though a bit differently from the previous handler. This time, we need to make sure that if the length is provided on the query string (which means, for example, http://our-api-url/?length=23
), it's in the correct format. This means that length
needs to be a positive integer.
So, to validate that, we do an int
conversion (req.get_param('length')
returns a string), then we assert that length
is greater than zero and, finally, we put it in context
under the '_length'
key.
Doing the int
conversion of a string which is not a suitable representation for an integer raises ValueError
, while a conversion from a type that is not a string raises TypeError
, therefore we catch those two in the except
clause.
We also catch AssertionError
, which is raised by the assert length > 0
line when length
is not a positive integer. We can then safely guarantee that the length is as desired with one single try
/except
block.
Note that, when coding a try
/except
block, you should usually try and be as specific as possible, separating instructions that would raise different exceptions if a problem arose. This would allow you more control over the issue, and easier debugging. In this case though, since this is a simple API, it's fine to have code which only reacts to a request for which length
is not in the right format.
The code for the on_get
method is quite straightforward. It starts by processing the request, then the length is fetched, falling back to 16 (the default value) when it's not passed, and then a password is generated and dumped to JSON, and then set to be the body of the response.
In order to run this application, you need to remember that we set the base URL in the interface to http://127.0.0.1:5555
. Therefore, we need the following command to start the API:
$ gunicorn -b 127.0.0.1:5555 main:app
Running that will start the app defined in the main module, binding the server instance to port 5555
on localhost
. For more information about Gunicorn, please refer to either Chapter 10, Web Development Done Right or directly to the project's home page (http://gunicorn.org/).
The code for the API is now complete so if you have both the interface and the API running, you can try them out together. See if everything works as expected.
In this section, let's take a look at the tests I wrote for the helpers and for the handlers. Tests for the helpers are heavily based on the nose_parameterized
library, as my favorite testing style is interface testing, with as little patching as possible. Using nose_parameterized
allows me to write tests that are easier to read because the test cases are very visible.
On the other hand, tests for the handlers have to follow the testing conventions for the Falcon library, so they will be a bit different. This is, of course, ideal since it allows me to show you even more.
Due to the limited amount of pages I have left, I'll show you only a part of the tests, so make sure you check them out in full in the source code.
Let's see the tests for the PasswordGenerator
class:
tests/test_core/test_passwords.py
class PasswordGeneratorTestCase(TestCase): def test__generate_password_length(self): for length in range(300): assert_equal( length, len(PasswordGenerator._generate_password(length)) ) def test__generate_password_validity(self): for length in range(1, 300): password = PasswordGenerator._generate_password( length) assert_true(PasswordValidator(password).is_valid()) def test__generate_candidate(self): score, password = ( PasswordGenerator._generate_candidate(42)) expected_score = PasswordValidator(password).score() assert_equal(expected_score['total'], score) @patch.object(PasswordGenerator, '_generate_candidate') def test__generate(self, _generate_candidate_mock): # checks `generate` returns the highest score candidate _generate_candidate_mock.side_effect = [ (16, '&a69Ly+0H4jZ'), (17, 'UXaF4stRfdlh'), (21, 'aB4Ge_KdTgwR'), # the winner (12, 'IRLT*XEfcglm'), (16, '$P92-WZ5+DnG'), (18, 'Xi#36jcKA_qQ'), (19, '?p9avQzRMIK0'), (17, '4@sY&bQ9*H!+'), (12, 'Cx-QAYXG_Ejq'), (18, 'C)RAV(HP7j9n'), ] assert_equal( (21, 'aB4Ge_KdTgwR'), PasswordGenerator.generate(12))
Within test__generate_password_length
we make sure the _generate_password
method handles the length parameter correctly. We generate a password for each length in the range [0, 300), and verify that it has the correct length.
In the test__generate_password_validity
test, we do something similar but, this time, we make sure that whatever length we ask for, the generated password is valid. We use the PasswordValidator
class to check for validity.
Finally, we need to test the generate
method. The password generation is random, therefore, in order to test this function, we need to mock _generate_candidate
, thus controlling its output. We set the side_effect
argument on its mock to be a list of 10 candidates, from which we expect the generate
method to choose the one with the highest score. Setting side_effect
on a mock to a list causes that mock to return the elements of that list, one at a time, each time it's called. To avoid ambiguity, the highest score is 21, and only one candidate has scored that high. We call the method and make sure that that particular one is the candidate which is returned.
Testing the PasswordValidator
class requires many more lines of code, so I'll show only a portion of these tests:
pwdapi/tests/test_core/test_passwords.py
from unittest import TestCase from unittest.mock import patch from nose_parameterized import parameterized, param from nose.tools import ( assert_equal, assert_dict_equal, assert_true) from core.passwords import PasswordValidator, PasswordGenerator class PasswordValidatorTestCase(TestCase): @parameterized.expand([ (False, ''), (False, ' '), (True, 'abcdefghijklmnopqrstuvwxyz'), (True, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'), (True, '0123456789'), (True, '!#$%&()*+-?@_|'), ]) def test_is_valid(self, valid, password): validator = PasswordValidator(password) assert_equal(valid, validator.is_valid())
We start by testing the is_valid
method. We test whether or not it returns False
when it's fed an empty string, as well as a string made up of only spaces, which makes sure we're testing whether we're calling .strip()
when we assign the password.
Then, we use all the characters that we want to be accepted to make sure the function accepts them.
I understand the syntax behind the parameterize.expand
decorator can be challenging at first but really, all there is to it is that each tuple consists of an independent test case which, in turn, means that the test_is_valid
test is run individually for each tuple, and that the two tuple elements are passed to the method as arguments: valid
and password
.
We then test for invalid characters. We expect them all to fail so we use param.explicit
, which runs the test for each of the characters in that weird string.
@parameterized.expand( param.explicit(char) for char in '>]{<`\;,[^/"'~:}=.' ) def test_is_valid_invalid_chars(self, password): validator = PasswordValidator(password) assert_equal(False, validator.is_valid())
They all evaluate to False
, so we're good.
@parameterized.expand([ (0, ''), # 0-3: score 0 (0, 'a'), # 0-3: score 0 (0, 'aa'), # 0-3: score 0 (0, 'aaa'), # 0-3: score 0 (1, 'aaab'), # 4-7: score 1 ... (5, 'aaabbbbccccddd'), # 12-15: score 5 (5, 'aaabbbbccccdddd'), # 12-15: score 5 ]) def test__score_length(self, score, password): validator = PasswordValidator(password) assert_equal(score, validator._score_length())
To test the _score_length
method, I created 16 test cases for the lengths from 0 to 15. The body of the test simply makes sure that the score is assigned appropriately.
def test__score_length_sixteen_plus(self): # all password whose length is 16+ score 7 points password = 'x' * 255 for length in range(16, len(password)): validator = PasswordValidator(password[:length]) assert_equal(7, validator._score_length())
The preceding test is for lengths from 16 to 254. We only need to make sure that any length after 15 gets 7 as a score.
I will skip over the tests for the other internal methods and jump directly to the one for the score method. In order to test it, I want to control exactly what is returned by each of the _score_*
methods so I mock them out and in the test, I set a return value for each of them. Note that to mock methods of a class, we use a variant of patch
: patch.object
. When you set return values on mocks, it's never good to have repetitions because you may not be sure which method returned what, and the test wouldn't fail in the case of a swap. So, always return different values. In my case, I am using the first few prime numbers to be sure there is no possibility of confusion.
@patch.object(PasswordValidator, '_score_length') @patch.object(PasswordValidator, '_score_case') @patch.object(PasswordValidator, '_score_numbers') @patch.object(PasswordValidator, '_score_special') @patch.object(PasswordValidator, '_score_ratio') def test_score( self, _score_ratio_mock, _score_special_mock, _score_numbers_mock, _score_case_mock, _score_length_mock): _score_ratio_mock.return_value = 2 _score_special_mock.return_value = 3 _score_numbers_mock.return_value = 5 _score_case_mock.return_value = 7 _score_length_mock.return_value = 11 expected_result = { 'length': 11, 'case': 7, 'numbers': 5, 'special': 3, 'ratio': 2, 'total': 28, } validator = PasswordValidator('') assert_dict_equal(expected_result, validator.score())
I want to point out explicitly that the _score_*
methods are mocked, so I set up my validator
instance by passing an empty string to the class constructor. This makes it even more evident to the reader that the internals of the class have been mocked out. Then, I just check if the result is the same as what I was expecting.
This last test is the only one in this class in which I used mocks. All the other tests for the _score_*
methods are in an interface style, which reads better and usually produces better results.
Let's briefly see one example of a test for a handler:
pwdapi/tests/test_core/test_handlers.py
import json from unittest.mock import patch from nose.tools import assert_dict_equal, assert_equal import falcon import falcon.testing as testing from core.handlers import ( PasswordValidatorHandler, PasswordGeneratorHandler) class PGHTest(PasswordGeneratorHandler): def process_request(self, req, resp): self.req, self.resp = req, resp return super(PGHTest, self).process_request(req, resp) class PVHTest(PasswordValidatorHandler): def process_request(self, req, resp): self.req, self.resp = req, resp return super(PVHTest, self).process_request(req, resp)
Because of the tools Falcon gives you to test your handlers, I created a child for each of the classes I wanted to test. The only thing I changed (by overriding a method) is that in the process_request
method, which is called by both classes, before processing the request I make sure I set the req
and resp
arguments on the instance. The normal behavior of the process_request
method is thus not altered in any other way. By doing this, whatever happens over the course of the test, I'll be able to check against those objects.
It's quite common to use tricks like this when testing. We never change the code to adapt for a test, it would be bad practice. We find a way of adapting our tests to suit our needs.
class TestPasswordValidatorHandler(testing.TestBase): def before(self): self.resource = PVHTest() self.api.add_route('/password/validate/', self.resource)
The before
method is called by the Falcon TestBase
logic, and it allows us to set up the resource we want to test (the handler) and a route for it (which is not necessarily the same as the one we use in production).
def test_post(self): self.simulate_request( '/password/validate/', body=json.dumps({'password': 'abcABC0123#&'}), method='POST') resp = self.resource.resp assert_equal('200 OK', resp.status) assert_dict_equal( {'password': 'abcABC0123#&', 'score': {'case': 3, 'length': 5, 'numbers': 2, 'special': 4, 'ratio': 2, 'total': 16}, 'valid': True}, json.loads(resp.body))
This is the test for the happy path. All it does is simulate a POST
request with a JSON payload as body. Then, we inspect the response object. In particular, we inspect its status and its body. We make sure that the handler has correctly called the validator and returned its results.
We also test the generator handler:
class TestPasswordGeneratorHandler(testing.TestBase): def before(self): self.resource = PGHTest() self.api.add_route('/password/generate/', self.resource) @patch('core.handlers.PasswordGenerator') def test_get(self, PasswordGenerator): PasswordGenerator.generate.return_value = (7, 'abc123') self.simulate_request( '/password/generate/', query_string='length=7', method='GET') resp = self.resource.resp assert_equal('200 OK', resp.status) assert_equal([7, 'abc123'], json.loads(resp.body))
For this one as well, I will only show you the test for the happy path. We mock out the PasswordGenerator
class because we need to control which password it will generate and, unless we mock, we won't be able to do it, as it is a random process.
Once we have correctly set up its return value, we can simulate the request again. In this case, it's a GET
request, with a desired length of 7. We use a technique similar to the one we used for the other handler, and check the response status and body.
These are not the only tests you could write against the API, and the style could be different as well. Some people mock often, I tend to mock only when I really have to. Just try to see if you can make some sense out of them. I know they're not really easy but they'll be good training for you. Tests are extremely important so give it your best shot.