The transform functions defined in the previous recipes can be chained together to normalize chunks. The resulting chunks are often shorter with no loss of meaning.
In transforms.py
is the function transform_chunk()
. It takes a single chunk and an optional list of transform functions. It calls each transform function on the chunk, one at a time, and returns the final chunk:
def transform_chunk(chunk, chain=[filter_insignificant, swap_verb_phrase, swap_infinitive_phrase, singularize_plural_noun], trace=0): for f in chain: chunk = f(chunk) if trace: print f.__name__, ':', chunk return chunk
Using it on the phrase the book of recipes is delicious
, we get delicious recipe book
:
>>> from transforms import transform_chunk >>> transform_chunk([('the', 'DT'), ('book', 'NN'), ('of', 'IN'), ('recipes', 'NNS'), ('is', 'VBZ'), ('delicious', 'JJ')]) [('delicious', 'JJ'), ('recipe', 'NN'), ('book', 'NN')]
The transform_chunk()
function defaults to chaining the following functions in the given order:
filter_insignificant()
swap_verb_phrase()
swap_infinitive_phrase()
singularize_plural_noun()
Each function transforms the chunk that results from the previous function, starting with the original chunk.
You can pass trace=1
into transform_chunk()
to get an output at each step:
>>> from transforms import transform_chunk >>> transform_chunk([('the', 'DT'), ('book', 'NN'), ('of', 'IN'), ('recipes', 'NNS'), ('is', 'VBZ'), ('delicious', 'JJ')], trace=1) filter_insignificant : [('book', 'NN'), ('of', 'IN'), ('recipes', 'NNS'), ('is', 'VBZ'), ('delicious', 'JJ')] swap_verb_phrase : [('delicious', 'JJ'), ('book', 'NN'), ('of', 'IN'), ('recipes', 'NNS')] swap_infinitive_phrase : [('delicious', 'JJ'), ('recipes', 'NNS'), ('book', 'NN')] singularize_plural_noun : [('delicious', 'JJ'), ('recipe', 'NN'), ('book', 'NN')] [('delicious', 'JJ'), ('recipe', 'NN'), ('book', 'NN')]
This shows you the result of each transform function, which is then passed in to the next transform until a final chunk is returned.