Earley algorithm was given by Earley in 1970. This algorithm is similar to Top-down parsing. It can handle left-recursion, and it doesn't need CNF. It fills in a chart in the left to right manner.
Consider an example that illustrates parsing using the Earley chart parser:
>>> import nltk >>> nltk.parse.earleychart.demo(print_times=False, trace=1,sent='I saw a dog', numparses=2) * Sentence: I saw a dog ['I', 'saw', 'a', 'dog'] |. I . saw . a . dog .| |[---------] . . .| [0:1] 'I' |. [---------] . .| [1:2] 'saw' |. . [---------] .| [2:3] 'a' |. . . [---------]| [3:4] 'dog' |> . . . .| [0:0] S -> * NP VP |> . . . .| [0:0] NP -> * NP PP |> . . . .| [0:0] NP -> * Det Noun |> . . . .| [0:0] NP -> * 'I' |[---------] . . .| [0:1] NP -> 'I' * |[---------> . . .| [0:1] S -> NP * VP |[---------> . . .| [0:1] NP -> NP * PP |. > . . .| [1:1] VP -> * VP PP |. > . . .| [1:1] VP -> * Verb NP |. > . . .| [1:1] VP -> * Verb |. > . . .| [1:1] Verb -> * 'saw' |. [---------] . .| [1:2] Verb -> 'saw' * |. [---------> . .| [1:2] VP -> Verb * NP |. [---------] . .| [1:2] VP -> Verb * |[-------------------] . .| [0:2] S -> NP VP * |. [---------> . .| [1:2] VP -> VP * PP |. . > . .| [2:2] NP -> * NP PP |. . > . .| [2:2] NP -> * Det Noun |. . > . .| [2:2] Det -> * 'a' |. . [---------] .| [2:3] Det -> 'a' * |. . [---------> .| [2:3] NP -> Det * Noun |. . . > .| [3:3] Noun -> * 'dog' |. . . [---------]| [3:4] Noun -> 'dog' * |. . [-------------------]| [2:4] NP -> Det Noun * |. [-----------------------------]| [1:4] VP -> Verb NP * |. . [------------------->| [2:4] NP -> NP * PP |[=======================================]| [0:4] S -> NP VP * |. [----------------------------->| [1:4] VP -> VP * PP
Consider an example that illustrates parsing using the Chart parser in NLTK:
>>> import nltk >>> nltk.parse.chart.demo(2, print_times=False, trace=1,sent='John saw a dog', numparses=1) * Sentence: John saw a dog ['John', 'saw', 'a', 'dog'] * Strategy: Bottom-up |. John . saw . a . dog .| |[---------] . . .| [0:1] 'John' |. [---------] . .| [1:2] 'saw' |. . [---------] .| [2:3] 'a' |. . . [---------]| [3:4] 'dog' |> . . . .| [0:0] NP -> * 'John' |[---------] . . .| [0:1] NP -> 'John' * |> . . . .| [0:0] S -> * NP VP |> . . . .| [0:0] NP -> * NP PP |[---------> . . .| [0:1] S -> NP * VP |[---------> . . .| [0:1] NP -> NP * PP |. > . . .| [1:1] Verb -> * 'saw' |. [---------] . .| [1:2] Verb -> 'saw' * |. > . . .| [1:1] VP -> * Verb NP |. > . . .| [1:1] VP -> * Verb |. [---------> . .| [1:2] VP -> Verb * NP |. [---------] . .| [1:2] VP -> Verb * |. > . . .| [1:1] VP -> * VP PP |[-------------------] . .| [0:2] S -> NP VP * |. [---------> . .| [1:2] VP -> VP * PP |. . > . .| [2:2] Det -> * 'a' |. . [---------] .| [2:3] Det -> 'a' * |. . > . .| [2:2] NP -> * Det Noun |. . [---------> .| [2:3] NP -> Det * Noun |. . . > .| [3:3] Noun -> * 'dog' |. . . [---------]| [3:4] Noun -> 'dog' * |. . [-------------------]| [2:4] NP -> Det Noun * |. . > . .| [2:2] S -> * NP VP |. . > . .| [2:2] NP -> * NP PP |. [-----------------------------]| [1:4] VP -> Verb NP * |. . [------------------->| [2:4] S -> NP * VP |. . [------------------->| [2:4] NP -> NP * PP |[=======================================]| [0:4] S -> NP VP * |. [----------------------------->| [1:4] VP -> VP * PP Nr edges in chart: 33 (S (NP John) (VP (Verb saw) (NP (Det a) (Noun dog))))
Consider an example that illustrates parsing using the Stepping Chart parser in NLTK:
>>> import nltk >>> nltk.parse.chart.demo(5, print_times=False, trace=1,sent='John saw a dog', numparses=2) * Sentence: John saw a dog ['John', 'saw', 'a', 'dog'] * Strategy: Stepping (top-down vs bottom-up) *** SWITCH TO TOP DOWN |[---------] . . .| [0:1] 'John' |. [---------] . .| [1:2] 'saw' |. . [---------] .| [2:3] 'a' |. . . [---------]| [3:4] 'dog' |> . . . .| [0:0] S -> * NP VP |> . . . .| [0:0] NP -> * NP PP |> . . . .| [0:0] NP -> * Det Noun |> . . . .| [0:0] NP -> * 'John' |[---------] . . .| [0:1] NP -> 'John' * |[---------> . . .| [0:1] S -> NP * VP |[---------> . . .| [0:1] NP -> NP * PP |. > . . .| [1:1] VP -> * VP PP |. > . . .| [1:1] VP -> * Verb NP |. > . . .| [1:1] VP -> * Verb |. > . . .| [1:1] Verb -> * 'saw' |. [---------] . .| [1:2] Verb -> 'saw' * |. [---------> . .| [1:2] VP -> Verb * NP |. [---------] . .| [1:2] VP -> Verb * |[-------------------] . .| [0:2] S -> NP VP * |. [---------> . .| [1:2] VP -> VP * PP |. . > . .| [2:2] NP -> * NP PP |. . > . .| [2:2] NP -> * Det Noun *** SWITCH TO BOTTOM UP |. . > . .| [2:2] Det -> * 'a' |. . . > .| [3:3] Noun -> * 'dog' |. . [---------] .| [2:3] Det -> 'a' * |. . . [---------]| [3:4] Noun -> 'dog' * |. . [---------> .| [2:3] NP -> Det * Noun |. . [-------------------]| [2:4] NP -> Det Noun * |. [-----------------------------]| [1:4] VP -> Verb NP * |. . [------------------->| [2:4] NP -> NP * PP |[=======================================]| [0:4] S -> NP VP * |. [----------------------------->| [1:4] VP -> VP * PP |. . > . .| [2:2] S -> * NP VP |. . [------------------->| [2:4] S -> NP * VP *** SWITCH TO TOP DOWN |. . . . >| [4:4] VP -> * VP PP |. . . . >| [4:4] VP -> * Verb NP |. . . . >| [4:4] VP -> * Verb *** SWITCH TO BOTTOM UP *** SWITCH TO TOP DOWN *** SWITCH TO BOTTOM UP *** SWITCH TO TOP DOWN *** SWITCH TO BOTTOM UP *** SWITCH TO TOP DOWN *** SWITCH TO BOTTOM UP Nr edges in chart: 37
Let's see the code for Feature chart parsing in NLTK:
>>> import nltk >>>nltk.parse.featurechart.demo(print_times=False,print_grammar=True,parser=nltk.parse.featurechart.FeatureChartParser,sent='I saw a dog') Grammar with 18 productions (start state = S[]) S[] -> NP[] VP[] PP[] -> Prep[] NP[] NP[] -> NP[] PP[] VP[] -> VP[] PP[] VP[] -> Verb[] NP[] VP[] -> Verb[] NP[] -> Det[pl=?x] Noun[pl=?x] NP[] -> 'John' NP[] -> 'I' Det[] -> 'the' Det[] -> 'my' Det[-pl] -> 'a' Noun[-pl] -> 'dog' Noun[-pl] -> 'cookie' Verb[] -> 'ate' Verb[] -> 'saw' Prep[] -> 'with' Prep[] -> 'under' * FeatureChartParser Sentence: I saw a dog |. I .saw. a .dog.| |[---] . . .| [0:1] 'I' |. [---] . .| [1:2] 'saw' |. . [---] .| [2:3] 'a' |. . . [---]| [3:4] 'dog' |[---] . . .| [0:1] NP[] -> 'I' * |[---> . . .| [0:1] S[] -> NP[] * VP[] {} |[---> . . .| [0:1] NP[] -> NP[] * PP[] {} |. [---] . .| [1:2] Verb[] -> 'saw' * |. [---> . .| [1:2] VP[] -> Verb[] * NP[] {} |. [---] . .| [1:2] VP[] -> Verb[] * |. [---> . .| [1:2] VP[] -> VP[] * PP[] {} |[-------] . .| [0:2] S[] -> NP[] VP[] * |. . [---] .| [2:3] Det[-pl] -> 'a' * |. . [---> .| [2:3] NP[] -> Det[pl=?x] * Noun[pl=?x] {?x: False} |. . . [---]| [3:4] Noun[-pl] -> 'dog' * |. . [-------]| [2:4] NP[] -> Det[-pl] Noun[-pl] * |. . [------->| [2:4] S[] -> NP[] * VP[] {} |. . [------->| [2:4] NP[] -> NP[] * PP[] {} |. [-----------]| [1:4] VP[] -> Verb[] NP[] * |. [----------->| [1:4] VP[] -> VP[] * PP[] {} |[===============]| [0:4] S[] -> NP[] VP[] * (S[] (NP[] I) (VP[] (Verb[] saw) (NP[] (Det[-pl] a) (Noun[-pl] dog))))
The following code is found in NLTK for the implementation of the Earley algorithm:
def demo(print_times=True, print_grammar=False, print_trees=True, trace=2, sent='I saw John with a dog with my cookie', numparses=5): """ A demonstration of the Earley parsers. """ import sys, time from nltk.parse.chart import demo_grammar # The grammar for ChartParser and SteppingChartParser: grammar = demo_grammar() if print_grammar: print("* Grammar") print(grammar) # Tokenize the sample sentence. print("* Sentence:") print(sent) tokens = sent.split() print(tokens) print() # Do the parsing. earley = EarleyChartParser(grammar, trace=trace) t = time.clock() chart = earley.chart_parse(tokens) parses = list(chart.parses(grammar.start())) t = time.clock()-t # Print results. if numparses: assert len(parses)==numparses, 'Not all parses found' if print_trees: for tree in parses: print(tree) else: print("Nr trees:", len(parses)) if print_times: print("Time:", t) if __name__ == '__main__': demo()