diff --git a/book/ch05.rst b/book/ch05.rst index 4a25d6ac..8d50d440 100644 --- a/book/ch05.rst +++ b/book/ch05.rst @@ -316,7 +316,7 @@ category of the Brown corpus: We can use these tags to do powerful searches using a graphical POS-concordance tool ``nltk.app.concordance()``. Use it to search for any combination of words and POS tags, e.g. -``N N N N``, ``hit/VD``, ``hit/VN``, or ``the ADJ man``. +``NOUN NOUN NOUN NOUN``, ``hit/VBD``, ``hit/VBN``, or ``the ADJ man``. .. Screenshot @@ -416,8 +416,9 @@ will do this for the WSJ tagset rather than the universal tagset: To clarify the distinction between ``VBD`` (past tense) and ``VBN`` (past participle), let's find words which can be both ``VBD`` and -``VBN``, and see some surrounding text: +``VBN`` from the WSJ tagset, and see some surrounding text: + >>> cfd1 = nltk.ConditionalFreqDist(wsj) >>> [w for w in cfd1.conditions() if 'VBD' in cfd1[w] and 'VBN' in cfd1[w]] ['Asked', 'accelerated', 'accepted', 'accused', 'acquired', 'added', 'adopted', ...] >>> idx1 = wsj.index(('kicked', 'VBD')) @@ -565,7 +566,7 @@ the distinctions between the tags. >>> data = nltk.ConditionalFreqDist((word.lower(), tag) ... for (word, tag) in brown_news_tagged) >>> for word in sorted(data.conditions()): - ... if len(data[word]) > 3: + ... if len(data[word]) >= 3: ... tags = [tag for (tag, _) in data[word].most_common()] ... print(word, ' '.join(tags)) ...