nltk.test.unit package

Submodules

nltk.test.unit.test_2x_compat module

Unit tests for nltk.compat. See also nltk/test/compat.doctest.

class nltk.test.unit.test_2x_compat.TestFraction(methodName='runTest')[source]

Bases: unittest.case.TestCase

test_unnoramlize_fraction()[source]
class nltk.test.unit.test_2x_compat.TestTextTransliteration(methodName='runTest')[source]

Bases: unittest.case.TestCase

test_repr()[source]
test_str()[source]
txt = <Text: São Tomé and Príncipe...>
nltk.test.unit.test_2x_compat.setup_module(module)[source]

nltk.test.unit.test_aline module

Unit tests for nltk.metrics.aline

class nltk.test.unit.test_aline.TestAline(methodName='runTest')[source]

Bases: unittest.case.TestCase

Test Aline algorithm for aligning phonetic sequences

test_aline()[source]
test_aline_delta()[source]

Test aline for computing the difference between two segments

nltk.test.unit.test_brill module

Tests for Brill tagger.

class nltk.test.unit.test_brill.TestBrill(methodName='runTest')[source]

Bases: unittest.case.TestCase

test_brill_demo()[source]
test_pos_template()[source]

nltk.test.unit.test_chunk module

class nltk.test.unit.test_chunk.TestChunkRule(methodName='runTest')[source]

Bases: unittest.case.TestCase

test_tag_pattern2re_pattern_quantifier()[source]

Test for bug https://github.com/nltk/nltk/issues/1597

Ensures that curly bracket quantifiers can be used inside a chunk rule. This type of quantifier has been used for the supplementary example in http://www.nltk.org/book/ch07.html#exploring-text-corpora.

nltk.test.unit.test_classify module

Unit tests for nltk.classify. See also: nltk/test/classify.doctest

nltk.test.unit.test_classify.assert_classifier_correct(algorithm)[source]
nltk.test.unit.test_classify.test_megam()[source]
nltk.test.unit.test_classify.test_tadm()[source]

nltk.test.unit.test_collocations module

class nltk.test.unit.test_collocations.TestBigram(methodName='runTest')[source]

Bases: unittest.case.TestCase

test_bigram2()[source]
test_bigram3()[source]
test_bigram5()[source]
nltk.test.unit.test_collocations.close_enough(x, y)[source]

Verify that two sequences of n-gram association values are within _EPSILON of each other.

nltk.test.unit.test_concordance module

class nltk.test.unit.test_concordance.TestConcordance(methodName='runTest')[source]

Bases: unittest.case.TestCase

Text constructed using: http://www.nltk.org/book/ch01.html

setUp()[source]

Hook method for setting up the test fixture before exercising it.

classmethod setup_class()[source]
tearDown()[source]

Hook method for deconstructing the test fixture after testing it.

classmethod teardown_class()[source]
test_concordance_lines()[source]
test_concordance_list()[source]
test_concordance_print()[source]
test_concordance_width()[source]
nltk.test.unit.test_concordance.stdout_redirect(where)[source]

nltk.test.unit.test_corenlp module

Mock test for Stanford CoreNLP wrappers.

class nltk.test.unit.test_corenlp.TestParserAPI(methodName='runTest')[source]

Bases: unittest.case.TestCase

test_dependency_parser()[source]
test_parse()[source]
class nltk.test.unit.test_corenlp.TestTaggerAPI(methodName='runTest')[source]

Bases: unittest.case.TestCase

test_ner_tagger()[source]
test_pos_tagger()[source]
test_unexpected_tagtype()[source]
class nltk.test.unit.test_corenlp.TestTokenizerAPI(methodName='runTest')[source]

Bases: unittest.case.TestCase

test_tokenize()[source]

nltk.test.unit.test_corpora module

nltk.test.unit.test_corpus_views module

Corpus View Regression Tests

class nltk.test.unit.test_corpus_views.TestCorpusViews(methodName='runTest')[source]

Bases: unittest.case.TestCase

data()[source]
linetok = <nltk.tokenize.simple.LineTokenizer object>
names = ['corpora/inaugural/README', 'corpora/inaugural/1793-Washington.txt', 'corpora/inaugural/1909-Taft.txt']
test_correct_length()[source]
test_correct_values()[source]

nltk.test.unit.test_data module

class nltk.test.unit.test_data.TestData(methodName='runTest')[source]

Bases: unittest.case.TestCase

test_find_raises_exception()[source]
test_find_raises_exception_with_full_resource_name()[source]

nltk.test.unit.test_disagreement module

class nltk.test.unit.test_disagreement.TestDisagreement(methodName='runTest')[source]

Bases: unittest.case.TestCase

Class containing unit tests for nltk.metrics.agreement.Disagreement.

test_advanced()[source]

More advanced test, based on http://www.agreestat.com/research_papers/onkrippendorffalpha.pdf

test_advanced2()[source]

Same more advanced example, but with 1 rating removed. Again, removal of that 1 rating shoudl not matter.

test_easy()[source]

Simple test, based on https://github.com/foolswood/krippendorffs_alpha/raw/master/krippendorff.pdf.

test_easy2()[source]

Same simple test with 1 rating removed. Removal of that rating should not matter: K-Apha ignores items with only 1 rating.

nltk.test.unit.test_hmm module

nltk.test.unit.test_hmm.setup_module(module)[source]
nltk.test.unit.test_hmm.test_backward_probability()[source]
nltk.test.unit.test_hmm.test_forward_probability()[source]
nltk.test.unit.test_hmm.test_forward_probability2()[source]

nltk.test.unit.test_json2csv_corpus module

Regression tests for json2csv() and json2csv_entities() in Twitter package.

class nltk.test.unit.test_json2csv_corpus.TestJSON2CSV(methodName='runTest')[source]

Bases: unittest.case.TestCase

setUp()[source]

Hook method for setting up the test fixture before exercising it.

tearDown()[source]

Hook method for deconstructing the test fixture after testing it.

test_file_is_wrong()[source]

Sanity check that file comparison is not giving false positives.

test_retweet_original_tweet()[source]
test_textoutput()[source]
test_tweet_hashtag()[source]
test_tweet_media()[source]
test_tweet_metadata()[source]
test_tweet_place()[source]
test_tweet_place_boundingbox()[source]
test_tweet_url()[source]
test_tweet_usermention()[source]
test_user_metadata()[source]
test_userurl()[source]
nltk.test.unit.test_json2csv_corpus.are_files_identical(filename1, filename2, debug=False)[source]

Compare two files, ignoring carriage returns.

nltk.test.unit.test_naivebayes module

class nltk.test.unit.test_naivebayes.NaiveBayesClassifierTest(methodName='runTest')[source]

Bases: unittest.case.TestCase

test_simple()[source]

nltk.test.unit.test_pos_tag module

Tests for nltk.pos_tag

class nltk.test.unit.test_pos_tag.TestPosTag(methodName='runTest')[source]

Bases: unittest.case.TestCase

test_pos_tag_eng()[source]
test_pos_tag_eng_universal()[source]
test_pos_tag_rus()[source]
test_pos_tag_rus_universal()[source]
test_pos_tag_unknown_lang()[source]
test_unspecified_lang()[source]

nltk.test.unit.test_rte_classify module

class nltk.test.unit.test_rte_classify.RTEClassifierTest(methodName='runTest')[source]

Bases: unittest.case.TestCase

test_feature_extractor_object()[source]
test_rte_classification_with_megam()[source]
test_rte_classification_without_megam()[source]
test_rte_feature_extraction()[source]

nltk.test.unit.test_seekable_unicode_stream_reader module

The following test performs a random series of reads, seeks, and tells, and checks that the results are consistent.

nltk.test.unit.test_seekable_unicode_stream_reader.check_reader(unicode_string, encoding, n=1000)[source]
nltk.test.unit.test_seekable_unicode_stream_reader.teardown_module(module=None)[source]
nltk.test.unit.test_seekable_unicode_stream_reader.test_reader()[source]
nltk.test.unit.test_seekable_unicode_stream_reader.test_reader_on_large_string()[source]
nltk.test.unit.test_seekable_unicode_stream_reader.test_reader_stream_is_closed()[source]

nltk.test.unit.test_senna module

Unit tests for Senna

class nltk.test.unit.test_senna.TestSennaPipeline(methodName='runTest')[source]

Bases: unittest.case.TestCase

Unittest for nltk.classify.senna

test_senna_pipeline()[source]

Senna pipeline interface

class nltk.test.unit.test_senna.TestSennaTagger(methodName='runTest')[source]

Bases: unittest.case.TestCase

Unittest for nltk.tag.senna

test_senna_chunk_tagger()[source]
test_senna_ner_tagger()[source]
test_senna_tagger()[source]

nltk.test.unit.test_stem module

class nltk.test.unit.test_stem.PorterTest(methodName='runTest')[source]

Bases: unittest.case.TestCase

test_oed_bug()[source]

Test for bug https://github.com/nltk/nltk/issues/1581

Ensures that ‘oed’ can be stemmed without throwing an error.

test_vocabulary_martin_mode()[source]

Tests all words from the test vocabulary provided by M Porter

The sample vocabulary and output were sourced from:
http://tartarus.org/martin/PorterStemmer/voc.txt http://tartarus.org/martin/PorterStemmer/output.txt

and are linked to from the Porter Stemmer algorithm’s homepage at

test_vocabulary_nltk_mode()[source]
test_vocabulary_original_mode()[source]
class nltk.test.unit.test_stem.SnowballTest(methodName='runTest')[source]

Bases: unittest.case.TestCase

test_arabic()[source]

this unit testing for test the snowball arabic light stemmer this stemmer deals with prefixes and suffixes

test_german()[source]
test_russian()[source]
test_short_strings_bug()[source]
test_spanish()[source]

nltk.test.unit.test_tag module

nltk.test.unit.test_tag.setup_module(module)[source]
nltk.test.unit.test_tag.test_basic()[source]

nltk.test.unit.test_tgrep module

Unit tests for nltk.tgrep.

class nltk.test.unit.test_tgrep.TestSequenceFunctions(methodName='runTest')[source]

Bases: unittest.case.TestCase

Class containing unit tests for nltk.tgrep.

test_bad_operator()[source]

Test error handling of undefined tgrep operators.

test_comments()[source]

Test that comments are correctly filtered out of tgrep search strings.

test_examples()[source]

Test the Basic Examples from the TGrep2 manual.

test_labeled_nodes()[source]

Test labeled nodes.

Test case from Emily M. Bender.

test_multiple_conjs()[source]

Test that multiple (3 or more) conjunctions of node relations are handled properly.

test_node_encoding()[source]

Test that tgrep search strings handles bytes and strs the same way.

test_node_nocase()[source]

Test selecting nodes using case insensitive node names.

test_node_noleaves()[source]

Test node name matching with the search_leaves flag set to False.

test_node_printing()[source]

Test that the tgrep print operator ‘ is properly ignored.

test_node_quoted()[source]

Test selecting nodes using quoted node names.

test_node_regex()[source]

Test regex matching on nodes.

test_node_regex_2()[source]

Test regex matching on nodes.

test_node_simple()[source]

Test a simple use of tgrep for finding nodes matching a given pattern.

test_node_tree_position()[source]

Test matching on nodes based on NLTK tree position.

test_rel_precedence()[source]

Test matching nodes based on precedence relations.

test_rel_sister_nodes()[source]

Test matching sister nodes in a tree.

test_tokenize_encoding()[source]

Test that tokenization handles bytes and strs the same way.

test_tokenize_examples()[source]

Test tokenization of the TGrep2 manual example patterns.

Test tokenization of basic link types.

test_tokenize_macros()[source]

Test tokenization of macro definitions.

test_tokenize_node_labels()[source]

Test tokenization of labeled nodes.

test_tokenize_nodenames()[source]

Test tokenization of node names.

test_tokenize_quoting()[source]

Test tokenization of quoting.

test_tokenize_segmented_patterns()[source]

Test tokenization of segmented patterns.

test_tokenize_simple()[source]

Simple test of tokenization.

test_trailing_semicolon()[source]

Test that semicolons at the end of a tgrep2 search string won’t cause a parse failure.

test_use_macros()[source]

Test defining and using tgrep2 macros.

tests_rel_dominance()[source]

Test matching nodes based on dominance relations.

tests_rel_indexed_children()[source]

Test matching nodes based on their index in their parent node.

nltk.test.unit.test_tokenize module

Unit tests for nltk.tokenize. See also nltk/test/tokenize.doctest

class nltk.test.unit.test_tokenize.TestTokenize(methodName='runTest')[source]

Bases: unittest.case.TestCase

test_phone_tokenizer()[source]

Test a string that resembles a phone number but contains a newline

test_remove_handle()[source]

Test remove_handle() from casual.py with specially crafted edge cases

test_stanford_segmenter_arabic()[source]

Test the Stanford Word Segmenter for Arabic (default config)

test_stanford_segmenter_chinese()[source]

Test the Stanford Word Segmenter for Chinese (default config)

test_treebank_span_tokenizer()[source]

Test TreebankWordTokenizer.span_tokenize function

test_tweet_tokenizer()[source]

Test TweetTokenizer using words with special and accented characters.

test_word_tokenize()[source]

Test word_tokenize function

nltk.test.unit.test_twitter_auth module

nltk.test.unit.test_wordnet module

nltk.test.unit.utils module

nltk.test.unit.utils.skip(reason)[source]

Unconditionally skip a test.

nltk.test.unit.utils.skipIf(condition, reason)[source]

Skip a test if the condition is true.

Module contents