Package nltk :: Module detect
[hide private]
[frames] | no frames]

Source Code for Module nltk.detect

 1  # Natural Language Toolkit: Detect Features 
 2  # 
 3  # Copyright (C) 2001-2008 NLTK Project 
 4  # Author: Edward Loper <[email protected]> 
 5  #         Steven Bird <[email protected]> (porting) 
 6  # URL: <http://nltk.org> 
 7  # For license information, see LICENSE.TXT 
 8   
 9  """ 
10  Functions for detecting a token's X{features}.  Features are stored in 
11  a dictionary which maps feature names to feature values. 
12   
13  (Not yet ported from NLTK: A X{feature encoder} can then be used to 
14  translate the feature dictionary into a homogenous representation 
15  (such as a sparse boolean list), suitable for use with other 
16  processing tasks.) 
17  """ 
18   
19 -def feature(functions):
20 """ 21 Return a feature detector that applies the supplied functions 22 to each token. 23 24 @type functions: dictionary of functions 25 @param functions: one or more functions in one string argument to compute 26 the features. 27 """ 28 29 return lambda tokens: [(feature,function(tokens)) for 30 (feature, function) in functions.items()]
31 32
33 -def get_features(str):
34 """ 35 takes a string 36 returns a list of tuples (feature type, feature value) 37 """
38 39
40 -def text_feature():
41 return feature({'text': lambda t:t})
42
43 -def stem_feature(stemmer):
44 return feature({'stem': stemmer})
45 46 # def context_feature(): 47 # Meet the need that motivated BagOfContainedWordsFeatureDetector 48 # and SetOfContainedWordsFeatureDetector 49 50 51 ###################################################################### 52 ## Demo 53 ###################################################################### 54
55 -def demo():
56 from nltk.corpus import brown 57 from nltk import detect 58 59 detector = detect.feature({'initial': lambda t:[t[0]], 60 'len': lambda t:[len(t)]}) 61 62 for sent in brown.words('a')[:10]: 63 print detector(sent)
64 65 if __name__ == '__main__': demo() 66