Package nltk :: Package draw :: Module rechunkparser :: Class RegexpChunkDemo
[hide private]
[frames] | no frames]

Class RegexpChunkDemo

source code

object --+
         |
        RegexpChunkDemo

A graphical tool for exploring the regular expression based chunk parser (RegexpChunkParser).

See HELP for instructional text.

Instance Methods [hide private]
 
normalize_grammar(self, grammar) source code
 
__init__(self, devset_name='conll2000', devset=None, grammar='', chunk_node='NP', tagset=None)
x.__init__(...) initializes x; see x.__class__.__doc__ for signature
source code
 
_init_bindings(self, top) source code
 
_init_fonts(self, top) source code
 
_init_menubar(self, parent) source code
 
toggle_show_trace(self, *e) source code
 
_eval_plot(self, *e, **config) source code
 
_eval_demon(self) source code
 
_adaptively_modify_eval_chunk(self, t)
Modify _EVAL_CHUNK to try to keep the amount of time that the eval demon takes between _EVAL_DEMON_MIN and _EVAL_DEMON_MAX.
source code
 
_init_widgets(self, top) source code
 
show_trace(self, *e) source code
 
show_help(self, tab) source code
 
_history_prev(self, *e) source code
 
_history_next(self, *e) source code
 
_view_history(self, index) source code
 
_devset_next(self, *e) source code
 
_devset_prev(self, *e) source code
 
destroy(self, *e) source code
 
_devset_scroll(self, command, *args) source code
 
show_devset(self, index=None) source code
 
_chunks(self, tree) source code
 
_syntax_highlight_grammar(self, grammar) source code
 
_grammarcheck(self, grammar) source code
 
update(self, *event) source code
 
_highlight_devset(self, sample=None) source code
 
_chunkparse(self, words) source code
 
_color_chunk(self, sentnum, chunk, tag) source code
 
reset(self) source code
 
save_grammar(self, filename=None) source code
 
load_grammar(self, filename=None) source code
 
save_history(self, filename=None) source code
 
about(self, *e) source code
 
set_devset_size(self, size=None) source code
 
resize(self, size=None) source code
 
mainloop(self, *args, **kwargs)
Enter the Tkinter mainloop.
source code

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Class Variables [hide private]
  TAGSET = {'#': 'pound sign (currency marker)', '$': 'dollar si...
A dictionary mapping from part of speech tags to descriptions, which is used in the help text.
  HELP = [('Help', '20', 'Welcome to the regular expression chun...
Contents for the help box.
  HELP_AUTOTAG = [('red', {'foreground': '#a00'}), ('green', {'f...
  _EVAL_DELAY = 1
If the user has not pressed any key for this amount of time (in seconds), and the current grammar has not been evaluated, then the eval demon will evaluate it.
  _EVAL_CHUNK = 15
The number of sentences that should be evaluated by the eval demon each time it runs.
  _EVAL_FREQ = 0.2
The frequency (in seconds) at which the eval demon is run
  _EVAL_DEMON_MIN = 0.02
The minimum amount of time that the eval demon should take each time it runs -- if it takes less than this time, _EVAL_CHUNK will be modified upwards.
  _EVAL_DEMON_MAX = 0.04
The maximum amount of time that the eval demon should take each time it runs -- if it takes more than this time, _EVAL_CHUNK will be modified downwards.
  _GRAMMARBOX_PARAMS = {'background': '#efe', 'border': 2, 'heig...
  _HELPBOX_PARAMS = {'background': '#efe', 'border': 2, 'foregro...
  _DEVSETBOX_PARAMS = {'background': '#eef', 'border': 2, 'heigh...
  _STATUS_PARAMS = {'background': '#9bb', 'border': 2, 'relief':...
  _FONT_PARAMS = {'family': 'helvetica', 'size': -20}
  _FRAME_PARAMS = {'background': '#777', 'border': 3, 'padx': 2,...
  _EVALBOX_PARAMS = {'background': '#eef', 'border': 2, 'height'...
  _BUTTON_PARAMS = {'activebackground': '#777', 'background': '#...
  _HELPTAB_BG_COLOR = '#aba'
  _HELPTAB_FG_COLOR = '#efe'
  _HELPTAB_FG_PARAMS = {'background': '#efe'}
  _HELPTAB_BG_PARAMS = {'background': '#aba'}
  _HELPTAB_SPACER = 6
  _SCALE_N = 5
  _DRAW_LINES = False
  _eval_demon_running = False
  _showing_trace = False
  SAVE_GRAMMAR_TEMPLATE = '# Regexp Chunk Parsing Grammar\n# Sav...
Instance Variables [hide private]
  chunker
The chunker built from the grammar string
  grammar
The unparsed grammar string
  normalized_grammar
A normalized version of self.grammar.
  grammar_changed
The last time() that the grammar was changed.
  devset
The development set -- a list of chunked sentences.
  devset_name
The name of the development set (for save files).
  devset_index
The index into the development set of the first instance that's currently being viewed.
  _last_keypress
The time() when a key was most recently pressed
  _history
A list of (grammar, precision, recall, fscore) tuples for grammars that the user has already tried.
  _history_index
When the user is scrolling through previous grammars, this is used to keep track of which grammar they're looking at.
  _eval_grammar
The grammar that is being currently evaluated by the eval demon.
  _eval_normalized_grammar
A normalized copy of _eval_grammar.
  _eval_index
The index of the next sentence in the development set that should be looked at by the eval demon.
  _eval_score
The ChunkScore object that's used to keep track of the score of the current grammar on the development set.
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, devset_name='conll2000', devset=None, grammar='', chunk_node='NP', tagset=None)
(Constructor)

source code 

x.__init__(...) initializes x; see x.__class__.__doc__ for signature

Parameters:
  • devset_name - The name of the development set; used for display & for save files. If either the name 'treebank' or the name 'conll2000' is used, and devset is None, then devset will be set automatically.
  • devset - A list of chunked sentences
  • grammar - The initial grammar to display.
  • tagset - Dictionary from tags to string descriptions, used for the help page. Defaults to self.TAGSET.
Overrides: object.__init__

_adaptively_modify_eval_chunk(self, t)

source code 

Modify _EVAL_CHUNK to try to keep the amount of time that the eval demon takes between _EVAL_DEMON_MIN and _EVAL_DEMON_MAX.

Parameters:
  • t - The amount of time that the eval demon took.

mainloop(self, *args, **kwargs)

source code 

Enter the Tkinter mainloop. This function must be called if this demo is created from a non-interactive program (e.g. from a secript); otherwise, the demo will close as soon as the script completes.


Class Variable Details [hide private]

TAGSET

A dictionary mapping from part of speech tags to descriptions, which is used in the help text. (This should probably live with the conll and/or treebank corpus instead.)

Value:
{'#': 'pound sign (currency marker)',
 '$': 'dollar sign (currency marker)',
 '\'\'': 'close quote',
 '(': 'open parenthesis',
 ')': 'close parenthesis',
 ',': 'comma',
 '.': 'period',
 ':': 'colon',
...

HELP

Contents for the help box. This is a list of tuples, one for each help page, where each tuple has four elements:

  • A title (displayed as a tab)
  • A string description of tabstops (see Tkinter.Text for details)
  • The text contents for the help page. You can use expressions like <red>...</red> to colorize the text; see HELP_AUTOTAG for a list of tags you can use for colorizing.
Value:
[('Help',
  '20',
  '''Welcome to the regular expression chunk-parser grammar editor.  Y\
ou can use this editor to develop and test chunk parser grammars based\
 on NLTK\'s RegexpChunkParser class.

Use this box (\'Help\') to learn more about the editor; click on the t\
abs for help on specific topics:<indent>
...

HELP_AUTOTAG

Value:
[('red', {'foreground': '#a00'}),
 ('green', {'foreground': '#080'}),
 ('highlight', {'background': '#ddd'}),
 ('underline', {'underline': True}),
 ('h1', {'underline': True}),
 ('indent', {'lmargin1': 20, 'lmargin2': 20}),
 ('hangindent', {'lmargin1': 0, 'lmargin2': 60}),
 ('var', {'foreground': '#88f'}),
...

_GRAMMARBOX_PARAMS

Value:
{'background': '#efe',
 'border': 2,
 'height': 12,
 'highlightbackground': '#efe',
 'highlightthickness': 1,
 'relief': 'groove',
 'width': 40,
 'wrap': 'word'}

_HELPBOX_PARAMS

Value:
{'background': '#efe',
 'border': 2,
 'foreground': '#555',
 'height': 15,
 'highlightbackground': '#efe',
 'highlightthickness': 1,
 'relief': 'groove',
 'width': 15,
...

_DEVSETBOX_PARAMS

Value:
{'background': '#eef',
 'border': 2,
 'height': 10,
 'highlightbackground': '#eef',
 'highlightthickness': 1,
 'relief': 'groove',
 'tabs': (30),
 'width': 70,
...

_STATUS_PARAMS

Value:
{'background': '#9bb', 'border': 2, 'relief': 'groove'}

_FRAME_PARAMS

Value:
{'background': '#777', 'border': 3, 'padx': 2, 'pady': 2}

_EVALBOX_PARAMS

Value:
{'background': '#eef',
 'border': 2,
 'height': 280,
 'highlightbackground': '#eef',
 'highlightthickness': 1,
 'relief': 'groove',
 'width': 300}

_BUTTON_PARAMS

Value:
{'activebackground': '#777',
 'background': '#777',
 'highlightbackground': '#777'}

SAVE_GRAMMAR_TEMPLATE

Value:
'''# Regexp Chunk Parsing Grammar
# Saved %(date)s
#
# Development set: %(devset)s
#   Precision: %(precision)s
#   Recall:    %(recall)s
#   F-score:   %(fscore)s

...