Class Index | File Index

Classes


Class textmodel.TextHash

textmodel.TextHash This is the class for the TextHash document model.

It is used in the samples as a way of providing information to the {@doubletree.DoubleTree} to create its {@doubletree.Trie} data structure. It is not a necessary component of DoubleTree — all that really matters is that the relevant {@doubletree.Trie}s get created. For example, instead of TextHash, a database query might provide the input to {@doubletree.Trie}.

The TextHash maintains a hash of the data items where the keys are based on the distinguishing fields. The data parameters, except for baseField, should be the same as those used in the DoubleTree that will visualize the data.

"Items" are types, which are associated with token ids. Some functions expect an full item as an argument. Others expect a key as an argument. #itemToIndex converts a full item to a key.


Defined in: TextHash.js.

Class Summary
Constructor Attributes Constructor Name and Description
 
textmodel.TextHash(string, caseSensitive, fldNames, fldDelim, distinguishingFldsArray, baseField, useRecords)
Field Summary
Field Attributes Field Name and Description
 
The number of tokens (readOnly)
 
The number of types (readOnly)
Method Summary
Method Attributes Method Name and Description
 
is the item key in the model
 
is the item in the model
 
fromJSON(obj)
make this TextHash have the values of a (previously saved) TextHash JSON object
 
getItem(item, contextLen, includeOnly, itemIsRegex, contextFilters, maxRandomHits, puncToExclude)
get the information associated with an item

The information is an array of preceding items, an array of matching items, and an array of following items, where the preceding and following items are of length contextLen.

 
getItemContext(item, contextLen, id, itemIsRegex)
get a string of context around a single hit
 
getItems(regex, contextLen, includeOnly, contextFilters, maxRandomHits, puncToExclude)
Convenience function for textmodel.TextHash#getItem with isRegex=true
 
get the unique item keys in the model
 
get the unique item keys in the model, each followed by tab and its token count
 
convert a full item to its key form
Class Detail
textmodel.TextHash(string, caseSensitive, fldNames, fldDelim, distinguishingFldsArray, baseField, useRecords)
Parameters:
string
the input string, where each item is separated by whitespace.
caseSensitive
is the comparison of the baseField case sensitive or not
fldNames
the names of the fields in the data items
fldDelim
the field delimter in the data items. Note: it cannot be a whitespace (e.g. tab), since whitespace is used to delimit items
distinguishingFldsArray
the fields that determine identity
baseField
the primary field for comparison and display (typically token or lemma, but also possibly part of speech)
useRecords
blank lines are treated as delimiting units (records) in the text. Default is false
Field Detail
{number} numTokens
The number of tokens (readOnly)

{number} numTypes
The number of types (readOnly)
Method Detail
containsIndex(item)
is the item key in the model
Parameters:
item
a key (not a full item)
Returns:
true if the item key is in the model, false otherwise

containsItem(item)
is the item in the model
Parameters:
item
a full item (not a key)
Returns:
true if the item is in the model, false otherwise

fromJSON(obj)
make this TextHash have the values of a (previously saved) TextHash JSON object
Parameters:
obj
the TextHash JSON object

getItem(item, contextLen, includeOnly, itemIsRegex, contextFilters, maxRandomHits, puncToExclude)
get the information associated with an item

The information is an array of preceding items, an array of matching items, and an array of following items, where the preceding and following items are of length contextLen.

Parameters:
item
a key (not a full item)
contextLen
the length of the preceding and following context to be returned
includeOnly
an object where the keys are the item ids to be included (optional)
itemIsRegex
true if the item parameter should be considered as a regular expression instead of as a true item
contextFilters
an object with "include" OR "exclude" which have objects whose keys are fields and whose values are arrays of values of those fields e.g. {"include":{"POS":["NN","NNS"]}} would include in the context only those items whose POS is NN or NNS. Similarly, "leftEnd" and "rtEnd" (both are possible together) indicate properties determining the left and right end points of the context, excluding those elements. So {"leftEnd":{"POS":["SENT"]}, "rtEnd":{"POS":["MD"]}} would include in the left context elements up to but not including the first SENT POS, and in the right context elements up to but not including the first MD POS.
maxRandomHits
how many random hits to return. -1 or null to return all
puncToExclude
a string of punctuation to exclude from the base field (will override any punctuation allowed via "include" in contextFilters). Default is null (i.e. include all punctuation)
Returns:
array of [array of prefixes, array of item, array of suffixes, array of ids]

getItemContext(item, contextLen, id, itemIsRegex)
get a string of context around a single hit
Parameters:
item
the item key (not a full form)
contextLen
the length of the preceding and following context to include
id
the id of the hit to return
itemIsRegex
true if the item parameter should be considered as a regular expression instead of as a true item (This should match the value in the original query)
Returns:
string of context around the item with id, including the item itself

getItems(regex, contextLen, includeOnly, contextFilters, maxRandomHits, puncToExclude)
Convenience function for textmodel.TextHash#getItem with isRegex=true
Parameters:
regex
contextLen
includeOnly
contextFilters
maxRandomHits
puncToExclude

getUniqItems()
get the unique item keys in the model
Returns:
a sorted (case insensitive) array of item keys

getUniqItemsWithCounts()
get the unique item keys in the model, each followed by tab and its token count
Returns:
a sorted (case insensitive) array of item keys with their token counts

itemToIndex(item)
convert a full item to its key form
Parameters:
item
the full item
Returns:
the key form of the item

Documentation generated by JsDoc Toolkit 2.4.0 on Sun Sep 24 2017 15:56:41 GMT-0500 (CDT)