Hermit Crab Parsing Engine Specification



Download 403.76 Kb.
Page10/20
Date31.07.2017
Size403.76 Kb.
#25627
1   ...   6   7   8   9   10   11   12   13   ...   20

5.3Superentries


The notion of a superentry of a lexical entry is used in the debugging function show_morphings, and in the lexicon function find_lexical_entries. Intuitively, one lexical entry is a superentry of a template lexical entry if the superentry is a (possibly more specific) instantiation of the second entry. The template will usually be supplied by the user, who (for instance) may wish to find all lexical entries in the lexicon matching the template.

More precisely, lexical entry X is a superentry of a (possibly partially specified) lexical entry Template iff:

(1) If Template specifies a Phonetic Shape, then the Phonetic Shape of Template is a substring of X. (“Substring” here includes the case where the two strings match exactly, i.e. an improper substring.) There is no provision for “wildcards.” However, the special character # (ASCII 35) at the beginning or end of the string representing the Phonetic Shape of Template must correspond to the respective terminus of the string representing the Phonetic Shape of X. (In other words, # represents the boundary of the stem.)

(2) If Template specifies a Family, it is identical to the Family of X.

(3) If Template specifies a Gloss, it is identical to the Gloss of X.

(4) If Template specifies a Part of Speech, it is identical to the Part of Speech of X.

(5) If Template specifies a Subcategorization, its Subcategorization list is a subset of the Subcategorization list of X.

(6) If Template specifies a Morphological Rules field, it is a subset of the Morphological Rules list specified for X. (The order of the two lists need not be identical.)

(7) If Template specifies a Stratum, it is identical to the Stratum of X.

(8) If Template specifies lists of MPR Features, Head Features, Foot Features, and/or Obligatory Head Features, each such list is a subset (not necessarily in the same order) of the corresponding list of X.


5.4Character Definition Table


Character Definition Tables define the translation between sequences of one or more characters, each of which represents a single segment, and sets of phonetic features representing those segments.

Record Label: char_table

Fields:

5.4.1Character Table Name


Optionality: obligatory

Label: name

Type: atom

Purpose: To refer to this table. For instance, the definition of a stratum must specify the Character Definition Table it uses (see Stratum Property Setting Record, section 5.5).

Warning: The morpher enforces uniqueness of character definition table names; no two tables can have the same name.

5.4.2Encoding


Optionality: obligatory

Label: encoding

Type: string

Purpose: This field is not used in Hermit Crab; it appears solely so that LevelOfRepr objects can be passed back and forth between Hermit Crab and Cellar (which does utilize this field).

5.4.3Segment Definitions


Optionality: optional

Label: seg_defs

Type: list

Contents: Each member of this list is itself a list of length two. The first member of the sublist is a string, whose characters define the external representation of a segment. The other member of the sublist is an atomic-valued feature list, which defines the features of the segment. Any features which do not appear are undefined for that segment.

Default: the empty list.

Purpose: The character representation of segments is used in lexical entries and in input tokens. They may also be used in the Phonetic Output field of morphological rules to represent affixal material.

If this field does not appear, the character definition table does not define any segments (such a table is unlikely to be of much use).

Boundary Definitions

Optionality: optional

Label: bdry_defs

Type: list

Contents: Zero or more strings, each of which represents a boundary marker. (There is no provision for the representation of boundary markers as sets of features.)

Default: the empty list.

Purpose: Boundary markers may be used in morphological and phonological rules. They may also appear in lexical entries and in input tokens, although such usage is likely to be rare.

If this field does not appear, the character definition table does not define any boundary markers.


5.5Stratum Property Setting Record


A Stratum Property Setting Record specifies, for a given stratum, one of the following properties: the name of its character definition table; whether it is cyclic or noncyclic; whether its morphological or phonological rules are linearly ordered; or the set of Affix Templates pertaining to the Stratum. The reason for setting these properties individually, rather than loading them together in some sort of stratum record, is that some sorts of changes (e.g. the character table) have much more far-reaching repercussions than others (e.g. the rule order). The use of individual property setting records allows the latter properties to be changed without having to reset the former properties.

Record Label: stratum_setting

Fields:

5.5.1Stratum Name


Optionality: obligatory

Label: nm

Type: atom

Purpose: To determine the stratum which is being set.

5.5.2Setting Type


Optionality: obligatory

Label: type

Type: atom: ctable, cyclicity, mrule, prule or templates

Purpose: Specifies the property of the stratum that is being set: its character definition table; its cyclicity; the ordering of its morphological or phonological rules, or its realizational affixes template.

Warning: Resetting the character definition table for a stratum means that any lexical entries or rules for that stratum are invalidated, and must be reloaded.

5.5.3Value


Optionality: obligatory

Label: value

Type: If the Setting Type is ctable, an atom (the name of a character definition table); if the Setting Type is cyclicity, one of the atoms cyclic or noncyclic; if the type is mrule, one of the atoms unordered or linear; if the type is prule, one of the atoms linear or simultaneous; or if the type is templates, a list of Affix Templates.

Default: None. Except for the character definition table, the properties of the stratum have defaults; but the use of this command requires an explicit value.

The defaults for the Stratum itself are as follows: the Cyclicity is ‘noncyclic’; the Morphological Rule Order is ‘unordered’; the Phonological Rule Order is ‘simultaneous’; and there are no Affix Templates.




Download 403.76 Kb.

Share with your friends:
1   ...   6   7   8   9   10   11   12   13   ...   20




The database is protected by copyright ©ininet.org 2024
send message

    Main page