diff --git a/ICHLL_Brenon.md b/ICHLL_Brenon.md index 356d2503d1555a2ca46242e61bd5f8401611b55a..b6a6a2411ebcd2a3cb66954bbd6b5a0f61640a5e 100644 --- a/ICHLL_Brenon.md +++ b/ICHLL_Brenon.md @@ -212,24 +212,50 @@ near the "surface" of article entries. The central element of the *dictionaries* module is the `<entry/>` element meant to encode one single entry in a dictionary, that is to say a head word -associated to its definition. It is the natural entry point from the `<body/>` +associated to its definition. It is the natural way in from the `<body/>` element to the dictionary module: indeed, although `<body/>` may also contain `<entryFree/>` or `<superEntry/>` elements, the former is a relaxed version of `<entry/>` while the latter is a device to group several related entries together. Both can contain an `<entry/` directly while no obvious inclusion -exists the other way around. Most of the inclusion paths of "reasonable" depth -(which we define to strictly inferior to 5, that is twice the average shortest -depth between any two nodes) seem to either include `<figure/>` +exists the other way around. Most (> 96.2%) of the inclusion paths of +"reasonable" depth (which we define as strictly inferior to 5, that is twice the +average shortest depth between any two nodes) seem to either include `<figure/>` +or `<castList/>`, two elements unrelated to encyclopedia articles in the general +case. Hence, not only the semantics conveyed by the documentation but also the +structure of the elements graph evidence `<entry/>` as the natural top-most +element for an article. + +### Information about the word itself + +Once a block for an article is created, it may contain elements useful to +represent features such as + +- its written and spoken forms: `<form/>` +- a group of grammatical information: `<gramGrp/>`, that may itself contain as + we've seen above `<case/>`, `<gen/>`, `<number/>` or `<pers/>` to describe the + form itself for instance, but also information about the categories it belongs + to like `<iType/>` for its inflexion class or `<pos/>` for its part-of-speech +- its etymology +- its variants if there is a different spelling in a variety of the language or + if it has changed through time + +All these are examples and by no means an exhaustive list; the complete set +provides the encoder with a toolbox to describe all the information related to +the form the entry is found at and seem general enough to accomodate the +structure of any book indexing entries by words. + +### Cross-references + +A common feature shared by dictionaries and encyclopedias is the ability to +connect entries together by using a word or short phrase as the link, referring +the reader to the related concept. This is known as cross-references and can +appear either when the definition of a term is adjacent to another one or to +catch alternative spellings where some readers might expect the word to appear +and redirect them to the form chosen as the reference. In XML-TEI, this is done +with the `<xr/>` element. + +### Content -Once a block for an article is created - -It contain elements useful to represent the features occurring at the begining -of an article such as its written and spoken forms (`<form/>`), a group of -grammatical information (`<gramGrp/>`), that may itself contain as we've seen -above `<case/>`, `<gen/>`, `<number/>` or `<pos/>` to describe the form itself for instance, or ` - -All these are quite exhaustive and seem general enough to accomodate any book -structure indexing entries by words. A more # A new standard ?