diff --git a/ICHLL_Brenon.md b/ICHLL_Brenon.md index f29ecc995fa9b0b5ba39a114200dd3c30ab205fa..ce51093a678128a8944f5a0db915cc668a865ba6 100644 --- a/ICHLL_Brenon.md +++ b/ICHLL_Brenon.md @@ -425,23 +425,135 @@ relevant. ### The notion of meaning -### Nested structures +Notwithstanding the correct way to represent domains of knowledge, their extent +itself raises concerns regarding the *dictionaries* module. Indeed, among the +vast collection of domains covered are sometimes historical articles and +biographies. If the notion of meaning can appear ill-fitting for a text +describing a series of historical events, one may still argue that it groups +them into a concept and associates it to the name of the event. But when it +comes to relating the life of a person, describing their relation to events and +other persons comes out even further from the notion of meaning. To what extent +is it relevant to consider that having discovered such or such thing or to have +been born on a certain time at a certain place *defines* someone ? + + + +Moreover, encyclopedias, inheriting as much as they have from the philosophical +Enlightenment, are not only spaces designed to assert, they also intrinsically +include an interrogative component. Some articles lay down the basis required to +understand the complexity of an issue and invite the reader to consider it +without providing a definite answer, going as far as to explicitly using +question marks. + + + +In this extract, the author devises a hypothetical situation to illustrate how +difficult it is to draw the line between two supposedly mutually exclusive +subcategories of legal actions. The whole point of the passage is to convey the +idea that the term eludes definition, wrapping it in a `<sense/>`, or worse, a +`<def/>` element would be an utter misnomer. + +As a result, the use of `<sense/>` and `<def/>` is not appropriate for +encyclopedic content in general. -### Candidates in the *dictionaries* module +### Nested structures -- `<sense/>` -- `<entryFree/>` -- `<note/>` -- `<dictScrap/>` / `<floatingText/>` +The final difficulty can be considered as a partial consequence of the previous +one on the structure of articles. The difficulty to define complex concepts is +the very reason why authors approach their subjects from various angles, +circumnavigating it as a best approximation. This strategy favours long, +structured developments with sections and subsections covering the multiple +aspects of the topic: from a historical, political, scientific point of view… +The longest articles can thus span several dozens of pages. They can contain +substructures with titles on at least three levels (for instance, a `a)` under a +`1)` under a `I.`), each of which are in turn generally developed over several +paragraphs. + + + +The nested structure that we have just evidenced demands of course a nesting +structure to accomodate it. More precisely it guides our search of XML elements +by giving us several constraints: we are looking for a pair of elements, the +first representing a (sub)section must be able to include both itself and the +second element, which doesn't have any special constraint in addition to the one +it shares with the first, which is to have a semantics compatible with our +purpose. In addition, the first element must be able to contain several `<p/>` +elements, `<p/>` being the reference element to encode paragraphs according to +the XML-TEI documentation. + +We have seen that the *dictionaries* module was equiped with a questionable but +possible element for subject domains. However, it does not include any element +for section titles. In the rest of the TEI specification, the elements `<head/>` +and `<title/>` — the latter with the possibility to set its `type` attribute to +`sub` — stand out as the best candidates for the semantics condition on the +second element. + +#### Candidates in the *dictionaries* module + +Filtering the content of the module to keep only the elements which can at the +same time contain themselves, be included under `<entry/>` and include a `<p/>` +and either the `<head/>` or `<title/>` elements yields absolutely no candidates. + +The lack of results from this simple query forces us to somewhat release the +constraints on the elements we are willing to use. We can for instance make the +asumption that the occurrence of an intermediate element could be needed between +the `<entry/>` element and the recursing one used to encode sections. This +"section" element could also need a companion element to be able to include +itself, or, to formalise it in terms of graph theory, we could relax the +condition on this element to admit a loop by considering a cycle of a given +(small, this still needs to represent a fairly direct inclusion) length to be +enough. We simultaneously extend the maximum depth of the inclusion paths we are +looking for between `<entry/>`, the pair of elements and the `<p/>` element. + +By setting this depth to 3, that is, by accepting one intermediate element to +occur in the middle of each one of the inclusion paths that define the structure +required to encode encyclopedic discourse, we find 21 elements but none of them +stand out as an obvious good solution: all paths to include the `<p/>` element +from any *dictionaries* element either contain a `<figure/>` (which we have +previously encountered earlier when we were practising our graph approach to +search for inclusions between `<entry/>` and `<entryFree/>` and dismissed as not +useful in general), a `<stage/>` (reserved to stage direction in dramatic works) +or a `<state/>` (used to describe a temporary quality in a person or place), +again not even close to what we want. The paths to either `<head/>` or +`<title/>` are similarly disappointing. If that is not a thorough proof that +none of these elements could fulfill our purpose, it is a fact than no element +in this module appears as an obvious solution and a serious hint to keep looking +somewhere else. + +#### Widening the search + +We hence widen our search to include elements outside the *dictionaries* module +which could be used to encode our sections and subsections, under the same +constraint as before to try and find a composite solution that would remain +under the `<entry/>` element even if resorting to subcomponents outside of the +dedicated module. Only three elements are returned: + +- `<figure/>`: not any more useful to represent the content of encyclopedic + discourse than as a helper to include paragraphs +- `<metamark/>`: a very useful device to transcribe the edition marks than may + appear on a particular primary source to alter the normal flow of the text and + suggest an alternative reading (deletion, insertion, reordering, this is about + a human editing the text from a given physical copy of it), again really of no + use for a part of an article describing the geology of Europe for instance. +- `<note/>`: the first element that might at least resemble what we are looking + for. It is meant to contain text, is about explaning something and seems + general enough (not specific to a given genre, or to the occurrence of a + particular object on the page). Unfortunately, its semantics still seems a bit + off compared to our need. The documentation describes it as an "additional + comment", and, moreover "out of the main textual stream" whereas the long + developments in article are the very matter that inhabits the columns of text + encyclopedias are made of. ## Encoding within the *core* module The above remarks explain why the *dictionary* module by itself is unable to -represent encyclopedias, where discourse with nested structures of arbitrary -depth can occur. Since the *core* module of course accomodates these structures -by means of the `<div/>`, `<head/>` and `<p/>` elements, we devise an encoding -scheme using them which we recommend using for other projects aiming at -representing encyclopedias. +represent encyclopedias, where the notion of "meaning" is less central that in +dictionaries and where discourse with nested structures of arbitrary depth can +occur. Since the *core* module of course accomodates these structures by means +of the `<div/>`, `<head/>` and `<p/>` elements which have the additional +advantage of carrying less semantical payload than `<sense/>` or `<def/>` we +devise an encoding scheme using them which we recommend using for other projects +aiming at representing encyclopedias. To remain consistent with the above remarks we will only concern ourselves with what happens at the level of each article, right under the `<body/>` element.