Skip to content
Snippets Groups Projects
Commit 82ce3cf7 authored by Alice Brenon's avatar Alice Brenon
Browse files

Abandon interp for subject indicators

parent 847c5286
No related branches found
No related tags found
No related merge requests found
......@@ -493,32 +493,37 @@ second element.
Filtering the content of the module to keep only the elements which can at the
same time contain themselves, be included under `<entry/>` and include a `<p/>`
and either the `<head/>` or `<title/>` elements yields absolutely no candidates.
It is remarkable that even replacing the `<entry/>` element for the root of each
article with an `<entryFree/>`, an element supposed to relax some constraint to
accomodate more unusual structure in dictionaries does not bring any
improvement.
The lack of results from this simple query forces us to somewhat release the
constraints on the elements we are willing to use. We can for instance make the
The lack of results from these simple queries forces us to somewhat release the
constraints on the encoding we are willing to use. We can for instance make the
asumption that the occurrence of an intermediate element could be needed between
the `<entry/>` element and the recursing one used to encode sections. This
"section" element could also need a companion element to be able to include
itself, or, to formalise it in terms of graph theory, we could relax the
condition on this element to admit a loop by considering a cycle of a given
(small, this still needs to represent a fairly direct inclusion) length to be
enough. We simultaneously extend the maximum depth of the inclusion paths we are
looking for between `<entry/>`, the pair of elements and the `<p/>` element.
the element wrapping the whole article and the recursing one used to encode each
section. This "section" element could also need a companion element to be able
to include itself, or, to formalise it in terms of graph theory, we could relax
the condition that this element admits a loop to consider instead cycles of a
given (small, this still needs to represent a fairly direct inclusion) length to
be enough. We simultaneously extend the maximum depth of the inclusion paths we
are looking for between `<entry/>`, the pair of elements and the `<p/>` element.
By setting this depth to 3, that is, by accepting one intermediate element to
occur in the middle of each one of the inclusion paths that define the structure
required to encode encyclopedic discourse, we find 21 elements but none of them
stand out as an obvious good solution: all paths to include the `<p/>` element
from any *dictionaries* element either contain a `<figure/>` (which we have
from any *dictionaries* element either contains a `<figure/>` (which we have
previously encountered earlier when we were practising our graph approach to
search for inclusions between `<entry/>` and `<entryFree/>` and dismissed as not
useful in general), a `<stage/>` (reserved to stage direction in dramatic works)
or a `<state/>` (used to describe a temporary quality in a person or place),
again not even close to what we want. The paths to either `<head/>` or
`<title/>` are similarly disappointing. If that is not a thorough proof that
none of these elements could fulfill our purpose, it is a fact than no element
in this module appears as an obvious solution and a serious hint to keep looking
somewhere else.
`<title/>` are similarly disappointing. Again, changing `<entry/>` for
`<entryFree/>` returns the exact same candidates. If that is not a thorough
proof that none of these elements could fulfill our purpose, it is a fact than
no element in this module appears as an obvious good solution and a serious hint
to keep looking somewhere else.
#### Widening the search
......@@ -546,14 +551,16 @@ dedicated module. Only three elements are returned:
## Encoding within the *core* module
The above remarks explain why the *dictionary* module by itself is unable to
represent encyclopedias, where the notion of "meaning" is less central that in
The above remarks explain why the *dictionary* module is unable to represent
encyclopedias, where the notion of "meaning" is less central that in
dictionaries and where discourse with nested structures of arbitrary depth can
occur. Since the *core* module of course accomodates these structures by means
of the `<div/>`, `<head/>` and `<p/>` elements which have the additional
advantage of carrying less semantical payload than `<sense/>` or `<def/>` we
devise an encoding scheme using them which we recommend using for other projects
aiming at representing encyclopedias.
occur. Even composite encodings using elements outside of the *dictionaries*
module under an `<entry/>` element do not meet our requirements. Since the
*core* module of course accomodates these structures by means of the `<div/>`,
`<head/>` and `<p/>` elements which have the additional advantage of carrying
less semantical payload than `<sense/>` or `<def/>` we devise an encoding scheme
using them which we recommend using for other projects aiming at representing
encyclopedias.
To remain consistent with the above remarks we will only concern ourselves with
what happens at the level of each article, right under the `<body/>` element.
......@@ -578,11 +585,25 @@ to avoid issues with the XML encoding.
Inside this element should be a `<head/>` enclosing the headword of the article.
The usual sub-`<hi/>` elements are available within `<head/>` if the headword is
highlighted by any special typographic means such as bold, small capitals, etc.
This element should also contain the optional subject indicator within
parenthesis that sometimes accompany the headword, with the appropriate standard
elements like `<persName/>` occurring in biographical articles or `<interp/>`
with a `theme` attribute if the article is given a specific domain in a
taxonomy.
The one disappointment of the encoding scheme we are currently defining is the
lack of support for a proper way to encode subject indicators.
The best candidate we have found so far was `<ùsg/>` from the *dictionaries*
module but it is not available directly under a `<head/>` element. All inclusion
paths from the latter to the former of length less than or equal to 3 contain
irrelevant elements (`<cit/>`, `<figure/>`, `<castList/>` and `<nym/>`) so it
must be discarded. The next best elements appear to be `<term/>` (not very
accurate) and `<rs/>` ("referring string", quite a general semantics but a
possible match — subject indicators refer to a given domain of knowledge —
although all the examples in the documentation refer to concrete persons,
places or object, not to the abstract objects that mathematics or poetry can be)
For this reason, we do not recommend any special encoding of the subject
indicator but leave it open to each particular context: they are often
abbreviated so an `<abbr/>` may apply, in *La Grande Encyclopédie*, biographies
are not labeled by a knowledge domain but usually include the first name of the
person when it is known so in that case a element like `<persName/>` is still
appropriate.
![](snippets/cathète_1.png)
......
......@@ -4,7 +4,7 @@ header-includes:
\usepackage{graphicx}
\usepackage[left=0cm,top=0cm,right=0cm,nohead,nofoot]{geometry}
\geometry{
paperwidth=12.4cm,
paperwidth=8.8cm,
paperheight=1.4cm,
margin=0cm
}
......@@ -12,6 +12,6 @@ header-includes:
```xml
<div xml:id="cathète-0">
<head>CATHÈTE (<interp theme="domain">Archit.</interp>)</head>
<head>CATHÈTE (<abbr>Archit.</abbr>)</head>
</div>
```
......@@ -4,7 +4,7 @@ header-includes:
\usepackage{graphicx}
\usepackage[left=0cm,top=0cm,right=0cm,nohead,nofoot]{geometry}
\geometry{
paperwidth=12.4cm,
paperwidth=8.8cm,
paperheight=1.8cm,
margin=0cm
}
......@@ -12,7 +12,7 @@ header-includes:
```xml
<div xml:id="cathète-0">
<head>CATHÈTE (<interp theme="domain">Archit.</interp>)</head>
<head>CATHÈTE (<abbr>Archit.</abbr>)</head>
<div type="sense" n="0"></div>
</div>
```
......@@ -12,7 +12,7 @@ header-includes:
```xml
<div xml:id="cathète-0">
<lb/><head>CATHÈTE (<interp theme="domain">Archit.</interp>).</head>
<lb/><head>CATHÈTE (<abbr>Archit.</abbr>).</head>
<div type="sense" n="0">
<p>
On désigne ainsi la ligne
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment