{height=830px #fig:dictionaries-subgraph}
{height=830px #fig:dictionaries-subgraph}
### Definitions {-}
By iterating several times the operation of moving on that graph along one edge,
By iterating several times the operation of moving on that graph along one edge,
that is, by considering the transitive closure of the relation "be connected by
that is, by considering the transitive closure of the relation "be connected by
an edge" we define *inclusion paths* which allow us to explore which elements
an edge" we define *inclusion paths* which allow us to explore which elements
...
@@ -283,8 +281,6 @@ directly contain another one, it may contain a `<geogName/>` which, in turn, may
...
@@ -283,8 +281,6 @@ directly contain another one, it may contain a `<geogName/>` which, in turn, may
contain a new `<address/>` element. From a graph theory perspective, we can say
contain a new `<address/>` element. From a graph theory perspective, we can say
that it admits an inclusion cycle of length two.
that it admits an inclusion cycle of length two.
### Applications {-}
Using classical, well-known methods such as Dijkstra's algorithm [@dijkstra59]
Using classical, well-known methods such as Dijkstra's algorithm [@dijkstra59]
allows us to explore the shortest inclusion paths that exist between elements.
allows us to explore the shortest inclusion paths that exist between elements.
Though a particular caution should be applied because there is no guarantee that
Though a particular caution should be applied because there is no guarantee that
...
@@ -327,8 +323,6 @@ can appear quite near the "surface" of article entries.
...
@@ -327,8 +323,6 @@ can appear quite near the "surface" of article entries.
## Content of the module
## Content of the module
### The `<entry/>` element {-}
The central element of the *dictionaries* module is the `<entry/>` element meant
The central element of the *dictionaries* module is the `<entry/>` element meant
to encode one single entry in a dictionary, that is to say a head word
to encode one single entry in a dictionary, that is to say a head word
associated to its definition. It is the natural way in from the `<body/>`
associated to its definition. It is the natural way in from the `<body/>`
...
@@ -347,8 +341,6 @@ as the natural top-most element for an article. This somewhat contrived example
...
@@ -347,8 +341,6 @@ as the natural top-most element for an article. This somewhat contrived example
hopes to further demonstrate the application of a graph-centred approach to
hopes to further demonstrate the application of a graph-centred approach to
understand the inner workings of the XML-TEI schema.
understand the inner workings of the XML-TEI schema.
### Information about the headword itself {-}
Once a block for an article is created, it may contain elements useful to
Once a block for an article is created, it may contain elements useful to
represent various of its features. Its written and spoken forms are usually
represent various of its features. Its written and spoken forms are usually
encoded by `<form/>` elements. Grammatical information like the `<case/>`,
encoded by `<form/>` elements. Grammatical information like the `<case/>`,
...
@@ -364,8 +356,6 @@ the encoder with a toolbox to describe all the information related to the form
...
@@ -364,8 +356,6 @@ the encoder with a toolbox to describe all the information related to the form
the entry is found at and seems general enough to accomodate the structure of
the entry is found at and seems general enough to accomodate the structure of
any book indexing entries by words.
any book indexing entries by words.
### Cross-references {-}
A common feature shared by dictionaries and encyclopedias is the ability to
A common feature shared by dictionaries and encyclopedias is the ability to
connect entries together by using a word or short phrase as the link, referring
connect entries together by using a word or short phrase as the link, referring
the reader to the related concept. This is known as cross-references and can
the reader to the related concept. This is known as cross-references and can
...
@@ -383,8 +373,6 @@ in this description of the toolbox because it is particularly useful in the
...
@@ -383,8 +373,6 @@ in this description of the toolbox because it is particularly useful in the
context of dictionaries. This element may have a target attribute which points
context of dictionaries. This element may have a target attribute which points
to the other resource to be accessed by the interested reader.
to the other resource to be accessed by the interested reader.
### Definitions {-}
The remaining part of entries is also usually the largest and represents the
The remaining part of entries is also usually the largest and represents the
content associated to the headword by the entry. In a dictionary, that is its
content associated to the headword by the entry. In a dictionary, that is its
meaning.
meaning.
...
@@ -395,8 +383,6 @@ of this versatile element) and other high-level information such as translations
...
@@ -395,8 +383,6 @@ of this versatile element) and other high-level information such as translations
in other languages. Both `<def/>` and `<usg/>` elements may appear directly
in other languages. Both `<def/>` and `<usg/>` elements may appear directly
under the `<entry/>`.
under the `<entry/>`.
### Structural remarks {-}
Before concluding this description of the *dictionaries* module from the
Before concluding this description of the *dictionaries* module from the
perspective of someone trying to concretely encode a particular dictionary or
perspective of someone trying to concretely encode a particular dictionary or
encyclopedia, we make use of the graph approach again to evidence some its
encyclopedia, we make use of the graph approach again to evidence some its
...
@@ -459,8 +445,6 @@ noticeable differences. It is difficult to make a precise list because the
...
@@ -459,8 +445,6 @@ noticeable differences. It is difficult to make a precise list because the
editorial choices may vary greatly between encyclopedias but we discuss some of
editorial choices may vary greatly between encyclopedias but we discuss some of
the most obvious.
the most obvious.
### Organised knowledge {-}
The first immediately visible feature that sets encyclopedias apart from
The first immediately visible feature that sets encyclopedias apart from
dictionaries and can be found in the *Encyclopédie* as well as in *La Grande
dictionaries and can be found in the *Encyclopédie* as well as in *La Grande
Encyclopédie* is the presence of subject indicators at the beginning of articles
Encyclopédie* is the presence of subject indicators at the beginning of articles
...
@@ -506,8 +490,6 @@ This point, although not the most concerning, still remains the hardest to
...
@@ -506,8 +490,6 @@ This point, although not the most concerning, still remains the hardest to
address but all things considered the `<usg/>` element stands out as the most
address but all things considered the `<usg/>` element stands out as the most
relevant.
relevant.
### The notion of meaning {-}
Notwithstanding the correct way to represent domains of knowledge, their extent
Notwithstanding the correct way to represent domains of knowledge, their extent
itself raises concerns regarding the *dictionaries* module. Indeed, among the
itself raises concerns regarding the *dictionaries* module. Indeed, among the
vast collection of domains covered in encyclopedias in general and in *La Grande
vast collection of domains covered in encyclopedias in general and in *La Grande
...
@@ -539,8 +521,6 @@ idea that the term eludes definition, wrapping it in a `<sense/>`, or worse, a
...
@@ -539,8 +521,6 @@ idea that the term eludes definition, wrapping it in a `<sense/>`, or worse, a
As a result, the use of `<sense/>` and `<def/>` is not appropriate for
As a result, the use of `<sense/>` and `<def/>` is not appropriate for
encyclopedic content in general.
encyclopedic content in general.
### Nested structures {-}
The final difficulty can be considered as a partial consequence of the previous
The final difficulty can be considered as a partial consequence of the previous
one on the structure of articles. The difficulty to define complex concepts is
one on the structure of articles. The difficulty to define complex concepts is
the very reason why authors approach their subjects from various angles,
the very reason why authors approach their subjects from various angles,
...
@@ -657,8 +637,6 @@ article, "Cathète" from tome 9 reproduced in Figure @fig:cathete-photo.
...
@@ -657,8 +637,6 @@ article, "Cathète" from tome 9 reproduced in Figure @fig:cathete-photo.
)](ressources/cathète_t9.png){#fig:cathete-photo}
)](ressources/cathète_t9.png){#fig:cathete-photo}
### The scheme {-}
Remaining within the *core* module for the structure, almost all useful elements
Remaining within the *core* module for the structure, almost all useful elements
are available and our encoding scheme merely quotes the official documentation.
are available and our encoding scheme merely quotes the official documentation.
Each article is represented by a `<div/>`. We suggest setting an `xml:id`
Each article is represented by a `<div/>`. We suggest setting an `xml:id`
...
@@ -768,8 +746,6 @@ encoding scheme as demonstrated by Figure @fig:alcala-xml.
...
@@ -768,8 +746,6 @@ encoding scheme as demonstrated by Figure @fig:alcala-xml.
{#fig:alcala-xml}
{#fig:alcala-xml}
### Currently implemented {-}
The reference implementation for this encoding scheme is the program
The reference implementation for this encoding scheme is the program
soprano
soprano
([https://gitlab.huma-num.fr/disco-lge/soprano](https://gitlab.huma-num.fr/disco-lge/soprano)) developed within the scope of project DISCO-LGE to
([https://gitlab.huma-num.fr/disco-lge/soprano](https://gitlab.huma-num.fr/disco-lge/soprano)) developed within the scope of project DISCO-LGE to
...
@@ -874,8 +850,6 @@ back and forth between trying to find patterns in the graph which reflects the p
...
@@ -874,8 +850,6 @@ back and forth between trying to find patterns in the graph which reflects the p
found in the text and questioning the relevance of the results explains the
found in the text and questioning the relevance of the results explains the
choice we ended up making but also the alternatives we have considered.
choice we ended up making but also the alternatives we have considered.
### Bend the semantics {-}
Several times, the issue of the semantics of some elements which posess the
Several times, the issue of the semantics of some elements which posess the
properties we need came up. This is the case for instance of the `<sense/>` and
properties we need came up. This is the case for instance of the `<sense/>` and
`<node/>` elements. It is very tempting to bend their documented semantics or to
`<node/>` elements. It is very tempting to bend their documented semantics or to
...
@@ -890,8 +864,6 @@ the encyclopedic developments that occur in the articles.
...
@@ -890,8 +864,6 @@ the encyclopedic developments that occur in the articles.
We have chosen not to follow the same path in the name of the FAIR principles to
We have chosen not to follow the same path in the name of the FAIR principles to
avoid the emergence of a custom usage differing from the documented one.
avoid the emergence of a custom usage differing from the documented one.
### Custom schema {-}
The other major reason behind our choice was the inclusion rules which exist
The other major reason behind our choice was the inclusion rules which exist
between TEI elements and pushed us to look for different combinations. Another
between TEI elements and pushed us to look for different combinations. Another
valid approach would have consisted in changing the structure of the inclusion
valid approach would have consisted in changing the structure of the inclusion