Skip to content
Snippets Groups Projects
Commit 68acc9d8 authored by Alice Brenon's avatar Alice Brenon
Browse files

Finished editing the intro apparently

parent 3020cab8
No related branches found
No related tags found
No related merge requests found
......@@ -69,63 +69,56 @@ semantics and philosophical considerations:
(*a language dictionary, which appears to be only a word dictionary, must often
be a thing dictionary when it is made properly*). A similar criticism is made by
@haiman_dictionaries_1980 who attacks no less than six criteria on which
dictionaries and encyclopedias are generally opposed to reach the conclusion
that there is no distinction between them because "dictionaries *are*
@haiman_dictionaries_1980 [p. 331] who attacks no less than six criteria on
which dictionaries and encyclopedias are generally opposed to reach the
conclusion that there is no distinction between them because "dictionaries *are*
encyclopedias". Regardless of the validity of his reasoning, it only proves one
inclusion: that perhaps, dictionaries would be a special case of encyclopedias.
This, as will be evidenced, does by no means imply that encyclopedias are
This, as will be shown, does by no means imply that conversely encyclopedias are
dictionaries.
XML-TEI is a set of guidelines collectively developped by the
@tei_consortium_tei_2023 under the form of XML schemas, along with a range of
tools to handle them and training resources in order to represent text in a
highly structured and machine-readable format. Its toolbox has a modular
structure consisting of optional parts each covering specific needs such as the
physical features of a source document, the transcription of oral corpora or
particular requirements for textual domains like poetry, or, in the case at
hand, dictionaries.
After describing why the dedicated
module was a natural candidate to consider, I formalise tools from graph
theory to browse the specifications of this guideline in a rational way and
explore this module in detail.
@romary_formal_2007
(@ide_encoding_1995 *dictionaries* only for western dictionaries) have been
applied for both historical (@bohbot2018) and digitally native
(@bowers_bridging_2018). In addition, a specific guidelines tailored at encoding
dictionaries, TEI-Lex0, has been published [@banski_tei_lex0_2017].
Systematic study of the guidelines @ide_background_1998 but here's a new method.
Less than ten years after the beginings of the TEI, @ide_background_1998 gives a
thorough account of the criteria
# Dictionaries and encyclopedias
After emerging over the course of the 18^th^ century, encyclopedias became a
fertile subgenre in themselves and a rich subject of study to digital humanities
for their particular relation to knowledge and its evolution. This section
describes the goal of the project, then looks at the origin of the term
"encyclopedia" itself before comparing the approaches of encyclopedias and
dictionaries.
## Context of the project
CollEx-Persée project DISCO-LGE
XML-TEI is a set of guidelines, tools and tranining resources collectively
developped by the @tei_consortium_tei_2023 to represent text in a highly
structured and machine-readable format. Its toolbox has a modular structure
consisting of optional parts each covering specific needs such as the physical
features of a source document, the transcription of oral corpora or particular
requirements for textual domains like poetry, or, in the case at hand,
dictionaries. The intrinsic complexity of dictionaries has been well identified
since the inception of the project [@tei_vault] and @ide_encoding_1995
underlines the amount of work which went into the third version of the
guidelines (P3) to provide a toolbox both general and expressive enough to
account for the variety of conventions found in dictionaries.
@romary_formal_2007 This module has been successfully used to encode both
historical [@williams2017], [@bohbot2018] and digitally native dictionaries
[@bowers_bridging_2018]. In addition, a specific guidelines tailored at encoding
dictionaries named TEI-Lex0 has also been published [@banski_tei_lex0_2017].
The TEI effort is described as "first steps" by @ide_background_1998 to reach a
standard to encode corpora and lay a common basis for corpora comparisons and
reuse. They point some light inconsistencies in the design, remark that there is
generally more than one way to encode a given text in XML-TEI and identify nine
criteria to design a sound standard. Their claims are backed by concrete
examples of encoding situations but without giving any idea of the prevalence of
the issues found. In fact, the sheer complexity of the guidelines can make it
hard to ascertain whether a particular element structure is impossible to
represent (not finding a suitable encoding is not a proof that there is none).
This chapter will use results from graph theory to give a systematic study of
the possibilities and shortcomings of the TEI *dictionaries* module.
# Context of the study
## CollEx-Persée Project DISCO-LGE
The project
([https://www.collexpersee.eu/projet/disco-lge/](https://www.collexpersee.eu/projet/disco-lge/))
set out to study *La Grande Encyclopédie, Inventaire raisonné des Sciences, des
Lettres et des Arts par une Société de savants et de gens de lettres*, an
encyclopedia published in France between 1885 and 1902 by an organised team of
over two hundred specialists divided into eleven sections. This text comprises
31 tomes of about 1200 pages each and according to @jacquet-pfau2015 [, pp. 88 et
seq.] was the last major french encyclopedic endeavour directly inheriting from
the prestigious ancestor that was the *Encyclopédie ou Dictionnaire raisonné des
sciences des arts et des métiers* published by Diderot and d'Alembert 130 years
earlier, between 1751 and 1772.
Lettres et des Arts par une Société de savants et de gens de lettres* (hence
*LGE*), an encyclopedia published in France between 1885 and 1902 by an
organised team of over two hundred specialists divided into eleven sections.
This text comprises 31 tomes of about 1200 pages each and according to
@jacquet-pfau2015 [, pp. 88 et seq.] was the last major french encyclopedic
endeavour directly inheriting from the prestigious ancestor that was the *EDdA*
published by Diderot and d'Alembert 130 years earlier, between 1751 and 1772.
The aim of the project was to digitise and make *La Grande Encyclopédie*
available to the scientific community as well as the general public. A previous
......@@ -136,8 +129,8 @@ pictures with an Optical Characters Recognition (OCR) system. This prevented an
exhaustive study of the text with textometry tools such as TXM [@heiden2010]. As
a prelude to project GEODE
([https://geode-project.github.io/](https://geode-project.github.io/)), the goal
of CollEx-Persée was to produce a digital version of *La Grande Encyclopédie*
with a quality comparable to the one of l'*Encyclopédie* provided by the ARTFL
of CollEx-Persée was to produce a digital version of *LGE* with a quality
comparable to the one of l'*Encyclopédie* provided by the ARTFL
([http://artfl-project.uchicago.edu/](http://artfl-project.uchicago.edu/))
project in order to conduct a diachronic study of both encyclopedias.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment