diff --git a/README.md b/README.md
index 3ecf1ce8897e404d15a7a851157f834299e846d5..37252a4c8ec979da310243a37c0d2427fd0b0fd2 100644
--- a/README.md
+++ b/README.md
@@ -1,14 +1,9 @@
-# PyEDdA
+# Classifying encyclopedia articles
 
-Ce dÃ©pÃ´t contient le code rÃ©alisÃ© dans le cadre du projet
-[GEODE](https://geode-project.github.io/) par **Khaled Chabane**, **Ludovic
-Moncla** et **Alice Brenon**.
+This repository contains the code developed for classifying French encyclopedia articles as part of the [GEODE](https://geode-project.github.io/) project.
 
-Il contient le code dÃ©veloppÃ© Ã  l'origine pour l'article "*Classification
-automatique d'articles encyclopÃ©diques*"
-([https://hal.archives-ouvertes.fr/hal-03481219v1](https://hal.archives-ouvertes.fr/hal-03481219v1))
-prÃ©sentÃ© lors de la confÃ©rence [EGC 2022](https://egc2022.univ-tours.fr/).
 
+<!--
 ## Utilisation
 
 Ce dÃ©pÃ´t est un paquet python pouvant Ãªtre installÃ© avec
@@ -28,112 +23,148 @@ pip install -e .
 ```sh
 guix shell python -f guix.scm
 ```
+-->
+
+## Overview
+
+
+This git repository contains the code developed for a comparative study of supervised classification approaches applied to the automatic classification of encyclopedia articles written in French.
+Our dataset is composed of 17 volumes of text from the *EncyclopÃ©die* by Diderot and d'Alembert (1751-72) including about 70,000 articles. 
+We combine text vectorization (bag-of-words and word embeddings) with machine learning methods, deep learning, and transformer architectures. 
+In addition evaluating these approaches, we review the classification predictions using a variety of quantitative and qualitative methods.
+The best model obtains 86% as an average F-score for 38 classes. 
+Using network analysis we highlight the difficulty of classifying semantically close classes. We also introduce examples of opportunities for qualitative evaluation of "misclassifications" in order to understand the relationship between content and different ways of ordering knowledge. 
+
+
+## Experiments
+
+Our experiments compare approaches to classifying *EDdA* articles using vectorization and supervised classification. We test the following combinations:
+
+1. Bag-of-words vectorization and classic ML algorithms (Naive Bayes, Logistic Regression, Random forest, SVM and SGD);
+    
+2. Vectorization using static word embeddings (Doc2Vec) and classic ML algorithms (Logistic regression, Random Forest, SVM et SGD);
+    
+3. Vectorization using static word embeddings (FastText[fr]) and deep learning algorithms (CNN and BiLSTM);
+    
+4. An *end-to-end* approach using pre-trained contextual language models (BERT, CamemBERT) with fine-tuning to adapt the model for our task.
+
+
+
+## Results
+
+All the results (measured with precision, recall and F-score) are in the [reports](./reports) directory. Some tables and figures are listed below:
+
+### Mean F-scores for different models for the test set with a sample of a maximum of 500 articles (1), 1 500 (2) and no limit (3).
+
+| Classifier                      | Vectorizer    |       | F-score |     |
+| ------------------------------- | ------------- | ----- | ----- | ----- |
+|                                 |               | (1)   | (2)   | (3)   |
+| Naive Bayes                     | Bag of Words  | 0.63  | 0.71  | 0.70  |
+|                                 | TF-IDF        | 0.74  | 0.69  | 0.44  |
+| Logistic Regression             | Bag of Words  | 0.74  | 0.77  | 0.79  |
+|                                 | TF-IDF        | 0.77  | 0.79  | 0.81  |
+|                                 | Doc2Vec       | 0.64  | 0.69  | 0.77  |
+| Random Forest                   | Bag of Words  | 0.57  | 0.54  | 0.16  |
+|                                 | TF-IDF        | 0.55  | 0.53  | 0.16  |
+|                                 | Doc2Vec       | 0.63  | 0.66  | 0.60  |
+| SGD                             | Bag of Words  | 0.70  | 0.73  | 0.75  |
+|                                 | TF-IDF        | 0.77  | 0.81  | 0.81  |
+|                                 | Doc2Vec       | 0.68  | 0.72  | 0.76  |
+| SVM                             | Bag of Words  | 0.71  | 0.75  | 0.78  |
+|                                 | TF-IDF        | 0.77  | 0.80  | 0.81  |
+|                                 | Doc2Vec       | 0.68  | 0.74  | 0.78  |
+| CNN                             | FastText      | 0.65  | 0.72  | 0.74  |
+| BiLSTM                          | FastText      | 0.69  | 0.79  | 0.80  |
+| BERT Multilingual (fine-tuning) | -             | 0.81  | 0.85  | 0.86  |
+| CamemBERT (fine-tuning)         | -             | 0.78  | 0.83  | 0.86  |
+
+
+### F-scores for classes on the test set obtained with SGD + TF-IDF (1), BiLSTM + FastText (2) and BERT Multilingual (3).
 
-## PrÃ©sentation
-
-Ce dÃ©pÃ´t contient le code dÃ©veloppÃ©e pour une eÌtude comparative de diffeÌrentes
-approches de classification superviseÌe appliqueÌes aÌ€ la classification
-automatique dâ€™articles encyclopeÌdiques. Notre corpus dâ€™apprentissage est
-constitueÌ des 17 volumes de texte de lâ€™EncyclopeÌdie de Diderot et dâ€™Alembert
-(1751-1772) repreÌsentant un total dâ€™environ 70 000 articles. Nous avons
-expeÌrimenteÌ diffeÌrentes approches de vectorisation de textes (sac de mots et
-plongement de mots) combineÌes aÌ€ des meÌthodes dâ€™apprentissage automatique
-classiques, dâ€™apprentissage profond et des architectures BERT. En plus de la
-comparaison de ces diffeÌrentes approches, notre objectif est dâ€™identifier de
-manieÌ€re automatique les domaines des articles non classeÌs de lâ€™EncyclopeÌdie
-(environ 2 400 articles).
-
-## MÃ©thodes testÃ©es
-
-Nos expeÌrimentations concernent lâ€™eÌtude de diffeÌrentes approches de
-classification comprenant deux eÌtapes principales : la vectorisation et la
-classification superviseÌe. Nous avons testeÌ et compareÌ les diffeÌrentes
-combinaisons suivantes :
-
-1. vectorisation en sac de mots et apprentissage automatique classique (Naive
-   Bayes, Logistic regression, Random Forest, SVM et SGD) ;
-2. vectorisation en plongement de mots statiques (Doc2Vec) et apprentissage
-   automatique classique (Logistic regression, Random Forest, SVM et SGD) ;
-3. vectorisation en plongement de mots statiques (FastText) et apprentissage
-   profond (CNN et LSTM) ;
-4. approche *end-to-end* utilisant un modeÌ€le de langue preÌ-entraiÌ‚neÌ
-   (BERT,CamemBERT) et une technique de *fine-tuning* pour adapter le modeÌ€le sur
-   notre taÌ‚che de classification.
-
-## RÃ©sultats
-
-### F-mesures moyennes des diffÃ©rents modÃ¨les pour les jeux de validation et de test avec un Ã©chantillonnage max de 500 (1) et 1 500 (2) articles par classe et sans Ã©chantillonnage (3).
-
-| Classifieur                     | Vectorisation |      | Test |      |
-| ------------------------------- | ------------- | ---- | ---- | ---- |
-|                                 |               | (1)  | (2)  | (3)  |
-| Naive Bayes                     | Bag of Words  | 0.72 | 0.68 | 0.61 |
-|                                 | TF-IDF        | 0.74 | 0.59 | 0.37 |
-| Logistic Regression             | Bag of Words  | 0.85 | 0.85 | 0.86 |
-|                                 | TF-IDF        | 0.88 | 0.88 | 0.88 |
-|                                 | Doc2Vec       | 0.39 | 0.39 | 0.44 |
-| Random Forest                   | Bag of Words  | 0.50 | 0.49 | 0.17 |
-|                                 | TF-IDF        | 0.48 | 0.48 | 0.16 |
-|                                 | Doc2Vec       | 0.28 | 0.29 | 0.37 |
-| SGD                             | Bag of Words  | 0.85 | 0.86 | 0.86 |
-|                                 | TF-IDF        | 0.88 | 0.88 | 0.88 |
-|                                 | Doc2Vec       | 0.43 | 0.42 | 0.44 |
-| SVM                             | Bag of Words  | 0.85 | 0.85 | 0.86 |
-|                                 | TF-IDF        | 0.86 | 0.86 | 0.87 |
-|                                 | Doc2Vec       | 0.32 | 0.32 | 0.43 |
-| CNN                             | FastText      | 0.04 | 0.05 | 0.09 |
-| LSTM                            | FastText      | 0.10 | 0.10 | 0.12 |
-| BERT Multilingual (fine-tuning) | -             | 0.84 | 0.88 | 0.89 |
-| CamemBERT (fine-tuning)         | -             | 0.82 | 0.86 | 0.88 |
-
-### F-mesures obtenues par ensemble de domaines avec les approches SGD + TF-IDF (1), LSTM + FastText (2) et BERT (3) sans Ã©chantillonnage et sur le jeu de test.
 
 | Ensemble de domaines    | Support | (1)  | (2)  | (3)  | Ensemble de domaines | Support | (1)  | (2)  | (3)  |
 | ----------------------- | ------- | ---- | ---- | ---- | -------------------- | ------- | ---- | ---- | ---- |
-| GÃ©ographie              | 2 870   | 0.98 | 0.22 | 0.99 | Arts et mÃ©tiers      | 132     | 0.45 | 0.00 | 0.51 |
-| Droit - Jurisprudence   | 1 452   | 0.92 | 0.39 | 0.94 | Blason               | 126     | 0.93 | 0.00 | 0.93 |
-| MÃ©tiers                 | 1 220   | 0.87 | 0.07 | 0.89 | Chasse               | 124     | 0.92 | 0.01 | 0.92 |
-| Histoire naturelle      | 1 130   | 0.92 | 0.06 | 0.95 | MarÃ©chage [\ldots]   | 118     | 0.90 | 0.00 | 0.88 |
-| Histoire                | 726     | 0.76 | 0.08 | 0.80 | Chimie               | 115     | 0.75 | 0.02 | 0.72 |
-| Grammaire               | 575     | 0.77 | 0.08 | 0.81 | Philosophie          | 115     | 0.75 | 0.01 | 0.69 |
-| MÃ©decine [\ldots]       | 535     | 0.87 | 0.07 | 0.87 | Beaux-arts           | 103     | 0.86 | 0.00 | 0.84 |
-| Marine                  | 454     | 0.93 | 0.03 | 0.94 | Monnaie              | 74      | 0.81 | 0.00 | 0.79 |
-| Commerce                | 437     | 0.85 | 0.04 | 0.85 | Pharmacie            | 75      | 0.65 | 0.00 | 0.58 |
-| Religion                | 389     | 0.89 | 0.02 | 0.90 | Jeu                  | 67      | 0.85 | 0.00 | 0.87 |
-| Architecture            | 326     | 0.88 | 0.01 | 0.88 | PÃªche                | 48      | 0.93 | 0.00 | 0.90 |
-| AntiquitÃ©               | 321     | 0.80 | 0.01 | 0.82 | Mesure               | 43      | 0.65 | 0.00 | 0.74 |
-| Physique                | 309     | 0.85 | 0.04 | 0.86 | Economie domestique  | 31      | 0.75 | 0.00 | 0.58 |
-| Militaire [\ldots]      | 304     | 0.92 | 0.01 | 0.92 | MÃ©dailles            | 28      | 0.84 | 0.00 | 0.79 |
-| Agriculture [\ldots]    | 259     | 0.80 | 0.04 | 0.80 | CaractÃ¨res           | 27      | 0.67 | 0.00 | 0.51 |
-| Belles-lettres - PoÃ©sie | 246     | 0.75 | 0.01 | 0.74 | Politique            | 27      | 0.31 | 0.00 | 0.00 |
-| Anatomie                | 245     | 0.92 | 0.02 | 0.91 | MinÃ©ralogie          | 26      | 0.68 | 0.00 | 0.65 |
-| MathÃ©matiques           | 164     | 0.88 | 0.00 | 0.89 | Superstition         | 26      | 0.81 | 0.00 | 0.73 |
-| Musique                 | 163     | 0.94 | 0.01 | 0.94 | Spectacle            | 11      | 0.17 | 0.00 | 0.00 |
-
-### Matrice de confusion obtenue avec lâ€™approche SGD+TF-IDF sur le jeu de test
+| GÃ©ographie              | 2 621   | 0.96 | 0.98 | 0.99 | Chasse               | 116     | 0.87 | 0.87 | 0.92 |
+| Droit - Jurisprudence   | 1 284   | 0.88 | 0.90 | 0.93 | Arts et mÃ©tiers      | 112     | 0.15 | 0.27 | 0.36 |
+| MÃ©tiers                 | 1 051   | 0.79 | 0.76 | 0.81 | Blason               | 108     | 0.87 | 0.86 | 0.89 |
+| Histoire naturelle      | 963	   | 0.90 | 0.87 | 0.93 | MarÃ©chage [\ldots]   | 105     | 0.83 | 0.86 | 0.90 |
+| Histoire                | 616     | 0.64 | 0.64 | 0.75 | Chimie               | 104     | 0.70 | 0.58 | 0.77 |
+| MÃ©decine [\ldots]       | 455     | 0.83 | 0.80 | 0.86 | Philosophie          | 94      | 0.75 | 0.49 | 0.72 |
+| Grammaire               | 452     | 0.58 | 0.54 | 0.71 | Beaux-arts           | 86      | 0.70 | 0.62 | 0.82 |
+| Marine                  | 415     | 0.83 | 0.86 | 0.88 | Pharmacie            | 65      | 0.53 | 0.38 | 0.63 |
+| Commerce                | 376     | 0.71 | 0.69 | 0.74 | Monnaie              | 63      | 0.63 | 0.50 | 0.72 |
+| Religion                | 328     | 0.78 | 0.77 | 0.84 | Jeu                  | 56      | 0.84 | 0.74 | 0.85 |
+| Architecture            | 278     | 0.79 | 0.74 | 0.80 | PÃªche                | 42      | 0.85 | 0.84 | 0.85 |
+| AntiquitÃ©               | 272     | 0.66 | 0.68 | 0.74 | Mesure               | 37      | 0.35 | 0.10 | 0.56 |
+| Physique                | 265     | 0.75 | 0.76 | 0.82 | Economie domestique  | 27      | 0.41 | 0.48 | 0.44 |
+| Militaire [\ldots]      | 258     | 0.83 | 0.82 | 0.88 | CaractÃ¨res           | 23      | 0.61 | 0.08 | 0.46 |
+| Agriculture [\ldots]    | 233     | 0.68 | 0.58 | 0.71 | MÃ©dailles            | 23      | 0.77 | 0.70 | 0.86 |
+| Anatomie                | 215     | 0.89 | 0.84 | 0.90 | Politique            | 23      | 0.15 | 0.22 | 0.53 |
+| Belles-lettres - PoÃ©sie | 206     | 0.58 | 0.41 | 0.70 | MinÃ©ralogie          | 22      | 0.38 | 0.39 | 0.70 |
+| MathÃ©matiques           | 140     | 0.82 | 0.85 | 0.89 | Superstition         | 22      | 0.72 | 0.48 | 0.41 |
+| Musique                 | 137     | 0.87 | 0.83 | 0.88 | Spectacle            | 9       | 0.33 | 0.46 | 0.61 |
+
+
+
+
+### F-scores obtained with BERT Multilingual and CamemBERT for each class
+
+![image info](./img/F1Scores_BERTvsCAMEMBERT.png)
+
+### F-scores obtained with Naive Bayes + TF-IDF on each class with three different sampling.
+
+![image info](./img/F1Scores_NB_TF.png)
+
+
+### F-scores obtained with SGD on each class with three different vectorizers without sampling.
+
+![image info](./img/F1Scores_SGD.png)
+
+
+### F-scores for classes on the test set obtained with SGD + TF-IDF, BiLSTM + FastText and BERT Multilingual.
+
+![image info](./img/F1Scores_SGD_BiLSTM_BERT.png)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+### Confusion matrix for SGD+TF-IDF 
+
+
 
 ![image info](./img/sgd_tf_idf_s10000.png)
 
-Cette figure preÌsente la matrice de confusion obtenue avec la meÌthode SGD+TF-IDF
-sur le jeu de test. On peut voir quâ€™un grand nombre dâ€™articles des classes *Arts
-et meÌtiers* et *Economie domestique* a eÌteÌ classeÌ dans la classe *MeÌtiers*, de
-la meÌ‚me manieÌ€re les classes *Mesure*, *MineÌ- ralogie*, *Pharmacie* et
-*Politique* sont souvent confondues avec les classes *Commerce*, *Histoire
-naturelle*, *MeÌdecine - Chirurgie* et *Droit - Jurisprudence*, respectivement.
-Les proximiteÌs seÌ- mantiques entre ces classes montrent bien la difficulteÌ pour
-les modeÌ€les de choisir entre lâ€™une ou lâ€™autre et les reÌsultats confirment quâ€™en
-cas de trop grande proximiteÌ les modeÌ€les choisissent la classe la plus
-repreÌsenteÌe dans le jeu de donneÌes.
+This confusion matrix presents the results for the SGD+TF-IDF model on the test set. We see that most articles in the classes *Arts et mÃ©tiers* and *Economie domestique* (Domestic economy) were classified as *MÃ©tiers*. In the same manner *Mesure* (Measurement), *MinÃ©ralogie* (Mineralology), *Pharmacie* (Pharmacy) and *Politique* (Politics) were confused with *Commerce*, *Histoire naturelle* (Natural history), *MÃ©decine - Chirurgie* (Medicine - Surgery) and *Droit - Jurisprudence* (Law), respectively. The semantic similarity between these classes illustrates the difficulty a model has when choosing a â€œbest match.â€ The results confirm that when there is great semantic similarity, the model chooses the best represented class in the dataset, thereby privileging certain ENCCRE domains that contain more articles.
+
 
-## Citation
+
+## Cite our work
 
 Moncla, L., Chabane, K., et Brenon, A. (2022). Classification automatique
 dâ€™articles encyclopÃ©diques. *ConfÃ©rence francophone sur lâ€™Extraction et la
-Gestion des Connaissances (EGC)*. Blois, France.
+Gestion des Connaissances ([EGC](https://egc2022.univ-tours.fr/))*. Blois, France. [https://hal.archives-ouvertes.fr/hal-03481219v1](https://hal.archives-ouvertes.fr/hal-03481219v1)
+
+
+
+## Ackowledgements
+
+
+Data courtesy the ARTFL EncyclopÃ©die Project, University of Chicago.
+
+This work was supported by the [ASLAN project](https://aslan.universite-lyon.fr/) (ANR-10-LABX-0081) of UniversitÃ© de Lyon, within the program Â« Investissements dâ€™Avenir Â» operated by the French National Research Agency (ANR).
 
-## Remerciements
 
-Les auteurs remercient le [LABEX ASLAN](https://aslan.universite-lyon.fr/)
-(ANR-10-LABX-0081) de l'UniversitÃ© de Lyon pour son soutien financier dans le
-cadre du programme franÃ§ais  "Investissements d'Avenir" gÃ©rÃ© par l'Agence
-Nationale de la Recherche  (ANR).