From 6fe5e31cd115b634a08f124d5098f2d615724b43 Mon Sep 17 00:00:00 2001 From: Jacques Fize <jacques.fize@insa-lyon.fr> Date: Mon, 7 Sep 2020 12:14:59 +0200 Subject: [PATCH] Change Readme (add Bert) --- .gitignore | 4 +++- README.md | 11 +++++++++++ 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/.gitignore b/.gitignore index 10ad02e..2f0ed0e 100644 --- a/.gitignore +++ b/.gitignore @@ -154,4 +154,6 @@ subset* time* -/data* \ No newline at end of file +/data* + +output_bert_allcooc_adjsampling3radius20km_batch32_epoch10 \ No newline at end of file diff --git a/README.md b/README.md index 3797136..9bd2998 100644 --- a/README.md +++ b/README.md @@ -93,3 +93,14 @@ grid.run() | -e,--epochs | number of epochs | | -d,--dimension | size of the ngram embeddings | | --admin_code_1 | (Optional) If you wish to train the network on a specific region | + + +# New model based on BERT embeddings + +In the recent years, BERT architecture proposed by Google researches enables to outperform state-of-art methods for differents tasks in NLP (POS, NER, Classification). To verify if BERT embeddings would permit to increase the performance of our approach, we code a script to use bert with our data. In our previous model, the model returned two values each on between [0,1]. Using Bert, the task has shifted to classification (softmax) where each class correspond to a cell on the glob. We use the hierarchical projection model : Healpix. Other projections model like S2geometry can be considered : https://s2geometry.io/about/overview. + +In order, to run this model training, run the `bert.py` script : + + python3 bert.py <train_dataset> <test_dataset> + +The train and test dataset are table data composed of two columns: sentence and label. \ No newline at end of file -- GitLab