Skip to content
Snippets Groups Projects
Commit 34bee52e authored by Nelly Barret's avatar Nelly Barret
Browse files

[M] fixed JS + readme

parent 9c7a2d08
No related branches found
No related tags found
No related merge requests found
Showing with 256 additions and 330 deletions
# ![predihood](/predihood/static/img/favicon.png?raw=true "Logo predihood")     predihood # Predihood
Cette application permet de visualiser les [IRIS](https://www.insee.fr/fr/metadonnees/definition/c1523) (zones administratives définies par l'INSEE, un peu similaires aux quartiers, environ 50000 IRIS sur le terrtitoire français) et les indicateurs qui les décrivent (e.g., nombre de boulangeries, nombre et type d'établissements scolaires, pourcentage d'habitant.e.s selon les catégories socio-professionnelles). Predihood is an application for visualizing [IRIS](https://www.insee.fr/fr/metadonnees/definition/c1523) (administrative areas defined by the French institute of statistics, they can be considered as neighbourhoods) and indicators which describe them (e.g. number of bakeries, average income and even the number of houses over 250m^2).
L'outil *mapiris* permet de chercher les IRIS par code, par nom (d'IRIS ou de commune) et d'afficher les IRIS sur une zone géographique donnée. ## Statement of need
<img src="/predihood/static/img/screenshot-mapiris.jpg?raw=true" alt="Capture mapiris" width="100%"> Finding a real estate in a new city is still a challenge. We often arrive in a city we don't know, thus finding the perfect living place becomes complex. Nearby public transport on one hand, a rural landscape on the other hand, an animated neighbourhood for some, far from urban hustle and bustle for others: there are many criteria for choosing your future neighbourhood. Our approach Predihood aims at facilitating the comparison between neighbourhoods. It defines and predicts the environment of any neighbourhood in France using supervised learning.
## Pré-requis ## Installation instructions
### Requirements
- Python, version >=3 - Python, version >=3
- [MongoDB](https://www.mongodb.com/), version >=4, pour lequel il faudra importer la base de données des IRIS (cf installation). - [MongoDB](https://www.mongodb.com/), version >=4 for importing the database about neighbourhoods.
## Installation ### Installation
Pour installer *predihood*, taper dans un terminal : For installing Predihood, type in a terminal:
``` ```
python3 -m pip install -e predihood/ --process-dependency-links python3 -m pip install -e predihood/ --process-dependency-links
``` ```
Cette commande installe les dépendances, dont [mongiris](https://gitlab.liris.cnrs.fr/fduchate/mongiris) qui permet l'interrogation d'une base de données sous MongoDB contenant les IRIS. This command install dependencies, including [mongiris](https://gitlab.liris.cnrs.fr/fduchate/mongiris) which provide the querying of the MongoDB database containing information about neighbourhoods.
Il est donc nécessaire de crééer la base de données avec la collection d'IRIS. Pour cela, exécuter la commande (depuis le répertoire des exécutables de MongoDB si besoin) : Create this database is mandatory. To achieve this, execute this command (from the MongoDB's executables directory if needed):
``` ```
./mongorestore --archive=/path/to/dump-iris.bin ./mongorestore --archive=/path/to/dump-iris.bin
``` ```
`/path/to/` représente le chemin vers le fichier dump de la collection des IRIS (fourni de base avec le package mongiris dans `mongiris/data/dump-iris.bin`) where `/path/to/` is the path to the dump file of the IRIS collection (provided with the package mongiris in `mongiris/data/dump-iris.bin`).
## Lancement de l'interface ### Run Predihood
Pour lancer *predihood*, taper dans un terminal : For running *Predihood*, type in a terminal:
``` ```
python3 main.py python3 main.py
``` ```
Après quelques informations, le terminal affiche l'URL permettant de tester *predihood* : `http://localhost:8080/` After some information, the terminal display the URL for testing *Predihood* : `http://localhost:8080/`. If you want to try the cartographic interface, click on the button "Search a neighbourhood". Otherwise, if you want to configure and test your algorithm in our interface, click on the button "Tune my classifier".
## Crédits ## Example usage
Source de données : [INSEE](https://www.insee.fr/) For the cartographic interface, an example would be:
Contributeurs : laboratoire [LIRIS](https://liris.cnrs.fr/), laboratoire [CMW](https://www.centre-max-weber.fr/) et Labex [Intelligence des Mondes Urbains (IMU)](http://imu.universite-lyon.fr/projet/hil) 1. Type a query in the panel on the left, e.g. "Lyon". This will display all neighbourhoods that contain "Lyon" in their name or their township.
2. Click on a neighbourhood (which are the small areas in blue). A tooltip will appear with some information about the neighbourhood. There are more informations when clicking on the "More details" link.
3. In order to predict the environment variables, you have to choose the classifier. The "Random Forest" classifier is recommended by default. After some seconds, predictions will appear in the tooltip. This will help you for comparing neighbourhoods between them.s
For the algorithmic interface, an example would be:
1. Choose an algorithm
2. Tune it as desired
3. Click on "Train, test and evaluate" button. When computing accuracies is done, a table shows results for each environment variable and each list of indicators.
## Tests
Tests are in `tests.py` file.
\ No newline at end of file
...@@ -17,5 +17,6 @@ ...@@ -17,5 +17,6 @@
@article{barretpredicting, @article{barretpredicting,
title={Predicting the enviornment of a neighbourhood: a use case for France}, title={Predicting the enviornment of a neighbourhood: a use case for France},
author={Barret, Nelly and Duchateau, Fabien and Favetta, Franck and Bonneval, Loic} author={Barret, Nelly and Duchateau, Fabien and Favetta, Franck and Bonneval, Loic},
year={2020}
} }
\ No newline at end of file
...@@ -13,47 +13,64 @@ authors: ...@@ -13,47 +13,64 @@ authors:
affiliation: 1 affiliation: 1
- name: Fabien Duchateau - name: Fabien Duchateau
orcid: 0000-0001-6803-917X orcid: 0000-0001-6803-917X
affiliation: 2 affiliation: 1
- name: Franck Favetta - name: Franck Favetta
orcid: 0000-0003-2039-3481 orcid: 0000-0003-2039-3481
affiliation: 2 affiliation: 1
affiliations: affiliations:
- name: Université Claude Bernard Lyon 1, Lyon, France
index: 1
- name: LIRIS UMR5205, Université Claude Bernard Lyon 1, Lyon, France - name: LIRIS UMR5205, Université Claude Bernard Lyon 1, Lyon, France
index: 2 index: 1
date: 16 July 2020 date: 16 July 2020
bibliography: paper.bib bibliography: paper.bib
--- ---
# Statement of need 1 # Introduction
Finding a real estate in a new city is still a challenge. We often arrive in a city we don't know, thus finding the perfect living place becomes complex. Nearby public transport on one hand, a rural landscape on the other hand, an animated neighbourhood for some, far from urban hustle and bustle for others: there are many criteria for choosing your future neighbourhood.
Finding a real estate in a new city is still a challenge. We often arrive in a city we don't know, thus finding the perfect living place becomes complex. Nearby public transport on one hand, a rural landscape on the other hand, an animated neighbourhood for some, far from urban hustle and bustle for others: there are many criteria for choosing your future neighbourhood. # Statement of need
Some projects have been focused on qualifying neighbourhoods, such as Livehoods [@cranshaw2012livehoods] and Hoodsquare [@zhang2013hoodsquare]. The Livehoods project aims at defining and computing dynamics of neighbourhoods based on data gathered from social networks while the Hoodsquare project detects similar areas based on Foursquare check-ins. Regarding a lot of papers about these challenges, our contribution differs on several points. Numerous works are limited to a few cities, some others introduce bias by using social networks and finally, the majority of works are focusing on life quality. Contrary to existing works, our approach works for a whole country (namely in France), is based on reliable and frequently updated sources and a social study and is focused on the environment of neighbourhoods. Some projects focuses on qualifying neighbourhoods, such as Livehoods [@cranshaw2012livehoods], Hoodsquare [@zhang2013hoodsquare] and [DataFrance](https://datafrance.info/). The Livehoods project aims at defining and computing dynamics of neighbourhoods based on data gathered from social networks. The Hoodsquare project detects similar areas based on Foursquare check-ins. DataFrance is an interface that integrates data from several sources, such as indicators provided by the National Institute of Statistics ([INSEE](https://insee.fr/en/accueil)), geographical information from the National Geographic Institute ([IGN](http://www.ign.fr/institut/activites/geoservices-ign)) and surveys from newspapers for prices (L'Express).
In order to describe in the most accurate way the environment of a neighbourhood, social science researchers have defined six environment variables with a limited number of values for each one. These six variables are the _building type_, the _building usage_, the _landscape_, the _social class_, the _morphological position_ and the _geographical position_. As an example, the _landscape_ can be evaluated as _urban_, _green areas_, _forest_ or _countryside_ while the _social class_ have values from _lower_ to _upper_. These variables are commonly accepted and easy to understand and use. There is still a challenge about describing each neighbourhood in a whole country with these six variables. To tackle this, our objective is to predict by supervised learning the environment variables whatever the neighbourhood.
# Methodology # Methodology
For predicting the environment of neighbourhoods, we have to gather data about them. There are mainly two types of data: the geometry which describe the shape of the neighbourhood and indicators that quantify the environment. Each neighbourhood can be described by thousands of indicators. Even if it is not possible to manually exploit these indicators, they are useful in an automatic approach. For example, there are the number of restaurants, the average income or even the number of houses over 250 $m^2$. Predihood integrates such data for France by using [mongiris](https://gitlab.liris.cnrs.fr/fduchate/mongiris), an interface for querying French administrative areas. There are only data about French areas, but this can be extended to other countries. Our approach Predihood aims at facilitating the comparison between neighbourhoods. It defines and predicts the environment of any neighbourhood in France using supervised learning.
## Describing neighbourhoods
In order to describe in the most accurate way the environment of a neighbourhood, social science researchers have defined six environment variables with a limited number of values for each one. These six variables are the _building type_, the _building usage_, the _landscape_, the _social class_, the _morphological position_ and the _geographical position_. As an example, the _landscape_ can be evaluated as _urban_, _green areas_, _forest_ or _countryside_ while the _social class_ have values from _lower_ to _upper_. These variables are commonly accepted and easily understandable.
## Predicting neighbourhoods
There are mainly four steps: producing supervised neighbourhoods, collecting data about neighbourhoods, compute datasets and finally running algorithms to predict environment.
The first step of producing supervised neighbourhoods is a manual task that has been done by social science researchers. This task consists of giving a value for each environment variable. This has been done by investigating Google Street View (building and streets pictures, parked cars, facilities and greens areas) and requires between one to two hours for a single neighbourhood. A total of 300 neighbourhoods have been annotated.
The second step is about collecting data that represents neighbourhoods. There are mainly two types of data:
After gathering data, the next step is to assess some neighbourhoods because of the supervised learning approach. This manual assessment has been realized by social science researchers. This have been done by investigating Google Street View (building and streets pictures, parked cars, facilities and greens areas) and requires between one to two hours for a single neighbourhood. A total of 300 IRIS have been annotated, which will be used as training data. - The geometry, stored as a GeoJSON object, which describe the shape of the neighbourhood.
- The indicators which quantify the environment. Each neighbourhood can be described by thousands of indicators, such as the number of restaurants, the average income or even the number of houses over 250 $m^2$. Even if it is not possible to manually exploit these indicators, they are useful in an automatic approach.
In order to unify the view between assessed neighbourhoods and their indicators, datasets have been constructed. They look like Figure 1 and are composed of the code INSEE of the neighbourhood (grey column), its indicators (yellow columns) that have been normalized by density of population (green column) and the assessment of social science researchers for the six environment variables (blue columns). Our approach Predihood aims at automatically filling question marks for neighbourhoods that are not yet assessed. Predihood integrates such data for France by using [mongiris](https://gitlab.liris.cnrs.fr/fduchate/mongiris), an interface for querying French administrative areas. Predihood is instantiated only for French data, but this can be easily extended to other countries.
The third step aims at computing datasets that will aggregate aforementioned data. A dataset looks like Figure 1 and is composed of the code INSEE of the neighbourhood (grey column), its indicators (yellow columns) that have been normalized by density of population (green column) and the assessment of social science researchers for the six environment variables (blue columns). As a reminder, our approach Predihood aims at automatically filling question marks for neighbourhoods that are not yet assessed.
![An example of the computed dataset.](predihood-indicators.png) ![An example of the computed dataset.](predihood-indicators.png)
It is now possible to predict the environment of any neighbourhood in France using our unified dataset. Because neighbourhoods are represented by hundreds of indicators, a selection process selects subsets of relevant indicators. These subsets are called _lists_ and contain from 10 to 100 indicators. They are used in the Predihood interface to predict environment. The last step predicts the environment of any neighbourhood in France. Because neighbourhoods are represented by hundreds of indicators, a selection process selects subsets of relevant indicators. These subsets are called _lists_ and contain from 10 to 100 indicators. They are used in the Predihood interface to predict environment.
Predihood proposes a generic interface for tuning algorithms more easily. This interface is based on [Scikit-learn](https://scikit-learn.org/stable/) algorithms but can handle hand-made ones. To implement your own algorithm and test it on our dataset, follow these steps:
## Algorithmic interface
Because the prediction of these variables is a complex task, we have to test several algorithms to compare results. In order to facilitate the tuning and the using of the algorithms, Predihood proposes a generic and easy-to-use interface for algorithms. This interface is based on [Scikit-learn](https://scikit-learn.org/stable/) algorithms but can handle hand-made ones. To implement your own algorithm and test it on our dataset, follow these steps:
1. Create a new class that represents your algorithm, e.g. `MyOwnClassifier` and inherits from `Classifier`. 1. Create a new class that represents your algorithm, e.g. `MyOwnClassifier`, and inherits from `Classifier`.
2. Then, implement the core of your algorithm by coding `fit()` and `predict()` functions. The `fit` function aims at fitting your classifier on assessed neighbourhoods while the `predict` function aims at predicting environment variables for a given neighbourhood. 2. Implement the core of your algorithm by coding `fit()` and `predict()` functions. The `fit` function aims at fitting your classifier on assessed neighbourhoods while the `predict` function aims at predicting environment variables for a given neighbourhood.
3. Next, add `get_params()` to be compatible with Scikit-learn framework. 3. Add `get_params()` to be compatible with Scikit-learn framework.
5. Finally, do not forget to comment your classifier with the Numpy style if you want to tune it. 5. Comment your classifier with the Numpy style in order to be able to tune it in the interface.
Below, there is a very simple example to illustrate the aforementioned steps. Below is a very simple example to illustrate the aforementioned steps.
```python ```python
# file ./algorithms/MyOwnClassifier.py # file ./algorithms/MyOwnClassifier.py
...@@ -61,7 +78,7 @@ from predihood.classes.Classifier import Classifier ...@@ -61,7 +78,7 @@ from predihood.classes.Classifier import Classifier
class MyOwnClassifier(Classifier): class MyOwnClassifier(Classifier):
"""Some text. """Description of the classifier.
Parameters Parameters
------------ ------------
a : float, default=0.01 a : float, default=0.01
...@@ -87,18 +104,22 @@ class MyOwnClassifier(Classifier): ...@@ -87,18 +104,22 @@ class MyOwnClassifier(Classifier):
After that, your algorithm is ready to be used in Predihood. After that, your algorithm is ready to be used in Predihood.
# Mentions of Predihood Figure 2 shows the generic interface of Predihood for tuning algorithms. The left panel allows to tune parameters and hyper parameters, such as training and test sizes. On the right, the table illustrates the accuracies obtained for each list (generated during the selection process) and each environment variable. You can export these results by clicking on the download icon.
Our approach Predihood has been presented during the DATA conference [@barretpredicting].
This first screenshot shows the generic interface of Predihood for tuning algorithms. The left panel allows to tune parameters and hyper parameters, such as training and test sizes. On the right, the table illustrates the accuracies obtained for each lists (generated during the selection process) and each environment variable. You can export these results by clicking on the download icon. ![Screenshot of algorithmic interface of Predihood](predihood-accuracies.png)
![Screenshot of algorithm interface of Predihood](predihood-accuracies.png) ## Cartographic interface
This screenshot exposes the cartographic interface of Predihood, used mostly by people who search for a new living place. By searching an area in the inputs on the left and then clicking on neighbourhoods, you will be able to choose an algorithm to predict environment variables of the chosen neighbourhood. For beginners, `Random Forest` classifier is recommended. For example, Alice is an IT commercial and has been recruited for a mission in Lyon for 6 months before going back to Paris. She compares easily many neighbourdhoods in the CBD (Central Business District) of Lyon and chooses the "Part-Dieu" neighbourhood. Figure 3 exposes the cartographic interface of Predihood, used mostly by people who search for a new living place. By searching an area in the inputs on the left and then clicking on neighbourhoods, you will be able to choose an algorithm to predict environment variables of the chosen neighbourhood. For beginners, `Random Forest` classifier is recommended. For example, Alice is an IT commercial and has been recruited for a mission in Lyon for 6 months before going back to Paris. She compares easily many neighbourdhoods in the CBD (Central Business District) of Lyon and chooses the "Part-Dieu" neighbourhood.
![Screenshot of the cartographic interface of Predihood](predihood-predictions.png) ![Screenshot of the cartographic interface of Predihood](predihood-predictions.png)
# Mentions of Predihood
Our approach Predihood has been presented during the DATA conference [@barretpredicting].
Results vary from 30% to 65% depending on the environment variable, but proposing new algorithms can help to improve these results.
# Acknowledgements # Acknowledgements
This work has been partially funded by LABEX IMU (ANR-10-LABX-0088) from Université de Lyon, in the context of the program "Investissements d'Avenir" (ANR-11-IDEX-0007) from the French Research Agency (ANR). This work has been partially funded by LABEX IMU (ANR-10-LABX-0088) from Université de Lyon, in the context of the program "Investissements d'Avenir" (ANR-11-IDEX-0007) from the French Research Agency (ANR).
......
predihood-indicators.png

560 KiB | W: | H:

predihood-indicators.png

503 KiB | W: | H:

predihood-indicators.png
predihood-indicators.png
predihood-indicators.png
predihood-indicators.png
  • 2-up
  • Swipe
  • Onion skin
...@@ -54,7 +54,6 @@ class MethodPrediction(Method): ...@@ -54,7 +54,6 @@ class MethodPrediction(Method):
iris_indicators_names.append(indicator) iris_indicators_names.append(indicator)
elif self.dataset.normalization == "population": elif self.dataset.normalization == "population":
for indicator in self.dataset.selected_indicators: for indicator in self.dataset.selected_indicators:
# if indicator == "P14_POP": continue # skip this indicator because of normalisation # TODO
if indicator in iris_object["properties"]["raw_indicators"] and iris_population > 0: if indicator in iris_object["properties"]["raw_indicators"] and iris_population > 0:
iris_indicators_values.append(float(iris_object["properties"]["raw_indicators"][indicator]) / iris_population) iris_indicators_values.append(float(iris_object["properties"]["raw_indicators"][indicator]) / iris_population)
else: else:
......
...@@ -39,7 +39,6 @@ def index(page): ...@@ -39,7 +39,6 @@ def index(page):
predihood.config.PREFERRED_LANGUAGE = request.args["lang"] predihood.config.PREFERRED_LANGUAGE = request.args["lang"]
else: else:
predihood.config.PREFERRED_LANGUAGE = "english" predihood.config.PREFERRED_LANGUAGE = "english"
print(predihood.config.PREFERRED_LANGUAGE)
return render_template("index.html", language=predihood.config.PREFERRED_LANGUAGE) return render_template("index.html", language=predihood.config.PREFERRED_LANGUAGE)
......
...@@ -129,18 +129,16 @@ def predict_one_iris(iris_code, data, clf, train_size, test_size, remove_outlier ...@@ -129,18 +129,16 @@ def predict_one_iris(iris_code, data, clf, train_size, test_size, remove_outlier
algorithm = MethodPrediction(name='', dataset=dataset, classifier=clf) algorithm = MethodPrediction(name='', dataset=dataset, classifier=clf)
algorithm.fit() algorithm.fit()
algorithm.predict(iris_code) algorithm.predict(iris_code)
print(predihood.config.PREFERRED_LANGUAGE)
if predihood.config.PREFERRED_LANGUAGE == "french": if predihood.config.PREFERRED_LANGUAGE == "french":
predicted_value = list(ENVIRONMENT_VALUES[env].keys())[list(ENVIRONMENT_VALUES[env].values()).index(algorithm.prediction)] # get french translation of the predicted value predicted_value = list(ENVIRONMENT_VALUES[env].keys())[list(ENVIRONMENT_VALUES[env].values()).index(algorithm.prediction)] # get french translation of the predicted value
else: else:
predicted_value = algorithm.prediction predicted_value = algorithm.prediction
predictions_lst.append(predicted_value) predictions_lst.append(predicted_value)
print(predictions_lst)
if predihood.config.PREFERRED_LANGUAGE == "french": if predihood.config.PREFERRED_LANGUAGE == "french":
predictions[ENVIRONMENT_VARIABLES_FR[env]] = get_most_frequent(predictions_lst) predictions[ENVIRONMENT_VARIABLES_FR[env]] = get_most_frequent(predictions_lst)
else: else:
predictions[env] = get_most_frequent(predictions_lst) # get the most frequent value and the number of occurrences predictions[env] = get_most_frequent(predictions_lst) # get the most frequent value and the number of occurrences
print(predictions) # TODO: give an example of the dictionary print(predictions) # {'building_type': {'most_frequent': 'Towers', 'count_frequent': 7}, 'building_usage': {'most_frequent': 'Housing', 'count_frequent': 4}, ... }
return predictions return predictions
......
...@@ -5,8 +5,10 @@ let generalParameters = ["class_weight", "cv", "kernel", "max_iter", "memory", " ...@@ -5,8 +5,10 @@ let generalParameters = ["class_weight", "cv", "kernel", "max_iter", "memory", "
let trainPercentage = $("#trainPercentage").val(); // to update test percentage depending on train percentage let trainPercentage = $("#trainPercentage").val(); // to update test percentage depending on train percentage
let testPercentage = $("#testPercentage").val(); // to update train percentage depending on test percentage let testPercentage = $("#testPercentage").val(); // to update train percentage depending on test percentage
let request_run = null; // the request send to the server (with classifier and its parameters) let request_run = null; // the request send to the server (with classifier and its parameters)
const MAX_PARAMETERS = 5; const MAX_PARAMETERS = 2;
let preferred_language_algo = get_preferred_language(); let preferred_language_algo = get_preferred_language();
// get parameters of the selected classifier and display them in the interface. // get parameters of the selected classifier and display them in the interface.
$("#selectAlgorithm").change(function () { $("#selectAlgorithm").change(function () {
let algorithm_name = $(this).children("option:selected").val(); let algorithm_name = $(this).children("option:selected").val();
...@@ -54,166 +56,156 @@ $("#testPercentage") ...@@ -54,166 +56,156 @@ $("#testPercentage")
}); // prevent user from clicking on the input }); // prevent user from clicking on the input
// run the classifier with specified parameters and display results in the results section. // run the classifier with specified parameters and display results in the results section.
$("#runBtn") $("#runBtn").click("on", function () {
.click("on", function () { $("body").css("cursor", "progress");
$("body").css("cursor", "progress"); $(".wrapperTable input[type='checkbox']:not(:checked)").each(function () {
$(".wrapperTable input[type='checkbox']:not(:checked)").each(function () { $(this).parent().parent().empty(); // remove tables that are not checked in the interface
$(this).parent().parent().empty(); // remove tables that are not checked in the interface });
}); let userParameters = {};
let userParameters = {}; let chosen_clf = $("#formAlgorithm")[0].elements[0].value;
let chosen_clf = $("#formAlgorithm")[0].elements[0].value; for (let key in $("#formParameters")[0].elements) {
for (let key in $("#formParameters")[0].elements) { let elem = $("#formParameters")[0].elements[key];
let elem = $("#formParameters")[0].elements[key]; if (parseInt(key) === undefined || isNaN(parseInt(key))) {
if (parseInt(key) === undefined || isNaN(parseInt(key))) { continue;
continue; } // key is not an element of the form
} // key is not an element of the form // console.log(elem.title + " : " + elem.value + " / " + current_parameters[elem.title]["default"]);
// console.log(elem.title + " : " + elem.value + " / " + current_parameters[elem.title]["default"]); if (elem.value !== current_parameters[elem.title]["default"] && elem.value !== "") { // get only parameters filled by user
if (elem.value !== current_parameters[elem.title]["default"] && elem.value !== "") { // get only parameters filled by user let label_name = elem.title;
let label_name = elem.title; let val = elem.value;
let val = elem.value; if (elem.type === "text") { // input with text type
if (elem.type === "text") { // input with text type if (elem.title.includes("int") && parseInt(elem.value)) {
if (elem.title.includes("int") && parseInt(elem.value)) { val = parseInt(elem.value)
val = parseInt(elem.value) } else if (elem.title.includes("float") && parseFloat(elem.val)) {
} else if (elem.title.includes("float") && parseFloat(elem.val)) { val = parseFloat(elem.value)
val = parseFloat(elem.value)
}
userParameters[label_name] = val;
} else if (elem.type === "number") { // input with number type
userParameters[label_name] = parseFloat(elem.value);
} else if (elem.type === "checkbox") { // input with checkbox type
userParameters[label_name] = elem.checked;
} }
userParameters[label_name] = val;
} else if (elem.type === "number") { // input with number type
userParameters[label_name] = parseFloat(elem.value);
} else if (elem.type === "checkbox") { // input with checkbox type
userParameters[label_name] = elem.checked;
} }
} }
userParameters["train_size"] = $("#trainPercentage")[0].valueAsNumber; }
userParameters["test_size"] = $("#testPercentage")[0].valueAsNumber; userParameters["train_size"] = $("#trainPercentage")[0].valueAsNumber;
userParameters["remove_outliers"] = $("#removeOutliers").prop("checked"); userParameters["test_size"] = $("#testPercentage")[0].valueAsNumber;
userParameters["remove_rural"] = $("#removeRural").prop("checked"); userParameters["remove_outliers"] = $("#removeOutliers").prop("checked");
console.log(chosen_clf); userParameters["remove_rural"] = $("#removeRural").prop("checked");
console.log(userParameters); console.log(chosen_clf);
request_run = $.ajax({ console.log(userParameters);
"type": "GET", request_run = $.ajax({
"url": "/run", "type": "GET",
//"async": false, "url": "/run",
data: { "async": false,
"clf": chosen_clf, data: {
"parameters": JSON.stringify(userParameters) "clf": chosen_clf,
}, "parameters": JSON.stringify(userParameters)
success: function (result) { },
// each result is displayed with : success: function (result) {
// - a checkbox to keep the results available in the next run // each result is displayed with :
// - the results table, highlighted cells are best means for each EV // - a checkbox to keep the results available in the next run
// - the list of parameters associated to the results // - the results table, highlighted cells are best means for each EV
// - the mean for this classifier (all EV combined) // - the list of parameters associated to the results
let keep = $("<label class='h5'><input type='checkbox' style='margin-right: 1rem;'/>" + chosen_clf + "</label>"); // - the mean for this classifier (all EV combined)
let table = $("<table id='tableToExport'>").addClass("table table-hover table-responsive").append($("<tbody>")); let keep = $("<label class='h5'><input type='checkbox' style='margin-right: 1rem;'/>" + chosen_clf + "</label>");
let results = result["results"] let table = $("<table id='tableToExport'>")
.addClass("table table-hover table-responsive")
.append($("<tbody>"));
let results = result["results"]
// header of table: None, 10, 20, ..., 100, Mean // header of table: None, 10, 20, ..., 100, Mean
let header = $("<tr>") let header = $("<tr>")
header.append("<th></th>"); header.append("<th></th>");
if(preferred_language_algo === "french") { if(preferred_language_algo === "french") {
header.append("<th title='Précision obtenue avec tous les indicateurs'><i>I</i></th>") header.append("<th title='Précision obtenue avec tous les indicateurs'><i>I</i></th>")
for (let key of result["tops_k"]) { header.append("<th title='Précision obtenue avec la liste de "+key+" indicateurs'>" + key + "</th>") } // adding header with tops-k for (let key of result["tops_k"]) { header.append("<th title='Précision obtenue avec la liste de "+key+" indicateurs'>" + key + "</th>") } // adding header with tops-k
header.append("<th title=\"Précision moyenne obtenue pour la variable d'environnement\">Moyenne</th>") header.append("<th title=\"Précision moyenne obtenue pour la variable d'environnement\">Moyenne</th>")
} else { } else {
header.append("<th title='Accuracy obtained with all indicators'>I</th>") header.append("<th title='Accuracy obtained with all indicators'>I</th>")
for (let key of result["tops_k"]) { header.append("<th title='Accuracy obtained by list with "+key+" indicators'>" + key + "</th>") } // adding header with tops-k for (let key of result["tops_k"]) { header.append("<th title='Accuracy obtained by list with "+key+" indicators'>" + key + "</th>") } // adding header with tops-k
header.append("<th title='Mean accuracy for the environment variable'>Mean</th>") header.append("<th title='Mean accuracy for the environment variable'>Mean</th>")
}
table.append(header);
console.log(results)
// content of table with computed accuracies
for (let key in results) { // iterating over env variables
let row = $("<tr>");
let env = results[key];
let max = getMaxValueDict(results[key]["accuracies"], env["accuracy_none"]);
row.append("<td>" + capitalizeFirstLetter(key.split("_").join(" ")) + "</td>")
let col = $("<td>").text(env["accuracy_none"].toFixed(2) + "%")
if (env["accuracy_none"] === max) {
col.css("background-color", "#71dd8a")
} }
table.append(header); row.append(col);
console.log(results)
// content of table with computed accuracies
for (let key in results) { // iterating over env variables
let row = $("<tr>");
console.log(key)
console.log(typeof(results[key]))
let env = capitalizeFirstLetter(results[key].split("_").join(" "));
let max = getMaxValueDict(env["accuracies"], env["accuracy_none"]);
row.append("<td>" + key + "</td>")
let col = $("<td>").text(env["accuracy_none"].toFixed(2) + "%") for (let top_k in env["accuracies"]) { // iterating over top-k for each EV
if (env["accuracy_none"] === max) { let col = $("<td>").text(env["accuracies"][top_k].toFixed(2) + "%");
col.css("background-color", "#71dd8a") if (env["accuracies"][top_k] === max) {
col.css("background-color", "#71dd8a");
} }
row.append(col); row.append(col);
for (let top_k in env["accuracies"]) { // iterating over top-k for each EV
let col = $("<td>").text(env["accuracies"][top_k].toFixed(2) + "%");
if (env["accuracies"][top_k] === max) {
col.css("background-color", "#71dd8a");
}
row.append(col);
}
row.append("<td>" + env["mean"].toFixed(2) + "%</td>");
table.append(row)
} }
row.append("<td>" + env["mean"].toFixed(2) + "%</td>");
table.append(row)
}
// download icon // download icon
let download; let download;
if(preferred_language_algo === "french") { if(preferred_language_algo === "french") {
download = $("<i class='fas fa-download' style='margin-left: 1rem;' title='Exporter cette table comme un fichier Excel.'></i>") download = $("<i class='fas fa-download' style='margin-left: 1rem;' title='Exporter cette table comme un fichier Excel.'></i>")
} else { } else {
download = $("<i class='fas fa-download' style='margin-left: 1rem;' title='Export this table as an Excel file.'></i>") download = $("<i class='fas fa-download' style='margin-left: 1rem;' title='Export this table as an Excel file.'></i>")
} }
download.on("click", function (e) { download.on("click", function (e) {
e.preventDefault(); e.preventDefault();
console.log($(this)) $("#tableToExport").table2excel({
console.log($(this)[0].nextElementSibling) type: 'xls',
console.log($(this)[0].nextSibling) filename: chosen_clf + '.xls',
console.log($("#tableToExport")) preserveColors: true
$("#tableToExport").table2excel({
type: 'xls',
filename: chosen_clf + '.xls',
preserveColors: true
});
}); });
let containing_table = $("<div>").prop("class", "wrapperTable"); });
containing_table.append(keep).append(download).append(table); let containing_table = $("<div>").prop("class", "wrapperTable");
containing_table.append(keep).append(download).append(table);
// list of parameters used to have the current results
let params = "";
for (let elem in current_parameters) {
if (elem in userParameters) {
params += "<i>" + elem + "</i>: " + userParameters[elem] + " ; "; // adding user value
} else {
params += "<i>" + elem + "</i>: " + current_parameters[elem]["default"] + " ; ";
} // adding default value
}
containing_table.append(params);
// Mean accuracy for this classifier // list of parameters used to have the current results
let mean_clf = 0; let params = "";
for (let env in results) { for (let elem in current_parameters) {
mean_clf += results[env]["mean"]; if (elem in userParameters) {
} params += "<i>" + elem + "</i>: " + userParameters[elem] + " ; "; // adding user value
mean_clf /= Object.keys(results).length;
if(preferred_language_algo === "french") {
containing_table.append("<br/> <b>Moyenne de cet algorithme : </b>" + mean_clf.toFixed(2) + "%");
} else { } else {
containing_table.append("<br/> <b>Mean for this classifier: </b>" + mean_clf.toFixed(2) + "%"); params += "<i>" + elem + "</i>: " + current_parameters[elem]["default"] + " ; ";
} } // adding default value
}
containing_table.append(params);
// append all to HTML // Mean accuracy for this classifier
$("#resultsDiv").append(containing_table); let mean_clf = 0;
$("body").css("cursor", "default"); for (let env in results) {
}, mean_clf += results[env]["mean"];
error: function (result, textStatus, errorThrown) { }
console.log(errorThrown); mean_clf /= Object.keys(results).length;
alert("something went wrong while training. Please check your parameters<br>" + textStatus); if(preferred_language_algo === "french") {
$("body").css("cursor", "default"); containing_table.append("<br/> <b>Moyenne de cet algorithme : </b>" + mean_clf.toFixed(2) + "%");
} else {
containing_table.append("<br/> <b>Mean for this classifier: </b>" + mean_clf.toFixed(2) + "%");
} }
});
return false; // don't reload // append all to HTML
$("#resultsDiv").append(containing_table);
$("body").css("cursor", "default");
},
error: function (result, textStatus, errorThrown) {
console.log(errorThrown);
alert("something went wrong while training. Please check your parameters<br>" + textStatus);
$("body").css("cursor", "default");
}
}); });
$("#abortRun").on("click", function () { return false; // do not reload
request_run.abort(); // TODO: abort also on Flask
alert("Request have been aborted");
}); });
// empty results when clicking on the trash icon
$("#clearResults").on("click", function () { $("#clearResults").on("click", function () {
// empty div and add title + "clear all" button // empty div and add title + "clear all" button
$("#resultsDiv") $("#resultsDiv")
...@@ -223,11 +215,11 @@ $("#clearResults").on("click", function () { ...@@ -223,11 +215,11 @@ $("#clearResults").on("click", function () {
/** /**
* Adds the parameter in the interface, with label and input. * Adds the parameter in the interface, with label and input.
* @param {string} label The name of the parameter. * @param {string} label The name of the parameter.
* @param {string} content The value of the parameter (default value). * @param {string} content The value of the parameter (default value).
* @param {string} type The type of the input (i.e. str, int, float, bool or None). * @param {string} type The type of the input (i.e. str, int, float, bool or None).
* @param {string} description The description of the parameter. It corresponds to the first sentence in the doc (sklearn) for the parameter. * @param {string} description The description of the parameter. It corresponds to the first sentence in the doc (sklearn) for the parameter.
* @param {boolean} hidden A boolean that indicates if the field is hidden or not (because we display only 5 parameters by default). * @param {boolean} hidden A boolean that indicates if the field is hidden or not (because we display only 5 parameters by default).
*/ */
function addElement(label, content, type, description, hidden) { function addElement(label, content, type, description, hidden) {
// adds something like : // adds something like :
......
...@@ -40,9 +40,9 @@ function initialize() { ...@@ -40,9 +40,9 @@ function initialize() {
} }
/* /**
** Event for zoom changes : updates a label and if zoom enabled and above min zoom level, display iris * Event for zoom changes : updates a label and if zoom enabled and above min zoom level, display iris
*/ */
function zoomendEvent() { function zoomendEvent() {
zoomLevel = map.getZoom(); zoomLevel = map.getZoom();
document.getElementById("spanZoomLevel").innerHTML = zoomLevel; document.getElementById("spanZoomLevel").innerHTML = zoomLevel;
...@@ -56,25 +56,15 @@ function zoomendEvent() { ...@@ -56,25 +56,15 @@ function zoomendEvent() {
} }
} }
/* /**
** Method for deleting a layer (e.g., all iris in irisLayer) * Method for deleting a layer (e.g., all iris in irisLayer)
*/ */
function removeLayer() { function removeLayer() {
map.removeLayer(irisLayer); map.removeLayer(irisLayer);
irisLayer = null; irisLayer = null;
$("#zoneMessages").html(""); $("#zoneMessages").html("");
} }
/**
* Reset the style of all highlighted layer elements
*/
function resetHighlightAll() {
if(irisLayer != null) {
$.each(irisLayer["layers"], function(key, value) {
irisLayer["layers"].resetStyle(key);
});
}
}
/** /**
* Display popup with several information about the neighbourhood (descriptive information and environment variables) when clicking on it. * Display popup with several information about the neighbourhood (descriptive information and environment variables) when clicking on it.
...@@ -221,7 +211,7 @@ function addLayerFromGeoJSON(geojson, events, style, typeMethod){ ...@@ -221,7 +211,7 @@ function addLayerFromGeoJSON(geojson, events, style, typeMethod){
*/ */
function eventsIRIS(feature, layer) { function eventsIRIS(feature, layer) {
layer.on({ layer.on({
mouseover: highlightFeature, // mouseover: highlightFeature,
//mouseout: resetHighlight, //mouseout: resetHighlight,
click: displayPopup //showPredictions click: displayPopup //showPredictions
}); });
......
/**
* Send an AJAX request to get predictions for the given IRIS.
* @param iris_code a string containing the code of the IRIS to predict.
* @param algorithm_name a strting containing the name of the algorithm used to predict environment.
* @returns {} a dictionary containing results of predictions, i.e. a value and a score for each EV.
*/
function predict(iris_code, algorithm_name) { function predict(iris_code, algorithm_name) {
$("body").css("cursor", "progress"); $("body").css("cursor", "progress");
let predictions = null; let predictions = null;
......
...@@ -173,6 +173,10 @@ function parseFloatComplex(str) { ...@@ -173,6 +173,10 @@ function parseFloatComplex(str) {
} }
} }
/**
* Get a list containing the environment variables.
* @returns {[]} a list containing the environment variables' names
*/
function getEnvironmentVariables() { function getEnvironmentVariables() {
let env_var = null; let env_var = null;
$.ajax({ $.ajax({
...@@ -190,17 +194,10 @@ function getEnvironmentVariables() { ...@@ -190,17 +194,10 @@ function getEnvironmentVariables() {
return env_var; return env_var;
} }
/**
function highlightFeature(e) { * Get the preferred language that have been selected by the user.
var layer = e.target; * @return {string} a string containing the name of the preferred language.
layer.setStyle({ */
weight: 1,
color: '#666',
fillOpacity: 0.25
});
}
function get_preferred_language() { function get_preferred_language() {
let chosen_language = undefined; let chosen_language = undefined;
......
...@@ -10,7 +10,7 @@ ...@@ -10,7 +10,7 @@
{% else %} {% else %}
<p class="text-gray font-weight-bold text-uppercase px-3 small pb-4 mb-0" title="Choose an algorithm among the list below.">Select an algorithm</p> <p class="text-gray font-weight-bold text-uppercase px-3 small pb-4 mb-0" title="Choose an algorithm among the list below.">Select an algorithm</p>
{% endif %} {% endif %}
<form id="formAlgorithm" class="px-3 small pb-4 mb-0"> <form id="formAlgorithm" class="px-3 small">
<select id="selectAlgorithm"> <select id="selectAlgorithm">
{% if language == "french" %} {% if language == "french" %}
<option selected value="Algorithme"> -- choisir un algorithme --</option> <option selected value="Algorithme"> -- choisir un algorithme --</option>
...@@ -24,9 +24,9 @@ ...@@ -24,9 +24,9 @@
<hr/> <hr/>
{% if language == "french" %} {% if language == "french" %}
<p class="font-weight-bold text-uppercase px-3 small pb-4 mb-0" title="Paramétrer l'algorithme choisi.">Paramétrer l'algorithme</p> <p class="font-weight-bold text-uppercase px-3 small" title="Paramétrer l'algorithme choisi.">Paramétrer l'algorithme</p>
{% else %} {% else %}
<p class="font-weight-bold text-uppercase px-3 small pb-4 mb-0" title="Tune the selected algorithm.">Tune algorithm</p> <p class="font-weight-bold text-uppercase px-3 small" title="Tune the selected algorithm.">Tune algorithm</p>
{% endif %} {% endif %}
<form id="formParameters"> <form id="formParameters">
<div class="col-12" id="divParameters"> <div class="col-12" id="divParameters">
...@@ -45,9 +45,9 @@ ...@@ -45,9 +45,9 @@
</form> </form>
{% if language == "french" %} {% if language == "french" %}
<p class="font-weight-bold text-uppercase px-3 small pb-4 mb-0" title="Paramétrer la répartition entre les jeux d'apprentissage et de test." style="padding-top: 1rem; padding-bottom: 0; margin-bottom: 0">Jeux de données</p> <p class="font-weight-bold text-uppercase px-3 small" title="Paramétrer la répartition entre les jeux d'apprentissage et de test." style="padding-top: 1rem; padding-bottom: 0; margin-bottom: 0">Jeux de données</p>
{% else %} {% else %}
<p class="font-weight-bold text-uppercase px-3 small pb-4 mb-0" title="Tune the repartition of the data into train and test sets." style="padding-top: 1rem; padding-bottom: 0; margin-bottom: 0">Tune dataset</p> <p class="font-weight-bold text-uppercase px-3 small" title="Tune the repartition of the data into train and test sets." style="padding-top: 1rem; padding-bottom: 0; margin-bottom: 0">Tune dataset</p>
{% endif %} {% endif %}
<ul class="nav flex-column bg-white mb-0"> <ul class="nav flex-column bg-white mb-0">
<li class="nav-item"> <li class="nav-item">
...@@ -108,7 +108,6 @@ ...@@ -108,7 +108,6 @@
Train, test and evaluate Train, test and evaluate
</button> </button>
{% endif %} {% endif %}
<!--<button id="abortRun" class="btn btn-danger" title="Arbort the current request">Abort</button>-->
</div> </div>
{% include 'footer.html' %} {% include 'footer.html' %}
</aside> </aside>
......
<header class="mb-3"> <header class="mb-3">
<h1><a href="/"><img src="{{url_for('static', filename='img/favicon.png')}}"></a>&emsp;predihood</h1> <h1><a href="/"><img src="{{url_for('static', filename='img/favicon.png')}}"></a>&emsp;predihood</h1>
{% if language == "french" %}
<em>Un outil de visualisation des IRIS</em>
{% else %}
<em>A tool for visualizing IRIS</em>
{% endif %}
<hr> <hr>
</header> </header>
\ No newline at end of file
...@@ -3,9 +3,11 @@ ...@@ -3,9 +3,11 @@
# ============================================================================= # =============================================================================
# Unit tests for predihood. # Unit tests for predihood.
# ============================================================================= # =============================================================================
import os
import pandas as pd
import unittest import unittest
from predihood.config import FOLDER_DATASETS, ENVIRONMENT_VALUES
from predihood.utility_functions import check_train_test_percentages, intersection, union, similarity, \ from predihood.utility_functions import check_train_test_percentages, intersection, union, similarity, \
get_most_frequent, address_to_code, address_to_city, indicator_full_to_short_label, \ get_most_frequent, address_to_code, address_to_city, indicator_full_to_short_label, \
indicator_short_to_full_label, get_classifier, set_classifier, signature, add_assessment_to_file indicator_short_to_full_label, get_classifier, set_classifier, signature, add_assessment_to_file
...@@ -47,9 +49,6 @@ class TestCase(unittest.TestCase): ...@@ -47,9 +49,6 @@ class TestCase(unittest.TestCase):
full_label = indicator_short_to_full_label(short_label) full_label = indicator_short_to_full_label(short_label)
assert full_label == "Pop 11-17 ans en 2014 (princ)" assert full_label == "Pop 11-17 ans en 2014 (princ)"
def test_hierarchy(self):
assert True == True # TODO
def test_get_classifier(self): def test_get_classifier(self):
# test if selecting a classifier gives the correct object # test if selecting a classifier gives the correct object
classifier_name = "KNeighbors Classifier" classifier_name = "KNeighbors Classifier"
...@@ -140,6 +139,14 @@ class TestCase(unittest.TestCase): ...@@ -140,6 +139,14 @@ class TestCase(unittest.TestCase):
result = add_assessment_to_file(code_iris, values) result = add_assessment_to_file(code_iris, values)
assert result == "iris already assessed" assert result == "iris already assessed"
def test_values_dataset(self):
# test if values used in dataset are the same than the one declared by social science researchers
filename = os.path.join(FOLDER_DATASETS, "data_density.csv")
dataset = pd.read_csv(filename)
values_for_building_type = set([value for key, value in ENVIRONMENT_VALUES["building_type"].items()])
assert set(dataset["building_type"].tolist()) == values_for_building_type
if __name__ == "__main__": if __name__ == "__main__":
unittest.main(verbosity=2) # run all tests with verbose mode unittest.main(verbosity=2) # run all tests with verbose mode
......
#!/usr/bin/env python
# encoding: utf-8
# =============================================================================
# Unit tests for predihood.
# =============================================================================
import os
import pandas as pd
import unittest
from predihood.config import FOLDER_DATASETS, ENVIRONMENT_VALUES
class TestCase(unittest.TestCase):
"""
A class for Predihood unit tests.
"""
def test_values_dataset(self):
# test if values used in dataset are the same than the one declared by social science researchers
filename = os.path.join(FOLDER_DATASETS, "data_density.csv")
dataset = pd.read_csv(filename)
values_for_building_type = set([value for key, value in ENVIRONMENT_VALUES["building_type"].items()])
assert set(dataset["building_type"].tolist()) == values_for_building_type
if __name__ == "__main__":
unittest.main(verbosity=2) # run all tests with verbose mode
...@@ -221,7 +221,7 @@ def signature(chosen_algorithm): ...@@ -221,7 +221,7 @@ def signature(chosen_algorithm):
try: try:
# model = eval(_chosen_algorithm) # never use eval on untrusted strings # model = eval(_chosen_algorithm) # never use eval on untrusted strings
model = get_classifier(chosen_algorithm) model = get_classifier(chosen_algorithm)
doc = model.__doc__ # TODO: specify case when there is no doc (user-implemented algorithm) doc = model.__doc__
param_section = "Parameters" param_section = "Parameters"
dashes = "-" * len(param_section) # ------- dashes = "-" * len(param_section) # -------
number_spaces = doc.find(dashes) - (doc.find(param_section) + len(param_section)) number_spaces = doc.find(dashes) - (doc.find(param_section) + len(param_section))
......
# Predihood
Predihood is an application for visualizing [IRIS](https://www.insee.fr/fr/metadonnees/definition/c1523) (administrative areas defined by the French institute of statistics, they can be considered as neighbourhoods) and indicators which describe them (e.g. number of bakeries, average income and even the number of houses over 250m^2).
## Statement of need
Predihood proposes an interface for searching and comparing neighbourhoods.
## Installation instructions
### Requirements
- Python, version >=3
- [MongoDB](https://www.mongodb.com/), version >=4 for importing the database about neighbourhoods.
### Installation
For installing Predihood, type in a terminal:
```
python3 -m pip install -e predihood/ --process-dependency-links
```
This command install dependencies, including [mongiris](https://gitlab.liris.cnrs.fr/fduchate/mongiris) which provide the querying of the MongoDB database containing information about neighbourhoods.
Create this database is mandatory. To achieve this, execute this command (from the MongoDB's executables directory if needed):
```
./mongorestore --archive=/path/to/dump-iris.bin
```
where `/path/to/` is the path to the dump file of the IRIS collection (provided with the package mongiris in `mongiris/data/dump-iris.bin`).
### Run the interface
For running *Predihood*, type in a terminal:
```
python3 main.py
```
After some information, the terminal display the URL for testing *Predihood* : `http://localhost:8080/`. If you want to try the cartographic interface, click on the button "Search a neighbourhood". Otherwise, if you want to configure and test your algorithm in our interface, click on the button "Tune my classifier".
## Example usage
For the cartographic interface, an example would be:
1. Type a query in the panel on the left, e.g. "Lyon". This will display all neighbourhoods that contain "Lyon" in their name or their township.
2. Click on a neighbourhood (which are the small areas in blue). A tooltip will appear with some information about the neighbourhood. There are more informations when clicking on the "More details" link.
3. In order to predict the environment variables, you have to choose the classifier. The "Random Forest" classifier is recommended by default. After some seconds, predictions will appear in the tooltip. This will help you for comparing neighbourhoods between them.s
For the algorithmic interface, an example would be:
1. Choose an algorithm
## Community guideline
## Functionality
## Tests
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment