Skip to content
Snippets Groups Projects
Commit 2a077a5d authored by Ludovic Moncla's avatar Ludovic Moncla
Browse files

Update README.md

parent a9f3a5ce
No related branches found
No related tags found
No related merge requests found
......@@ -9,19 +9,19 @@ Inscription : [https://framaforms.org/inscription-seminaire-8-decembre-traitemen
## Programme
### How can voting mechanisms improve the robustness and generalizability of toponym disambiguation?
### Harvesting geospatial information from natural language texts
Xuke Hu (German Aerospace Center)
A vast amount of geographic information exists in natural language texts, such as tweets and historical documents. Extracting geographic information from texts is called Geoparsing, which includes two subtasks: toponym recognition and toponym disambiguation, i.e., to identify the geospatial representations of toponyms. In this report, I will share our latest findings in toponym disambiguation. Specifically, we proposed a spatial clustering-based voting approach that combines several individual approaches to improve
SOTA performance in terms of robustness and generalizability. Experiments are conducted to compare a voting ensemble with 20 latest and commonly-used approaches (especially deep learning-based ones) on 12 public datasets, including several highly ambiguous
and challenging datasets (e.g., WikToR and CLDW). The datasets are of six types: tweets, historical documents, news, web pages, scientific articles, and Wikipedia articles, containing in total 98,300 places across the world. The results prove the generalizability
and robustness of the voting approach. Also, the voting ensemble drastically improves the performance of resolving fine-grained places, i.e., POIs, natural features, and traffic ways.
A vast amount of geospatial information exists in natural language texts (e.g., social media posts, website texts, and historical archives) in the form of toponyms, place names, and location descriptions. Extracting geographic information from texts is named geoparsing, which is beneficial not only for scientific studies, such as sociolinguistics and spatial humanities but can also contribute to various practical applications, such as disaster management, urban planning, and disease surveillance.
In the presentation, I will share our latest findings in the two sub-tasks of geoparsing: toponym recognition and toponym resolution. Specifically, I will introduce our proposed approaches for the two sub-tasks and compare them with numerous existing ones based on many datasets.
#### Reference
Hu, X., Sun, Y., Kersten, J., Zhou, Z., Klan, F. and Fan, H., 2022. How can voting mechanisms improve the robustness and generalizability of toponym disambiguation? arXiv preprint arXiv:2209.08286.
#### Related publications
[1] Hu, X., Al-Olimat, H.S., Kersten, J., Wiegmann, M., Klan, F., Sun, Y. and Fan, H., 2022. GazPNE: annotation-free deep learning for place name extraction from microblogs leveraging gazetteer and synthetic data by rules. International Journal of Geographical Information Science, 36(2), pp.310-337.
[2] Hu, X., Zhou, Z., Sun, Y., Kersten, J., Klan, F., Fan, H. and Wiegmann, M., 2022. GazPNE2: A general place name extractor for microblogs fusing gazetteers and pretrained transformer models. IEEE Internet of Things Journal.
[3] Hu, X., Zhou, Z., Li, H., Hu, Y., Gu, F., Kersten, J., Fan, H. and Klan, F., 2022. Location reference recognition from texts: A survey and comparison. arXiv preprint arXiv:2207.01683.
[4] Hu, X., Sun, Y., Kersten, J., Zhou, Z., Klan, F. and Fan, H., 2022. How can voting mechanisms improve the robustness and generalizability of toponym disambiguation? arXiv preprint arXiv:2209.08286.
### Crowdsourcing content for cultural geo-analytics: The case of Wikipedia
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment