-
Mathieu Loiseau authoredb38252cd
To learn more about this project, read the wiki.
README.md 2.47 KiB
wikstraktor
A python tool to query the wiktionary and extract structured lexical data.
Dependencies
This project does depend on python packages.
-
pywikibot
allows to use the mediawiki API -
wikitextparser
can parse mediawiki pages and extract sections, templates and links -
importlib
: to import parser modules
Installation
(maybe to be replaced by an automation of some sort)
Wikstraktor Server
If you want wikstraktor as a server, you need to install flask, and best practice is to do so in a virtual environment.
The following commands are extracted from the aforementionned documentation, it is probably more secure to click on the link and follow the modules documentation :
python3 -m venv wikstraktorenv #create wikstraktorenv environment
. wikstraktorenv/bin/activate #activate environment
pip install Flask
Use
Wikstraktor
from wikstraktor import Wikstraktor
f = Wikstraktor.get_instance('fr', 'en') #create a wikstraktor,
# first parameter is the language of the wiki
# second parameter is the language of the word sought for
f.fetch("blue") #fetch an article
str(f) #convert content to json
Wikstraktor Server
The server runs by default on port 5000, you can change that in the wikstraktor_server_config.py
file.
./wikstraktor_server.py
Then there is a very simple API :
GET server_url/search/<word> #Searches the word in the default wiktionary
GET server_url/search/<wiktlang>/<wordlang>/<word> #Searches the word In wordlang in the wiktlang wiktionary
Both API calls return a json object
Licence
TODO but will be open source