Skip to content
Snippets Groups Projects
user avatar
authored

wikstraktor

A python tool to query the wiktionary and extract structured lexical data.

This experimentally identifies every structured info and merges info from different sources.

Dependencies

This project does depend on python packages.

Installation

(maybe to be replaced by an automation of some sort, using a virtual environment might be better, see server version)

Basic version

python3 -m venv wikstraktorenv #optional for basic version
. wikstraktorenv/bin/activate #activate environment (optional)
pip install -r requirements.txt
./setup.py

Wikstraktor Server

If you want wikstraktor as a server, you need to install flask and flask-cors — to allow other domains to query —, and best practice is to do so in a virtual environment.

The following commands are extracted from the aforementionned documentation, it is probably more secure to click on the link and follow the modules documentation :

python3 -m venv wikstraktorenv #create wikstraktorenv environment
. wikstraktorenv/bin/activate #activate environment
pip install -r server_requirements.txt
./setup.py

Use

Wikstraktor

Python

from wikstraktor import Wikstraktor
f = Wikstraktor.get_instance('fr', 'en') #create a wikstraktor,
    # first parameter is the language of the wiki
    # second parameter is the language of the word sought for
f.fetch("blue") #fetch an article
str(f) #convert content to json

Bash

usage: wikstraktor.py [-h] [-l LANGUAGE] [-w WIKI_LANGUAGE] [-m MOT]
                      [-f DESTINATION_FILE] [-A] [-C]

Interroger un wiktionnaire
	ex :
	‣./wikstraktor.py -m blue
	‣./wikstraktor.py -m blue -f blue.json -A -C
	‣./wikstraktor.py -l en -w fr -m blue -f blue.json -A -C

options:
  -h, --help            show this help message and exit
  -l LANGUAGE, --language LANGUAGE
                        la langue du mot
  -w WIKI_LANGUAGE, --wiki_language WIKI_LANGUAGE
                        la langue du wiki
  -m MOT, --mot MOT     le mot à chercher
  -f DESTINATION_FILE, --destination_file DESTINATION_FILE
                        le fichier dans lequel stocker le résultat
  -A, --force_ascii     json avec que des caractères ascii
  -C, --compact         json sans indentation

Wikstraktor Server

The server runs by default on port 5000, you can change that in the wikstraktor_server_config.py file.

./wikstraktor_server.py

Then there is a very simple API :

  • GET server_url/search/<word> : Searches the word in the default wiktionary
  • GET server_url/search/<wiktlang>/<wordlang>/<word> : Searches the word In wordlang in the wiktlang wiktionary Both API calls return a json object.

Licence

TODO but will be open source