Skip to content
Snippets Groups Projects
Commit 81f92ac7 authored by EricBoix's avatar EricBoix
Browse files

Added project propositions.

parent 068b1bc4
No related branches found
No related tags found
No related merge requests found
...@@ -12,11 +12,11 @@ Concern: manpower is key. A team must be constituted and seeking benevolent cont ...@@ -12,11 +12,11 @@ Concern: manpower is key. A team must be constituted and seeking benevolent cont
* On the ITA side of the force: * On the ITA side of the force:
* YPE is off LIDA to sysadmin and won't contribute anymore * YPE is off LIDA to sysadmin and won't contribute anymore
* …? * …?
* On the researchers side of the force: \\ Bellow is the list of the participants to LIDA V1 that helped in defining the needs. Are they still "benevolent" or charged of anything ? * On the researchers side of the force: bellow is the list of the participants to LIDA V1 that helped in defining the needs. Are they still "benevolent" or charged of anything ?
* [Sylvie Servigne](http://liris.cnrs.fr/membres/?id=15) [BD](http://liris.cnrs.fr/equipes?id=61) * [Sylvie Servigne](http://liris.cnrs.fr/membres/?id=15) [BD](http://liris.cnrs.fr/equipes?id=61)
* [Pierre Antoine Champin](http://liris.cnrs.fr/pierre-antoine.champin/en/) [TWEAK](https://liris.cnrs.fr/equipes/?id=75) * [Pierre Antoine Champin](http://liris.cnrs.fr/pierre-antoine.champin/en/) [TWEAK](https://liris.cnrs.fr/equipes/?id=75)
* [Pierre Edouard Portier](http://liris.cnrs.fr/pierre-edouard.portier/) DRIM * [Pierre Edouard Portier](http://liris.cnrs.fr/pierre-edouard.portier/) DRIM
* |Mickael Mrissa](https://liris.cnrs.fr/membres/?id=1708) |SOC]https://liris.cnrs.fr/equipes?id=62) * [Mickael Mrissa](https://liris.cnrs.fr/membres/?id=1708) [SOC](https://liris.cnrs.fr/equipes?id=62)
* Julien Milles left LIRIS * Julien Milles left LIRIS
## Who are the LIDA users ## Who are the LIDA users
...@@ -25,7 +25,7 @@ Concern: LIDA start with very limited resources and there are many platforms/tea ...@@ -25,7 +25,7 @@ Concern: LIDA start with very limited resources and there are many platforms/tea
* Where is the teams/platforms need stated ? What are their current demand ? * Where is the teams/platforms need stated ? What are their current demand ?
* For MSH: Which teams/platforms needs should be addressed ? * For MSH: Which teams/platforms needs should be addressed ?
* LIDA V1 initially targeted [Data Pole](https://liris.cnrs.fr/axes?id=68) and namely BD, DM2L and GOAL. Are these still the official "guilty parties"? * LIDA V1 initially targeted [Data Pole](https://liris.cnrs.fr/axes?id=68) and namely BD, DM2L and GOAL. Are these still the official "guilty parties"?
* The [PLEID]http://liris.cnrs.fr/pleiad/) platform already has some support from a private company. Will the LIRIS further contribute to this software platform ? * The [PLEID](http://liris.cnrs.fr/pleiad/) platform already has some support from a private company. Will the LIRIS further contribute to this software platform ?
## What is the LIDA mission statement ## What is the LIDA mission statement
Concern: LIDA' objective is to satisfy research needs as opposed to be a resource provider. A clear mission statement would avoid LIDA being conceived as a storage provider (without pointing fingers [storing video](https://liris.cnrs.fr/equipes?id=48) require lots of disks) but as a platform/research enabler. Concern: LIDA' objective is to satisfy research needs as opposed to be a resource provider. A clear mission statement would avoid LIDA being conceived as a storage provider (without pointing fingers [storing video](https://liris.cnrs.fr/equipes?id=48) require lots of disks) but as a platform/research enabler.
......
## Federate data related "tools" (ETL)
Before being able to work on any given raw data one first needs to:
* be able to read/parse the format (be it a file to open or a stream) in one owns target language (if the researchers pipeline is in e.g. Python then one needs to read the data from Python). Even when not encoded, e.g. for XML files, the structure of the data needs to be understood and some syntactic sugar might be appreciated
* possibly anonymize the data (in order to respect some legal constraints),
* sanitize the data (remove degenerated data, or degenerated/ill formed field of data),
* re-sample the data when they are temporally missing captures,
* qualify the data: some data might be too redundant to contain "enough" information.
All such tasks might require dedicated tools and specific know how that is of low interest to the researcher yet can be time consuming due to its technicality.
Proposition: gather such tools, libraries, recipes, code snippets in order to ease the burden of researchers.
Notes:
* Gathering such tools might start with pointing to ad-hoc already existing websites...
* There already exists [ETL](https://en.wikipedia.org/wiki/Extract,_transform,_load) frameworks as well as specialized ETL (like [HALE](https://www.wetransform.to/products/hale/) or [FME](https://en.wikipedia.org/wiki/Feature_Manipulation_Engine) for spatial data). Using such frameworks (as opposed to general purpose scripting languages boosted with wrapped ad hoc libraries) to express ones recipes might prove to be a big time saver.
## Display a team know how on some given data
When New York city went open on its data it took a few weeks for two students before being able to concretely show some 3D rendering of the city geometry based on such data. Extracting advanced information (juiced data out) might prove to require a real know how. For example computing the road network load out of a geometrical description (dedicated to 3D rendering) of a road network will first require some topological fixes of the data (two endpoints of road segments might be geometrically superimposed yet not topologically connected). Blending the geometry/topology of such network with the traffic-light schedules might no be science yet might prove to be not such a trivial technical task.
A team might wishing to illustrate such a know how might need to "offer" some resulting data samples and beyond that present such per-treatment algorithms (e.g. a service offering on-line clean up of client uploaded data). Offering such a service will create technical needs for the team...
## meta-datas
When given some raw data, automatically (or semi-automatically) extracted/generated associated meta-data can be valuable and a result per se. Such meta-data might range from simple information like boundary values, number of samplings, content access limits to more advanced quality indicators be they qualitative (poor, medium, high) to quantified indicators.
When valuable such meta-data information might need to be stored, retrieved, mined...
...@@ -8,17 +8,15 @@ ...@@ -8,17 +8,15 @@
### Various immature pages: ### Various immature pages:
* Informal [LIRIS Data inventory](/DataUsedLiris.md) * Informal [LIRIS Data inventory](/DataUsedLiris.md)
* [[TeamDemands|Research teams correspondents and demands]] * [Proposed activities](/ProjectPropositions.md)
* [[ProjectPropositions|Proposed activities]] * [Open questions](/OpenQuestions.md)
* [[OpenQuestions|Open questions]]
* [[DataManagementPlan|DMP]] (Data Management Plan) * [[DataManagementPlan|DMP]] (Data Management Plan)
* City domain: * City domain:
* [[CityPartnersProjects|Partners and projects]] * [[CityPartnersProjects|Partners and projects]]
* [[CityDataTools|City oriented data Tools]] * [[CityDataTools|City oriented data Tools]]
### Technologies: ### Technologies:
* [[CKAN|CKAN]] * [[CKAN|CKAN]]
* [[https://tech.knime.org/installation-0|Knime]] * [[https://tech.knime.org/installation-0|Knime]]
* Open source (GPL), German made * Open source (GPL), German made
* [[https://tech.knime.org/community/developers|Installation out of sources]] * [[https://tech.knime.org/community/developers|Installation out of sources]]
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment