>This branch corresponds to the results of the crawler for the study linked with [BioFlow-Insight](https://gitlab.liris.cnrs.fr/sharefair/bioflow-insight)
"/usr/lib/python3/dist-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.17.3 and <1.25.0 is required for this version of SciPy (detected version 1.26.1\n",
" warnings.warn(f\"A NumPy version >={np_minversion} and <{np_maxversion}\"\n"
"plt.title(\"Evolution of the yearly and cumulative number of Nextflow workflows available on GitHub\");\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We only want to use open workflows, so we are only gonna use the workflows which have an open license. We are gonna keep the ones which have :\n",
"\n",
"* Apache License 2.0\n",
"* GNU General Public License v3.0\n",
"* MIT License"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Only keeping open workflows that leaves us with 677 workflows.\n"
]
}
],
"source": [
"nb_open = len(df[(df[\"license\"] ==\"Apache License 2.0\") | (df[\"license\"] == \"GNU General Public License v3.0\") | (df[\"license\"] == \"MIT License\")])\n",
"print(f\"Only keeping open workflows that leaves us with {nb_open} workflows.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
%% Cell type:markdown id: tags:
# Analysis of results of crawler
%% Cell type:code id: tags:
``` python
importseabornassns
importmatplotlib.pyplotasplt
importnumpyasnp
sns.set(style='darkgrid',palette="Accent")
taille=(9,5)
```
%% Output
/usr/lib/python3/dist-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.17.3 and <1.25.0 is required for this version of SciPy (detected version 1.26.1
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
%% Cell type:code id: tags:
``` python
importjson
importpandasaspd
withopen('wf_crawl_nextflow.json')asjson_file:
dict=json.load(json_file)
_=dict.pop("last_date")
```
%% Cell type:code id: tags:
``` python
print(f"The crawler found {len(dict)} Nextflow workflows with at least Nextflow file at the root.")
```
%% Output
The crawler found 752 Nextflow workflows with at least Nextflow file at the root.