Skip to content
Snippets Groups Projects
README.md 2.35 KiB
Newer Older
Mouhamadou Ba's avatar
Mouhamadou Ba committed
>> This is work in progress, contact us if you have questions
Mouhamadou Ba's avatar
Mouhamadou Ba committed

Mouhamadou Ba's avatar
Mouhamadou Ba committed
# About

Mouhamadou Ba's avatar
Mouhamadou Ba committed
This project is designed to extract entities (i.e., `taxa`, `phenotypes`, `habitats`, `disease names`, `hosts`, `pathogen`, `vector`, `dates` and `geographic names`) from textual data for the purpose of scientific watch.
Mouhamadou Ba's avatar
Mouhamadou Ba committed

Mouhamadou Ba's avatar
Mouhamadou Ba committed
It contains a workflow based on Framework [AlvisNLP](https://github.com/Bibliome/alvisnlp) and uses the Ontobiotope Ontology and NCBI taxonomy.
Mouhamadou Ba's avatar
Mouhamadou Ba committed
## Usage
Mouhamadou Ba's avatar
Mouhamadou Ba committed
The workflow works on command line (e.g., `GNU bash, version 4.4.x`) with `singularity version 3.4.x` 
Mouhamadou Ba's avatar
Mouhamadou Ba committed
installed on your computer (see [how to install singularity](https://sylabs.io/guides/3.4/user-guide/quick_start.html#quick-installation-steps)). 
Mouhamadou Ba's avatar
Mouhamadou Ba committed
It is compatible with `AlvisNLP version 0.7.x` provided in a [Singularity](https://sylabs.io/) image. Run the following steps to test the workflow,
Mouhamadou Ba's avatar
Mouhamadou Ba committed
a test corpus is provided in `corpus/pesv/Xylella-test/txt/` (`16Go` RAM memory is required to run the test corpus).
Mouhamadou Ba's avatar
Mouhamadou Ba committed

Mouhamadou Ba's avatar
Mouhamadou Ba committed

Mouhamadou Ba's avatar
Mouhamadou Ba committed
1. clone the project.
Mouhamadou Ba's avatar
Mouhamadou Ba committed

```
git clone https://forgemia.inra.fr/mandiayba/pesv-tm.git
Mouhamadou Ba's avatar
Mouhamadou Ba committed

Mouhamadou Ba's avatar
Mouhamadou Ba committed
cd pesv-tm
Mouhamadou Ba's avatar
Mouhamadou Ba committed
```

Mouhamadou Ba's avatar
Mouhamadou Ba committed
2. pull the singularity image of AlvisNLP. 
Mouhamadou Ba's avatar
Mouhamadou Ba committed
> `login` and `password` are required to pull the AlvisNLP singularity image from forgemia, please contact the maintainer if you don't have permissions.
Mouhamadou Ba's avatar
Mouhamadou Ba committed

```
cd pesv-tm/softwares
Mouhamadou Ba's avatar
Mouhamadou Ba committed

Mouhamadou Ba's avatar
Mouhamadou Ba committed
singularity pull --docker-login alvisnlp.sif oras:registry.forgemia.inra.fr/migale/tm-tools-packages/sif/alvisnlp:v0.0.4
Mouhamadou Ba's avatar
Mouhamadou Ba committed
```

Mouhamadou Ba's avatar
Mouhamadou Ba committed
3. run the workflow. 
Mouhamadou Ba's avatar
Mouhamadou Ba committed
> execute the workflow with the test corpus `corpus/pesv/Xylella-test/txt/`, results are stored into `corpus/pesv/Xylella-test/`
Mouhamadou Ba's avatar
Mouhamadou Ba committed

```
Mouhamadou Ba's avatar
Mouhamadou Ba committed
cd pesv-tm/

Mouhamadou Ba's avatar
Mouhamadou Ba committed
softwares/alvisnlp.sif -J-Xmx32G -verbose -cleanTmp \
-alias input corpus/pesv/Xylella-test/txt/ \
-outputDir corpus/pesv/Xylella-test/ \
-entity ontobiotope resources/BioNLP-OST+EnovFood \
-feat inhibit-syntax inhibit-syntax \
plans/PESV_workflow.plan
Mouhamadou Ba's avatar
Mouhamadou Ba committed
```

Mouhamadou Ba's avatar
Mouhamadou Ba committed
4. See results from `corpus/Xylella/visualisation_html`

Mouhamadou Ba's avatar
Mouhamadou Ba committed
*. You may browser the results by using option `-browser`. For that, run the following command, check the logs and goto [http://localhost:8878](http://localhost:8878) 
Mouhamadou Ba's avatar
Mouhamadou Ba committed

```
cd pesv-tm/

softwares/alvisnlp.sif -J-Xmx32G -verbose -cleanTmp \
-browser
-alias input corpus/pesv/Xylella-test/txt/ \
-outputDir corpus/pesv/Xylella-test/ \
-entity ontobiotope resources/BioNLP-OST+EnovFood \
-feat inhibit-syntax inhibit-syntax \
plans/PESV_workflow.plan
```

Mouhamadou Ba's avatar
Mouhamadou Ba committed
## Maintainer
Mouhamadou Ba's avatar
Mouhamadou Ba committed
Mouhamadou Ba : mouhamadou.ba@inrae.fr