Data files to integrate
All data required to build MalariaMine is included in bio/tutorial/malariamine/malaria-data.tar.gz. Copy this file to your local directory and extract from the archive.
cp bio/tutorial/malariamine/malaria-data.tar.gz DATA_DIR cd DATA_DIR tar -zxvf malaria-data.tar.gz
Edit /malariamine/project.xml so that all occurances of DATA_DIR point to the your local data directory location. For example:
<sources>
<source name="malaria-gff" type="malaria-gff">
<property name="gff3.taxonId" value="36329"/>
<property name="src.data.dir" location="DATA_DIR/malaria/genome/gff"/>
</source>
...
The data included is:
/malaria-genome
The malaria genome as gff3 and fasta, originally downloaded from PlasmoDB
/uniprot
UniProt XML with protein information and sequences from SwissProt and Trembl. Downloaded from: http://www.ebi.ac.uk/uniprot/database/download.html and filtered on taxon id 36329.
/gene_ontology
The Gene Ontology structure. Downloaded from http://www.geneontology.org/
/go_annotation
GO term assignments for P. falciparum. Downloaded from http://www.geneontology.org/
