Data files to integrate

All data required to build MalariaMine is included in bio/tutorial/malariamine/malaria-data.tar.gz. Copy this file to your local directory and extract from the archive.

cp bio/tutorial/malariamine/malaria-data.tar.gz DATA_DIR
cd DATA_DIR
tar -zxvf malaria-data.tar.gz

Edit /malariamine/project.xml so that all occurances of DATA_DIR point to the your local data directory location. For example:

  <sources>
    <source name="malaria-gff" type="malaria-gff">
      <property name="gff3.taxonId" value="36329"/>
      <property name="src.data.dir" location="DATA_DIR/malaria/genome/gff"/>
    </source>
    ...

The data included is:

/malaria-genome

The malaria genome as gff3 and fasta, originally downloaded from PlasmoDB

/uniprot

UniProt XML with protein information and sequences from SwissProt and Trembl. Downloaded from: http://www.ebi.ac.uk/uniprot/database/download.html and filtered on taxon id 36329.

/gene_ontology

The Gene Ontology structure. Downloaded from http://www.geneontology.org/

/go_annotation

GO term assignments for P. falciparum. Downloaded from http://www.geneontology.org/