Sunflower Unigene Repository (SUR v1.1)

About the Sunflower Unigene Repository

This database is maintained by the Bioinformatics Unit at INTA, on Hurlingham, Buenos Aires, Argentina. The technical basis for the web UI is a Chado database, plus a Web User Interface written in Python using web2py, which will be released soon, working codename for that project is "Another Tool for Genome Comprehension". For more information on the technical setup contact Bernardo J. Clavijo on bernardo.clavijo@tgac.ac.uk or Sergio Gonzalez on gonzalez.sergio@inta.gob.ar .

For details on how the information on the databse has been developed and curated, refer to the works on the citation section of this page. For more information on the data curation and use contact Paula Fernandez on fernandez.pc@inta.gob.ar .

The total number of GO terms used in this project come from different analysis and tools:

- Local instalation of Blast2Go and all databases. - A tool for functional annotation of (novel) sequences using BLAST.

- Local instalation of all software tools from InterProscan and their databases, from which IPR results were mapped to their respective GO terms. Features from the InterProScan suite: - ProDom: provider of sequence-clusters built from UniProtKB using PSI-BLAST. - PROSITE patterns: provider of simple regular expressions. - PROSITE and HAMAP profiles: provide sequence matrices. - PRINTS provider of fingerprints, which are groups of aligned, un-weighted Position Specific Sequence Matrices (PSSMs). - PANTHER, PIRSF, Pfam, SMART, TIGRFAMs, Gene3D and SUPERFAMILY: are providers of hidden Markov models (HMMs).

The information stored on the repository include as of now:

  • 28209 total ESTs
  • 12924 total overlapping_EST_sets
  • 132 total primers
  • 40169 total probes
  • 123 total SNPs
  • 8369 GO annotated overlapping_EST_sets
  • 11983 GO annotated ESTs
  • 90455 GO Term annotations
  • 1954 unique GO Terms on use for annotation.


Using the web interface


Searching

The search menu has three different tools. Each one has their pourpose, but the first is lighter and faster, so please try not to put too much stress on our servers by running hundreds of futile searchs with hundreds of results on the other two.

The first and simplest search, Search -> Features by Name, will allow you to locate features by looking for its name on the database.

The second tool, Search -> List of Features by Name will allow you to locate a list of fetures

The third tool, Search -> Ontology Terms and annotated Features will allow you to search by any ontology annotation. Some remarks apply here, as we do a complete search over the ontology graph. For example, if you search "GO:0009582" with this tool you will find that the term "detection of abiotic stimulus" has none direct features and a few indirect ones. Direct features means sequences that during the annotation process where associated to that specific term, while Indirect means the the association is with some child of that term. If you switch to Feature List on the results page, you could see a list of features, and the GO terms associated with that features that matched the search criteria, you'll find that next to each term is either a "(d)" for a Direct annotation or a "(i)" for an Indirect one.

The fourth search tool, Search -> Features by BLAST Matches, is there just in case. We want to make a strong statement discouraging you to use it, because it searchs over the BLAST matches we've used as a means of annotation. If you need do a really fast and imprecise search without any certainty, give it a try, but try not to consider it as a really valuable source of information, just download the sequences and align them yourself knowing you're using the right paramenters and databases for your search.


The Feature Detail Page

By clicking a feature name in any search or wherever a feature name appears, you will go to the detail page. There you could see and download the sequence, view the annotation (including the partial GO graph down to the annotation terms) and see the blast matches used for further annotation.


Navigating the Ontology annotation tree

The Ontology trees are a representation for the ontology directed acyclic graph as a tree, yout could enter to it either from a feature or search clicking on the go term name/ID or via the Ontology Annotation menu (GO, SO, etc).

Each ontology term will show you its next level composition both a pie chart and as a tree list you could navigate and use to open another lower-level terms. Keep in mind that the totals displayed for annotation correspond to unique features, so the numbers rarerly adds up, because some sequence could have more than one anotation on more than one child for a particular ontology term. Once again, a good understanding of the Ontology will help you get the most of this.

For each term displayed you could do an annotated features search by clicking on the Feature List link bellow it on the tree. This will open a search by that ontology term on a new page and switch automatically to a Feature List view.


Data download

You could download both the features as fasta files and the feature->GO annotation to do further analysis of your own, like blast searching over the collections or whatever you want to do. Please try not to distribute this resources without pointing to this site, so everyone could find here the latest data available.


Blast

In this section may be searched in the sequences stored in the database using the BLAST algorithm in the following types:

  • blastn: Search a nucleotide databse using a nucleotide query
  • tblastn: Search translated nucleotide database using a protein query
  • tblastx: Search translated nucleotide database using a translated nucleotide query