Help Notes:
LIRCPs (LC3 interacting Region containing proteins)
Through this tab users can search through Canonical LIR motifs present in 8 model organisms: Arabidopsis thaliana, Caenorhabditis elegans, Gallus gallus, Homo sapiens, Mus musculus, Rattus norvegicus, Saccharomyces cerevisiae, Danio rerio . Upon clicking for the information on a particular organism, the users could see the predicted list of LIR containing proteins (LIRCPs) arranged alphabetically. For each LIRCP, information on the motif along with its start and end positions, PSSM score, Similar LIR motifs in other model organisms, protein description and Gene Ontology (GO) classification (if any) is made available.
PSSM Score: Position Specific Scoring Matrix (PSSM) is a commonly used representation of motifs or patterns in biological sequences. The matrix is derived from a set of aligned sequences that are thought to be functionally related. Values in the matrix represent a log-odds score for the presence of a residue in the respective position of the alignment. While negative scores are assigned to rarely observed in the alignment and high positive scores are assigned to most frequent residues. (Further information).
BLAST(Basic Local Alignment Search Tool)
BLAST option helps Users to search against LIRCPs of model organisms. PSI-BLAST with phi-pattern option is used for this.
Important:
a) Only one sequence in FASTA format can be searched at a time. Multiple sequences are not allowed.
b) Search against whole proteomes of model organisms described in the website is available. This includes both Swiss-Prot and TrEMBL sequences.
c) Default e-value is set to 0.01 (When e-value textbox is left blank). If the user provides e-value greater than 0.1, it is limited to 0.1 to avoid longer search times.
d) PSI-BLAST whi Phi-Pattern is used to search against the proteomes of model organisms.
Any combination in the LIR pattern [ADEFGLPRSK][DEGMSTV][FWY][DEILQTV][ADEFHIKLMPSTV][ILV] at a given position could appear in the result rather than an exact match with an aminoacid.
For example: a LIR pattern in the query protein sequence KEFEKL can match with a subject in the following way:
Query KEFEKL
Subject SVFTSV
Here, in the Subject the first aminoacid 'S' is present within the enclosed first set of braces in the above mentioned LIR pattern where 'K' in the query sequence is present. The second aminoacid 'V' is present in the second set of braces in the LIR pattern where 'E' in the query was present. The same is matched for other residues as well.
The position of the pattern is mentioned in the query sequence submitted for BLAST search and in the matches the red colored asteriks (******) indicate the Query pattern. If multiple patterns are present, then all of their positions are also mentioned.
e) Clicking on the bitscore link takes the User to the respective alignment in the result.
Gene Ontology (GO) Annotation
GO provides an ontology of defined terms representing gene product properties.
a) GO Slim tab gives a broad overview of the ontology content without the detail of the specific fine grained terms associated with the LIRCPs of the model organisms.
The counts of LIRCPs in the total reviewed set of proteins from UniProt is put for a given GO slim category. A hypergeometric test was performed to indicate the significance of a particular GO slim category being abundant in LIRCPs when compared to total number of proteins under the same GO Slim category for a model organism. The significance is measured in terms of p-value. Further, we provide a P-adjusted value using Benjamini-Hochberg procedure to account for False Discovery Rate (FDR).
b) The Distribution tab is essentially the same as Count component of GO Slim tab put as a graphical representation.
c) The Enrichment tab helps us to compare GO Slim classes across two different model organisms. The users could limit the search to different significant levels based on p-value.
Search
Search option has two tabs i.e., Keyword and LIR motif.
Keyword tab helps to search the database for a Gene name (or) Protein description (or) Uniprot ID.
LIR motif search helps to find LIR motifs in the protein sequence submitted by the user.