6 GO-Enrichment Tab

6.1 Analysis Methods

6.1.1 Over Representation Analysis (ORA)

ORA is designed for analyzing specific lists of proteins, such as: - Significantly changed proteins - Proteins from a cluster - Manually curated lists of interest

Key features: - Uses Fisher’s exact test for statistical analysis - Compares your protein list against all known proteins in each biological process - Perfect for focused analysis of specific protein sets - Results show enrichment of your proteins in GO terms compared to background

6.1.2 Correlation Adjusted MEan RAnk gene set test (CAMERA)

CAMERA takes a more comprehensive approach by analyzing your entire dataset: - Uses all proteins and their measurements - Ranks proteins based on both abundance changes and statistical significance - Calculates a score using: -log10(p-value) * (Direction of the Fold-Change) - Ideal for discovering subtle but consistent changes across multiple proteins - Integrated with the Limma package for robust statistical analysis

6.2 Interface Components

6.2.1 Settings Panel

The settings panel allows you to configure your analysis:

  1. Analysis Method Selection
    • Choose between ORA and CAMERA
    • Each method has specific requirements and use cases
  2. GO Term Size Filters
    • Minimum term size (default: 50)
    • Maximum term size (default: 500)
    • Adjust based on your research focus:
      • Small range (10-50) for specific processes
      • Large range (250-1000) for broader processes
  3. Statistical Options
    • Choose between raw p-values or adjusted p-values
    • All results undergo Benjamini-Hochberg correction

6.2.2 Results Visualization

6.2.2.1 Bar Plot

  • X-axis: (adjusted) P-value
  • Y-axis: Ranked GO terms
  • Interactive selection of terms
  • Filterable by GO categories (BP, MF, CC)

6.2.2.2 Volcano Plot

Shows different metrics based on analysis method: - ORA: Percentage of significant proteins per GO term - CAMERA: Average scoring value - X-axis: (adjusted) P-value - Y-axis: Enrichment metric

6.2.2.3 Network Plot

Visualizes relationships between GO terms: - Node size based on: * ORA: Percentage of proteins found * CAMERA: Adjusted P-value - Edge thickness shows term overlap - Interactive clustering option - Customizable edge cutoff and term count

6.2.3 Term-Specific Analysis

When selecting a specific GO term, you get access to:

6.2.3.1 Term-Protein Network

  • Interactive visualization of term-protein relationships
  • Multiple coloring options:
    • P-value
    • Adjusted P-value
    • Log2 Fold Change
    • -log10(adjusted P-value) * Fold Change Direction
  • Shows protein connections between terms

6.2.3.2 Detailed Information

  1. Heatmap
    • Z-scored log2 LFQ values
    • Hierarchical clustering of proteins
  2. General Info
    • Term name and ID
    • Description
    • Statistical metrics
  3. Enrichment Plot (CAMERA only)
    • Protein ranking visualization
    • Distribution of term members

6.3 Practical Tips

  1. Choosing Analysis Method
    • Use ORA for specific protein lists
    • Use CAMERA for exploratory analysis
    • Consider biological context when interpreting results
  2. Term Size Selection
    • Smaller ranges (10-50) for specific processes
    • Larger ranges (250-1000) for pathway analysis
    • Default (50-500) works well for most analyses
  3. Result Interpretation
    • Consider both p-values and effect sizes
    • Look for biological coherence
    • Use network visualization for context
  4. Data Export
    • Results can be downloaded as CSV
    • Includes all statistical metrics
    • Preserves protein annotations

6.4 References

  1. Gene Ontology Consortium (http://geneontology.org/)
  2. Wu D, Smyth GK. Camera: a competitive gene set test accounting for inter-gene correlation. Nucleic Acids Research. 2012
  3. Alexa A, Rahnenfuhrer J. topGO: Enrichment Analysis for Gene Ontology. R package version 2.48.0, https://bioconductor.org/packages/topGO