Help page

eDGAR is a free to use database collecting and organizing the data on gene/disease associations as derived from OMIM, Humsavar and ClinVar. For each disease-associated gene eDGAR provides information on its annotation. For diseases associated to multiple genes, eDGAR provides information on the relations among the involved genes.

Browsing

The database can be browsed from the "Main Table" page, reporting all the associations between diseases and genes. The "Search" window (in the upper right side) allows to search for specific genes by HGNC code or specific diseases by OMIM code (including the phenotypic series code) or by disease name. By default, only 10 records are shown in the table. More records can be visualized by changing the number of entries in the upper left pull-down menu or by browsing pages using the links at the bottom right side of the page.

Search

Advanced search can be performed in the "Search Page". The user may search by means of a single entry (gene, protein, disease) or with a set of genes.

Genes can be searched by HGNC code or Ensembl code (ENSG). Proteins can be searched by UniProt accession or by Ensembl code (ESNP). Diseases can be searched by OMIM code, including the phenotypic series code, or via text search.
User must:
i) select the type of search to be performed among: "Gene name (HGNC)", "Gene name (ENSG)", "Protein (UniProt)", "Protein (ENSP)", "Disease (OMIM ID)" and "Disease name" (text search);
ii) enter the query;
iii) press "Submit".
OMIM codes included in phenotypic series are automatically readdressed towards the corresponding phenotypic series code.
If your query is directly associated to a a gene or a disease collected in eDGAR, you will get a link to the corresponding "Gene page" or "Disease page". If your query is a text search, you will retrieve a list of all the matching disease entries.

The user may also perform searches entering a group of genes and he will retrieve the list of shared annotations.
Only genes associated to disease are currently present in eDGAR.
User must:
i) enter the set of gene codes, one per line or separated by space, comma, semicolons;
ii) press "Submit".
The process may take few minutes.

Gene pages

Each page collects:
A) the list of diseases associated to the gene, along with the source from where the association has been annotated (OMIM, ClinVar, Humsavar). By default, only 10 records are shown in the table. More records can be visualized by changing the number of entries in the upper left pull-down menu or by browsing pages using the links at the bottom right side of the page.
B) the gene annotations. Specifically, the following annotations are reported:

  1. the Ensembl name of the gene and the corresponding external link;
  2. the SwissProt accession code(s) of the proteins encoded by the gene and the corresponding external link(s);
  3. the PDB IDs (if any) of the proteins encoded by the gene and the corresponding external link(s);
  4. the chromosomal localization of the gene;
  5. the KEGG pathways associated to the gene and the corresponding external links;
  6. the REACTOME pathways associated to the gene and the corresponding external links;
  7. the GO terms for molecular function associated to the gene and the corresponding external links;
  8. the GO terms for biological process associated to the gene and the corresponding external links;
  9. the GO terms for cellular component associated to the gene and the corresponding external links;

When the annotation is available, the corresponding row is coloured in black and the information can be accessed by clicking on the left arrow When the information is unavailable, the corresponding line is coloured in pale gray.

Disease pages

For monogenic diseases, eDGAR reports the table showing the associated gene.

For diseases associated to multiple genes eDGAR reports;
A) the table showing the associated genes. By default, only 10 records are shown in the table. More records can be visualized by changing the number of entries in the upper left pull-down menu or by browsing pages with the links at the bottom right side of the page.
B) the relations among the genes co-involved in the disease:

  1. presence of pairs of genes in the same tandem repeat, as annotated in DGD;
  2. presence of pairs of genes in the same cytogenetic band;
  3. regulatory relationships among genes as derived from TRRUST. Pairs of transcription factor/target and groups of genes co-regulated by the same transcription factor are reported;
  4. Physical interactions among genes as extracted from BIOGRID. Both direct and indirect interactions involving one intermediate gene are reported.
  5. Genetic interactions among genes as extracted from BIOGRID. Both direct and indirect interactions involving one intermediate gene are reported.
  6. Interactions among genes as extracted from STRING. High confidence links (STRING score > 0.7) with annotated "action" are collected. Both direct and indirect interactions involving one intermediate gene are reported.
  7. Interactions in the same CORUM structural complex.
  8. Interactions in the same CENSUS structural complex.
  9. Interactions reported in PDB.
  10. Interactions extracted from literature and UniProt text fields.
  11. KEGG pathway annotation. eDGAR reports both the terms shared by pairs of genes and terms enriched by NET-GE. Information Content values (IC) for all terms and p-values for enriched terms are also reported.
  12. REACTOME pathway annotation. eDGAR reports both the terms shared by pairs of genes and terms enriched by NET-GE. Information Content values (IC) for all terms and p-values for enriched terms are also reported.
  13. Gene Ontology - Molecular Function annotation. eDGAR reports both the terms shared by pairs of genes and terms enriched by NET-GE. Information Content values (IC) for all terms and p-values for enriched terms are also reported.
  14. Gene Ontology- Biological Process annotation. eDGAR reports both the terms shared by pairs of genes and terms enriched by NET-GE. Information Content values (IC) for all terms and p-values for enriched terms are also reported.
  15. Gene Ontology- Cellular Component annotation. eDGAR reports both the terms shared by pairs of genes and terms enriched by NET-GE. Information Content values (IC) for all terms and p-values for enriched terms are also reported.

The interactions from Biogrid and STRING can be visualized in a graph where the gene associated to the disease are represented as blue nodes and the other genes in interactions as pale blue nodes; the direct interactions are visualized as green edges and the indirect interactions as thin black edges. Clicking on nodes, the user is redirected to the correspondent gene pages.

By default, only 10 records are shown in all tables. More records can be visualized by changing the number of entries in the upper left pull-down menu or by browsing pages with the links at the bottom right side of the page.

When the annotation is available, the corresponding row is coloured in black and the information can be accessed by clicking on the left arrow When the information is unavailable, the corresponding line is coloured in pale gray.

Search by a set of genes: result page

When searching by a list of gene codes, eDGAR reports the list of all the diseases associated to each gene in the list. Genes not associated to diseases are not considered in the following analysis.

For each set of genes, eDGAR reports the following relations

  1. presence of pairs of genes in the same tandem repeat, as annotated in DGD;
  2. presence of pairs of genes in the same cytogenetic band;
  3. regulatory relationships among genes as derived from TRRUST. Pairs of transcription factor/target and groups of genes co-regulated by the same transcription factor are reported;
  4. Physical interactions among genes as extracted from BIOGRID. Both direct and indirect interactions involving one intermediate gene are reported.
  5. Genetic interactions among genes as extracted from BIOGRID. Both direct and indirect interactions involving one intermediate gene are reported.
  6. Interactions among genes as extracted from STRING. High confidence links (STRING score > 0.7) with annotated "action" are collected. Both direct and indirect interactions involving one intermediate gene are reported.
  7. Interactions in the same CORUM structural complex.
  8. Interactions in the same CENSUS structural complex.
  9. Interactions reported in PDB.
  10. Interactions extracted from literature and UniProt text fields.
  11. KEGG pathways annotation. eDGAR reports the shared terms and their Information Content values (IC).
  12. REACTOME pathways annotation. eDGAR reports the shared terms and their Information Content values (IC).
  13. Gene Ontology - Molecular Function annotation. eDGAR reports the shared terms and their Information Content values (IC).
  14. Gene Ontology- Biological Process annotation. eDGAR reports the shared terms and their Information Content values (IC).
  15. Gene Ontology- Cellular Component annotation. eDGAR reports the shared terms and their Information Content values (IC).
  16. NET-GE enrichment. eDGAR reports the links to the NET-GE submission page where the user may start the enrichment analysis.

Downloads

The users can obtain the raw data of eDGAR in csv format. The main table webpage (as well as gene pages, disease pages and pages of results derived by searching with a set of genes) contain a link to the corresponding csv file. All the available csv files for genes and diseases are stored in a folder (folder link), and the user may retrieve all files directly from this folder. The csv file of the main table is called maintable.csv, while the names of the csv files for genes and disease correspond to the relative gene or disease ID (ex: 104300.csv for the Alzheimer disease).

For further information and bug report please contact: Giulia Babbi at giulia.babbi3@unibo.it