Katherine James, Jennifer Hallinan, Anil Wipat, Wan-Fai Ng
Objectives: The majority of experimental datasets can be represented as networks of parts and interactions. A network representation allows biological data to be represented in a manner that is both tractable for human visual study and computationally amenable. One of the most powerful approaches to the integration of heterogeneous data is the use of probabilistic functional integrated networks (PFINs), since these networks have statistical weights that indicate the level of confidence in the evidence for each interaction. The confidence weights allow the use of a variety of statistical algorithms that take these weightings into account. This study aims to integrate an immune-specific PFIN from a number of relevant experimental datasets and apply it to the study of primary Sjögren's Syndrome (pSS) and related diseases.
Methods: Functional interaction data were sourced from the BioGRID and InnateDB resources. Datasets were confidence scored using a metabolic pathway Gold Standard dataset derived from the BioSystems Database. The confidence scores were integrated for each individual interaction using a weighted sum. The proteins of the network were then annotated using Gene Ontology biological process terms. Finally, the network was filtered to produce a sub-network of immune proteins and their high confidence interaction partners. The final network was assessed for its ability to predict known autoimmune disease-related proteins before being applied to the prediction of novel pSS-associated proteins and to the comparison of autoimmune diseases using a variety of network analysis techniques.
Results: A probabilistic functional integrated network of immunity was produced. The core immune PFIN contained ˜1700 proteins which were involved in >10,000 interactions. Clustering of the network based on interaction confidence revealed distinct patterns of interaction between pSS-associated proteins and several biological processes, in particular the stress responses. In addition, a ranked list of candidate pSS-associated genes was produced.
Conclusion: Probabilistic network analysis is a powerful approach to data integration and the study of human disease. The immune PFIN generated by this work provides a valuable resource for the future study of pSS and its comparison with other autoimmune diseases.