ExTRI Extracted Transcription Regulation Interactions

image1

Previously, several Transcription Factor – Target Gene (TF-TG) resources have been generated by manual curation after mining the literature with PubMed queries (HTRIdb) or text mining (TRRUST). For each of these resources, the initial information extraction mainly served the purpose to generate sources for manual curation, during which relevant abstracts and information was selected.

Here, we present the results of a machine learning assisted text mining approach that allowed the automatic extraction of information on TF-TG with high precision and recall. We provide a list of additional resources produced by this work, among others the ExTRI sentences comprising more than 40.000 unique TF-TG interactions retrieved from abstracts in the entire Medline. Transcription Factors (TFs) were strictly as evidenced by TFClass (Wingender et al. 2015).

All interactions have been converted to RDF specifying Transcription Factor to Target Gene interactions, which together with the TF-TG data from TFactS. TRRUST, Signor, GEREDB, CytReg and HTRIdb constitutes the tfact2gene graph of the BioGateway triple store, available for network building through the BioGateway Cytoscape App. 

The combined tfact2gene resource represents a valuable contribution to the systems oriented Gene Regulation Knowledge Commons, and serves as an important platform for generating and retrieving information and knowledge about parts of the Gene Regulation domain that still needs further analysis.