Hu Li Lab Mayo Clinic


Data and Software

  • Hypothesis-Driven AI LIFE

    Hypothesis-driven AI is a new class of AI that has not been previously described. Unlike conventional AI, the design of the learning algorithm of hypothesis-driven AI is guided by underlying hypothesis that can explain how a system behaves. This new AI technology offers a way to test a hypothesis and make new discoveries using AI approach. Hypothesis-driven AI offers a targeted and informed approach to address many of the challenges in diseases. Hypothesis-driven AI can perform focused investigations by centering on specific hypotheses or research questions and thus uses prior knowledge to guide its exploration. This approach can generate more interpretable and explainable results as compared to conventional AI tools, since the underlying hypotheses provide a mechanistic framework for understanding the logic behind certain predictions or outcomes. Hypothesis-driven AI tends to use resources more efficiently. Hypothesis-driven AI encourages the integration of domain-specific knowledge to generate meaningful insights within a specific context. Hypothesis-driven AI allows researchers to test and validate hypotheses via AI-mediated "thought experiments" (or "Gedankenexperiment" as called by Albert Einstein) which in turn guides future experimental designs.
    Reference: Cancers 2024.

    Spatially resolved sequencing technologies help us dissect how cells are organized in space. Several available computational approaches focus on the identification of spatially variable genes (SVGs), genes whose expression patterns vary in space. The detection of SVGs is analogous to the identification of differentially expressed genes and permits us to understand how genes and as-sociated molecular processes are spatially distributed within cellular niches. However, the ex-pression activities of SVGs fail to encode all information inherent to the spatial distribution of cells. Here, we devised a deep learning model - Spatially Informed Artificial Intelligence (SPIN-AI) to identify spatially predictive genes (SPGs), genes whose expression can predict how cells are organized in space, without any prior assumptions of spatial distribution. We used SPIN-AI on spatial transcriptomic data from squamous cell carcinoma (SCC) as a proof-of-concept. Our results demonstrated that SPGs recapitulate the biology of SCC but also identify genes distinct from SVGs. Moreover, we found a substantial number of ribosomal genes that are SPGs but not SVGs. Since SPGs possess the capability to predict spatial cellular organization, we reason SPGs capture more biologically relevant information for a given cellular niche. Hence, SPIN-AI has broad applications to detect SPGs and uncover which biological pro-cesses play important roles in governing cellular organization.
    Reference: Biomolecules 2023.
    SPIN-AI source code are availalbe for public accessing Li Lab.

    Artificial neural network (ANN) was initially created to model how human brain works. Over past few decades, ANN has evolved into numerous sophisticated algorithms with proven outstanding performance in various recognition tasks. Artificial Neural Network Encoder (ANNE) is a novel weight engineering deep machine learning method that harness the power of autoencoder and demonstrated that it is possible to decode meaningful information encoded in ANN models trained for specific tasks. We applied ANNE on breast cancer gene expression data with known clinical properties as case studies. Our work illustrates the trained autoencoder models are indeed information encoders that meaningful gene-gene associations with numerous supported evidences can be retrieved. ANNE opens a new avenue in machine intelligence that ANN models will no longer perceived as tools to perform recognition tasks but as powerful tools to extract meaningful information embedded within the sea of high dimensional data.
    Reference: Front Immunol. 2022.
    ANNE source code are availalbe for public accessing Li lab GitHub.

    MALANI (Machine Learning-Assisted Network Inference) is a hybrid computational platform that harnesses the power of both machine learning and network biology methodologies to provide new insights and improve understanding of complex biological systems. MALANI assesses all genes regardless of expression or mutational status in the context of disease etiology by building more than 2 millions machine learning models for reconstructing gene regulatory networks. MALANI has the power to uncover "dark" disease genes that are neither mutated nor differentially expressed but play important pathological roles in disease development.
    Reference: Sci Rep. 2017 Aug 01.
    MALANI source code can be downloaded at Li lab GitHub.
  • Computational Drug Discoveries Platform, Machine Learning, Feature Selection, AI Drug Discoveries  Machine Learning

    AI and Machine learning methods and feature selection approaches for predicting specific Pharmacodynamic, Pharmacokinetic or Toxicological properties of pharmaceutical agents are useful for facilitating new drug discovery and development. Pharmaceutical agents have been developed and tested for possessing desirable pharmacodynamic, pharmacokinetic, and minimal level of toxicological properties. Computational methods have been explored for predicting these properties aimed at the discovery of promising leads and the elimination of unsuitable ones in early stages of drug development. AI and Machine learning methods have shown their huge potential for predicting these properties for structurally diverse sets of agents by using recently explored AI, mahcine learning and deep machine learning models. These methods have been used for predicting agents of a variety of pharmacodynamic, pharmacokinetic and toxicological properties
    Reference: J Pharm Sci. 2007.; Drug Development Research. 2006.; J Mol Graph Model. 2006.; J Chem Inf Model. 2005.

    PERsonalized MUtation evaluaTOR (PERMUTOR) is a novel computational pipeline which collects potent disease gene cooperative pathways to envision individualized disease etiology and therapies. Our algorithm constructs individualized disease networks and modules de novo which enable us to elucidate the importance of mutated genes in specific patients and to understand the synthetic penetrance of these genes across patients. Individualized module disruption enables us to devise customized singular and combinatorial target therapies which were highly varied across patients demonstrating the need for precision therapeutics pipelines. As the first analysis of de novo individualized disease networks and modules, we illustrate the power of individualized disease modules for precision medicine by providing deep novel insights on the activity of diseased genes in individuals.
    Reference: Genome Res. 2021.
    PERMUTOR source code are availalbe for public accessing Li lab GitHub.

    Gene Utility Model (GUM) is a novel computational pipeline to understand the importance of genes under specific cellular contexts. GUM states that it is the utility of genes that provides selective pressure for the survival and fitness of aberrant cells. Using GUM, it is possible to construct an "utility karyotype" by mapping differentially utilized genes to their respective chromosomal loci. Further, GUM predicts that the resulting utility karyotype can recapitulate, to a certain extent, the chromosomal aberrancies observed in diseases.
    Reference: Comput Struct Biotechnol J. 2022.

    RSI (Regulostat Inferelator ) is a novel computational algorithm to decipher intrinsic molecular devices called regulostats that predetermine cellular phenotypic responses.
    Reference: Nucleic Acids Res. 2019 May
    RSI web interface and source code are availalbe at the RSI website portal Li lab GitHub.
  • Single-nucleus m6A-CUT&Tag (sn-m6A-CT) data analysis GBM

    sn-m6A-CT is for simultaneous profiling of m6A methylomes and transcriptomes within a single nucleus. sn-m6A-CT is capable of enriching m6A-marked RNA molecules in situ, without isolating RNAs from cells. sn-m6A-CT profiling is sufficient to determine cell identity and allows the generation of cell-type-specific m6A methylome landscapes from heterogeneous populations.
    Reference: Mol Cell. 2023 Aug 25:S1097-2765(23)00649-4.
    Source code are availalbe for public accessing sn-m6A-CT.
  • ASTAR-seq ASTAR-seq

    ASTAR-Seq is an automated method with high sensitivity, assay for single-cell transcriptome and accessibility regions for simultaneous measurement of whole-cell transcriptome and chromatin accessibility within the same single cell.
    Reference: Genome Research. 2020 July; Science Advances 2020 September.
    Source code are availalbe for public accessing ASTAR-seq.
  • Multi-Regional-GBM-Imaging-and-Genetics GBM

    Reference: Nat Commun. 2023 Sep 28; 14 (1):6066.
    Source code are availalbe for public accessing Li Lab.

    EDDI (Expression Dosage Dependent Inferelator) is a machine learning and systems biology approach to characterize dosage-based gene dependencies.
    Reference: J Bioinform Syst Biol. 2021.
    EDDI source code are availalbe for public accessing Li lab GitHub.
  • DPYD-Varifier

    DPYD-Varifier (DPYD Gene-specific variant classifier) is a highly accurate in silico classifier to predict the functional impact of DPYD variants on DPD activity. DPYD-Varifier have great potential to systems pharmacology and individualize medicine and improve the clinical decision-making process.
    Reference: Clin Pharmacol Ther. 2018 Jan 12.
  • P-Map P-Map

    P-Map (Phenotype mapping) is a network-based phenotype mapping approach to identify genes and regularory networks that modulate drug response phenotypes.
    Reference: Sci Rep. 2016 Nov 14.
    P-Map source code can be downloaded at Li lab GitHub..
  • NetDecoder net decoder

    NetDecoder is a network biology computational platform to dissect context-specific biological networks and gene activities. NetDecoder provides freely available source code and web portal resource for researchers to explore genome-wide context-dependent information flow profiles and key genes using pairwise phenotypic comparative analyses. NetDecoder also allows researchers to prioritize drug targets for genes that affect pathological contexts.
    Reference: Nucleic Acids Res. 2016 Mar 14.
    NetDecoder web interface and other materials are available at the website portal.
    NetDecoder source code can be downloaded at Li lab GitHub.
    For support of NetDecoder, please subscribe to our web forum.
  • CellNet

    CellNet is a network biology-based computational platform that more accurately assesses the fidelity of cellular engineering than existing methodologies and generates hypotheses for improving cell derivations.
    Reference: Cell. 2014 Aug 14;158(4):903-15.; Cell. 2014 Aug 14;158(4):889-902.
    CellNet web interface and other materials are available at the website portal.
  • Modified RNA

    Highly efficient reprogramming to pluripotency and directed differentiation of human cells with synthetic modified mRNA.
    Reference: Cell Stem Cell. 2010.
  • StemSite

    StemSite is a database of regulators network of the developmental origin of mouse hematopoietic stem cells.
    Reference: Cell Stem Cell. 2012 Nov 2; 11(5):701-14.
    StemSite Database is available here.
  • MNI

    MNI (Mode-of-action by Network Inference) is a reverse engineering network biology algorithm to identify the gene targets and key mediators of a biomedical phenotype based on transcriptome data.
    Reference: Nat Biotechnol. 2005 Mar;23(3):377-83.
    Reference: Sci Transl Med. 2014 Jan 1;6(217):217ra2.
    MNI source code can be downloaded here.
  • CLR

    CLR (Context Likelihood of Relatedness) is an network biology algorithm to reverse-engineer and infer regulatory interactions between master regulators and their targets using a compendium of transcriptome profiles.
    Reference: PLoS Biol 5(1): e8.
    CLR source code can be downloaded here.
  • GEDI

    GEDI (Gene Expression Dynamics Inspector), developed by Dr. Ingber's Lab, is a computational program that opens a new perspective to the analysis of transcriptome data. By treating each high-dimensional sample, such as one transcriptome experiment, as an object, it accentuates and visualize the genome-wide response of a tissue or a patient and treats it as an integrated biological entity. GEDI honors the new spirit of a system-level approach in biology and unites a novel holistic perspective with the traditional gene-centered approach in molecular biology.
    Reference: Bioinformatics. 2003 Nov 22;19(17):2321-2.
    GEDI source code can be downloaded here.
    For general questions on GEDI source code, please contact Dr. Donald Ingber or Hu Li.
  • Pathway Modelling and Simulation  Pathway Simulation

    One of the most commonly used approaches to model biological systems is that of ODEs. In general, a differential equation can be used to describe the chemical reaction rate that depends on the change of participating species over time. The temporal dynamic behavior of molecular species in the biological signaling pathway network can be captured by a set of coupled ODEs.
    Reference: Bioinformatics. 2009.; Cancer. 2009.; FEBS Lett. 2008.

© 2024 H Li • All Rights Reserved