News & Reviews

Breakthrough Technique Calculates Gene Regulation at the Single-Cell Level Using Deep-Learning

A new technique has been developed that is expected to advance our knowledge of numerous underlying biological processes, including those implicated in complex diseases like cancer. Using machine learning —a form of artificial intelligence— scientists can predict gene regulation at the cellular level – a process that, before now, has been nearly impossible to do accurately.

An important goal in epigenetic research is to identify regions in the genome that are vulnerable to molecular factors that can alter gene expression without modifying the underlying DNA code. This could include mechanisms like DNA methylation, histone modifications, non-coding RNA expression, or chromatin structural changes. Because these mechanisms regulate many vital biological processes, such as those controlling cell division and differentiation, understanding them holds immense promise for prospective medical applications.

Over the years, researchers have improved upon their understanding of epigenetic modifications and the computational approaches needed to identify these changes and where they take place along the genome. However, extracting the data and analyzing it has been time-consuming and expensive. Plus, most techniques are developed to assess genome-wide binding profiles within a cell population rather than at the single-cell level. Sparsity and noise constraints within the datasets have made it challenging to study transcription factor (TF) binding within an individual cell.

However, scientists at the University of California, Irvine, have overcome these challenges by drawing upon their expertise across multiple departments. In a recent study made available in Science Advances, a team of collaborators has developed a deep-learning framework based on artificial neural networks to predict TF binding at the single-cell level. Called, single-cell factor analysis network, or scFAN for short, this pipeline consists of a “pre-trained model” instructed on mass amounts of genomic and epigenetic data, which can forecast TF binding at the cellular level. scFAN incorporates DNA sequence and ChIP-sequencing (ChIP-seq) data, aggregated similar single-cell ATAC-Seq (scATAC-Seq) data, and mapability data.

“The breakthrough was in realizing that we could leverage deep learning and massive datasets of tissue-level TF binding profiles to understand how TFs regulate target genes in individual cells through specific signals,” said co-author Xiaohui Xie, UCI professor of computer science.

While almost every cell in the body has the same genomic sequence, not every tissue is the same. The variation among cells, or different phenotypes, arises from a specific subset of instructions largely controlled by transcriptional regulatory pathways. Transcription factors (TF) and other proteins orchestrate this vital process and facilitate whether a gene is turned on or off by binding to nearby DNA or RNA. In general, TFs allow cells to perform their operations in proper sequence, bringing together various sources of information to decide how and when a gene is expressed.

Co-senior author and UCI Chancellor’s Professor of Mathematics, Qing Nie, believes that having the ability to predict whether TPs bind to DNA in a specific cell or cell type and at what interval “provides a new way to tease out small populations of cells that could be critical to understanding and treating diseases.”

“At the bulk level, we found that scFAN can predict TF binding motifs more accurately than other deep learning models,” wrote the researchers. “At the single-cell level, scFAN robustly identifies cellular identities, even in cells that are genetically similar.” They also believe that using scFAN allows for more accurate identification of distinct cell types at the chromatin accessibility level, and it can deal with batch effects across multiple samples.

It is becoming more evident that deep-learning technologies like scFAN can greatly improve our understanding of previously unknown domains, especially when it comes to epigenetic research. But these techniques are still limited by the quality and quantity of data available. The current study had some shortcomings in the area of data compilation and chose just three cell lines for their model. Still, this tool’s potential is highly promising, especially since more TF-related data can be incorporated into the scFAN model, increasing prediction results.

Qing Nie mentioned that scientists could use this new deep-learning method to identify key signals in small cell populations that are notoriously difficult to quantify or target in treatment, such as cancer stem cells. He also added, “This interdisciplinary project is a prime example of how researchers with different areas of expertise can work together to solve complex biological questions through machine-learning techniques.”

Source: Laiyi F. et al. (2020). Predicting transcription factor binding in single cells through deep learningScience Advances. 6(51).

Reference: Bell B. (2021). UCI Researchers use deep learning to identify gene regulation at single-cell level. University of California – Irvine.,

Natalie Crowley

Disqus Comments Loading...
Share
By
Natalie Crowley

Recent Posts

DNA Methylation Changes in the Aging Gut May Help Drive Cancer Risk

The human gut is one of the body’s fastest-renewing tissues. Every few days, new cells…

7 days ago

Researchers Find Violence Can Leave an Epigenetic Mark on Future Generations

Trauma can leave lasting effects on the body and mind. But could the biological impact…

2 weeks ago

How Aging Makes Skin More Sensitive Through Epigenetic Changes

As skin ages, it may become more sensitive to its environment. A small amount of…

3 weeks ago

Fathers’ Teenage Weight May Leave an Epigenetic Mark on Future Children

A father’s health before conception may play a larger role in a child’s future biology…

4 weeks ago

Why Skin Aging Can Progress Faster Than Expected

Two people can be the same chronological age, yet their skin may appear to age…

1 month ago

DNA Methylation Dynamics in Aging Skin

Your skin is often the first place where aging becomes visible. Fine lines, dryness, uneven…

1 month ago