Bringing the dark matter of proteins to light using deep learning

Ilham Abbasi

ProtENN, a deep learning approach for protein function prediction, increases coverage of the protein family database, Pfam, by 9.5%, comparable to the coverage achieved over a decade by alignment-based methods.

Protein domains with similar amino acid sequences tend to have similar functions– this is the very backbone of existing computational tools that leverage sequence homology to predict protein function. Despite the success of these algorithms in providing functional annotations for a large number of proteins, they struggle in predicting the function of proteins with low sequence homology to known proteins. However, recent work by Google Research presents a deep learning solution called ProtENN, which has effectively produced functional annotations for 6.8 million previously unannotated protein domains1.

State-of-the-art methods for protein function prediction, such as Protein Basic Local Alignment Tool (BLASTp), primarily rely on pairwise alignment-based techniques2. In these methods, a protein sequence is aligned to sequences with known function. If there is at least 30% homology between the sequences, they are inferred to share function. To further refine these techniques, probability-based methods have been introduced, in which the degree of conservation of a multiple sequence alignment is determined. For instance, profile hidden Markov models (HMM), such as HMMER, compare the protein sequence to a profile HMM that serves as a representation of a known protein domain or protein family3. If the sequence is matched to a profile HMM, its function can be inferred.

Although these methods have progressed protein function prediction, the well-known database for protein annotation, Pfam (now hosted by InterPro4) has seen a mere 5% coverage expansion over the past 5 years5. Dependence on sequence alignment limits the ability of such approaches to annotate proteins that diverge in sequence to known protein families and families that contain relatively few sequences. Additionally, proteins are not simply linearly arranged – the secondary and tertiary structure of proteins can influence function, which alignment-based methods fail to consider1.

To overcome the limitations of alignment-based approaches, Bileschi and colleagues1 propose a deep  learning model that predicts protein function without reliance on sequence alignment (fig. 1). They use a one-dimensional Convolutional Neural Network (CNN) that classifies proteins into one of 17,929 possible functional classes found in the Pfam database. Their model, ProtENN, considers both local and global protein sequence information to recognize sequence characteristics that are indicative of specific functions. Within ProtENN, a filter moves along the inputted amino acid sequence to identify features and patterns in the sequence. These patterns are then processed through multiple layers of the model, where higher layers identify increasingly complicated patterns. The function of novel protein domains can then be predicted, offering a quick, autonomous approach for annotation with minimal human intervention.

Figure 1: Comparison of ProtENN with alignment-based approaches. In alignment-based algorithms, an unknown protein sequence is aligned to known proteins and sequence similarity is used to infer protein domain function. In the ProtENN deep learning algorithm, an unknown protein sequence is processed through multiple model layers, which outputs a predicted protein domain classification without sequence alignment.

The greatest challenge in developing an accurate model for protein function prediction is not in building the model itself, but in designing train and test datasets that can apply to diverse sequences andprevent model bias1. To account for this, sequences in Pfam obtained from UniProtKB reference proteomes were split into train and test sets (1) randomly or (2) by grouping sequence families together and placing the entire group in either the training set or the testing set. The latter ensures that sequence homology between the datasets is low, allowing for accurate classification of proteins with low sequence similarity.

To benchmark model performance, the team at Google Research compared ProtENN against the well-established alignment-based methods, BLASTp and HMMER. Remarkably, ProtENN outperformed the two methods, achieving the lowest error rate and highest accuracy in both the random and grouped split datasets. This showcases ProtENN’s ability to make accurate predictions for diverse sequences.

Strikingly, the authors found that merging ProtENN with alignment-based methods improves prediction accuracy more than either method can individually. Not only did combining ProtENN with HMMER further reduce error rates by 38.6%, but the ensemble increased protein coverage in Pfam by 9.5%, or 6.8 million sequence regions. This added annotations for 1.8 million full-length proteins with no previous annotations, including 360 human proteins. These annotations have publicly been released as Pfam-N, available on the European Bioinformatics Institute website. Since this work, Pfam-N now has 5.2 million protein sequences, expanding UniProtKB reference proteome coverage by 8% (fig. 2)4.

Figure 2: Pfam coverage over the last decade. The orange line depicts the amount of annotations matched to UniProtKB reference proteomes with each Pfam update. The blue line depicts the amount of Pfam annotations added by Pfam-N. Figure taken from 1.

As an emerging space in proteomics, deep learning still faces many challenges. The information ProtENN uses to make predictions is largely unknown. Uncovering this information is crucial in understanding the relationship between protein sequence and function, however, this remains a difficult task6. Additionally, deep learning models heavily rely on a high volume of sequence data to learn meaningful patterns. To overcome this, a machine learning technique called transfer learning has recently been tested in conjunction with ProtENN to show further increases in protein prediction accuracy7. This suggests that despite its limitations, deep learning will likely become a core component of future tools for protein function prediction.

Alongside these advancements, integrative models will likely be developed that combine deep learning with approaches that consider protein information beyond sequence, such as structure and phylogenetic relationships. This will be useful for developments in biomedicine and therapeutics, such as de novo protein design, which requires precise protein sequence evaluation and functional prediction8. To facilitate the usefulness and buildability of ProtENN for various applications, the authors have made the information used to build ProtENN publicly available.

As public protein databases continue to grow, the need for accurate protein function predictions becomes increasingly important. To meet this challenge, ProtENN has paved the way for the use of deep learning in protein classification. Although in its infancy, ProtENN’s full capabilities are only beginning to be explored.


1.        Bileschi, M. L. et al. Using deep learning to annotate the protein universe. Nat Biotechnol 40, 932–937 (2022).

2.        Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J Mol Biol 215, 403–410 (1990).

3.        Finn, R. D., Clements, J. & Eddy, S. R. HMMER web server: interactive sequence similarity searching. Nucleic Acids Res 39, W29 (2011).

4.        Paysan-Lafosse, T. et al. InterPro in 2022. Nucleic Acids Res 51, (2023).

5.        Mistry, J. et al. Pfam: The protein families database in 2021. Nucleic Acids Res 49, D412–D419 (2021).

6.        de Crécy-Lagard, V. et al. A roadmap for the functional annotation of protein families: a community perspective. Database (Oxford) 2022, (2022).

7.        Bugnon, L. A. et al. Transfer learning: The key to functionally annotate the protein universe. Patterns 4, 100691 (2023).

8.        Unsal, S. et al. Learning functional properties of proteins with language models. Nat Mach Intell 4, 227–245 (2022).

Epigenetics behind age reversal in mice- what does that mean for humans?

Syed A K Shifat Ahmed

New research findings in mice suggest old cells retain a copy of their young state that can be reactivated to regain phenotypes lost from aging

Let’s admit it- in conscious or subconscious minds we all have been lured to that anti-aging skincare commercial on the roadside billboard or that anti-aging hack post on our social feed.  The desire to look young and stay away from age-related health complications is real. Science for decades have been trying to understand and control aging1. It is known that specific signatures like methylation markers tags DNA stretches (collectively known as epigenome) that is critical in deciding the functional DNA sites in a biological phenomenon called epigenetics2.  A recent article published in Cell by Yang et al. 2023, provides evidence that disrupting this epigenome accelerated molecular, cognitive, and physiological aging in mice3. Alternatively restoring the disrupted areas was able to reset the epigenome to its younger biological state3. This suggest epigenetic information is not completely lost, rather the cell saves the information in a form that can be retrieved upon appropriate activation (Figure 1a). Understanding this age reversal epigenetic mechanism can open novel therapies for a host of age-related disorders.

Aging has largely been associated with mutational or unwanted changes in DNA as the cell responds to DNA damages such as double stranded breaks (DSB)1,4.  However recent research suggests the aging associated changes are also triggered from loss of epigenetic information4,5.  A previous study from this research group showed that that when subjected to DNA damage, the cells recruit special chromatin repair proteins to the sites of damage where they are tasked with repairing the errors5. These proteins would normally leave after fixing the damage allowing the DNA to return to its usual compactness. But with repeated DNA damage-repairs, these proteins may get displaced inappropriately. Since DNA compactness influences which genes get exposed and expressed, this displacement can result in altered expressions of genes that are critical to aging.

In this study, the researchers investigated if regulating the epigenetic landscape could alter cell aging3. This was experimented through genetically modified mice containing a gene for a scissor-like-enzyme called Ppo1 endonuclease. The enzyme upon induction with drug tamoxifen, which is known to increase oxidative cellular stress, produced cuts in non-protein coding DNA regions mimicking the DSBs as seen in physiological state. The increased DNA damages induced in transgenic mice resulted in rapid modifications of the epigenetic landscape and the system being termed as ICE (Inducible Changes to Epigenome).  Both the transgenic ICE mice (with Ppo1 endonuclease) and non-transgenic control mice (without Ppo1 endonuclease) were treated with tamoxifen for 3 weeks and phenotypes observed for a period. There were not any noticeable differences in the first 4-6 months post-treatment. However, after 10 months the ICE mice started exhibiting features typical of aging like grey hair, reduced body weight, lower activity, and cognitive decline – all these features were absent in the control mice group (Figure 1b). This was reasoned to occur from higher rate of DNA damage and epigenetic reshuffling due to the endonuclease mediated DNA cutting in the ICE mice. ChIP sequencing which is performed to study interactions between epigenetic regulators and DNA confirmed there was increased epigenetic disruption in the ICE mice that resulted in advancement of the epigenetic clock by 50% causing the ICE mice to biologically age faster3.

Having provided evidence of epigenome erosion in accelerated aging, the team decided to test if the epigenome could be reversed to the original landscape in post treated ICE mice. The researchers used a subset of Yamanaka factors called OSK3. The Yamanaka factors are proteins known for their role in reversing adult stem cells to embryolike ones and can alleviate old-age phenotypes and increase lifespan of progeroid mice6. In this study, cyclic expression of Yamanaka factors reversed age-associated gene expressions in ICE mice, with the genes associated with chromatin modification showing expression profiles similar to younger mice3 (Fig 1c). The results consolidated earlier findings from this research group where Yamanaka factors was used to cure blindness in mice by restoring the youthful epigenome7.

The concept of age reversing has attracted a lot of interest among researchers and investors. While our lifespan has improved considerably, has our health span improved equally? Exploring mechanisms involved in aging and cell rejuvenation could pave the way for novel treatment interventions for conditions like cancer, diabetes, and blindness. This work has shown our perceived idea of cell aging being driven by accumulation of DNA mutations only – is a bit misleading. When DNA from ICE mice and non-ICE mice were sequenced, they did not reveal significant differences despite the former demonstrating higher aging phenotypes3.  This led to the authors conclude that epigenetics holds cues to cell aging and reprograming the epigenome would be a more feasible option than correcting mutations in the DNA as scientists continue to tackle cell aging.

When Steve Horvath first developed the concept of epigenetic clock to measure biological age, based on the epigenetic markers the DNA has accumulated, scientists thought of dialing this clock -up and – down to regulate aging8. The current work not only give hope of gaining control to such a dialing meter but also shows promise in reversing age by retrieving the encrypted copy of the “young epigenome” information. At present how and where this information is stored and what signals could authorise the cell to download this epigenetic software permanently are questions for further investigation.

Figure 1: Regulating the epigenetic landscape to accelerate and reverse aging in mice. a) The image shows chromatin reorganization and restoration following double stranded breaks (DSB) repair. When the chromatin modifier gets displaced, it induces changes in epigenetic landscape to promote normal aging. In the study the epigenetic clock was accelerated by increasing DNA damage in ICE (induced change to epigenome) mice that caused changes in gene expression and promoted faster aging in mice. b) The image shows the epigenome was restored by dialing the epigenetic clock backwards using OSK proteins. The OSK proteins aided in reprogramming of the epigenome that returned ICE mice to its “youthful” state. c) ICE and control (CRE) mice from same litter post treatment (1-month and 10-month) showing accelerated aging in ICE mice. Adapted from Yang et al 20233


  1. Melzer, D., Pilling, L.C. & Ferrucci, L. The genetics of human ageing. Nat Rev Genet 21, 88–101 (2020).
  2. Benayoun, B., Pollina, E. & Brunet, A. Epigenetic regulation of ageing: linking environmental inputs to genomic stability. Nat Rev Mol Cell Biol 16, 593–610 (2015).  
  3. Yang, J.-H. et al. Loss of epigenetic information as a cause of mammalian aging. Cell 186, (2023).
  4. Tian, X. et al. SIRT6 is responsible for more efficient DNA double-strand break repair in long-lived species. Cell 177, (2019).
  5. Oberdoerffer, P. et al. SIRT1 redistribution on chromatin promotes genomic stability but alters gene expression during aging. Cell 135, 907–918 (2008).
  6. Ocampo, A. et al. In vivo amelioration of age-Associated Hallmarks by partial reprogramming. Cell 167, (2016).
  7. Lu, Y. et al. Reprogramming to recover youthful epigenetic information and restore vision. Nature 588, 124–129 (2020).
  8. Horvath, S. DNA methylation age of human tissues and cell types. Genome Biology 14, (2013).

Mapping the genetic risk of common diseases with blood metabolites

Nowrin Aman

What if the key predictors of common diseases lie in the components of blood? Genome-wide association studies have predicted the role of blood metabolites as a risk factor and potential treatment targets for 12 common traits and diseases.

Human blood is a mystery, composed of small biochemical constituents called metabolites which are small molecules essential for maintenance of normal homeostasis and physiological processes(Figure 1)1,2. Metabolites are produced because of various catalytic metabolic reactions and go on to form a network of chemical reaction pathways known as the metabolome2. The metabolome now presents as a vivid picture of how the cell operates, and trace back to how the genome acts to give a functional product for everyday life1. In a recently published Nature Article, Chen et al3. elucidated the pivotal role of blood metabolites as functional genomic candidates in determining treatment strategies for common traits and diseases. Defining the specific function of the molecules, with their involved cellular pathways, deviation from this normal physiology, and heritability can provide indications of disease condition3–5.

Figure 1. Illustration showing a conversation between a patient and physician unveiling the mystery of blood metabolites and related genes for treatment.

Metabolite-ratios are an estimate of substrate to product for enzymatic reaction3. Chen et al. explored this relationship by using genome-wide association studies (GWAS) to identify metabolites and metabolite-ratios with related genes and chromosome positions, in addition to their causal effect in disease mechanisms. In addition, twelve common traits and diseases were selected as study outcomes in these three categories: 1)estimated bone mineral density (eBMD) from ultrasound measurements, Alzheimer’s Disease, Parkinson’s Disease and osteoarthritis influenced by aging; 2)body mass index (BMI), coronary artery disease (CAD), ischemic stroke, and type 2 diabetes(T2D) influenced by metabolism; 3)type 1 diabetes (T1D), inflammatory bowel disease (IBD), multiple sclerosis (MS) and asthma influenced by immunity (Figure 2)3.

Gene-metabolite associations have been studied in previous GWAS since 2012, often termed as “metabolomics profiling”. 2–4,6,7 However, sample size limitations made it difficult to infer the causal role of the functional variants, which has been addressed by Chen and colleagues. They used a series of large GWAS studies encompassing 1091 metabolites and 309 metabolites8. Participants had genome-wide genotyped and circulating plasma metabolite levels measured. The relationship between blood metabolites have not only highlighted the previously known eight superpathways for each molecule but have also discovered their own other pathways for metabolites, related to multiple gene associations, with majority identified for lipids and amino acids3,6.For example, genetic markers for the ratio of phosphate and 21 metabolites were identified from 4 different superpathways. The authors reported around 20% heritability for metabolites and up to 84% for metabolite-ratios.

Figure 2. An overview of the study showing3: 1) study cohort used8 2) disease mechanisms selected on possible genes relating to common traits and diseases as outcomes measured; comprehensive list of main causal metabolites and metabolite-ratios identified in the study; Orotate is identified as a major risk for incidental hip fractures by correlating with eBMD. Highlighting major findings: 3) superpathways of metabolite-gene variant associations identified 4) 14 genes, mostly for encoding enzymes and transporters finalized as potential therapeutic targets using knockout mice models, Mendelian disease traits and drug target information. 5) Future prospects of metabolome studies highlighted by increasing ethnicity, functional studies, and controlling the roles of diet, environment and gene interaction. Illustration created by

Analyzing the genetic markers, and metabolomes using different databases, the authors identified 94 genes for 109 metabolites and 48 metabolite-ratios3. The results resonated the observations from other studies, where a single gene expressed its effect on multiple conditions, or the opposite where a single trait is influenced by multiple genes3,5,6. Examples include the fatty acid desaturase (FADS) gene family on FADS locus, chromosome 11, which is responsible for its effect on 79 metabolites including 75 lipids and 1 amino acid, also showed the highest number of associated metabolite-ratios in a locus3. About 26% of the genetic markers identified for metabolite-ratios encode enzymes that are used to make the metabolite pairs. Examples include the gene effecting bilirubin-glucuronide conjugates, encodes for a catalytic enzyme of glucuronidation reaction. Delving into disease-gene association in drug database9, the authors discovered 42 genes related to ~580 pre-clinical and clinical drugs, which act as antagonists, agonists, substrates, inhibitors, or inducers of the encoded proteins. Integrating this pharmacological information, Mendelian disease traits and murine knockout models, 14 genes were deduced to have therapeutic potential as drug targets for regulating metabolite levels.3

The authors further validated their analysis to target 22 metabolites and 20 metabolite-ratios that conferred a causal relationship with one or more traits and disease outcomes3. The most implicated causal finding was the triangular correlation of genetically predicted plasma orotate levels regulating eBMD, which is a known strong risk factor for hip fracture and delved deeper to validate the debilitating relation of higher orotate levels with increased fracture risk in a separate nested study3,10. This could be a novel progression in the diagnosis of osteoporosis, which is one of the topmost causes of hip fracture and disability in aging population, especially women globally10. This is perhaps the shining light which shows the application of GWAS in designing the metabolomes as diagnostic or prognostic biomarkers for common diseases.

The authors have used only European ancestry to reduce population stratification bias and a separate small cohort of other ethnicities, and so the limitations remain the same as other studies3. Moreover, they acknowledged that the study has been based on possible gene-metabolite causes, and so the playground remains to be explored for other metabolites regulating daily life. Ethnicity and diet could be major variables in shaping the diversity of genetics and associated metabolomics, hence more studies are needed to find disease-specific biomarkers for other populations3,5,6. The authors managed to scratch the surface for the heritability of the metabolomes, and this calls for more functional studies to understand their role in disease pattern7.

Overall, Chen and colleagues emphasized the genes and their role in metabolites, through pathways, specific enzymes, transporters or proteins, that could act as a modification target for therapeutic interventions in common chronic conditions3.This paper leaves an open floor for the clinicians, diagnostic labs, pharmaceutical companies and basic scientists to collaborate their thoughts on developing clinical screening panels, prognostic tests to include metabolomes, and eventually implementing them in clinical settings to guide treatment goals6. The authors advocated to refocus our aim of scientific advances? should it not get easier for patient? – Like just a simple, quick blood test.


1.        Zhang, A., Sun, H., Xu, H., Qiu, S. & Wang, X. Cell Metabolomics. OMICS 17, 495 (2013).

2.        Roberts, L. D., Souza, A. L., Gerszten, R. E. & Clish, C. B. Targeted Metabolomics. Curr Protoc Mol Biol 98, (2012).

3.        Chen, Y. et al. Genomic atlas of the plasma metabolome prioritizes metabolites implicated in human diseases. Nature Genetics 2023 55:1 55, 44–53 (2023).

4.        Shin, S. Y. et al. An atlas of genetic influences on human blood metabolites. Nat. Genet. 46, 543–550 (2014).

5.        Gallois, A. et al. A comprehensive study of metabolite genetics reveals strong pleiotropy and heterogeneity across time and context. Nat Commun 10, (2019).

6.        Bar, N. et al. A reference map of potential determinants for the human serum metabolome. Nature 2020 588:7836 588, 135–140 (2020).

7.        Yousri, N. A. et al. Long term conservation of human metabolic phenotypes and link to heritability. Metabolomics 10, 1005–1017 (2014).

8.        Raina, P. et al. Cohort Profile: The Canadian Longitudinal Study on Aging (CLSA). Int J Epidemiol 48, 1752–1753j (2019).

9.        Wishart, D. S. et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res 46, D1074–D1082 (2018).

10.      Lu, T., Forgetta, V., Greenwood, C. M. T. & Richards, J. B. Identifying causes of fracture beyond bone mineral density: evidence from human genetics. J. Bone Min. Res. 37, 1592–1602 (2022).

Past history of obesity remodels the epigenetic landscape of the immune system in mice

Nina Anggala

New insights into the molecular relationship between obesity and Age-related Macular Degeneration (AMD) are the first evidence of the systemic impact of obesity, even after weight loss.

Until recently, obesity was considered a disease of overindulgence rather than biological fact.The advent of Genome Wide Association Studies (GWAS) challenged these preconceptions. With the ensuing flood of candidate genes, specifically those regulating satiety pathways, came the realization that obesity is a complex disorder with a significant contribution from our DNA1.

Hata et al.2 report for the first time that obesity not only has a genetic basis, but that living with obesity impacts the genome – even after weight loss. Specifically, they focus on the previously reported association between obesity and Age-Related Macular Degeneration (AMD), a leading cause of vision loss for adults over the age of 503.

Currently, no causal genes have been identified for AMD4. The implication: AMD likely owes its etiology to environmental challenges, specifically blue light damage to the eye, that accumulate over a lifetime. Over time, the immune system’s response to repeated assaults manifests in the form of retinal thinning, invasive neovascularization, and Drusen deposits: insoluble extracellular material containing lipids, proteins, and inflammatory factors secreted by immune cells5 (Fig 1).

The physiology of AMD, by virtue of the inflammatory content within Drusen deposits, has been linked to immune processes3. Canonically, as macrophages infiltrate the site of injury, they initiate a cascade of events, including angiogenesis and cell death. Within the microenvironment of the retina specifically, debris from cells targeted for apoptosis form Drusen deposits. These deposits are themselves harmless – vision loss is a function of retinal thinning, not the accumulation of deposits – but the neovascularization that precedes their formation may be the catalyst for AMD. Specifically, it’s thought that invasive neovascularization into the retina, which facilitates infiltration of immune cells and is the culprit for the signs of inflammation with which we are all familiar (the red nose and fever that accompanies a cold, for example), directly causes retinal thinning. Similarly, researchers have time and again linked obesity to systemic inflammation6. Their investigation thus began with the question of if and how obesity lays the foundation for AMD, with the assumption that the two disorders interact with, and perhaps through, the immune system.

Figure 1: Early stages of AMD are characterized by Drusen deposits beneath the Retinal Pigment Epithelium (green arrows). As disease progresses, the immune response triggers neovascularization from the choroid layer into the retina. The retina itself thins, eventually leading to vision loss. Figure taken from 7.

To test their theory, Hata and colleagues placed experimental mice on a 20-week weight gain/weight loss (WG/WL) regimen to simulate a past history of obesity. After weight loss, experimental and control mice underwent laser-induced injury to the retina, mimicking the blue light damage that leads to AMD. Compared to mice kept on a regular diet throughout, experimental mice who had a history of obesity displayed increased neovascularization (Fig 2). Transplants of adipose tissue and bone marrow from experimental and control mice into naive recipients recapitulated the above phenotypes: macrophages in mice that received tissue from previously obese mice were hyperactive and, in their hyperactivity, destructive.

Hata and colleagues turned to the epigenome, as the mediator between gene regulation and environment, to determine how this change of behaviour, which must have occurred during the period of obesity, could be maintained after weight loss. Global patterns of chromatin accessibility, which are potentiated by epigenetic markers on the histone and nucleotide level, determine which gene programs are poised for expression within a cell and are typically stable within differentiated cells, unless acted on by external pressures (Fig 2)8. When they analyzed these patterns in the macrophages of mice that had undergone the WG/WL regimen, without receiving laser damage, the change in landscape was striking — accessible regions in the chromatin of previously obese mice were significantly shifted towards pro-inflammatory, pro-angiogenic pathways, compared to controls. Essentially, obesity primed the immune response of these mice, potentially accelerating disease progression in the future. To establish causality, they eradicated macrophage precursors in a separate cohort and repeated the blue light experiment in both WG/WL and control groups. In the absence of a fully functional innate immune system, previously obese mice did not develop AMD, substantiating their conclusion that immune system hyperactivity drives disease progression.

Figure 2: Experimental pipeline for WG/WL and control mice. (1-2) WG/WL mice were placed on a High-Fat Diet (HFD) for 11 weeks followed by a Regular Diet (RD) for 9 weeks to simulate a past history of obesity. (3-4) After laser-induced injury to the retina, WG/WL mice displayed increased neovascularization: damage to the eye initiated by immune activity. (5) Epigenetic profiling of macrophages within the eye of WG/WL mice showed increased chromatin accessibility at pro-inflammatory gene pathways compared to control mice. Epigenetic states can be stably transmitted through cell divisions, thus connecting a history of obesity to future retinal damage7. Me, red: methyl group associated with closed chromatin; Ac, blue: acetyl group associated with open chromatin.

            If these outcomes can be validated in human studies, primed immune cells could present a novel therapeutic target for AMD and other age-related diseases – with the caveat that immunosuppressive medications also pave the way for opportunistic pathogens9. Alternatively, one could curtail AMD well before onset by targeting inflammation at the source: obesity. The outcomes of this study are particularly serendipitous as they join a wave of obesity research fueled by groundbreaking weight loss drugs10. Their discoveries may very well motivate drug developers further, though caution must be paid to persistent weight stigma in both popular culture and health care. Wherever their next endeavors may lead, Hata et al.’s efforts have yielded insights into the lifelong impact of obesity and have uncovered, for the first time, a genetic basis for AMD.


  1. Loos, R. J. & Yeo, G. S. The genetics of obesity: From Discovery to Biology. Nature Reviews Genetics 23, 120–133 (2021).
  2. Hata, M. et al. Past history of obesity triggers persistent epigenetic changes in innate immunity and exacerbates neuroinflammation. Science 379, 45–62 (2023).
  3. Age-related macular degeneration (AMD). Age-Related Macular Degeneration (AMD) | Johns Hopkins Medicine (2021). Available at: (Accessed: 3rd February 2023)
  4. DeAngelis, M. M. et al. Genetics of age-related macular degeneration (AMD). Hum Mol Genet 26, R45–R50 (2017).
  5. Adams, M. K. et al. Abdominal obesity and age-related macular degeneration. American Journal of Epidemiology 173, 1246–1255 (2011).
  6. What is AMD? Macular Disease Foundation Australia (2021). Available at: (Accessed: 3rd February 2023)
  7. Kawai, T., Autieri, M. V. & Scalia, R. Adipose tissue inflammation and metabolic dysfunction in obesity. American Journal of Physiology-Cell Physiology 320, (2021).
  8. Dupont, C., Armant, D. & Brenner, C. Epigenetics: Definition, mechanisms and clinical perspective. Seminars in Reproductive Medicine 27, 351–357 (2009).
  9. Rasmussen, L. & Arvin, A. Chemotherapy-induced immunosuppression. Environmental Health Perspectives 43, 21–25 (1982).
  10. Prillaman, M. K. The ‘Breakthrough’ obesity drugs that have stunned researchers. Nature 613, 16–18 (2023).

Untangling the Cellular Nuance of Endometriosis

Thomas Barbazuk

The last decade of genetic medicine has seen a paradigm shift from the investigation of genetic constitution to a focus on active gene expression. The human genome is now understood to be incredibly plastic, as gene expression is not one-dimensional enough to fit into the confines of classical Mendelian genetics. The inception of transcriptomics, the analysis of actively transcribed mRNA molecules in the cell, has allowed for a tangible understanding of genetic dysregulation in the context of human disease. More specifically, the cutting-edge ability to analyze gene expression in individual cells has allowed us to disentangle the expression profiles found in complex tissues. This technology has recently been utilized to decode the cellular heterogeneity of endometriosis, a uterine dysregulation of tissue that is endemic in 10% of individuals with female-assigned reproductive systems. Endometriosis is characterized by endometrial-like tissue (resembling the inner lining of the uterus) proliferating outside of the uterine cavity 12. In addition to causing chronic pain, infertility, and inconsistencies in menstrual cycles, endometriosis has been observed to significantly raise an individual’s risk of epithelial ovarian cancer13. Fonseca et al at Cedars-Sinai used single-cell RNA sequencing methods to construct a cellular atlas of endometriosis in an effort to characterize the molecular hallmarks that set aberrant endometrial tissues apart1.

Endometriosis is generally characterized into three subtypes:  ovarian endometriosis/endometrioma (fig. 1), superficial peritoneal endometriosis (superficial deposits on the lining of the abdominal wall) and deep infiltrating

endometriosis (characterized by lesions that infiltrate 5mm or more under the peritoneal surface)14. Numerous and diverse tissue samples were taken from the sample population allowing for a comprehensive analysis of the disease. All these tissues were analyzed using a high-throughput single-cell RNA sequencing method called droplet based scRNA seq:  a method that earns its name by utilizing microfluidics to encapsulate individual cells in nanoliter droplet emulsions567.

Figure 1: Anatomy of the uterus, depicting possible endometrioma and endometriosis locations. (Cleveland Clinic et al. 8

With over 9.2 billion reads sequenced, 373,851/432,751 fully profiled individual cells were taken through analysis after stringent quality control1. The sequence data was segregated into different groups by analyzing the expression levels of cell-specific markers, leading to 114 distinctly different cell clusters1. The transcript data was compared across tissue types to deeply investigate the differences in tissue composition between the samples. Analysis confirmed that eutopic endometrium tissues were enriched for epithelial cells and endothelial cells in relation to control tissues. There was a 7-fold depletion in epithelial cells in endometrioma tissues, accompanied by an enrichment in immune cells such as B and plasma cells. Ectopic endometrial tissues were also particularly rich in mast cells and killer T cells. These results affirm our understanding of the aggressive immune response that can accompany the disease, as well as the primary cell types present in dysregulated tissues.

Further stringency in the grouping of the transcript sequence data was accomplished by conducting a principal component analysis (PCA) of the genetic profiles of each tissue type. Cluster 1 was primarily composed of endometriosis, with cluster 2 containing the majority of the endometrioma and eutopic endometrium samples. Unaffected ovarian tissues consistently clustered in group 3. These results reiterate the distinctly different tissue landscapes across different forms of endometriosis. This is a crucial observation, as it presents the possibility that different forms of endometriosis demand the development of different treatments and thus should not necessarily be categorized as the same disease between patients. Furthermore, it was observed that ectopic endometrial tissue and eutopic ovarian tissue from separate ovaries within the same patient grouped to their respective PCA clusters.

The molecular consequence of somatic mutation within endometrial tissues and how such mutations facilitate aberrant cancer expression is relatively poorly understood. It was previously known that somatic mutations within the cancer driver genes ARID1A and KRAS are associated of endometriosis, but the in vivo transcriptional result was especially unclear1. The in vivo transcriptional behaviors of these cancer genes were investigated and better characterized in the study by using immunohistochemical and PCR methods. Heterogenous ARID1A expression was observed, indicating a heterozygous loss-of-function mutation within endometrioma and peritoneal lesion samples. KRAS mutations at codon 12 were also observed within these tissues (fig. 2).

Figure 2: ARID1A and KRAS mRNA expression by mutation state. Fonseca et al. 1

Endometrial tissues strongly differed in expression from unaffected tissues, suggesting aberrant expression and hormone regulation with the disease tissues. Differential gene expression analysis of the tissues examined showed that these tissues go through extensive expression remodeling in conjunction with different stages of the menstrual cycle. There were inverse relationships in gene expression with healthy and disease tissues in concordance with the luteal and follicular phases of the menstrual cycle, suggesting a greater relationship between hormone regulation and disease phenotype.

Endometriosis has remained poorly characterized despite its prevalence in the human population. This study addressed the complexity of endometrial disease with a robust analysis of mRNA transcripts in the context of multiple tissue types, hormonal regulation, and somatic mutation. This was a landmark study in characterizing how the molecular profiles of endometrial type epithelium and stroma differ in expression by locale, affirming the growing body of literature that endometriomas and peritoneal lesions exhibit two distinct disease entities. Specific genes were identified to be upregulated within ectopic endometrial tissues, providing a promising outlook for non-invasive screening assuming these biomarkers can be detected in a blood test. Dysregulation of innate immunity was also observed in the ectopic endometrial tissues studied. Continued large-scale cellular analyses of endometrial tissues is crucial, as the distinction between endometrioma and peritoneal endometriosis that was detected in this study needs to be explored due to different treatment and diagnostic demands.

Works cited:

1.         Fonseca, M. A. S. et al. Single-cell transcriptomic analysis of endometriosis. Nat. Genet. (2023) doi:10.1038/s41588-022-01254-1.

2.         Amro, B. et al. New Understanding of Diagnosis, Treatment and Prevention of Endometriosis. Int. J. Environ. Res. Public. Health 19, 6725 (2022).

3.         Kheil, M. H., Sharara, F. I., Ayoubi, J. M., Rahman, S. & Moawad, G. Endometrioma and assisted reproductive technology: a review. J. Assist. Reprod. Genet. 39, 283–290 (2022).

4.         Rolla, E. Endometriosis: advances and controversies in classification, pathogenesis, diagnosis, and treatment. F1000Research 8, F1000 Faculty Rev-529 (2019).

5.         Salomon, R. et al. Droplet-based single cell RNAseq tools: a practical guide. Lab. Chip 19, 1706–1727 (2019).

6.         Zhang, X. et al. Comparative Analysis of Droplet-Based Ultra-High-Throughput Single-Cell RNA-Seq Systems. Mol. Cell 73, 130-142.e5 (2019).

7.         De Rop, F. V. et al. Hydrop enables droplet-based single-cell ATAC-seq and single-cell RNA-seq using dissolvable hydrogel beads. eLife 11, e73971 (2022).

8.         Endometriosis: Causes, Symptoms, Diagnosis & Treatment. Cleveland Clinic

A New Standard for Pediatric Cancer Care

Lise Cinq-Mars

The SickKids Cancer Sequencing (KiCS) program is advancing pediatric cancer care by performing somatic-germline and tumor sequencing to personalize therapeutic efforts.

The KiCS group at SickKids, headed by Dr. Anita Villani, Dr. David Malkin, and Dr. Adam Shlien1, recently came out with a study that investigated 300 pediatric patients that were previously treated and had poor cancer prognosis, rare tumors, or were suspected of cancer predisposition.  There has been almost no improvement in treating patients who have relapsed metastatic or treatment-refractory diseases in about 40 years. The KiCS group aimed to identify new, targeted approaches that improved the prognosis of pediatric oncology patients (Figure 1). Through in-depth somatic-germline and tumor sequencing, they came up with two major findings: there is a notable proportion of high mutation burden in relapsed pediatric cancers, and pediatric cancers tend to have a prevalence of defects in the homologous recombination repair pathway. This work suggests that analysis of every child and young adult with cancer is likely a realistic goal and should be readily incorporated into clinical care2.

The KiCS program identified at least one clinically actionable finding in 56% of participants2. 54% of patients who had tumor analysis done were found to have variants that were therapeutically targetable2. Interestingly, out of all the cancers examined, Central Nervous System (CNS) tumors had the highest amount (61%) of targetable findings2. Most of these findings were targets of MEK-ERK inhibitors, PARP inhibitors, immune checkpoint inhibitors and cell cycle inhibitors. MEK-ERK inhibitors have been found to be extremely effective in suppressing tumor growth in MAPK dependant cancers3. PARP is an enzyme that aids in DNA repair so inhibiting the enzyme can push cancer cells towards death4. Of the 69 patients that needed a new therapeutic option, 60% were treated with a matching targeting agent2. Additionally, germline sequencing identified pathogenic/likely pathogenic variants in cancer predisposition genes in 15% of participants2. Most of these variants were not found in genes typically associated with pediatric cancers but in genes involved in Homologous Recombination (HR) and mismatch repair (MMR) pathways.

Figure 1. Kids Cancer Sequencing Program The method pipeline for how the KiCS group provides personalized therapies for pediatric patients through sequencing cancer samples and personal genomes.

Villani, et al., wanted to investigate this involvement of the HR pathway more and decided to compare Whole Genome Sequencing (WGS) of 293 available pediatric tumor samples to a group of adult control tumor samples. They found that Signature 3 (a BRCAness mutational signal which requires a minimum of 100 mutations) was detected in 23.1% of samples with no known HR mutations and only 13.3% of adult samples2. They also found that Signature 3 was significantly higher in KiCS samples with previously identified somatic mutations in the HR pathway and highest in tumors from the previously stated group of KiCS patients (pathogenic/likely pathogenic variants associated with the HR pathway in their germline)2. This notable increase suggests that mutations in the HR pathway could be a driver of pediatric cancers.

Tumor sequencing shows that it is possible to identify the evolution of tumors by taking multiple samples at different points in cancer treatment. Villani, et al., found that tumor genomes showed massive changes over the course of cancer and that most mutations could be identified from a sample at only one time point. This information was important for treatment as physicians were able to determine if the relapse was from an older, previously identified cancer strain, or if it was a completely new “relapse”2.

It is clear from their work that there is a high mutational burden in relapsed pediatric cancers. Specifically, tumors previously treated with chemotherapy and radiation tend to have a significantly higher number of mutations when compared to pre-therapy biopsies. The high number of defects in the HR repair pathway suggests that PARP inhibitors might be useful in treatment.

Additionally, the KiCS group made it very clear that germline sequencing is a critical part of the sequencing effort. Understanding what patients are already predisposed to can influence therapeutic options and advise any family history and planning options. A major interest of this study was to sequence participants’ germlines on an 864-gene cancer panel. These patients had not previously had cancer but were suspected of being highly predisposed. Villani, et al., found that the panel identified only 12% of these patients to have a pathogenic variant. Given that 88% of patients were still unaccounted for, the KiCS group suggests that other, yet-to-be-discovered mechanisms must be at play. They suggest continuing to look for novel genes, variants in regulatory regions, epigenetic changes, and multiple gene interactions resulting from several low penetrant variants in a common pathway.

In terms of setting a new standard in pediatric oncology care, the KiCS program suggests that NGS-based tumor analysis can provide more robust assessments than the current molecular/cytogenic testing standard. It would also cut down on the time required to perform sequential assays, meaning that patient oncology teams would have a therapeutic plan devised in a fraction of the time. This approach is flexible, adaptable, and inexpensive. This work is the first of its kind in Canadian precision oncology. It will open the doors and lead the way for what cancer care should look like. The KiCS program shows that careful clinical annotation of variants can significantly accelerate biological insights. Cancer genomics is just getting started and it’s amazing to see such impressive results in such a small cohort. We look forward to seeing what Dr. Villani and the team at KiCS comes out with next!


  1. The Hospital for Sick Children. SickKids study demonstrates how comprehensive genetic sequencing informs a new standard of cancer care. (2023)
  2. Villani, A. et al. The clinical utility of integrative genomics in childhood cancers extends beyond targetable mutations. Nat. Cancer. 10.1038/s43018-022-00474-y (2022)
  3. Merchant, M. et al. Combined MEK and ERK inhibition overcomes therapy-mediated pathway reactivation in RAS mutant tumors. PLoS One. 12, 10 (2017).
  4. National Cancer Institute. PARP Inhibitor.

Single Nuclei RNA Sequencing Breathes New Life into Single Cell Transcriptomic Analyses of Cancer in Clinical Settings

Teresa Brooke-Lynn Coe

Single-nuclei RNA sequencing has been established as a reliable and scalable method of analyzing single cell transcriptomes in frozen, clinically obtained tumor samples, allowing large-scale clinical application of single cell analyses and superior patient experiences.

Single-cell RNA sequencing (scRNA-seq) has been an invaluable player in cancer research for over a decade. Tumor landscapes are complex, thus scRNA-seq enables gene expression analysis of individual cell populations within those environments, helping to identify rare cellular populations and associated non-cancerous cells like immune cells1. This allows for unbridled insights into tumor progression, metastasis, and drug resistance1. However, scRNA-seq research does not always translate to the clinical world – in fact, it is falling fatally behind. There is a deep need in clinical cancer genomics for reliable and scalable single cell characterization. Unfortunately, scRNA-seq cannot achieve this due to the clinical impracticality of large, fresh tissue samples that require immediate processing1. A recently published article by Wang et al. (2023) sought to establish a multimodal method in which single cell transcriptomics could be reliably assessed from frozen clinical samples using single-nuclei RNA sequencing (snRNA-seq) thereby overcoming the practical issues that make scRNA-seq irreconcilable with clinical work.

Single cell transcriptomics is a subset of genomics that examines gene expression levels for individual cells in a population via quantification of messenger RNA (mRNA)2. Two common methods for this are scRNA-seq and snRNA-seq, which assess expression using cellular and nuclear mRNA, respectively3. While analogous in most technical aspects, snRNA-seq holds significant advantages in clinical settings as it does not require large amounts of fresh tissue samples like scRNA-seq does1,3.

Clinical tumor sample collection is typically done via needle biopsy, providing small quantities of sample which are frozen after initial analysis1. However, frozen tissue samples cannot be adequately dissociated before scRNA-seq4,5. The dissociation step of scRNA-seq protocol is important as it releases and isolates individual cells from solid tissue samples5. With fresh samples, the intense enzymatic and mechanical processing can affect cell quantity and quality, and potentially result in the exclusion of rare cell populations (Figure 1)4,5. Moreover, the temperature of dissociation causes transcriptional machinery to remain active, thus mechanical assaults can create stress-response signals called artifacts, potentially biasing results4-6.

Due to their increased membrane rigidity, nuclei are more tolerant to mechanical isolation than whole cells, thus snRNA-seq can be performed from frozen samples (figure 1)3,7. Likewise, snRNA-seq can be done on smaller tissue samples more akin to those obtained clinically1. This overcomes logistical processing hurdles exposed by scRNA-seq while also correcting downstream data processing issues. However, snRNA-seq has its own concerns – nuclei have lower RNA amounts than whole cells, thus obtaining enough data to identify and classify cell types introduces different challenges3. Further validation of snRNA-seq’s ability to accurately establish cellular expression is vital to its clinical implementation.

Figure 1. Differences associated with scRNA-seq and snRNA-seq sample preparation and isolation. (left) The dissociation process for scRNA-seq results in mechanical assaults that can damage fragile cells (pink) in fresh tissue samples, making them unable to be sequenced/included in data4. This leads to limited cell recovery and biased results, like the exclusion of rare cell populations. (right) The dissociation process for snRNA-seq from a frozen tissue sample, leading to isolated nuclei. Nuclei classification leads to inferred cellular classification based off nuclear RNA3,4. Created via

Wang and colleagues yearned to adapt a framework that produced reliable and comparable data using snRNA-seq that could be easily implemented into clinical practice1. To do this, the researchers tested scRNA-seq and snRNA-seq protocols on a variety of tumor tissue samples (using matched fresh and frozen samples). In accordance with previous work3,4, snRNA-seq reliably performed on par with scRNA-seq in isolating cells, classifying cell types, and producing high-quality data. Notably, when assessing fragile tumor samples, snRNA-seq reproducibly out-performed scRNA-seq – isolating more high-quality nuclei and reducing stress-response signals due to snRNA-seq tolerance to dissociation. Thus, the snRNA-seq method was able to provide consistently accurate data using frozen sample and 1000-fold less starting material.

To test the effectiveness of adequately tracking cellular differentiation in response to treatment, Wang et al. (2023) tested past clinical trial samples. These samples were long term frozen samples (some archived up to ten years). Here, snRNA-seq was able to assess diversity in tumor cell populations and associated immune cell populations with excellent technical quality. This protocol was also able to track cellular sub-types involved in resistance to the drug when assessing samples captured at different times of treatment. Additionally, the researchers were able to show changes in chromosomal copy number that occurred after treatment in resistant cells. Whole genome sequencing confirmed these results. Although the copy number results are notable, it does require further study for reliability as a potential standard in clinical practice.

By tracking cellular transcriptomic differences related to drug resistance in long-term frozen samples, the authors clearly demonstrated the potential for clinical impact. Validation of this technique on small, long term frozen samples shows that snRNA-seq is adequate to perform on small and/or frozen clinical samples, negating the need for immediate processing that is so incompatible with clinical workflows. By utilizing samples that would likely be collected regardless, this method provides a less intensive way of achieving valuable single cell genomic results so that clinicians can better assess their patient’s personal treatment needs over the course of treatment. Moreover, to assess frozen samples and still achieve reproducible, high-quality of data opens doors for broad, coordinated, multi-institutional genomic cancer studies capable of producing more widely applicable results on complicate aspect of cancer, like drug treatments and resistance targets1.

A potentially interesting application of this technique is for exploring metastasis. Since scRNA-seq has been proven a powerful tool for studying metastasis8, it would be intriguing to demonstrate the efficacy of this snRNA-seq method in detecting metastatic expression markers. Not only would this help establish metastatic markers via archived sample studies, but it provides a methodology that would help identify those markers in clinical spaces which could have profound effects on patient outcomes.

Additionally, cancer-cell reference atlases are becoming popular to synthesize genomic data as a means of aiding clinical identification of cell type and potential evolutionary pathways6. Here, researchers demonstrated the capability of this method to effectively evaluate single cell genomics in long-term archival samples. This creates a hugely advantageous way to add large amounts of data into such digital archives, generating comprehensive, versatile, and clinically valuable reference atlases.

Although, there are many high-impact transcriptomic tools in development, the translation of these methodologies into clinical practice is lacking due to incompatible practical requirements. There is a need to rethink how these tools can be adapted for clinical spaces and enhance patient experiences. The potential of the multimodal snRNA-seq framework outlined by Wang et al. (2023) to produce broadly applicable data and highly personal medicine illustrates just how technical advances in the field of genomics need to be implemented in clinical spaces.


  1. Wang, Y. et al. Multimodal single-cell and whole-genome sequencing of small, frozen clinical specimens. Nature Genetics 55, 19–25 (2023).
  2. Kanter, I. & Kalisky, T. Single cell transcriptomics: Methods and applications. Frontiers in Oncology 5, (2015).
  3. Slyper, M. et al. A single-cell and single-nucleus RNA-seq toolbox for fresh and frozen human tumors. Nature Medicine 26, 792–802 (2020).
  4. Bakken, T. E. et al. Single-nucleus and single-cell transcriptomes compared in matched cortical cell types. PLOS ONE 13, (2018).
  5. Denisenko, E. et al. Systematic assessment of tissue dissociation and storage biases in single-cell and single-nucleus RNA-seq workflows. Genome Biology 21, (2020).
  6. Rozenblatt-Rosen, O. et al. The Human Tumor Atlas Network: Charting tumor transitions across space and time at single-cell resolution. Cell 181, 236–249 (2020).
  7. Lammerding, J. Mechanics of the nucleus. Comprehensive Physiology 783–807 (2011). doi:10.1002/cphy.c100038
  8. Han, Y. et al. Single-cell sequencing: A promising approach for uncovering the mechanisms of tumor metastasis. Journal of Hematology & Oncology 15, (2022).

Altered Serotonin Receptor Linking Obesity and Maladaptive behaviour

Kamalika Bhandari Deka

Obese patients with maladaptive behaviour such as anxiety, are found to carry a rare serotonin receptor variant that explains the link between maladaptive behaviour and weight gain.

Severe early-onset obesity (SEOO) has been potentially linked to manifestation of behavioural issue such as depression in adulthood.1 Patients prescribed with selective serotonin reuptake inhibitors (SSRI) or antidepressants complained of weight gain.2 Studies have found the non-specificity of SSRI drugs is the causal factor.3 Therefore, the need to understand the role of serotonin receptor on mood and food intake is important for developing targeted therapy.

A recent study by He et al, exploring the role of serotonin 2C receptor on weight regulation and behaviour was published.4 Using exome analysis, 13 rare variants were found in HTR2C, a gene encoding for serotonin 2C receptor. The individuals harbouring HTR2C variant reportedly had maladaptive behaviour and extreme unsatisfied food-consuming drive (also called hyperphagia). Additionally, using mouse model it was demonstrated that specific serotonin receptor activating drug, such as Lorcaserin, are better suited to treat obese patients carrying the variant.

The study cohort consisted of 2548 individuals of European ancestry diagnosed with SEOO and 1117 ancestry-matched controls. The rare variants identified on HTR2C gene were found in 19 unrelated individuals. Sixteen heterozygous (variant on one allele, here on one of the X-chromosome) XX females and 3 hemizygous (variant only on X-chromosome) XY males carried a variant.

To understand the molecular consequence of the rare variants, cellular functional assays were done. One of the variants leads to replacement of the amino acid, Phenylalanine, which leads to form an altered serotonin receptor. This receptor is unable to bind with neurotransmitters (fig.1) such as serotonin.5 By studying the plasma membrane expression, He and team found this newly formed protein loses the ability to bind to serotonin.4 This illustrates that indeed a mutated gene is impairing the normal functional ability of serotonin receptors in such patients. The inability to bind to serotonin can impact many normal physiological pathways that includes mood changes and satiety.1

Mouse model studies have shown that serotonin receptor plays a major role in mediating the action of serotonin on appetite suppression through the melanocortin pathway.6,7 The activation of Proopiomelanocortin (POMC) neurons found on the hypothalamus region is necessary to induce satiety (Fig.1). POMC activation is dependent on serotonin receptors.6 To study the human HTR2C variant’s impact on satiety, a CRISPR-Cas9 approach was done in a knock-in (KI) mouse model. The POMC neurons in KI-mice displayed reduced activity indicating a deficient melanocortin signalling.4 This finding can be beneficial in studying the impaired melanocortin pathway in human.

Figure 1: Schematic representation of Proopiomelanocortin (POMC) neuron activation. A. shows the Arcuate nucleus region of hypothalamus receiving signals from the other parts of the body due to food consumption, the neurotransmitters after crossing the blood brain barrier are received by the serotonin receptors on the POMC neurons, this in turn leads to a cascade reaction on POMC which produces POMC derived peptides. These peptides are received by the Paraventricular hypothalamic neuron (PVHN). PVHN in turn releases regulating signals to other parts of the brain and these signals are further carried to other parts of the body to reduce food intake. B. Whereas due to the presence of altered serotonin receptors, the signals are not received properly and hence the cascade reaction is hindered. Thus, the feeling of satiety is not released which leads to hyperphagia in SEOO patients. Figure created using BioRender.

Serotonin receptor drugs or melanocortin receptor (MC4R) drugs that activate serotonin receptors (also called agonists) have shown efficacy in treating SEOO caused by other genetic variations affecting the melanocortin pathway.8 A serotonin receptor drug, Lorcaserin, has shown promise to be a substitute for the non-specific SSRI drugs.8 The association of impaired melanocortin pathway due to the rare variantwith MC4R drugs was explored. Comparable hunger control was observed in the KI-mouse when dosed with the medication(fig.2).4 These observations are indicative that similar MC4R drugs can be used to repair the melanocortin signalling, which is impaired in severely obese patients due to damaged POMC neural activity. Currently, obese patients with other rare variants in different genes acting on the POMC activity are being included in clinical trials of various MC4R drugs.8 Therefore, it is projected that patients carrying HTR2C variants with impaired POMC neural activity can also benefit from such trials.

Figure 2: A schematic representation of MC4R agonist obesity medications binding site, MC4R or serotonin receptor agonist obesity drug, such as Lorcaserin, binds to serotonin receptor to help transmit the neurotransmitters and in turn activating the POMC neurons, which produces POMC derived peptides, leading to the feeling of satiety in turn reducing food intake in mouse model. The obese mouse carrying the altered serotonin receptor displayed anorectic effects when given Lorcaserin. (Figure adapted from,7 modified using Biorender.)

Both the male and female KI-mice, harbouring the rare variant displayed uncontrolled hunger drive. The behavioural aspects studied in such mice also showed increased offences, reduce sociability and reduced risk assessment ability. The results are comparable with other such obesity mouse model studies although particularly the serotonin 2C receptor was not studied in them.9,10 Similarly, the individuals in this cohort carrying the HTR2C variant also reported early-onset anxiety, a maladaptive behaviour. This is an association highlight the influential nature of our mood which creates a vicious cycle of uncontrolled eating habits.

The study associated the role of biological gender in developing such maladaptive behaviour and obesity. It has been seen that female KI-mouse carrying the rare variant in one allele (heterozygote) displayed similar but less pronounced behavioural and hyperphagic effect as the male mouse. This suggests that the penetrance of a heterozygous rare HTR2C variants is variable. Additionally, most probands carrying HTR2C variants were female, suggesting a possible X-linked inheritance pattern. As the gene is found in X-chromosome, it is speculated to be because of gene-dosage effect. However, authors expressed further exploration to understand the roles of other genetic and environmental factors influencing the full effect of this variant.  

Predominantly, it has been assumed that anxiety in obese female was caused due to societal pressure and stigma. However, this study puts forward a new biological perspective on the associated symptoms of obesity and maladaptive behaviour. The study shows impaired serotonin signalling in females carrying HTR2C variant causes co-occurrence of such symptoms.

Often SEOO patients are reported to have maladaptive behaviour.1 This study has provided evidence for a shared mechanistic origin between SEOO and maladaptive behaviour.4 Through the study, He et al., have emphasised that a systematic psychological analysis of maladaptive behaviour in SEOO patients can be useful in guiding them towards genetic testing and finding an actionable variant. 4

An elaborate functional study has been performed by He and team. Improving the power and effect size of this association further, would require larger studies of thousands along with control samples. An important missing link  between mood and hunger has been established through this study. Population studies can solidify this association across diverse group of population.  Although animal studies are not truly capable of emulating the complex and social environment we live in, they do provide a solid foundation for human disease research. These observations can be utilised to broaden the clinical effectiveness of MC4R drugs. Through this study, it can be accentuated that targeting and understanding the underlying genetic mechanisms regulating serotonin receptor can play a huge role in the development of psychiatric medications that are effective and causes no chronic weight gain.


1. Mills, J. K. & Andrianopoulos, G. D. The relationship between childhood

onset obesity and psychopathology in adulthood. Journal of Psychology: Interdisciplinary and Applied 127, 547–551 (1993).

2. Brian, H. & Bouwer, C. D. Neuropharmacology of paradoxic weight gain with selective serotonin reuptake inhibitors. Clinical neuropharmacology 23, 90-97 (2000).

3. Marston, O. J., Garfield, A. S. & Heisler, L. K. Role of central serotonin and melanocortin systems in the control of energy balance. Eur. J.Pharmacol. 660, 70–79 (2011).

4. He, Y. et al. Human loss-of-function variants in the serotonin 2C receptor associated with obesity and maladaptive behavior. Nat Med 28, 2537–2546 (2022).

5. Peng, Y. et al. 5-HT2C Receptor Structures Reveal the Structural Basis of GPCR Polypharmacology. Cell 172, 719-730 (2018).

6. Berglund, E. D. et al. Serotonin 2C receptors in pro-opiomelanocortin neurons regulate energy and glucose homeostasis. J. Clin. Invest. 123, 5061–5070 (2013).

7. D’Agostino, G. et al. Nucleus of the Solitary Tract Serotonin 5-HT2C Receptors Modulate Food Intake. Cell Metab 28, 619-630 (2018).

8. Clément, K. et al. Efficacy and safety of setmelanotide, an MC4R agonist, in individuals with severe obesity due to LEPR or POMC deficiency: single-arm, open-label, multicentre, phase 3 trials. Lancet Diabetes Endocrinol 8, 960–970 (2020).

9. Walf, A. A. & Frye, C. A. The use of the elevated plus maze as an assay of anxiety-related behavior in rodents. Nat Protoc 2, 322–328 (2007).

10. Haller, J., Mikics, É., Halász, J. & Tóth, M. Mechanisms differentiating normal from abnormal aggression: Glucocorticoids and serotonin. In Eur. J. Pharmacol. 526 89–100 (2005).

Genes gone rogue lead to inflamm-aging

Solomiya Hnatovska

Study uncovers causal relationship between the breakdown of the nuclear lamina and uncontrolled overexpression of CGI- genes, potentially explaining the chronic inflammation symptoms that are associated with aging.

It is well established that the 3D organization of chromatin is important to normal gene expression. In healthy cells, heterochromatin is found tethered to the inner walls of the nucleus by nuclear lamina proteins, while the more actively expressed regions loop out into the center of the nucleus. Unraveling of heterochromatin is seen in aging tissue and has long been believed to contribute to aging associated degenerative changes, however, the underlying mechanism has remained elusive1,2. Lee et al, have made important strides in our understanding of this mechanism using mouse models and meta-analyses of human and mouse data. They discovered that increased expression of a specific group of genes lacking CpG islands is the missing link between the chromatin unraveling and inflammation associated with aging3.

CpG islands are regions of high CG nucleotide density which are known to affect the regulation of genes if found in their promotors. CpG island lacking genes (CGI- genes) make up 40% of genes, are expressed in a tissue specific manner, and are silenced through association with heterochromatin4,5. These are unlike the other 60% of genes which contain CpG islands (CGI+ genes), are broadly expressed across tissues and are silenced through a different mechanism, called polycomb inactivation4,5. See figure 1 for a visualization of this CGI-mediated dual mode form of gene regulation. Given that CGI- genes are silenced by association with heterochromatin, the authors hypothesized that disorganization of nuclear lamina proteins, which occurs during aging, would disproportionately affect CGI- gene expression. Indeed, they found a 33.7% increase in expression of the CGI- genes with age in mouse liver and brain tissue, compared to only 9.5% increased expression of CGI+ genes. They found similar gene expression changes, accompanied by loss of heterochromatin markers, in mouse lines with disrupted and knocked out nuclear lamina proteins. These results indicate that, at least in mice, specifically the CGI- genes are overexpressed during aging, due to the nuclear lamina proteins becoming disorganized and releasing the heterochromatin to unravel.

Figure 1. Diagram showing how CGI+ and CGI-genes are regulated through different mechanisms. While CGI- genes are silenced through association with heterochromatin, CGI+ genes are inactivated through Polycomb mediated inactivation.3

Having determined that nuclear lamina disruption and heterochromatin unraveling lead to CGI- gene upregulation, the researchers wanted to find out what effect this has on known hallmarks of aging. One poorly understood but prominent marker of aging is ‘Senescent Associated Secretory Phenotype’ (SASP). SASP is a phenotype associated with chronic inflammation that occurs with age where non-immune cells begin to secrete pro-inflammatory proteins into the extracellular space6. Thus, the researchers chose to investigate whether CGI- gene upregulation might explain what causes SASP.

Initial findings showed that older mice with particularly high expression of CGI- genes showed signs of liver damage, local inflammation, and an increased expression of genes encoding inflammatory markers. While this critical finding supports the hypothesis that CGI- gene misexpression contributes to inflamm-aging, further investigations using human data were essential to confirm this link exists in humans. Through meta-analysis of both mouse and human data they found that a large proportion of mis-expressed CGI- genes encode secreted extracellular or transmembrane proteins, many of which have previously been implicated in SASP. This was a major finding as the mechanism underlying SASP was previously unknown.

Having found evidence associating the misexpression of CGI- genes with SASP and other markers of aging, the authors turned to testing potential applications of their findings. They suggest using the misexpression of CGI- genes as a marker of aging when testing the effectiveness of treatments for aging and aging related diseases. They found that administration of treatments such as caloric restriction to healthy aged mice, significantly reduces their misexpression of CGI- genes. Through another meta-analysis of both mouse and human data they found that individuals with age related diseases, such as Alzheimer’s Disease (AD) show significantly increased expression of CGI- genes relative to healthy age matched individuals. Given the current lack of therapeutic targets for complex diseases such as AD, this is a significant finding as it supports the potential of CGI- gene expression as a therapeutic target. Thus, misexpression of CGI- genes may be not only be a marker for aging, but also a potential target for treatments of age-related diseases.

This study shows how aging associated inflammation in both mice and humans may be explained by underlying uncontrolled overexpression of CGI- genes. Misexpression of CGI- genes is likely caused by disorganization of the 3D chromatin conformation that occurs with aging. While these findings improve our understanding of mechanisms of aging, further research is needed to resolve what causes the initial breaking down of the nuclear lamina and how misexpression of CGI- genes contributes to systemic aging and age-related diseases. Nevertheless, this research is an essential steppingstone to future studies investigating the potential to target this mechanism of aging and reduce the inflammatory processes associated with aging.


1.         Scaffidi, P. & Misteli, T. Lamin A-dependent nuclear defects in human aging. Science 312, 1059–1063 (2006).

2.         Chandra, T. et al. Global reorganization of the nuclear landscape in senescent cells. Cell Rep. 10, 471–483 (2015).

3.         Lee, J.-Y. et al. Misexpression of genes lacking CpG islands drives degenerative changes during aging. Sci. Adv. 7, eabj9111 (2021).

4.         Deaton, A. M. & Bird, A. CpG islands and the regulation of transcription. Genes Dev. 25, 1010–1022 (2011).

5.         Lee, J.-Y. et al. Conserved dual-mode gene regulation programs in higher eukaryotes. Nucleic Acids Res. 49, 2583–2597 (2021).

6.         Childs, B. G. et al. Senescent cells: an emerging target for diseases of ageing. Nat. Rev. Drug Discov. 16, 718–735 (2017).

Risk alleles for Systemic Lupus Erythematosus may be protective against severe COVID-19

Vivian Hong

Genome-wide association studies (GWAS) find genetic associations between severe COVID-19 and systemic lupus erythematosus (SLE) – identifying the TYK2 locus to have a significant negative local genetic correlation in terms of disease severity.

The coronavirus disease-19 (COVID-19) pandemic has garnered much interest in understanding the genetics involved in the response to viral infection1. COVID-19 is caused by the virus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), leading to a range of mild to severe symptoms that include respiratory, gastrointestinal, and cardiac problems2. To investigate genetic loci that are associated with severe COVID-19, many studies conduct genome-wide association studies (GWAS), which can detect variants that are significantly correlated with the phenotype of interest across the whole genomes of a large selection of individuals1,3.

Recent research using GWAS has shown that the clinical variability of COVID-19 outcomes may be partly attributed to variations in an individual’s genetics1,3. These studies have identified several genetic loci involved in response pathways of host immunity that appear to be highly associated with COVID-19 outcome heterogeneity1,3. Variants that upregulate factors required in immune response pathways were found to have protective effects against COVID-191. Conversely, variants that decrease the expression of essential factors in the immune pathways may lead to more severe phenotypes1. As a result, comparing autoimmune diseases (associated with an overactive immune system that attacks its own host) to severe COVID-19 (associated with a weaker viral immune response system) could contribute to understanding the genetic mechanisms of the COVID-19 pathogenicity1. One autoimmune disease that has been analyzed with COVID-19 is systemic lupus erythematosus (SLE), which is a chronic illness that can cause systemic inflammation across multiple organs4.  In Wang et al’s study, a series of genetic association analyses between severe COVID-19 and SLE identified TYK2 as a significant locus that had reciprocal effects on the two diseases1

In order to compare the genetics of severe COVID-19 and SLE, the authors used SLE data from three previously published European GWAS meta-analysis studies and COVID-19 association European datasets obtained from GenOMICC1. They conducted a genome-wide genetic correlation analysis between the two diseases by calculating the cross-trait linkage disequilibrium score regression (LDSR) using the GWAS result summary statistics1. LDSR can provide the genetic correlation values between different phenotypes5. The results showed that there was indeed a significant genetic correlation between COVID-19 and SLE1.

To identify the regions that contributed to the observed correlation, Wang et al. performed local genetic correlation analyses1. This was done using an approach called p-Hess, which quantifies the correlation of the two tested traits that is due to genetic variation at a genetic locus6. The analyses found fifteen significant loci with positive and negative correlations1. Of those genetic loci, the TYK2 gene was shown to be the most significantly correlated, displaying a negative relationship in disease severity between COVID-19 and SLE1.

To further support their results, the authors performed TYK2 locus-wide association tests to show significant alleles associated with SLE and COVID-191. Wang et al. found that the significantly associated TYK2 alleles had reciprocal effects in terms of disease outcome1. Some TYK2 alleles are found to confer risk for SLE while being protective against COVID-19, and vice-versa1.

The TYK2 gene encodes for a protein kinase that is involved in the host autoimmunity response pathways by helping promote immune factor signalling and production, such as type 1 interferon (IFN-I)1,7. Mutations that affect TYK2 function may lead to dysregulation in IFN signalling during immune responses1. Previous studies have shown that SLE patients are associated with upregulated TYK2 leading to overactive immune systems1,7.

For the TYK2 alleles that are found to confer SLE risk and COVID-19 protectiveness, the mutation had either led to increased gene expression or protein activity1. The increased interferon production due to TYK2 activity may help enhance the host defence system against viral infection1,7. Additionally, the authors suggest that the identified SLE risk alleles may also be involved in intracellular viral sensing pathways where the immune system loses tolerance in sensing foreign molecules1. This may lead to autoimmune attacks, but may also provide extra protection against foreign viruses1. Thus, Wang et al. suggest that these reasons might be why those SLE risk alleles may be protective against COVID-191. Contrastingly, the TYK2 alleles that are SLE protective and confer risk for COVID-19 were found to have either decreased gene expression or protein activity (e.g., impaired phosphorylation)1. This likely leads to a decrease in IFN response signals and the corresponding downstream factors1. The downregulated immune pathway could result in poor viral clearance and severe COVID-19 symptoms, but safe against SLE1,8. Therefore, Wang et al. show why the TYK2 alleles may have opposing effects on COVID-19 and SLE severity.

Figure 1. Proper and downregulated immune response to viral infection lead to different clinical COVID-19 outcomes. The proper pathway shows normal IFN signalling cascade to viral infection, allowing for effective viral clearance and protective immune response. The dysregulated pathway showed downregulated IFN response that leads to poor viral clearance and acute respiratory distress syndrome (ARDS), which is a severe COVID-19 outcome. Adapted from8.

Figure 1. Proper and downregulated immune response to viral infection lead to different clinical COVID-19 outcomes. The proper pathway shows normal IFN signalling cascade to viral infection, allowing for effective viral clearance and protective immune response. The dysregulated pathway showed downregulated IFN response that leads to poor viral clearance and acute respiratory distress syndrome (ARDS), which is a severe COVID-19 outcome. Adapted from8.

Overall, these studies suggest that there are alleles that predispose an individual for an autoimmune disease but provide more protection against viral infection1. There appears to be a delicate balance between calibrating our immune system to fight against viruses (and other foreign agents) and increasing our risk of developing autoimmune diseases. Although this study has provided some insight into the underlying mechanisms behind the shared genetic effects of COVID-19 and SLE, further studies need to be conducted with a larger and more representative dataset. The study performed analyses primarily using limited datasets from European ancestry1. Moreover, functional studies on the identified significantly associated loci in the future could help researchers better understand the host immune response system and disease pathogenicity, which could contribute to the development of possible therapeutic treatments that rescue gene dysregulation in the immune pathways for COVID-19 and other immune-related diseases. 


  1. Wang, Y. et al. COVID-19 and systemic lupus erythematosus genetics: A balance between autoimmune disease risk and protection against infection. PLoS Genet 18(11), e1010253 (2022).  
  2. Ofner, M. et al. COVID-19 signs, symptoms and severity of disease: A clinician guide. Government of Canada (2022).
  3. Pairo-Castineira, E. et al. Genetic mechanisms of critical illness in COVID-19. Nature 591, 92–98 (2021).
  4. Fava, A and Petri, M. Systemic lupus erythematosus: Diagnosis and clinical management. J Autoimmun 96, 1-13 (2019).
  5. Byun, J. et al. Shared genomic architecture between COVID-19 severity and numerous clinical and physiologic parameters revealed by LD score regression analysis. Sci Rep 12(1), 1891 (2022).
  6. Shi, H., Mancuso, N., Spendlove, S., Pasaniuc, B. Local Genetic Correlation Gives Insights into the Shared Genetic Architecture of Complex Traits. Am J Hum Genet 101(5), 737-751 (2017).
  7. Shao, W.H., Cohen, P.L. The role of tyrosine kinases in systemic lupus erythematosus and their potential as therapeutic targets. Expert Rev Clin Immunol 10(5), 573-82 (2014).
  8. Spihlman, A.P., Gadi, N., Wu, S.C., Moulton, V.R. COVID-19 and Systemic Lupus Erythematosus: Focus on Immune Response and Therapeutics. Front Immunol 11, 589474 (2020).