Advancing Prenatal Care: Integrating New Genomic Technologies to Improve Pregnancy Outcomes

Dr. Elena Greenfeld MD, PhD, CCMG, FACMG has implemented new technologies that have improved the accuracy of prenatal and perinatal diagnosis in Canada.

Aisha Wada and Azin Keshavarz

Three decades ago, the genetic screening of unborn children (prenatal testing) was limited to detecting abnormalities in chromosome number and structure1. This testing always required taking a sample from the pregnancy using an invasive method such as amniocentesis or chorionic villus sampling (Figure 1). Genetic testing technologies have advanced significantly and can now detect even minor genetic changes. Additionally, non-invasive prenatal testing has become more widely available. This test can identify chromosome abnormalities in the fetus using just a blood sample from the pregnant woman, avoiding the risks associated with invasive procedures that can lead to early pregnancy loss.

Dr. Greenfeld embarked on her professional journey by initially training as a medical doctor and then delving into medical laboratory technology. In the early stages of her career, she fell in love with genetics. With an aim to pursue a clinical genetics fellowship with the Canadian College of Medical Geneticists (CCMG), she undertook doctoral studies in cancer biology and earned her PhD in record time. She thoroughly enjoyed her CCMG fellowship at Mount Sinai Hospital, Toronto, and has since played a significant role in advancing their genetic testing capabilities. Currently, she holds the position of Director of the Clinical Cytogenetics laboratory and is Head of the Division of Diagnostic Medical Genetics in the Department of Pathology and Laboratory Medicine at Mount Sinai Hospital.

Figure 1: Prenatal testing methods2.

A – Amniocentesis. A needle is inserted through the abdomen to collect a sample from the amniotic fluid. B – Chorionic Villus Sampling. The needle is placed in the placenta for sample collection. Ultrasound imaging is used to guide the needle to the right position in both methods.

“The desire to have a healthy child is universal, and a genetic diagnosis for a condition could make a difference not only to the child and their parents but also to their extended family,” says Dr. Greenfeld. This fact ignited her passion for prenatal genetics. She describes how genetic technologies have changed rapidly since she began her career, and an increasing number of conditions can now be diagnosed during or even before pregnancy. Her lab uses a variety of cytogenetic and molecular genetic techniques to make these diagnoses. Genetic testing is akin to examining the human genome as if reading a book. Cytogenetic methods analyze large-scale changes andare like checking if the book has all the chapters and pages in the correct order by investigating structural and numerical abnormalities. Molecular genetic techniques analyze more discreet changes as theyhelp find missing information in genetic material, like missing paragraphs or phrases3.Each method has its own set of advantages and limitations. In her lab, Dr. Greenfeld strives to adopt testing techniques that maximize these advantages and minimize the impact of the limitations.

Dr. Greenfeld’s involvement in prenatal diagnosis has made a positive impact on care practices in Canada. She reflects that when she first started in the field, the main way to study chromosomes in prenatal diagnosis was through methods such as G-banding and fluorescence in situ hybridization (commonly known as FISH), which basically involved staining chromosomes so you could see their structure under a microscope. These techniques helped identify big genetic issues like changes in the number of chromosomes, which causes Down syndrome, for example. They were the first lab in the country to introduce several faster and more accurate testing methods. Currently, the lab mainly uses a more advanced technology called microarray, which can spot tiny changes in chromosomes that older methods couldn’t. Her efforts have enhanced the lab’s testing capabilities. One improvement from her lab that she’s particularly pleased with is the introduction of a new method for analyzing samples from early pregnancy losses. This ‘no-culture technique’ is quicker and more reliable compared to the older methods4. Prior to the introduction of this method, cells from perinatal samples needed to be grown in the lab for days (i.e., cultured) before they could be analyzed, with varying success rates. With this novel technique, Dr. Greenfeld remarks, they had an exceptionally low failure rate of just 1%, in contrast to the failure rate of up to 40% seen with other methods. Currently, whole-genome sequencing (WGS) represents the next significant advancement in the study of fetal genetics, illustrating the considerable evolution of prenatal testing over time. WGS analyzes the complete genetic makeup of the fetus down to every letter (Figure 2). Each new technique adds to their ability to diagnose genetic problems in babies before they’re born, making prenatal care more accurate and comprehensive for Canadians.

Figure 2: Depiction of gene sequencing.

Chromosomes in the nucleus contain all the ‘letters,’ or nucleotides, that make up all of the DNA in a cell, i.e., the human genome. Whole genome sequencing can determine each letter that makes up the genome. In prenatal testing, genetic material from the growing fetus can be sequenced to diagnose various medical conditions. Figure made with BioRender.

A common misconception about prenatal testing is that it’s about being given the choice to terminate a pregnancy. As Dr. Greenfeld points out, this is not the case: “It is about providing options, which could include in-utero therapies, in-utero surgeries, and in-utero gene therapy. If treatments can be offered very early, it could make a big difference. The more we know, the more information we can provide to couples and clinicians to act on”. When rare abnormalities are found on prenatal ultrasound, many parents-to-be face an unending ‘diagnostic odyssey,’ having to endure several tests with no guarantee of an answer. “In prenatal diagnosis, we do qfPCR (a cytogenetic test) first, then we do microarray, and that does not always explain abnormalities in ultrasound. There isn’t always an answer. Couples might even opt for a termination if they are left without a diagnosis.” Considering the fact that a diagnosis can make all the difference, she adds: “Some of the conditions seen prenatally might even have favorable outcomes if we know the genetic basis.” This fact underlies the importance of her work, and she believes that introducing these new technologies that can better investigate the cause is paramount.

Dr. Greenfeld recalls some memorable cases from her lab in which advanced genetic testing changed the lives of a family. For one couple with recurrent pregnancy loss, testing the products of conception led to the discovery of a rare chromosomal abnormality in which a segment of a chromosome breaks off, and the broken ends get fused together. The couple was offered pre-implantation genetic testing (PGT), in which an embryo is tested for genetic disorders before pregnancy. In the case of this couple, an unaffected embryo was identified and resulted in a healthy child. In another instance, a couple who had had two pregnancies affected with very different but severe abnormalities were found to both be carriers of two rare variants. With extensive prenatal testing, their next pregnancy was confirmed to be unaffected, and they went on to have a healthy child.

When asked about her next steps, Dr. Greenfeld excitedly reveals her newest project: “I want to introduce WGS as a first-tier diagnostic test in prenatal and perinatal diagnosis.” This would entail sequencing the fetal genome from a sample taken from the pregnancy. With WGS, she aims to improve diagnosis rates. She has already demonstrated the accuracy and cost-effectiveness of low-pass genome sequencing compared to conventional testing5 and is putting in efforts to introduce this technology in the laboratory.

Accepting new technologies can be uncomfortable as there is a tendency to take more of a conservative approach that resists new technology. “This is understandable,” she explains, knowing that the diagnosis provided can lead couples to make irreversible decisions. New knowledge is difficult to explain to patients, and as novel disorders are identified, it is even more challenging because we do not know the postnatal outcome. However, Dr. Greenfeld feels that when it comes to communicating with patients, we should not presume that we know what they need to know and hold back difficult information from them. Rather, she emphasizes that the role of scientists and clinicians is “to provide the information available in a way that can be easily understood and help patients in their decision-making while also trusting the patients to understand this information and make informed decisions.”

Embracing WGS as a primary diagnostic tool presents a notable challenge: because the test examines the entire genome, it might uncover genetic changes whose impact is not yet known. Although she acknowledges this concern, she points out the fact that these are not disclosed to the patient or physician unless it is believed that they could explain abnormal findings on a prenatal ultrasound. She stresses the benefits of adopting this testing method, noting that these findings can sometimes drive the discovery of new genetic diseases. Therefore, she believes it is important not to let this challenge deter us from introducing newer testing methods that can increase the chances of making a difficult diagnosis.

Dr. Greenfeld’s efforts to implement new genetic tests is improving outcomes for families dealing with congenital abnormalities. Reflecting on her work, Dr. Greenfeld notes, “If you give someone an option, a hope, it will make a huge difference.” Her belief in the transformative power of choices echoes throughout her contributions, emphasizing the positive impact that informed decisions can have on individuals and families facing genetic challenges.

References

1.         Stranc, L. C., Evans, J. A. & Hamerton, J. L. Prenatal diagnosis in Canada — 1990: A review. Prenat. Diagn. 14, 1253-1265 (1994).

2.         Jelin, A. C. & Van den Veyver, I. B. in Thompson & Thompson Genetics and Genomics in Medicine (eds Cohn, R. D., Scherer, S. W. & Hamosh, A.) (Elsevier – OHCE. Available from: Elsevier eBooks+, 2023).

3.         Berisha, S. Z., Shetty, S., Prior, T. W. & Mitchell, A. L. Cytogenetic and molecular diagnostic testing associated with prenatal and postnatal birth defects. Birth Defects Research 112, 293-306 (2020).

4.         Morgen, E. K., Maire, G. & Kolomietz, E. A clinical algorithm for efficient, high-resolution cytogenomic analysis of uncultured perinatal tissue samples. European Journal of Medical Genetics 55, 446-454 (2012).

5.         Mighton, C. et al. Validation of low-pass genome sequencing for prenatal diagnosis. Prenatal Diagnosis (2024).

Scientist develops computer program that discovers new viruses at an unprecedented rate

Dr. Artem Babaian has harnessed the powers of computational biology and open science to improve human health by unravelling the diversity of Earth’s RNA viruses.

Pamela Alamilla and Lekshmi Mohan

Dr. Artem Babaian, Assistant Professor of Molecular Genetics and Principal Investigator at the University of Toronto.

The COVID-19 pandemic was a favourable time for virologist job security. Experts on viruses and viral genetics were being recruited left and right to perform vital COVID-19 research, run PCR testing labs, and provide guidance to public health experts. And yet, Dr. Artem Babaian, a Canadian computational biologist and virologist, was unemployed. On purpose. 

Babaian, who currently works as an Assistant Professor of Molecular Genetics at the University of Toronto, was donating his time to a project that he believed would significantly advance our understanding of virology. He was working on Serratus, an open-access computer program that would find new species of viruses, including novel coronaviruses, in existing genetic databases1. He was hopeful that he might find the sequence of the then-uncharacterized virus that was causing COVID-19 in another scientist’s data: “I was thinking: could we find someone who accidentally sequenced SARS-COV-2 in 2012 or 2016 – maybe in some field sample or some cell line?”

Education and Early Career

            Babaian knew from a young age that he wanted to become a virologist. When he was 11 or 12 years old, he came across a book entitled ‘The Hot Zone,’ which tells the real story of a team of scientists and physicians who travelled to Africa to prevent a large-scale Ebola pandemic. This book, he says, inspired him to begin his journey into the field of virology. When asked about his graduate school training Babaian stated: “I opted to go to a smaller molecular lab when I went to graduate school knowing that I [would] acquire the skills to think like a geneticist and [bring those skills] back to virology.” In 2019, when the COVID-19 pandemic started, Babaian drew on his years of research experience to start working on a computational tool that would change the field of virology: Serratus.

Project Serratus is Initiated at hackseq Hackathon

hackseq organizing committee in 2018. Babaian is third from the left. Taken from hackseq.com/gallery.

Babaian knew that he couldn’t tackle this new Serratus project on his own – it would require a combination of advanced programming skills and an in-depth understanding of virology. He certainly possessed the latter, but he was no software developer. “I’m a biologist, not an engineer,” he explains. So, in 2020, Babaian took project Serratus to a Vancouver-based genomics hackathon called ‘hackseq’ to look for collaborators. hackseq is an annual, three-day computer programming marathon aimed at bringing together students and trainees to tackle genomics problems3. Participants can pitch ideas for projects, recruit team members, and work together to come up with solutions that they then present to their peers3. hackseq’s philosophy is one of scientific collaboration and openness; the founders believe that “every facet of the scientific process [can be] 100% transparent, from initial hypotheses through open-access publication”3. Babaian helped to co-found and organize the first event in 2016, after which the event ran annually (and successfully) until 2020. 

The 2020 iteration of hackseq took place virtually during the COVID-19 pandemic and focused on research projects involving SARS-CoV-2, the virus that causes COVID-19. It was at this event that Babaian started assembling a team to tackle building Serratus. By then, Babaian had graduated from his doctorate program and was working as a bioinformatician, but he recognized that it would take months of uninterrupted work to make any headway on the project. So, he did what few scientists would dare to do: he quit his job to work full-time on his passion project. “We were a team of volunteers and we got 30 thousand dollars in computing credits and that was the project budget,” he said of Serratus’ humble beginnings.

Scope and Mission of Serratus

The overarching aim of the Serratus project is to find as many new viruses as possible. There are an estimated 108 to 1012 viruses on planet Earth, and only about 0.001% of these are currently known to scientists2. The ‘Dark Virome’ is a term used to describe the unknown viruses that are particularly hard to find using current technologies because their genomic sequences differ so much from those of viruses we already know1. Babaian gives the analogy that our current ability to locate viruses is like “[holding] a candle in the middle of a football field,” with the candlelight representing the viruses that we can locate and the football field representing all those we can’t – those that make up the Dark Virome. He wanted to build Serratus because he believed that understanding the viral diversity on our planet would help us to protect human, animal, and environmental well-being. Speaking on the benefits that discovering new viruses could have on our understanding of human disease, he points out that “one of these viruses [could be] the causal agent for Alzheimer’s or Parkinson’s or… chronic fatigue syndrome.” In other words, viral infection could be the root cause of some poorly understood chronic diseases.  It became Babaian’s mission to create a tool that would allow researchers to make meaningful virological discoveries and to provide clinicians with the data they would need to diagnose patients with infectious diseases more effectively.

Number of viruses (unique sOTUs) discovered by Serratus in RNA databases over a period of 11 days. Figure adapted from 1.

Serratus was, by all accounts, a resounding success. The search engine finds new RNA viruses by searching for one key viral gene, RdRp, in public databases. The RdRp gene is an essential component of RNA viruses that can replicate their own genomes without relying on the machinery of a host cell. Viral ‘palmprints’ are protein sequences that fall within this RdRp gene. Serratus takes a list of palmprints and searches for matching sequences in RNA databases. If a match is found, this indicates the presence of an RNA virus. Babaian and his team built the core of their search engine in 3-4 months, then spent the next few months perfecting the system and testing it on publicly available data. Over a period of only eleven days, Serratus compared all known viral palmprints (at the time) to about 5.7 billion RNA sequences across nearly 4 million public datasets2. This cost the group $23,980 USD and resulted in the identification of 131,957 novel RNA viruses, nine of which were coronaviruses2. The significance of these results cannot be overstated. For comparison’s sake, the Global Virome Project is an international scientific initiative that aims to identify approximately 1.2 million viruses over a 10-year time frame at a cost of ~$1.2 billion USD; that’s an average of $1013 USD spent per new virus found and ~325 viruses discovered per day. Babaian’s team managed to spend under $0.19 USD per virus and found an average of 11,996 viruses per day. In alignment with Babaian’s philosophy of open science, the Serratus team made their program free to use and cloud-based so that researchers across the globe could access it. As a result, their work was published in Nature, a top scientific journal. “I was unemployed, I had no university affiliation, and I was sending a paper off to Nature,” Babaian beamed, recalling his success despite being a self-proclaimed underdog.

Real-World Applications of Serratus

When asked about the impact of his project on the lives of ordinary people, Babaian fondly recounted the time Serratus was used by a research group from Stanford University. Their research studied the utility of genomics in cell therapy products such as CAR-T cells, which are used in cancer treatment4. Using Serratus, they observed that Herpes virus (HHV-6) can be reactivated in patients receiving CAR-T cell therapy4. This discovery, made possible by Serratus, will aid clinicians and researchers in improving CAR-T cell therapy and patient recovery outcomes.

The Future of Biology is Computational

Projects like Serratus owe their existence to centuries of biological discoveries, but also to the modern computational tools capable of manipulating huge amounts of biological data. This fusion of biology, computer science, and statistics has given rise to an entirely new field: computational biology. Babaian embodies this new kind of biologist; he doesn’t work in the wet lab and instead spends his days on a laptop, writing code. When asked about the role that computational biology will play in our scientific future, he simply responds: “Biology is going to be computational biology.” He likens the development of computational methods for biological analysis to the invention of PCR in the 1970s, which fundamentally changed the field of biology by allowing researchers to amplify existing DNA samples. “This is the turning point where we’re going to be transitioning away from molecular genetics and into computational genetics”, Babaian declares with confidence. He predicts that molecular biology will transition away from the current model in which wet lab experimentation is most important and data analysis is secondary. Instead, he expects that as more and more biological data becomes widely available, data analysis and computer modelling will become the first-line methods of discovery, with limited in-lab experimentation needed to confirm their predictions. He describes computational biology as “a new language with which to study genetics”, one that will accelerate research discoveries and fundamentally change the way we study biological systems. “Just don’t tell the molecular geneticists,” he chuckles. 

Dr. Artem Babaian’s story highlights the importance of trying something even if you might fail. His unwavering dedication to collaboration and open science has led to the creation of a powerful computational tool with the potential to advance our understanding of viral infections and even prevent pandemics. In the final moments of our interview, Babaian shared the core belief that pushed him to quit his job and develop Serratus, despite the possibility of failure: “You’re never promised any success in science, but you can try.”

References

  1. Edgar, R.C., Taylor, B., Lin, V. et al. Petabase-scale sequence alignment catalyses viral discovery. Nature 602, 142–147 (2022).
  2. Koonin, E.V., Dolja, V.V., Krupovic, M., et al. Global organization and proposed megataxonomy of the virus world. Microbiol. Mol. Biol. Rev. 84, e00061-19 (2020).
  3. hackseq Organizing Committee. hackseq: Catalyzing collaboration between biological and computational scientists via hackathon. F1000Res. 6, 197 (2017).
  4. Lareau, C. A. et al. Latent human herpesvirus 6 is reactivated in CAR T cells. Nature 623, 608–615 (2023).

New disease discovery provides answers for families struggling with rare disease

Dr. Deshwar’s efforts in rare disease research surpass the laboratory. His research uncovers never-before-described diseases, providing answers and insight for families living with these conditions.

C’airah Ceolin, Renée Smith, and Jade Zhang

Ashish Deshwar, MD, PhD, FRCPC

You’ve got the sniffles, a cough, and a chill that runs down your spine. The first thing that you do is turn to Google and try to find out what illness you have by entering all those keywords into your search query. This is a common occurrence for many of us, and we do these searches hoping to quell our anxiety with answers to those rampant questions running through our brains. But what happens when the disease you’re trying to find answers to hasn’t yet been classified? What happens when you are the first case of a rare disease in the entire country? What happens when the doctors you consult have never seen anything like this before? What happens when all of the above is true, but it’s happening to your child?

These are the challenges that Dr. Ashish Deshwar is trying to address. Dr. Deshwar is a Staff Physician in the Division of Clinical and Metabolic Genetics at the Hospital for Sick Children and a scientist-track investigator in the Developmental and Stem Cell Biology program at the SickKids Research Institute. He is also an Assistant Professor in the departments of Pediatrics and Molecular Genetics at the University of Toronto.

Dr. Deshwar first fell in love with science when reading the book, Endless Forms Most Beautiful by Sean B. Carroll1. Exploring topics including how a single cell can develop into a complex organism like an animal, this book solidified his love for science and the need to pursue it, specifically in the field of developmental biology. Through this journey and after achieving his MD and PhD from the University of Toronto, Dr. Deshwar found himself in the Medical Genetics program at the Hospital for Sick Children researching a new disease gene with Dr. Jim Dowling, Ms. Cheryl Cytrynbaum, and Dr. Rosanna Weksberg. Using his background in zebrafish developmental biology, Dr. Deshwar began researching the claudin-5 (CLDN5) gene2 and variants in this gene, collecting functional evidence to define the genetic basis of disease. CLDN5 is a major contributor to the integrity of the blood-brain barrier (BBB), helping to form tight junctions and preventing unwanted molecules from crossing the barrier into the central nervous system (CNS)3. Dr. Deshwar and his team identified a cohort of 15 patients with potential disease-causing variants in CLDN5 and sought out to prove that these patients had a new disease. The patients in the cohort experienced symptoms including microcephaly (small head size), seizures, delayed development and distinct brain calcifications4

When describing his work with the CLDN5 gene, Dr. Deshwar spoke about the importance of collaboration in the fields of medicine and genomics, specifically emphasizing his work was an international collaboration with more than 10 research centres across the world. This massive collaboration permitted the sharing of patient data to help uncover genetic associations with CLDN5 pathogenesis. He believes that this communal effort is what made it possible to elucidate the findings he presented, sharing that his lab “got so much help from so many people who offered their time very freely”, highlighting how these efforts are “essential to these collaborations”. Too often, labs working in similar fields work in isolation resulting in less change being made at a global level; by working together, Dr. Deshwar believes that we can make a bigger impact and that this impact is what changes lives. He is confident that the work carried out in his 20224 paper can help shed light on other disorders affecting the CNS, providing a framework for investigating neurological disorders.

Dr. Deshwar employs a method of genetic variant analysis which aims to uncover the functional causes of disease via a model organism, zebrafish. When discussing this choice, he emphasized that zebrafish were an ideal fit because of their rapid development, ease of live imaging, and their genome is easily amenable to editing with the CRISPR/Cas9 system. While it may seem odd to study human disease in a different organism such as zebrafish, they are surprisingly quite commonly used. For instance, zebrafish are used in biological research and have allowed for exponential growth in our molecular understanding of many human disorders5. Performing analyses via a zebrafish model system provides an approach that can be broadly applied in genetic variant interrogation. With this, their work not only aids in the understanding of CLDN5 variation and its implications in disease prognosis, but acts as a stepping stone to understanding and potentially creating therapeutics for other neurological diseases.

Dr. Deshwar highlighted three key takeaways from his CLDN5 work. Firstly, his work describes a new disease which has not been previously reported. When working in rare diseases, there is often such a small patient cohort that it can be difficult to establish causality in a given gene. In fact, in the work presented, there were only 15 patients which underscores how rare these disorders are. Dr. Deshwar believes that his published work in CLDN5 disorders and the symptoms seen in the patient cohort helps to establish a standardized description for a novel disease. The impact of this novel description is that it can now be used moving forward by other clinicians and researchers in the field. This opens the possibility of not only helping patients reach a diagnosis but also helps to bring new possible treatment and care plans to individuals living with CLDN5-related disorders. Because this work is the first of its kind in establishing a disease model for CLDN5-related disorders, there is a magnitude of new questions to be answered to gain an understanding of the disease and thus increase the overall quality of patient care. In the years to come, Dr. Deshwar would like to follow up on the original patient cohort to monitor disease progression with the ultimate goal of providing families with a developmental trajectory of CLDN5 disorders.

A second key highlight is that the observed patient variants shed light on the function of critical protein domains and amino acid residues in the gene that otherwise would not have been identified. All patient variants cluster around a small domain within CLDN5, demonstrating unique biological insight into how variation in this specific location leads to disease. This defining feature suggests an importance of the domain for CLDN5 to maintain proper integrity of the BBB4. This finding opens the possibilities of domain-targeted therapeutics to assist in regaining proper integrity of the BBB when CLDN5 function is impaired. Additionally, the patients exhibited brain calcifications pointing to a unique characteristic of the disease that can potentially be used for diagnosis. Brain calcifications can be identified by neuroimaging which shows the optimistic future of CLDN5 disease diagnosis as “there’s not too many genetic disorders you can identify just based on [neuroimaging], but […] this will be one of them”. When describing a novel rare disease, these unique biological features improve the ease of diagnosis and expedite the journey to receiving optimized care, ultimately leading to enhanced quality of life.

A final important element of the paper is that Dr. Deshwar and his team established a functional assay to evaluate new variants that are reported in the future. Establishing and utilizing such an assay allows for additional CLDN5 variation to be evaluated in zebrafish with relative ease as these new variants are uncovered. He stated that the model and analysis techniques described in the paper could be used to understand new CLDN5 variants as they arise, and that he would be interested in further exploring these new variants in future projects. The proposed methodology holds promise for future applications for other CNS-related disorders by providing functional evidence and understanding of the basis of disease.

The work presented by Dr. Deshwar is inspiring to say the least. His passion for the field and his work shines through in conversation. Rare disease research like the work he and his collaborators perform allows for a greater number of patients to receive a formal diagnosis, improving patient experience, and providing answers to those with CLDN5-related diseases worldwide. The possibility of diagnosis via brain imaging also presents new diagnostic avenues that are more accessible than before, highlighting the clinical implications this research has for patients. On top of this, the CLDN5 genetic variation discovered in his work could lead to the future implementation of genetic tests containing these proposed variants as a means for disease diagnosis or confirmation. These specific variants and their shared clustering location in the gene also provide insight for new research in drug discovery to treat CLDN5-related diseases. A major goal for Dr. Deshwar is to eventually build upon this developmental trajectory for CLDN5-related disease, providing families with anticipatory guidance on what to expect for their child diagnosed with a CLDN5-related disease and ultimately improving patient experience and quality of life. 

The revelations uncovered by Dr. Deshwar and his team act as a foundation for other BBB and CNS disorders while accentuating the need for increased and continued collaboration in the field of medical research. The Deshwar lab is working to understand various new diseases as there are many that need to be described and require functional evidence to do so. His findings help to provide answers for patients and their families struggling with rare diseases and their futures with these diagnoses, emphasizing that using diagnostic tools like those established in his work “make sure that [children] get the medical care they need to best optimize their life.”
References

  1. Carroll, S. B. Endless Forms Most Beautiful by Sean B. Carroll. (W. W. Norton, 2005).
  2. Ling, Y. et al. CLDN5: From structure and regulation to roles in tumors and other diseases beyond CNS disorders. Pharmacological Research 200, 107075 (2024).           
  3. Tsukita, S., Tanaka, H. & Tamura, A. The Claudins: From Tight Junctions to Biological Systems. Trends in Biochemical Sciences 44, 141–152 (2019).
  4. Deshwar, A. R. et al. Variants in CLDN5 cause a syndrome characterized by seizures, microcephaly and brain calcifications. Brain 146, 2285–2297 (2023).
  5. Veldman, M. B. & Lin, S. Zebrafish as a Developmental Model Organism for Pediatric Research. Pediatr Res 64, 470–476 (2008).

A Deep Dive into Deep Learning: The Role of Artificial Intelligence in Genomic Medicine

Dr. Ken Kron, Director of Biology at Toronto startup Deep Genomics, shares his motivations to transition from academia to industry and reveals the current and future implications of AI models in genomic medicine and drug development.

Rushil Dua and Faizan Hasan

Dr. Ken Kron: an active Director of Biology at Deep Genomics. Photo provided by Dr. Kron.

The realm of genomic medicine is advancing at an unprecedented pace as vast amounts of genomic data are regularly generated. Geneticists and computational biologists have been working tirelessly to develop research-focused institutions to make effective use of this tsunami of genomic data. One such institution is the Toronto-based biotechnology startup, Deep Genomics, where Dr. Ken Kron is a Director of Biology. He has been with the company for five years following the completion of his academic training at the University of Toronto, which focused on targeting prostate cancer-linked epigenetic modifications.

Before Dr. Kron’s transition to Deep Genomics, his research sought to gain insight on cancer-linked epigenetic modifications, and their therapeutic ramifications. Specifically, he discusses enzyme inhibitors that are overexpressed in certain cancers and may signify advancing prognosis in the patient that he studied during his PhD and postdoctoral training. He hoped to identify markers that were indicative of cancer prognosis, and correlate these findings with a system for grading severity of prostate tumours. He further sought to explore epigenetic and mutational differences between tumours, or modifications made both to the gene sequence and gene expression which may impact tumour growth. In doing so, he found that tumour development is not only spurred on by mutations in coding regions of the genome, which are responsible for protein creation, but in non-coding regions as well1. With this rigorous academic background under his belt, Dr. Kron sought out industry leaders, such as Deep Genomics, to contribute his vast skill set to groundbreaking genomic research .

What is Deep Genomics and what do they do?

Deep Genomics’ work focuses on the use of artificial intelligence to interpret genomic data and make conclusions about RNA biology, using this newly gained knowledge to identify therapeutics and design relevant drugs. Dr. Kron’s discovery of the importance of non-coding regions aligns well with Deep Genomics’ recent work in the same niche. He later elaborated upon the company’s values and missions, and what role such companies play in the wider genomics landscape, “There’s so much out there that can be utilized to understand the biology that’s driving a lot of different diseases, [and] we are uniquely positioned to do that with the expertise that we have at Deep Genomics, in taking information that already exists and applying it.”

Deep Genomics is working on developing gene-level therapeutics, exploiting the potential of RNA. These RNA-based modalities may hope to upregulate gene expression in diseases where the patient manifests their phenotype due to haploinsufficiency, the insufficiency of gene product due to a defunct copy of a gene. The primary goal is to overexpress the functional copy to compensate for the defunct copy and negate the haploinsufficiency. One modality that the company is exploring is steric-blocking oligonucleotides (SBOs), short, single-stranded RNA which modulate gene expression via transcription factor binding to inhibit cellular machinery activity2. Specifically, SBOs are antisense RNA oligonucleotides that complementarily bind their target, and can be chemically modified to improve its binding success . The goal of this methodology is to identify gene regulatory elements within the genome, and employ SBOs to interfere with inhibitory action of these elements, ultimately restoring sufficient levels of gene product to the target organism2. According to Dr. Kron, there are three target tissues that are presently most likely to benefit from SBO-based therapeutics: the liver, central nervous system, and retina, which may be due to the biochemical properties of these tissues. He discusses the utility of artificial intelligence in assessing potential binding targets for SBOs, allowing his team to focus on developing and honing the therapeutic power of this therapeutic.

What is artificial intelligence?

The recent decade has been marked by the explosion of AI related tools such as ChatGPT and Bing’s AI platform. AI tools operate on the analysis of big data, large datasets that have been organised by data scientists to learn tasks that would usually require human intelligence3. For example, ChatGPT was trained on the abundance of text on the internet to simulate human speech. Therefore, wherever there is data, AI is bound to follow. With the digitization of data storage, a wide range of big data is now available. E-commerce data, social media data, and stock market data have been used to create recommendation algorithms, sentiment analysis algorithms, and financial prediction models. With the invention of high throughput technologies such as next generation sequencing (NGS), genomics has joined the ranks of platforms benefiting from big data.

What role does AI play in genomics?

Dr. Kron explained that the wealth of genomic data in databases such as GenBank (Figure 1) is unlikely to be meaningfully interpreted by a human, yet holds many key insights into genetic disease. The nature of monogenic disease has largely been elucidated, and as such, complex genetic disease specifically is where AI technologies “really shine through”. The multifaceted nature of these diseases mean that the answers are not always evident at face value, hence an AI algorithm that can efficiently parse entire genomic datasets will see great success in identifying patterns across disease traits. Another great application of AI, and the specialty of Deep Genomics, is drug design. As an example, Dr. Kron explained that if we aim to identify a 20 nucleotide sequence to target a gene with 50,000 nucleotides, there are 49,981 possibilities that need considering. Subsequently, if the search is modified to incorporate chemical modifications or variable sequence lengths, the amount of possible sequences increases exponentially. Deep Genomics uses AI to screen through all regions of the gene and quickly identify an optimal target. As Dr. Kron maintains that, “… having an integrated AI or [machine learning] driven approach to understanding diseases and disease biology [and] finding new targets is critical to the medicine of the future.”

Figure 1: Recorded number of nucleotides on GenBank over time, data collected from NCBI4.

Deep Genomics has recently published a seminal paper about their AI platform, BigRNA, believed to be the world’s first foundational AI model for RNA therapeutics5. Foundational models are larger machine learning models and have the ability to carry out multiple tasks simultaneously, making them exceptionally versatile in decoding complex biological processes. Dr. Kron elaborated that BigRNA stands out for its ability to accurately predict how specific perturbations to a DNA sequence can influence RNA expression levels and previously unknown tissue-specific mechanisms. Additionally, BigRNA can subsequently design therapeutics to target specific genes that are incorrectly expressed. BigRNA can potentially have a huge clinical impact, helping decode variants of unknown significance in both coding and non-coding regions, and efficiently designing effective therapeutics for rare variants.

Dr. Kron additionally discusses the unique advantages of BigRNA over other previous RNA-based ML models. Similar models are based on the cap analysis of gene expression sequencing technique (CAGEseq), compared to BigRNA’s RNA sequencing (RNAseq) training. Due to CAGEseq’s limited resolution when reading RNA, it is unable to ascertain small-scale sequence variation, including single nucleotide variants (SNVs)6. SNVs are responsible for a great deal of genetic variation and many are believed to be pathogenic, or disease-causing. Thus, possessing the ability to screen for the presence of SNVs in RNA is a massive upgrade for the BigRNA model. In fact, it is capable of making predictions concerning the functional effects of these SNVs as well. In addition, this model is able to assess the role of untranslated regions within gene bodies, and downstream effects of their variation. As Dr. Kron expresses, “I think we’re taking it a little bit further, and trying to push BigRNA to the next level”.The genomics community is beginning to grow its understanding of non-coding variants, and Deep Genomic’s BigRNA is a perfect example of how to approach this gargantuan undertaking.

However as with any groundbreaking technology, the promise of foundational AI models such as BigRNA must be scrutinised in order to determine their clinical validity. Dr. Kron emphasised that a common limitation of AI models is the lack of experimental validation, reminding us that “… where …large claims [are] made around what AI can do with a lack of experimental data…, I think you need to take that with a fairly large grain of salt”.  He believes that much of Deep Genomics work is easily validatable, and that getting empirical data to validate predictions is critical to accepting the reliability of any AI/ML recommendations. Validation of BigRNA sought to ensure that the predicted targets would repeatedly have a tangible downstream effect on gene expression. In this validation, it was shown that BigRNA consistently designed SBOs capable of upregulating expression of 14 different genes, some of which are known to be involved in diseases including Wilson’s disease or Spinal Muscular Atrophy5. mRNA expression is known to be altered due to pathogenic mutations associated with autoimmune disorders, developmental disorders, cancer et cetera. Hence, the clinical impact of correcting mRNA expression can be profound, creating the possibility of potentially curing a wide range of diseases that are prevalent in today’s society.

The Future of AI and Genomics

The intersection of AI and genomics, epitomised by the groundbreaking BigRNA platform from Deep Genomics, represents a paradigm shift in genomics, and biology more broadly. Dr. Kron agrees that AI will instigate greater advancements in genomics than the invention of NGS, and foresees it will do much more. In the context of BigRNA, Dr. Kron claims, “it’s not just a powerful starting point. It actually gives you the answer with a high degree of confidence”. The implications of being able to trust AI tool claims without the need for extensive validation are profound. This shift could significantly reduce costs in both monetary and temporal terms for both testing and assessment of genomic data. As such, the process of diagnosing and developing therapeutics for genetic diseases will be streamlined, promising faster turnaround times with increased accuracy. Recognising the manner in which exponentially growing genomic datasets will result in further improvement of AI models, a world where AI predictions are truly trusted is not merely wishful thinking.

References

  1. Kron, K. et al. Discovery of Novel Hypermethylated Genes in Prostate Cancer Using Genomic CpG Island Microarrays. PLOS ONE 4, e4830 (2009).
  2. Holgersen, E. M. et al. Transcriptome-Wide Off-Target Effects of Steric-Blocking Oligonucleotides. 2020.09.03.281667 Preprint at https://doi.org/10.1101/2020.09.03.281667 (2020).
  3. Holzinger, A., Langs, G., Denk, H., Zatloukal, K. & Müller, H. Causability and explainability of artificial intelligence in medicine. WIREs Data Min. Knowl. Discov. 9, e1312 (2019).
  4. GenBank and WGS Statistics. https://www.ncbi.nlm.nih.gov/genbank/statistics/.
  5. Celaj, A. et al. An RNA foundation model enables discovery of disease mechanisms and candidate therapeutics. 2023.09.20.558508 Preprint at https://doi.org/10.1101/2023.09.20.558508 (2023).
  6. Guerrini, M. M., Oguchi, A., Suzuki, A. & Murakawa, Y. Cap analysis of gene expression (CAGE) and noncoding regulatory elements. Semin. Immunopathol. 44, 127–136 (2022).

Finding Solutions for Liver Transplant Rejection From Rat Liver Cell Atlas

Liver transplantation is one of the main treatment options for uncontrollable liver diseases, but organ rejection severely affects patient survival. Dr. Gary Bader utilizes computational tools and genomic technologies to create the active cell map, aiming to uncover the mystery behind liver transplant rejection.

Heidi Li, Jasmine Li, and Yuxi Yang

Gary Bader, PhD., Ontario Research Chair in Biomarkers of Disease, Principal Investigator at The Donnelly Centre, and Professor at University of Toronto. Image taken from the University of Toronto Department of Molecular Genetics website.

Exploring the Intersection of Biology and Computer Science

As numerous computational tools become available, opportunities in the field of bioinformatics are on the rise, fundamentally altering the way people process information and transforming research methods in the lab. The rapid development in the field accelerates the process of genomic data analysis and facilitates data sharing.

With the emergence of bioinformatics, many scientists pursue their interest by applying computational tools to better understand genomic data and support biomedical research. Dr. Bader is among the many scientists who discovered their passion for bioinformatics. As a pioneer in this interdisciplinary field, Dr. Bader’s journey into bioinformatics started back in high school biology and computer science classes. Despite encountering individuals who commented on the uncertain future of computer science at the time, Dr. Bader continued to pursue his initial interests and actively explored this research field during his undergraduate years. After graduating, Dr. Bader joined a bioinformatics lab, where he solidified his passion for life science and computer science. Now, Dr. Bader is a professor at the University of Toronto and affiliated with the Terrence Donnelly Center for Cellular and Biomolecular Research

The Bader lab is actively contributing to the field of health and disease research by uncovering essential biological networks and pathways involved in disease progression. They use computational tools to study integrated genomics, transcriptomics, and proteomics to investigate detailed molecular mechanisms and interactions, providing valuable results in clinical research and precision medicine.

When asked what has inspired him in doing the work and research, Dr. Bader says, “[It’s] all motivated by interests, needs and opportunities.”

From Active Cell Map to Treatment Strategies

The need for liver transplantation has increased over the years. Globally, the number of liver transplants in 2021 almost reached 34 700, marking a 6.5% increase from 2020 and a 20% increase from 20151. However, around one-quarter of the patients did not survive 5 years after transplantation2. Studies have shown that up to 40% of transplant recipients develop rejection against the donor liver, and the organ rejection significantly reduces the recipients’ survival rate3. Investigating the cause of liver transplant rejection is urgently needed.

There has been unprecedented advancement in the development of novel sequencing technology and computational tools. As noted by Dr. Bader, modern technologies, such as single-cell (scRNA) RNA sequencing, are able to “give [us] new kinds of information. We see things that we never saw before, and that generates a whole new field.”Dr. Bader recognizes the need to find cures for patients suffering from transplant rejection. He takes the opportunity provided by advancement in technology development to explore possible solutions for transplant rejection.

To look for a cure, one of the strategies is to create the active cell map, where the expression of genes in liver cells are profiled under healthy and diseased conditions in rat liver models4. By analyzing differences under diseased and healthy states, hopefully, the active cell map guides us to discover the molecular mechanism causing transplant rejection.

When discussing the inspiration behind developing the active cell map in the rat liver model, Dr. Bader shares with us a fascinating observation from previous studies: two strains of rats, belonging to the same species, show completely opposite outcomes for liver transplant operation – either rejection or tolerance5. To find out why the opposite outcomes happen, Dr. Bader collaborates with surgeons and immunologists in the liver transplant program at the University Health Network, aiming to construct active cell maps that characterize differences at the gene expression level in liver cells between the two strains of rats4. As one might expect, this work demands resources and dedication, with scientists taking multiple steps to achieve the ultimate goal. Their first paper, published in 2023, focuses on liver cell profiling under the healthy state4, “to see what the differences are at baseline or at the healthy state between the 2 strains of rats.”

To construct the active cell map, scRNA sequencing is used to collect gene expression information4, and the gathered data is processed, analyzed, and visualized, using bioinformatics tools that require expertise in both biomedical and computational science (Figure 1). Specifically, machine learning algorithms are employed to visualize intricate cellular heterogeneity in the liver model4. Bioinformatics tools are utilized to identify active pathways in specific regions in the rat liver, to characterize liver cell identities and to capture strain-specific differences4.

Figure 1. A Schematic Overview of the Rat Liver Cell Profiling Using Single Cell RNA Sequencing Technology. Heterogeneous liver cell populations obtained from two strains of rat liver model undergo single cell RNA sequencing4. The single cell RNA sequencing technology generates a large amount of raw data which requires computational tools for analysis. The analyzed data is then interpreted for the transcriptomic characterization of liver cells obtained in the rat strains4. The liver cell profiling provides important reference for future liver transplantation studies. (Figure created in BioRender.com)

One of the most interesting findings is that“one [rat] strain has, by default, more active macrophages, and that is maybe one of the reasons why this strain is more likely to reject a transplanted organ,” says Dr. Bader.Macrophage is important in activating host defense against foreign entities and in triggering adaptive immune response5. Previous studies suggest that macrophage is an essential component contributing to transplant rejection6, which correlates with the fascinating results produced in the rat liver model. This finding also opens the door to future studies aiming to investigate details in the immunogenic mechanism underlying organ rejection.

The current study has successfully characterized heterogeneous liver cells from two healthy rat strains4. Building upon these discoveries, Dr. Bader shares with us that “the next phase of that project is to study the same [rat model] system, but in their diseased state.”At this moment, Dr. Bader and his team are working to “find out if we can learn more about the transplant rejections and see if we can treat the system to reduce transplant rejections. The ultimate goal is to improve the reliability and success rate for organ transplants.”

Having a comprehensive understanding of transplant rejection will significantly improve transplantation outcomes in the future. Scientists may be able to develop predictive software to accurately forecast post-transplantation prognosis. In terms of treatment, the analysis of differentially expressed genes and active pathways can potentially lead to the development of novel therapeutic strategies for downregulating immune responses targeting the organ. Eventually, such studies will provide valuable insights into transplant patient management and offer significant benefits to patients suffering from liver cancer and other uncontrollable liver disease. 

​​Challenges in Data Shortage and Complexity

One of the primary hurdles in bioinformatics research is the shortage of data. Dr. Bader emphasizes the vastness and complexity of the potential data that could be collected, and notes that the currently available data “is only a tiny, tiny fraction of possible data that we could collect,” despite billions of dollars being invested.

Developing a comprehensive active cell map requires prior knowledge of human biology. Significant progress has been made in understanding gene functions, protein complexes, biological pathways, and human body systems. Nonetheless, “A lot of genes in the human genome don’t have any known function, so we’re missing a lot of information there,” explains Dr. Bader. This issue underscores the need for additional research to bridge the gap between our current understanding of biological mechanisms and unresolved research questions.

To add on another layer of complexity, technical challenges exist. For instance, the lack of immunological tools for rat cells poses a significant challenge, making in vitro validation difficult, unless more tailored antibodies designed specifically for rats become available. Furthermore, ambient RNA is the background RNA that does not actively contribute to the pathway of interest, which complicates bioinformatics analysis.

Science as a Team Sport

To address the challenges in data shortage and complexity, open science plays a vital role. Dr. Bader defines open science as “a way of doing science, that you share your results as early as you can with other people publicly, so that everybody can use all the information that’s available.” Additionally, reproducible science enables researchers to efficiently reproduce identical data through data sharing. Open and reproducible science are essential for swiftly translating scientific knowledge into actionable insights. They facilitate efficient information sharing among the research community, clinical health care professionals, and individuals affected with diseases. This collaborative activity can guide physicians to makewell-informed decisions at critical stages, ultimately benefiting patients’ health.

However, obvious conflicts exist when implementing open and reproducible science, as “the traditional science culture values publications and authorship.” To alleviate the conflicts, Dr. Bader highlights the importance of recognizing the value of open science in transplant patient care and practicing good behavior by respecting people’s work and embracing the concept of open and reproducible science.

Dr. Bader emphasizes the importance of teamwork, suggesting that “we should all think about how to work as a team.” One step we can take to support open and reproducible science is publishing our work on a preprint server, as the results can be utilized more quickly. Additionally, Dr. Bader suggests making all of our code available as notebooks on GitHub, so people can reproduce the code, enhancing efficiency and facilitating reproducibility.

As noted by Dr. Bader, “Open and reproducible science, and learning how to develop skills in that area help people work better as a team, and science is a team.”In the long run, patients eventually benefit from the practice of open and reproducible science. Patients’ well-being should always be prioritized in scientific research activities, which aligns with Dr. Bader’s initial motivations for pursuing biomedical research. His projects are driven by his interests in biology and computer science, the need for a deeper understanding of the human body system, and the opportunities afforded by automated data processing and analysis.

With over 20 years of experience in bioinformatics, Dr. Bader’s expertise serves as a guiding force in understanding complex genomic data. Ultimately, his dedication to advancing biomedical research holds promise for improving liver transplant patient outcomes and tackling emerging challenges in bioinformatics research.

Dr. Bader and his lab members, Donnelly Centre for Cellular and Biomolecular Research, University of Toronto. Photo provided by Dr. Bader.

References

  1. Global Observatory on Donation and Transplantation. Available at: https://www.transplant-observatory.org/. (Accessed: 8th March 2024)
  2. Liver Transplant. MAYO CLINIC (2024). Available at: https://www.mayoclinic.org/tests-procedures/liver-transplant/about/pac-20384842#:~:text=Your%20chances%20of%20a%20successful,for%20at%20least%20five%20years. (Accessed: 8th March 2024)
  3. NACIF, L. S. et al. Late acute rejection in liver transplant: A systematic review. ABCD. Arquivos Brasileiros de Cirurgia Digestiva (São Paulo) 28, 212–215 (2015).
  4. Pouyabahar, D. et al. A rat liver cell atlas reveals intrahepatic myeloid heterogeneity. iScience 26, 108213 (2023).
  5. Wang, X., MacParland, S. A. & Perciani, C. T. Immunological Determinants of Liver Transplant Outcomes Uncovered by the Rat Model. Transplantation 105, 1944–1956 (2021).
  6. Li, J. et al. The Evolving Roles of Macrophages in Organ Transplantation. J. Immunol. Res. 2019, 5763430 (2019).

Cell-Free DNA: The Future of Cancer Diagnostics

Dr. Trevor Pugh and the CHARM Consortium’s research in cell-free DNA analysis heralds a new dawn in the early detection and personalized treatment of hereditary cancer syndromes, showcasing the power of liquid biopsy to redefine the landscape of oncology.

Andeep Turna and Yash Patel

Dr. Trevor Pugh, PhD, FACMG, (he/him) is the Canada Research Chair in Translational Genomics, Senior Scientist at Princess Margaret Cancer Centre, Genomics Director at the Ontario Institute for Cancer Research (OICR), and Professor at the Department of Medical Biophysics, University of Toronto. He directs the OICR integrated genomics program that encompasses several key research teams and platforms, focusing on advancing genomic medicine through basic, translational and clinical research. Photo provided by Dr. Trevor Pugh.

In the relentless fight against cancer, a groundbreaking approach promising to redefine early detection and monitoring is emerging, particularly for those carrying the heavy burden of hereditary cancer syndromes. This new frontier is the analysis of cell-free DNA (cfDNA), a revolutionary biomarker found circulating in our blood, shedding light on the genetic blueprint of cancer without the need for invasive procedures1. With the potential to transform the landscape of cancer care, cfDNA and the technique known as liquid biopsy stand at the cusp of a new era in precision medicine.

cfDNA refers to tiny fragments of DNA that freely circulate in the bloodstream, originating from the natural turnover of cells2. When cancer is present, these fragments include circulating tumor DNA (ctDNA), which carries the specific genetic alterations of the tumor (Figure 1)2. The ability to detect and analyze ctDNA through a simple blood draw, also known as a liquid biopsy, opens unprecedented possibilities for cancer detection1. This new method offers a non-invasive window into the molecular makeup of a patient’s cancer1.The significance of cfDNA extends beyond its diagnostic value; it heralds a shift towards more proactive and personalized cancer management. About 5% of cancer cases are linked to a specific genetic trait passed down in families, known as hereditary cancer syndrome (HCS)1. Individuals with HCSs face a significantly elevated risk of developing various cancers throughout their lives3. One example of this is Li-Fraumeni syndrome, where patients face a near 100% lifetime risk of developing cancer3. For these patients, the advent of liquid biopsy is particularly impactful. This enables early detection and ongoing monitoring of cancer with a level of precision and convenience previously unattainable1,3.

Figure 1. Key Sources of cfDNA in Cancer: Illustration of primary sites where cell-free DNA (cfDNA) originates, highlighting tumor locations where cfDNA is released into the bloodstream due to cancerous cell processes. This figure also notes the presence of cfDNA from fetal cells in pregnant women, showcasing the diverse origins of cfDNA relevant to cancer diagnosis and monitoring. Abbreviations: cfDNA – cell-free DNA, WBC – white blood cells, CTC – circulating tumor cells, RBC – red blood cells. Figure adapted from2.

At the forefront of this promising field is Dr. Trevor Pugh, a visionary scientist and a leading figure in the cfDNA in Hereditary and High-Risk Malignancies (CHARM) consortium. His work epitomizes the collaborative effort to harness the power of cfDNA and liquid biopsies in revolutionizing cancer care. By leveraging advanced DNA sequencing technologies, Dr. Pugh and his team are not only detecting cancer earlier but are also tracking its evolution in real-time, paving the way for tailor-made therapeutic strategies that cater to the unique genetic profile of each patient’s tumor3,4. Fortunately, Dr. Pugh is also an educator at heart who was willing to share his expertise and latest work with us with no reward other than for aspiring scientists to learn of such groundbreaking work.

Before diving into his work, we asked Dr. Pugh what inspired him to pursue research in the field of cfDNA. From our discussion, Dr. Pugh revealed that his motivation stems from the inherent limitations of traditional cancer biopsies—they are difficult to obtain, cannot be performed repeatedly, and often are not entirely composed of cancer cells. Dr. Pugh emphasized that “The main reason we wanted to get into the cell-free DNA space was the ability to look at how cancer genomes change during treatment… no one wants to sign up for a monthly biopsy, but a monthly blood test is something that you could do.” This interest quickly evolved to the potential for early cancer detection in hereditary cancer carriers and represented a pivotal shift towards more accessible, non-invasive cancer monitoring strategies.

Building on this foundation, CHARM was born. A consortium driven from a nationwide need for normalized cfDNA surveillance across Canada that is focused on advancing cfDNA technology for early cancer detection. The consortium’s development was further accelerated from the need to resolve the variability found in HCS management across the provinces and territories of Canada. “It was an opportunity for us to do cell-free DNA surveillance in a standardized way across multiple centers,” explained Dr. Pugh. This effort is particularly poignant in a country such as Canada which has an abundance of rural and remote areas, where access to traditional cancer screening is limited. When questioned on the social ramifications of his work, Dr. Pugh was proud to note that the CHARM consortium is not just pioneering cfDNA technology for early cancer detection; it’s also championing equitable access to life-saving surveillance. In a healthcare landscape where geography often dictates the availability of advanced diagnostics, CHARM’s work is a beacon of hope. “There’s a huge opportunity for just access to cancer surveillance in general,” Dr. Pugh informed, emphasizing the initiative’s potential to bridge the gap for underserved populations. By leveraging cfDNA testing, CHARM is poised to offer a lifeline to remote and rural communities, ensuring that advanced cancer surveillance transcends the barriers of location and resources. Furthermore, a critical goal of Dr. Pugh and CHARM is to evolve cfDNA testing into a point-of-care system, making it widely accessible across Canada. The current centralized analysis in Toronto, while effective for initial studies, isn’t a long-term solution and “… the goal is to have sequencing capacity also being distributed,” Dr. Pugh confirmed 3. He envisions a future where cfDNA testing can be conducted locally, drastically reducing costs and expediting results. This decentralization aims to enable broader, more equitable access to cutting-edge cancer surveillance, ultimately transforming the early detection landscape.

These long-term far-reaching goals may seem ambitious but that is for no small reason. The advent of liquid biopsy through cfDNA analysis indicates a significant advancement over traditional cancer detection methods, promising an impressive paradigm shift in how we approach cancer surveillance1. Dr. Pugh highlights two critical benefits: accessibility and early detection. Unlike conventional methods that require physical visits to centralized facilities, cfDNA can be collected anywhere, significantly improving access to cancer surveillance. More importantly, cfDNA’s potential to detect cancer earlier than existing methods could lead to more effective treatments and outcomes, marking a pivotal step forward in patient care. Not only do these advantages help with prognosis, but they also ease the psychological strain that comes from the constant stress of a potential cancer diagnosis5. “The term we’ve heard from the patients we’ve worked with is ‘Scanxiety’. You’re really worried about what the next scan is going to make the patients feel,” Dr. Pugh explained, emphasizing the need to consider the emotional toll of individuals awaiting their results. A white paper by LUNGevity, one of the largest lung cancer non-profits, stated how clinicians do not always recommend additional tissue biopsies because of how the patients might feel or the potential worsening of patient conditions6. Yet, for liquid biopsies, patients feel anxiety similar to getting a blood test, which is typically far lower than the traditional invasive methods. Additionally, the ability to perform multiple tests would further help alleviate concerns and allow for increased routine testing5. In Dr. Pugh’s experience “…the frequency of the test, especially if you have multiple negatives in a row, may acclimatize people to getting these types of testing, and there may actually be reduced anxiety by having a test more frequently versus a single, large, high stakes test every year.” 

In fact, the benefits are so enticing, Dr. Pugh and CHARM have recently been awarded just under $7.5 million over five years by the Canadian Cancer Society and the Canadian Institute for Cancer Research. According to the Canadian Cancer Society, “The results of this project could have wide-reaching implications for early detection of cancer above and beyond those people living with FCS [Familial Cancer Syndrome]”7. When asked to comment on this, Dr. Pugh said, “It is probably the most ambitious study we’ve taken on… we’re hoping to be able to have the data within those four years to make a very strong statement as to [if] cell-free DNA [can] find these cancers early?” Research at this scale can take many years, sometimes even decades, but Dr. Pugh is pushing to get the data needed to bring this test to greater clinical utility. 

Currently, the CHARM consortium has published their data on a retrospective study on cfDNA diagnosis for patients with Li-Fraumeni syndrome. Dr. Pugh highlighted the impressive accuracy of the test in excluding cancer, noting its negative predictive value exceeded 95%, which implies a high level of confidence in negative results. Moreover, he mentioned the positive predictive value was slightly above 50%, indicating that the test could confirm cancer in just over half of the cases3. The significance of this test is highlighted in its ability to confidently rule out cancer in numerous patients with HCS3. The success of cfDNA testing to detect and rule out cancer significantly earlier than traditional methods was enough to shock Dr. Pugh when he first saw the results of his work. Dr. Pugh confessed, “I’m surprised at how much earlier we’ve been able to find these early cancers…even before it’s apparent by imaging,” underscoring the transformative potential of liquid biopsy in changing the discipline of cancer detection and treatment. Dr. Pugh wants to improve these results by having another trial on methylation testing on cfDNA in a future project, provided that the results of this five-year prospective study are promising. He plans to utilize methylation testing as a secondary research measure after a positive genetic result, aiming to deduce the cancer’s tissue of origin, as every tissue exhibits unique methylation patterns. When asked why they did not attempt to test methylation in their current study, Dr. Pugh stated “The challenge with methylation-based result was we needed large numbers of reference plasma samples from patients with the cancers that our population is at risk for, and we didn’t yet have that reference set to set that up as a high-quality clinical assay.”

Undeniably impressive results aside, the technology being developed is not without challenges. Implementing cfDNA technology across Canada presents formidable obstacles, chiefly cost and engagement. Transporting samples to centralized testing facilities currently incurs significant expenses, a barrier Dr. Pugh acknowledges yet views as surmountable with scale and efficiency improvements. Equally critical is fostering understanding and buy-in from both patients and healthcare providers. It is imperative that both parties understand cfDNA’s role as an adjunct to, not a replacement for, existing surveillance methods and that incorporating such methods is a big commitment. 

Addressing these challenges through future research is paramount for CHARM to realize its vision of making early cancer detection a universal standard. One avenue of doing so, that Dr. Pugh is particularly excited about, is researching the potential of “fragmentomics” in improving early testing technology. Dr. Pugh celebrates that “One spin-off study that’s actually come from CHARM is building a fragment database…to look for these specific breakpoints associated with time of cancer diagnosis,” highlighting the global collaboration aimed at pooling fragmentomic data to advance cancer diagnosis. This innovative approach could not only detect cfDNA from cancer cells but also infer its cell of origin, dramatically enhancing the precision of cancer detection. So, what do these results tell us? Can we look forward to not having to perform tumor biopsies ever again? Dr. Pugh sure seems to think so, noting that he “… can certainly see a day where we might not necessarily have a tissue biopsy. We might do all of this from cell-free DNA as costs get lower.” Furthermore, cfDNA assays with methylation testing may expand this testing to all types of cancer as we may be able to determine the tissue of origin. We may not be far away from the day where everyone can get routine cancer screening from blood tests via our yearly physical examinations thanks to the tireless efforts of our scientists and physicians. However, all this hard work would be impossible without the participation from the community. When asked what Dr. Pugh would like to say to someone who has been impacted by a hereditary cancer syndrome, he emphasized, “I think in general, participate in research. You get extraordinary oversight from really the world’s greatest leaders who are working and trying to understand your disease and work with clinical investigators… we thank people for participating in research. And it’s the way to get cancer treatment and cancer surveillance better and better.”

References

  1. Farncombe, K. M. et al. Current and new frontiers in hereditary cancer surveillance: Opportunities for liquid biopsy. The American Journal of Human Genetics 110, 1616–1627 (2023). 
  2. Arshad, S. et al. Cell free DNA; Diagnostic and prognostic approaches to oncology. Advances in Cancer Biology – Metastasis 5, 100052 (2022). 
  3. Wong, D. et al. Early cancer detection in li–fraumeni syndrome with cell-free DNA. Cancer Discovery 14, 104–119 (2023). 
  4. El Ghamrasni, S. et al. Mutations in noncoding cis-regulatory elements reveal cancer driver Cistromes in luminal breast cancer. Molecular Cancer Research 20, 102–113 (2021). 
  5. Adi-Wauran, E. et al. “I just wanted more”: Hereditary cancer syndromes patients’ perspectives on the utility of circulating tumour DNA testing for cancer screening. European Journal of Human Genetics 32, 176–181 (2023).
  6. Roy, U. B., Mantel, S., Jacobson, M. & Ferris, A. Oa16.06 willingness for multiple biopsies to improve quality of lung cancer care: Understanding the patient perspective. Journal of Thoracic Oncology 12, (2017).
  7. Canadian Cancer Society / Société canadienne du cancer. Engaging people with gene mutations to detect cancer earlier with a blood test. Canadian Cancer Society Available at: https://cancer.ca/en/research/for-researchers/funding-results/breakthrough-team-grants/engaging-people-with-gene-mutations-to-detect-cancer-earlier-with-a-blood-test. (Accessed: 8th March 2024)

DNA Methylation Signatures: Their Future in Personalized Genomics

In the evolving landscape of epigenetics, Dr. Rosanna Weksberg’s groundbreaking work on DNA methylation signatures illuminates the intricate interplay between epigenetics and disease and paves the way for transformative insights into autism spectrum disorder (ASD) and other neurodevelopmental conditions.

Areeba Imran, Sananda Pragalathan, and Kobe Huynh

Rosanna Weksberg, Ph.D., MD
Clinical Geneticist, Division of Clinical and Metabolic Genetics
Senior Associate Scientist, Genetics and Genome Biology Program
The Hospital for Sick Children
Professor, Departments of Paediatrics and Molecular Genetics
Institute of Medical Science, University of Toronto

Many astounding scientific breakthroughs do not arise without controversy. In the intricate landscape of genetics, epigenetics emerges as a dynamic realm, constantly unfolding with new revelations, where epigenetic changes drive gene expression that do not involve changes in the DNA sequence itself1. These subtle but powerful changes, which include chemical modifications to DNA and histone proteins, can be influenced by genetic, environmental factors, and lifestyle choices1. Many discoveries within this captivating field focus on using epigenetics as a diagnostic tool for various neurodevelopmental diseases2. At the forefront of epigenetics research lies Dr. Rosanna Weksberg, a distinguished clinical geneticist at The Hospital for Sick Children and senior associate scientist at the University of Toronto. In our enthralling discussion with Dr. Weksberg, she unveiled the riveting tale of her first landmark contribution to the epigenetics field, a pivotal moment that would sculpt the trajectory of her career. In 2011, Dr. Weksberg andher esteemed team submitted for publication a groundbreaking DNA methylation (DNAm) signature for Sotos syndrome, a rare genetic disorder stemming from pathogenic variants (PV) in the NSD1 gene. DNAm signatures are unique patterns of genome-wide DNA methylationthat offer a glimpse into the molecular fingerprints of disease, serving as invaluable biomarkers for early detection and diagnosis2. It involves adding a methyl group to specific DNA molecule sites known as CpG sites2. PVs are changes in the DNA sequence of a gene that causes a patient to have or be at risk of developing a specific disease1. Dr. Weksberg’s team’s journey to publication was not simple. The discovery that PVs impact DNA methylation across the genome was met with curiosity and skepticism. Editors and reviewers, although intrigued, required several rounds of revision and letters of support from independent bioinformaticians before finally accepting the paper for publication in Nature Communications. Yet, this publication marked only the beginning of  Dr. Weksberg’s team’s inspiring odyssey. Her team’s ongoing efforts are to unravel the intricate tapestry of epigenetic markers and shed light on the interplay between genetics, epigenetics, and disease. In doing so, they pave the way for a deeper understanding of normal development, health, and the complex mechanisms underlying genetic disorders.

Dr. Weksberg has dedicated her career to unraveling the enigmatic complexities of DNAm signatures. During our conversation, she fondly recounted her inspirational key research findings that led to her lifelong exploration into the world of DNAm signatures. Her team analyzed blood DNA from patients with PVs in an array of epigenetic regulators, a group of genes that play a significant role in controlling the expression of other genes through various epigenetic mechanisms2. They revealed that downstream DNAm changes in blood mirror the pathophysiology of disorders. Therefore, her team hypothesized that this downstream effect occurred in many epigenetic regulator genes. However, this downstream effect also occurred for some genes that interact with DNA but aren’t known to have epigenetic regulator functions. By not focusing on a specific set of genes, Dr. Weksberg and her collaborators discovered that this phenomenon extends beyond known epigenetic regulators and potentially could involve as many as 500 genes.

Today, the study of DNA methylation has emerged as a beacon of exploration, especially in the realm of epigenetics. To date, around 50 DNAm signatures have been identified2. They serve as potent tools in deciphering the functionality of PVs associated with disease phenotypes or the genes related to disease2. In a remarkable discovery, Dr. Weksberg and her team revealed a compelling link between DNAm signatures for the EZH2 gene distinctly associated with Weaver syndrome (WS), an overgrowth and intellectual disability disorder3. Their investigation led them to unravel several loss of function (LoF) variants, which commonly occur in WS and reduce or eliminate the gene product’s activity3. Additionally, they identified a gain of function (GoF) variant, where the gene product’s activity is increased or a new function is acquired. When comparing the DNAm signatures generated by LoF variants, the GoF variant produced a DNAm profile opposite to the Weaver LoF profile but at the same CpG sites genome-wide (Figure 1). The signatures observed in the GoF variant mirrored its phenotype, given that the individual presented with growth restriction. This is the opposite of what is seen in WS, as patients exhibit the growth phenotype3. However, the narrative took an intriguing turn as Dr. Weksberg’s team delved into the familial landscape of WS. A father and son presented with starkly different clinical features of WS despite having the same EZH2 variant3. Dr. Weksberg’s team considered the potential for somatic mosaicism, which occurs during the development of an organism when a PV causes a particular cell population to have a distinct genotype from all other cells. They confirmed their suspicions by genotyping the variant allele and discovering different percentages of the variant allele in the blood of the father, who was mosaic, and the son, who was not mosaic3.

Figure 1. Heatmap of the GoF variant, LoF variants of WS and control samples. The heatmap is a technique used to visualize DNAm signature patterns. Each row represents a gene, and each column is a sample/individual. The colour and intensity of each box represent the change in DNAm signatures. The yellow bands indicate high DNAm, while the blue bands indicate low DNAm.The heatmap shows that the GoF variant (pink) has a different DNAm signature compared to the LoF variants (red) and control (blue). The DNAm profile is opposite to the profile of individuals with WS. However, one DNAm profile is distinctly different from the affected individual and contrasting to the profile of controls. This indicates a GoF variant instead of a LoF variant of EZH2. The DNAm profile of the GoF variant also matches the phenotype observed in the affected child—WS, Weaver syndrome; GoF, gain of function; LoF, loss of function. Figure taken from 3.

DNAm signatures serve as windows into the profound physiological changes induced by disease. Neurodevelopmental disorders are predominantly caused by PVs present in every tissue from early development, resulting in the impaired functioning of multiple organ systems2. Dr. Weksberg’s team has embarked on a multitude of pioneering studies to support these claims, one of which focused on Nicolaides-Baraitser syndrome, a neurodevelopmental disorder caused by PVs in the SMARCA2 gene2,4. In a remarkable twist of fate, Dr. Weksberg encountered an individual with Nicolaides-Baraitser syndrome who didn’t meet conventional expectations. “We had one child with all the somatic features [of] Nicolaides-Baraitser syndrome but was going to university. These kids normally have severe intellectual impairment,” Dr. Weksberg recounted. When analyzing the DNAm signature, they found an intriguing intermediate signal in the heatmap4. This enigmatic pattern resembled an overlap between the signatures they expected for controls and cases4. From this revelation, Dr. Weksberg and her team concluded that the DNAm signature of SMARCA2 PVs can aid in understanding the syndrome’s pathophysiology and potentially predict an individual’s phenotypic outcome. Their innovative work illuminates new avenues for exploration, a frontier where DNAm signatures can be used to develop biomarkers for effective treatments and prognostic insights. Dr. Weksberg casts her gaze toward the horizon of future research when she proudly said: “And we’re now doing a lot of studies looking very carefully at neurocognitive outcomes and DNA methylation and whether we can correlate DNA methylation with the kinds of outcomes we want to target with treatment.”

            With its promising features, DNA methylation emerges as a beacon of hope, holding immense potential as a diagnostic tool. Within DNAm signatures lies the key to decoding and classifying variants of uncertain significance (VUS) and distinguishing them as pathogenic or benign while foreseeing their phenotypic outcome2. However, amidst their boundless potential, Dr. Weksberg has emphasized that one must consider that signatures vary based on cell type. Across the vast 200 distinct cell types found within the human body, DNAm signatures change when cell types change. This poses a formidable challenge for diagnostic testing in the clinic; therefore, DNAm signatures have been developed using DNA from peripheral blood cells, which variably overlap with DNAm in other cell types2. Critically assessing the origin of DNAm signatures is essential before determining which neurodevelopmental disorders are associated with a specific DNAm signature.

Dr. Weksberg’s research into DNAm signatures reveals cases when a DNAm signature for a specific gene is tied to a genetic disorder, but the patient was not found to have a PV in the associated gene. Dr. Weksberg admits this could be because genomic sequencing and analysis may not have covered the area the variant was in, or the disease-causing gene could be an interacting gene, where two or more genes encode proteins that form a functional complex. These complexes are functionally related genes linked together that may share the same or overlapping DNAm signatures. Another possibility is that some DNAm signatures for different genes can overlap because “epigenetic regulators affect the downstream expression of multiple developmental genes,” Dr. Weksberg explained. This brings up an essential aspect regarding diagnostic testing for DNAm signatures: they must address the issue that gene-specific signatures must be bioinformatically engineered so they do not overlap with other gene-specific signatures.

            The likelihood of the disease-causing gene being an interacting gene is discussed extensively in a paper Dr. Weksberg wrote on WS3. This disease arises due to PVs in the EZH2 gene, which are responsible for encoding a core component of the Polycomb repressive complex-2 (PRC2), as mentioned above3. This complex is comprised of three genes with similar DNAm signatures3. The DNAm signatures must be analyzed carefully to ensure the right gene is linked to the correct neurodevelopmental disorder it is associated with.

In her relentless pursuit of understanding the genomic architecture of autism spectrum disorder (ASD), Dr. Weksberg turned to whole genome sequencing5 because “…autism is [due to] rare high-risk genes…When whole genome sequencing is done, [we] only identify pathogenic variants in ~25% of cases.” According to Dr. Weksberg, this may be the case because more common genes may be associated with ASD but not observed in most patients and on top of that, the environment can play a role in ASD as well. The transformative power of DNAm signatures analysis could be useful in future work when looking at primary epigenetic regulatory genes and ASD.

In the labyrinth of ASD diagnosis, Dr. Weksberg’s lab employs a layered approach. First, they determine whether there is a high-risk PV in a primary epigenetic regulatory gene associated with ASD, observed in many patients. Dr. Weksberg claims that in many cases, even in the MSSNG database, the largest whole genome sequencing database on ASD, many of these high-risk variants have been classified as VUS6. When the DNAm signatures relevant to these variants are examined, they are often re-classified as benign or pathogenic. This proves that DNAm signatures can help geneticists transform patients’ lives with ASD by classifying the gene responsible for their diagnosis. However, there are cases when DNAm signature classifications cannot be linked to a PV in the relevant gene, even when the phenotype matches it. Consequently, the Weksberg lab embarks on harnessing the power of long-read sequencing, a cutting-edge method to sequence lengthy DNA fragments without breaking them into smaller fragments, to better capture genomic variation and pinpoint the precise location of PVs in ASD genes.

            The initial skepticism that once shrouded DNAm signatures has been overcome with its acceptance as an invaluable tool within genomics research and clinical diagnostics. As the horizon of technological innovation expands, fueled by the incorporation of long-read sequencing to capture genomic variations more accurately, the future of genetics holds exciting possibilities. When asked what advice she would give young researchers entering the field, she encouraged the next generation of scientists, saying, “You are making the right choice; [it’s] an amazing space to work in. There’s so much going on, and there are so many potential applications. [The field] allows people to follow their passions in different ways.” For aspiring researchers entering this field, Dr. Weksberg’s sage advice underscores the immense potential and opportunities in studying DNA methylation and personalized genomics. The collaborative efforts and interdisciplinary approaches in this space offer avenues for further discoveries, ultimately contributing to a deeper understanding of genetic mechanisms and their implications for human health.

References:

  1. Siu, M. T. et al. Functional DNA methylation signatures for autism spectrum disorder genomic risk loci: 16P11.2 deletions and CHD8 variants. Clinical Epigenetics 11, (2019).
  2. Chater-Diehl, E. et al. Anatomy of DNA methylation signatures: Emerging insights and applications. The American Journal of Human Genetics 108, 1359–1366 (2021).
  3. Choufani, S. et al. DNA methylation signature for EZH2 functionally classifies sequence variants in three PRC2 complex genes. The American Journal of Human Genetics 106, 596–610 (2020).
  4. Chater-Diehl, E. et al. New insights into DNA methylation signatures: SMARCA2 variants in Nicolaides-Baraitser syndrome. BMC Medical Genomics 12, (2019).
  5. Trost, B. et al. Genomic architecture of autism from comprehensive whole-genome sequence annotation. Cell 185, (2022).
  6. MSSNG. https://research.mss.ng/ (2024).

Advancing Genetic Research: Paving the Way Towards Equitable Genomics

Dr. Naveed Aziz is pioneering equitable access to diverse genomic data through transformative initiatives like HostSeq and the Pan-Canadian Genome Library, redefining the landscape of precision medicine and advancing healthcare for all.

Amna Shah, Farah Shah, and Monica Chacón Grijalva

Dr. Naveed Aziz (he/him) is the Vice President of Research and Innovation at Genome Canada. Formerly, he held the position of Chief Executive Officer (CEO) at CGEn and has contributed his expertise to several genetics advisory boards. Recognized as one of Canada’s emerging executive leaders, Dr. Aziz was selected as a member of the inaugural cohort of adMare and Pfizer Canada’s Executive Institute training program in 2018.

Genetics and genomics research primarily focuses on populations with European ancestry, leaving out many underrepresented groups but the change is on the horizon1. Consequently, these populations might be neglected by advancements like deeper insights into disease causes, early disease identification and diagnosis, strategic drug development, and enhanced clinical treatment1. According to the National Human Genome Research Institute (NHGRI), around 78% of genomic data is from individuals of European ancestry, while 10% is from Asian ancestry, 2% from African ancestry, and 1% from Hispanic ancestry2. Moreover, the diverse phenotypic responses to diseases vary among individuals, possibly due to genetic variations within the human population. This variability underscores the significance of analyzing and comprehending key genetic variations related to disease, which in turn lead to diverse phenotypes.

To address these potential challenges and gain insights into the genetic diversity of populations concerning disease, Dr. Naveed Aziz has been leading the HostSeq initiative which provides insights into understanding the genetic basis of diseases and helps identify potential drug or vaccine targets3. Dr. Aziz has also been one of the key members of the team that was involved in the planning and design of the Pan-Canadian Genome Library which aims to store, share and analyze large-scale genomic data information from Canadians, recognizing Canada’s diverse cultural makeup. This data will then be stored in a central data repository. As a result, this repository becomes a valuable resource for genomic research, aiding in the development of treatments tailored to diverse population groups.

The HostSeq Project:

Amidst the Coronavirus disease 2019 (COVID-19) pandemic, researchers worldwide recognized the importance of understanding virus sequences and monitoring the emergence of new strains. This realization led to one of the collaborative initiatives known as the “HostSeq project” (Figure 1). It has been observed in the recent pandemic that individuals exhibit varying responses to the same viral strain. Studies on severe acute respiratory syndrome (SARS) and Middle East respiratory syndrome (MERS) viruses also suggest that genetic variations among hosts influence their response to the viruses4. Dr. Aziz and his team recognized that this variability in human response could stem from the interaction between the host genetics and the virus.

Figure 1: The operational framework of the HostSeq project. The project was funded by Genome Canada through CanCOGeN (Genome Canada’s initiative, the Canadian COVID-19 Genomics Network) and delivered by CGEn (Canada’s National Platform for Genome Sequencing & Analysis). Initially, HostSeq responsibilities included COVID-19-affected patient enrollment, obtaining consent, gathering clinical data, sample collection (blood), and securing ethical approval. Subsequently, CGEn conducted whole genome sequencing (WGS), utilized bioinformatic tools for data interpretation, conducted quality control tests, and reported major findings to the principal investigator. The project’s findings are currently undergoing further research studies for validation. Figure taken from3.

Dr. Aziz highlights, “Sequencing human genomes, particularly in cases of population-wide viral diseases like COVID-19, would help to identify genetic variations or genetic signatures responsible for the severity of the disease.” While similar projects were initiated in countries like the UK and the USA, Canada emerged as one of the leading countries to have delivered such a difficult project. The HostSeq project aimed to sequence the genomes of 10,000 affected Canadians to investigate how genetic makeup influences their response to the virus and explore genetic diversity among them.

 Given the complexity of human genomes, identifying variations associated with viral diseases presents a considerable challenge like finding a needle in a haystack. Presently, the HostSeq project is in the early stages of data analysis and bioinformaticians are working to differentiate between genetic variances that may or may not correlate with viral diseases. While the project has identified some promising candidate genes that may contribute to severe diseases, the data is yet to be published. All genomic data collected through this project, alongside the clinical profiles of patients, will be incorporated into a database accessible to research communities. This repository will enable researchers to conduct further studies on similar diseases.

The insights gained from the HostSeq project will be invaluable for guiding responses to future pandemics or disease outbreaks. Therefore, advancements in science and technology will allow us to gather clues about potential future pandemics5. This can be achieved by analyzing past pandemic datasets and evaluating their various factors and variables through machine learning models5. This research will enhance our understanding of these diseases in advance, offering potential drug and vaccine targets that could be translated to both current and future diseases.

The Pan-Canadian Genome Library Project:

In the complex world of Canadian genomics, Dr. Aziz is co-leading another groundbreaking initiative: the Pan-Canadian Genome Library. This initiative, born out of the pressing need to streamline genomic data management, represents a significant leap forward for the country’s scientific community.

As Dr. Aziz explains, Canada faces critical challenges in its genomic research landscape. While there is a considerable amount of sequencing being conducted across various projects, the resulting data remains fragmented and scattered among different research teams. Each project operates within its own silo, with data residing in disparate locations and inaccessible to others6. This fragmentation hinders collaboration and limits our collective potential to unlock genomic discoveries, leading to the absence of a centralized repository where researchers can effortlessly deposit and access genomic data6. He emphasizes, “We still don’t have a place in Canada where we can deposit appropriately consented data in one place for researchers to access”.

The Pan-Canadian Genome Library is supported by $15 million from the Canadian Institutes of Health Research (CIHR) and $10 million from Canada’s National Platform for Genome Sequencing & Analysis (CGEn)6,7. Dr. Aziz envisions, “This initiative is a collaborative effort, bringing together experts who can build databases and construct a unified platform for housing genomic data in Canada where all the newly generated data, obtained with the appropriate consent and deemed readily shareable, can be seamlessly integrated.​​” This centralized repository will be easily accessible, fostering collaboration and driving innovation in genomic research across the nation, thus, serving as a new Canadian infrastructure moving forward. Drawing parallels with the UK Biobank, he paints a vision of possibility where The Pan-Canadian Genome Library aims to bring different datasets together that are easily accessible and encourage open collaboration within the scientific community.

Dr. Aziz explains that handling data integration in large-scale projects involves several technical and ethical considerations. While the computational aspects of data integration are manageable due to the advancements in computing technology, the consent process is crucial as individuals must consent to share their data for research purposes. Ensuring that participants are properly informed and consented is paramount, as it upholds their rights and privacy. He expresses deeply saying “The data not only benefits Canadians but also originates from them. We must ensure it improves the lives of all Canadians while safeguarding those who provide it for analysis and sharing.”

The significance of the Pan-Canadian Genome Library cannot be overstated. Dr. Aziz highlights its importance by emphasizing the transformative impact it will have on genomics research in Canada. Once the project is completed, it will provide the infrastructure necessary to host and share genomic data on a large scale, effectively revolutionizing the landscape of genomic research8. By establishing a centralized repository, the library will facilitate seamless data sharing among researchers nationwide. Access to comprehensive datasets is essential for conducting meaningful analyses, driving innovation, and unlocking new insights into genetics, drug discovery, vaccine targeting, and beyond8.

Limitations & Future Directions of The Pan-Canadian Genome Library Project:

Dr. Aziz recognizes that there are going to be challenges along the way as we dive into the Pan-Canadian Genome Library project. Given the multitude of people involved, the decision-making process regarding what methodologies or processes should be applied will be more difficult due to the numerous diverse perspectives. He acknowledges that not all existing genomic datasets can be readily incorporated into the central repository due to varying consent agreements and accessibility restrictions. The ability to share the data publicly where consent is lacking could be challenging, if not impossible in some cases. Additionally, acquiring the liability for the storage and safety of the data will be burdensome as it encompasses numerous privacy layers and entails a great responsibility.

 Dr. Aziz also highlights the importance of having interoperability, meaning having consistency across establishments during data acquisition which is vital for accurate usage and downstream analysis of data. He further mentions that quality metrics are critical for accurate and precise genomic data. These metrics allow for confident comparison between different subjects, which are essential for validating genomics information. Lastly, Dr. Aziz stresses the challenge of achieving equity within a data repository, where greater inclusivity will expand the potential applications of the data in future research and clinical practice.

Dr Aziz indicates that although the Pan-Canadian Genome Library project has been recently established, “it will take around four years to be completed”. This project will not only have multiple applications in Canada but internationally as well. He highlights that generating equitable genomic data will contribute to the field of precision medicine. With the decline in the cost of DNA sequencing, the success of precision medicine has been observed in fruitful medical applications, approvals of several targeted therapies and the inclusion of pharmacogenetics in healthcare settings9. However, one of the constant challenges of the research applications of current databases is the lack of diverse subjects8. While the Pan-Canadian Genome Library will be able to host large-scale genomic data, investments in actually generating such datasets will be critical. It is important that any large-scale datasets generated pay close attention to the population diversity captured within the dataset. Dr Aziz illustrates equity shortage with the example of cardiovascular disease studies, where most of the research has focused on males, leaving women out when it comes to accurate diagnosis and treatment. When asked why precision medicine matters, Dr. Aziz says, “The ultimate goal is to make the life of every human being better”.

Dr. Aziz’s initiatives shed light on the importance of access to equitable data and the challenges this demands. The HostSeq and the Pan-Canadian Genome Library projects offer a foundation for future personalized applications in the healthcare field.  Nonetheless, he clarifies that the applications of genomic technologies and data acquisition do not only play a vital role in medicine, but in other fields as well such as agriculture, environment, and bioeconomy. The significance of large-scale genomic data and its precise utilization will continue to expand to various sectors. Therefore, by fostering collaboration, innovation, and discovery, these initiatives will elevate Canadian genomics to new heights and pave the way for groundbreaking advancements in healthcare and beyond.

References

  1. Fatumo, S. et al. A roadmap to increase diversity in genomic studies. Nature Medicine 28, 243-250 (2022). https://doi.org/10.1038/s41591-021-01672-4
  2. Diversity in Genomic Research, https://www.genome.gov/about-genomics/fact-sheets/Diversity-in-Genomic-Research (2023).
  3. HostSeq, https://genomecanada.ca/challenge-areas/cancogen/hostseq/ (2024).
  4. Di Maria, E., Latini, A., Borgiani, P. & Novelli, G. Genetic variants of the human host influencing the coronavirus-associated phenotypes (SARS, MERS and COVID-19): rapid systematic review and field synopsis. Human Genomics 14, 1-19 (2020). https://doi.org/10.1186/s40246-020-00280-6
  5. Jana, P. K., Majumdar, A. & Dutta, S. Predicting Future Pandemics and Formulating Prevention Strategies: The Role of ChatGPT. Cureus 15 (2023). https://doi.org/10.7759/cureus.44825
  6. Research, C. I. o. H. Government of Canada invests $15M in first-of-its-kind Pan-Canadian Genome Library – Canada.ca, https://www.canada.ca/en/institutes-health-research/news/2023/10/government-of-canada-invests-15m-in-first-of-its-kind-pan-canadian-genome-library.html (2023).
  7. Katz, N. Genome Canada invests in inclusive research and innovation, https://genomecanada.ca/genome-canada-celebrates-launch-of-the-pan-canadian-genome-library/ (2023).
  8. Aziz, N. A large-scale national data approach is key to unlocking the power of genomics in Canada, https://medium.com/@drnaveedaziz/a-large-scale-national-data-approach-is-key-to-unlocking-the-power-of-genomics-in-canada-bce7d5e051fc (2022).
  9. Hodson, R. Precision medicine. Nature537, S49 (2016). https://doi.org/10.1038/537S49a

A Father’s Unyielding Quest: Terry Pirovolakis’ Journey Towards a Cure for SPG50 through Novel Gene Therapy

Terry Pirovolakis embarks on an inspiring journey to save his son through the creation of a novel gene therapy for Spastic Paraplegia Type 50. The journey continues as his newly found company, Elpida Therapeutics, hopes to save the lives of many other children living with rare diseases.

Aastha Patel, Erin Hsue, and Liliana Trajceska

“If there’s something important to you and you’re willing to risk your life for … then you just can’t give up on it,” explains Terry Pirovolakis, a father ready to move mountains to cure his son, Michael, who was diagnosed with Spastic Paraplegia Type 50 (SPG50).

Terry Pirovolakis pictured with his son Michael. Michael is the only known child in Canada to be diagnosed with SPG50 and was the first to receive a novel gene therapy for the condition as the single patient in a clinical trial. Photo taken from CureSPG501.

A parent’s intuition – their instinct and gut feeling – is usually right. Michael’s parents felt something seemed off when Michael was not hitting all the milestones for his age. Soon after this observation, at just 15 months old, Michael underwent genetic testing, which revealed two disease-causing variants in hisAP4M1 gene. Mutations in this gene are responsible for the development of the rare neurodegenerative disorder, SPG50, affecting only 80 children worldwide2,3. Normally, the AP4M1 gene encodes a protein subunit important in neuronal activity4. Thus, with a non-functional AP4M1 gene, a child will lack the protein necessary for adequate neuron performance, essentially leading to the symptoms associated with SPG504. Characteristic clinical manifestations of SPG50 include developmental delays, small head size, and seizures2. In SPG50-diagnosed children such as Michael, changes in muscle tone cause rigidity of leg muscles, consequently making it difficult to walk2. With no treatment currently available, the stiffening of a child’s muscles can expand toward the upper body and lead to paralysis in all limbs2.

Mr. Pirovolakis, however, was not willing to accept the devastating fate of SPG50 patients, nor the massive hurdle of funding that needs to be overcome for the efficient and life-changing development of therapeutics targeting rare diseases. Instead, he and his family took it upon themselves to create CureSPG50 – an organization founded in hopes of raising money to develop an SPG50 gene therapy1. Mr. Pirovolakis later became the founder and CEO of Elpida Therapeutics: a socially responsible corporation aiming to develop gene therapies as fast as possible, for as many children as possible5. Mr. Pirovolakis shows just how far the love and passion of a father can go as he shares his role in the roadmap toward developing a gene therapy for Michael and other children living with rare genetic conditions. Following a long fundraising journey, Michael received MELPIDA – the first SPG50 gene therapy – in March of 2022. The goal was to counteract the symptoms associated with SPG50, and to accomplish this, they utilized a viral vector – a recombinant adeno-associated viral vector type 9 (AAV9). AAV9, a non-pathogenic virus we all eventually get in our lives, was specifically selected for its ability to enter the central nervous system by safely and efficaciously bypassing the blood-brain barrier6. Inside this vector, a functional version of the AP4M1 gene, which is missing from Michael and other patients with SPG50, is packaged. Along with the AP4M1 gene, an expression cassette is included to ensure that the AAV9 knows which cell to target and how much of the gene to express. The MELPIDA gene therapy product is injected into the spine in the Trendelenburg downward position with a X-ray machine to monitor drug movement through the spinal column before and after administration (Figure 1)7. The patient is then rotated for the next hour and a half, promoting movement of the drug to the neuronal cells in the brain where it can release the gene.

Figure 1. Trendelenburg downward position. Patient is placed in a position with their legs elevated above the level of the heart at a downward 12-degree angle to allow for drug flow through the spinal column and into the brain. Figure taken from7.

Once the functional AP4M1 gene is delivered to the affected neuronal cells, the protein necessary for neuronal performance can be made by the cells, resolving the symptoms seen in SPG50. In theory, the goal would be to hit 100% of neuronal cells, but Mr. Pirovolakis notes that this gene therapy is very new, and “as much as we want it to hit 100% of the brain cells, we’re lucky to hit maybe 10%. But that 10% is enough in most diseases.” In the case of Michael and three other children, they hope it will be enough.

The clinical trial developed and delivered for Michael occurred in less than three years, and Mr. Pirovolakis owes the success and speed of the therapy to the many scientists and teams he worked with. He notes, “we were very fortunate to have this really good team of people that understood that it’s complicated, it’s difficult, but we have to try.” In particular, he highlights the efforts of Dr. Steven Gray and Dr. Xin Chen from the University of Texas Southwestern Medical School, who are experts in AAV-based gene therapy for diseases involved in the nervous system. Prior to administering MELPIDA to Michael, Dr. Gray, Dr. Chen, and their team designed the AAV9 carrying a functional AP4M1 gene to investigate the safety and efficacy in mice models to ensure that MELPIDA could be safely administered in children with SPG50 (Figure 2)6.

Figure 2. Evaluating the efficacy of MELPIDA in vivo mice. Mice lacking the AP4M1 gene were intrathecally injected with doses of the AP4M1 gene in an AAV9. Mice receiving the gene therapy at a younger age and higher dose were observed to have the greatest therapeutic benefits. Through these preclinical results, a safe and efficacious dose of intrathecally-administered gene therapy was determined, supporting the next phase of the clinical trial on patients with SPG50. Figure taken from6.

With the success of the gene therapy Michael received, Mr. Pirovalakis knew there were other children with rare diseases that he could help. He leveraged the knowledge accumulated from his own journey collaborating with different partners in creating the gene therapy for Michael to kickstart Elpida Therapeutics. Elpida Therapeutics’ core mission is to help children with rare diseases obtain gene therapy treatment as soon as possible through its unique drug approval pipeline. They identified a niche in the biotech space, coined the “Great Abandonment”, in which many biotech companies have shelved or abandoned rare disease programs, even though many of these diseases are treatable8.

Elpida Therapeutics aims to secure approval for five gene therapy programs, with each program dosing 8-12 children, within the next 2-3 years5. To help streamline the process, the science behind the programs will all use the AAV9 technology that was used in MELPIDA5. Currently, MELPIDA is anticipated to begin additional clinical trials in July of 2024 once adequate funding is received. Elpida Therapeutics has also taken on other ultra-rare pediatric diseases, such as Charcot-Marie-Tooth Disease, Type 4J, and Neuronal Ceroid Lipofuscinosis 7 Disease. Incredibly, in less than one year since its establishment, the company has already made a global impact with studies conducted in Spain, Germany, Italy, and London. As such, Elpida’s innovative and inspiring approach is poised to have an even greater impact in the field of pediatric rare diseases.

The company mirrors the format of a not-for-profit organization, setting it apart from many other therapeutic companies. While maintaining the goal of helping as many children as possible and acknowledging that “there’s probably around 10,000 rare diseases, 95% [of which] have no treatment at all,” Mr. Pirovolakis emphasizes the need to think outside the box when it comes to developing and manufacturing treatments for such conditions. He remarks, “if we’re ever going to make a dent in that number, we have to think a bit differently about how we do things.”

Many large pharmaceutical companies involved in drug discovery, development, and approval follow a distinct pathway when it comes to clinical trials. Oftentimes, the approval of a new drug relies on three separate clinical trial phases after the pre-clinical research is complete. The first phase typically studies the safety and dosage of a drug, and if the drug is found to be safe, the trial can move into phase two where efficacy and side effects are evaluated9. Following this, phase three further analyzes the efficacy and possible adverse reactions of a drug9. There are two main issues with the classic, well-known drug approval pipeline: there are sparse amounts of patients with rare diseases and time is truly of the essence for young patients. Elpida Therapeutics, however, is using a “new, innovative way of thinking” which Mr. Pirovolakis describes as having a “rolling phase one through three.” Rather than doing a “phase one, then a [phase] two, then a [phase] three … [where] you’d make different [batches] of drugs along the way,” explains Mr. Pirovolakis, “we just make commercial [drug] batch right off the bat … as if we’re going to get it to children or patients.” With the rolling phase concept, “the goal now is to choose the right patient to show efficacy and then they continuously just roll into a [phase] three” adds Mr. Pirovolakis. Taken together, a rolling phase clinical trial style for drug approval has the potential to solve the two major problems associated with treating rare diseases. Elpida Therapeutic’s avant-garde pipeline may minimize the often decade-long drug approval process to get rare disease patients treated quickly. Essentially, having patients roll into the next phase eliminates the need to wait for a new trial phase to begin with a new set of patients (Figure 2). Unfortunately, for these rare disease patients, there is no time to lose and every second counts. The typical drug approval pipeline requires a lot of patience as there are long wait times between trial phases, “and then throughout those periods of time, people are waiting … years for their children to be treated with no hope.” But, by implementing their rolling phase pipeline for therapeutic treatment for SPG50 and other rare diseases, Mr. Pirovolakis says, “our goal is that we do trials that end in three to five years, and then during that period of time, we actually treat patients as well. So, we kind of don’t stop treating patients, but we try to get the drug approved [as well]. Of course, forward-thinking comes with big risks. When asked what the biggest challenge was during the Pirovolakis family’s journey, Mr. Pirovolakis answered “funding”, with no hesitation. “Funding is the biggest challenge that all of us, all the parents that are trying to raise [money] … to treat children within their families, [face],” he continued. Research, drug development, and clinical trials have immense costs associated with each process. Unfortunately, the lack of funding is a reality for many newly discovered gene therapies; this was certainly the case for MELPIDA, totalling over “4.5 million dollars of risk.” The lack of funding available for research in this area of genomics may halt the progression towards life-changing treatments. The burden of insufficient funding available often lies on the parents of these children, adding to the large amount of stress

Figure 3. Comparison of generic drug approval pipeline with Elpida Therapeutics drug approval pipeline. With the rolling phase clinical trial approach applied at Elpida Therapeutics, wait time that can be between three to five additional years between each phase, is eliminated. The overall process from drug discovery and research to final FDA approval should consequently have a faster turn-around-time. Figure created in Biorender.com.

already faced by these families. Although discovering treatments appears to be the most apparent obstacle in transitioning from bench to bedside, raising sufficient funds may actually be the biggest mountain these parents have to move.

Open-science stands as a key principle within scientific inquiry, but in the context of treatment for ultra-rare diseases – where information is already limited – it becomes even more essential. Elpida Therapeutics plans to publish all their work where they can ensure that others in a similar position can use these documents to help them along their journey. While the main focus is to get treatment for the children, Mr. Pirovolakis also identifies the importance of sharing the information they have. He notes that “it’ll help everybody, not just one disease, right? Because in the end, we’re not in competition with each other. There’s enough [companies] out there. We should be helping each other.”

After debriefing the long and brave journey embarked by the Pirovolakis family, we circle back to one of Mr. Pirovolakis’ main messages: “never give up”. In the words of an advocate, leader, and above all, an inspiring dad, some things are “important enough that you just don’t give up.” And when the thing you don’t give up on can help save hundreds of children, it’s certainly worth fighting for. Mr. Pirovolakis also emphasizes the significance of this current generation of students in advancing treatments for rare diseases. He states, “1 in 10 of us will be affected by rare diseases. Someone that you care about will be affected, and I think that we can get [the current generation of students] involved to really take it up and figure out different ways of [treatment].” With Mr. Pirovolakis’ insight resonating, we were filled with inspiration and optimism, eagerly anticipating the continued evolution of treatments for SPG50 and other rare diseases. His story serves as a reminder of the collective determination and collaborative spirit driving the advancements in the field of medical genetics.

References

1. CureSPG50: Looking forward to a brighter tomorrow | Rare Disease | Gene Therapy. https://www.curespg50.org/ (2020).

2. Ebrahimi-Fakhari, D., Behne, R., Davies, A. K. & Hirst, J. AP-4-Associated Hereditary Spastic Paraplegia. in GeneReviews® (eds. Adam, M. P. et al.) (University of Washington, Seattle, Seattle (WA), 2018).

3. Oleksiw, B. Battling SPG50 and Changing the World. https://www.jax.org/news-and-insights/2022/August/battling-spg50-and-changing-the-world (2022).

4. Tüysüz, B. et al. Autosomal recessive spastic tetraplegia caused by AP4M1 and AP4B1 gene mutation: Expansion of the facial and neuroimaging features. Am. J. Med. Genet. A. 164, 1677–1685 (2014).

5. Rare Disease | Elpida Therapeutics. https://www.elpidatx.com/.

6. Chen, X. et al. Intrathecal AAV9/AP4M1 gene therapy for hereditary spastic paraplegia 50 shows safety and efficacy in preclinical studies. J. Clin. Invest. 133, (2023).

7. Wilcox, S. & Vandam, L. Alas, Poor Trendelenburg and His Position!: A Critique of Its Uses and Effectiveness. Anesth Analg (1988).

8. Yingling, N., Sena-Esteves, M. & Gray-Edwards, H. L. A Paradox of the Field’s Own Success: Unintended Challenges in Bringing Cutting-Edge Science from the Bench to the Market. Hum. Gene Ther. 35, 83–88 (2024).

9. Step 3: Clinical Research | FDA. https://www.fda.gov/patients/drug-development-process/step-3-clinical-research#Clinical_Research_Phase_Studiesn (2018).

Researchers discover that viruses encode RNA-based anti-CRISPRs

Pamela Alamilla

Short repetitive sequences in phage genomes encode small non-coding RNAs that competitively inhibit the formation of host CRISPR-Cas effector complexes

In our unending quest to solve human health issues, it is easy to overlook the importance of understanding the inner workings of our microscopic neighbours: bacteria, archaea, and viruses. Only when a finding is pertinent to human health and disease, do we rediscover the utility of studying these microbes. CRISPR-Cas systems are an example of this – their ability to recognize and cleave specific DNA sequences reinvigorated many scientists’ interest in microbiology1. In 2019, bioinformaticians expanded on existing CRISPR-Cas research by scanning viral genomes for CRISPR arrays resembling those found in prokaryotic genomes2. They found short, CRISPR-like repetitive sequences not flanked by spacer sequences and speculated that these short repetitive sequences (SRUs) might inhibit the CRISPR-Cas systems of bacterial and archaeal hosts2. Now, a new publication by Camara-Wilpert et al. reports that many SRUs encode small non-coding RNAs that competitively bind some of the Cas proteins required for the formation of functional CRISPR-Cas effector complexes3. This finding proposes the existence of a new, exciting component of viral pathogenicity and introduces a potential new mechanism for fine-tuning CRISPR-Cas gene therapy.

CRISPR-Cas systems are the adaptive immune systems of bacteria and archaea1. These microbial genomes contain Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs), and they also encode a variety of CRISPR-associated (Cas) proteins1. The palindromic repeats in CRISPR arrays encode CRISPR RNAs (crRNAs) that guide Cas protein complexes to foreign genetic elements for fragmentation1 (Figure 1). Over time, viruses developed their own methods for evading or combating this system. In their newest publication, Camara-Wilpert et al. located an SRU of unknown function in an intragenic region of a Thiocystis violascens-infecting prophage3. As its sequence was similar to that of CRISPR repeats, the researchers hypothesized that the SRU might be involved in CRISPR–Cas inhibition3.

Figure 1 | CRISPR-Cas systems protect bacteria and archaea from previously-encountered viruses. A. During initial infection, host cells process the invading viral RNA to generate a spacer, which gets added to the host genome’s CRISPR array. B. When reinfection occurs, the spacer is transcribed into crRNA that guides a complex of Cas proteins to the viral RNA for cleavage. Created with BioRender.com.

The researchers first exogenously expressed the SRU and its surrounding region under the native prophage promoter in Pectobacterium atrosepticum. As a result, a small non-coding RNA was transcribed that gave a different phage a replicative advantage in the bacteria. They named this SRU ‘RNA anti-CRISPR IF1’ (RacrIF1). This finding caught the researchers’ eyes because RNAs had never been reported to function as anti-CRISPRs. In further experiments, they found that Cas6f was responsible for processing RacrIf1 into a more stable RNA transcript. Moreover, RacrIF1 was found to form a subcomplex with Cas6f and Cas7f, preventing these Cas proteins from participating in the formation of the Cascade protein complex. In other words, RacrIF1 was competing with phage-targeting crRNAs for the subunits required to form a functional CRISPR-Cas complex (Figure 2). Another important component of CRISPR-Cas immunity is the ability to acquire new spacers from viruses so they can be stored in the bacterial genome as reference sequences2. Camara-Wilpert et al. found that RacrIF1 inhibited this process, preventing host bacteria from adapting to the invading phage3.

Figure 2 | Racrs suppress CRISPR-Cas systems through competitive inhibition. A bacteriophage injects its genetic material containing an SRU into a bacterial host. The bacterial Cas6f protein processes the SRU into a stable RNA transcript, a Racr. This Racr attracts Cas6f and Cas7f to form an aberrant subcomplex, drawing these proteins away from the Cascade complex and thereby preventing its formation. Created with BioRender.com.

Next, the group used a specialized algorithm, SRUFinder, to detect putative SRUs from viral and plasmid databases. They identified over 2000 SRUs in viral genomes belonging to a wide range of viruses that infect bacteria and archaea, and 90 SRUs in plasmids. This result indicated that Racrs may be a widespread anti-defence mechanism for viruses. In support of this idea, the group identified at least seven other Racrs that allowed the ΦTE phage to suppress the CRISPR-cas systems of P. atrosepticum, P. aeruginosa, and M. bovoculi. The researchers knew that genes encoding anti-CRISPR proteins (Acrs) often co-localize in the viral genome in anti-defence gene clusters. They wanted to investigate if Racrs might also form part of these clusters. Bioinformatic analysis confirmed that regions within 1 kb of Racrs were enriched with Acrs, which led to further hypothesizing that Acrs and Racrs might function under the same operon in some viruses. Their experiments revealed that co-expression of an Acr and a Racr from the same promoter led to inhibition of a CRISPR-Cas system in P. aeruginosa, even when one of the two components was inactivated.

Microorganisms developed CRISPR-Cas systems – and CRISPR-fighting systems – millennia before human scientists discovered their utility4. It was a mere decade ago that Drs. Doudna and Charpentier published their method for hijacking the CRISPR-Cas9 system for use in gene editing, despite viruses having already achieved this for millennia5. In their newest paper, Camara-Wilpert et al. provide evidence for the continued importance of investigating microbial genomes. Their discovery of Racrs immediately conjures ideas for their role in gene editing. Other groups have already demonstrated that introducing Acrs into off-target cells protects them from the potential off-target effects of gene editing6. Likely, Racrs could also be used as an ‘off-switch’ to counteract the activity of CRISPR-Cas systems. As synthetic gene editing systems continue to evolve, it is likely that a combination of Acrs and Racrs will be required to achieve precise calibrations of CRISPR-Cas-mediated gene circuits. Nonetheless, our understanding of Racrs is still in its infancy; more research is needed to investigate if they have other functions, why some of them lie outside of anti-defence gene clusters, and if bacteria or archaea have developed methods to suppress Racrs. SRUs are also still largely a mystery; if not all of them encode Racrs, then what do they encode, if anything? And why are they largely found in intragenic regions? This new publication from Camara-Wilpert et al offers a fresh scientific gap to fill. The endless arms race between pathogen and host has again provided to us with another powerful tool to combat human disease: Racrs.

References

  1. Xu, Y. & Li, Z. CRISPR-Cas systems: Overview, innovations and applications in human disease research and gene therapy. Comput Struct Biotechnol J 18, 2401-2415 (2020).
  2. Faure, G., Shmakov, S.A., Yan, W.X., et al. CRISPR–Cas in mobile genetic elements: counter-defence and beyond. Nat Rev Microbiol 17, 513-525 (2019).
  3. Camara-Wilpert, S., Mayo-Muñoz, D., Russel, J., et al. Bacteriophages suppress CRISPR–Cas immunity using RNA-based anti-CRISPRs. Nature 623, 601-607 (2023).
  4. lonso-Lerma, B., Jabalera, Y., Samperio, S. et al. Evolution of CRISPR-associated endonucleases as inferred from resurrected proteins. Nat Microbiol 8, 77–90 (2023).
  5. Jinek, M., Chylinski, K., Fonfara, I., Hauer, M., Doudna, J. A., & Charpentier, E. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816-821 (2012).
  6. Nakamura, M., Srinivasan, P., Chavez, M., et al. Anti-CRISPR-mediated control of gene editing and synthetic circuits in eukaryotic cells. Nat Commun 10, 194 (2019).
  7. Jinek, M., Chylinski, K., Fonfara, I., Hauer, M., Doudna, J. A., & Charpentier, E. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816-821 (2012).