Why Should We Care4Rare?

The Care4Rare initiative has revolutionized the way we diagnose rare genetic disorders in Canada through providing access to genetic sequencing and other ‘-omics’ technologies. 

Kassandra Bisson, Radhika Mahajan, Paul McKay, and Hamid Farahmand

Dr. Kym Boycott pictured with Eli, one of the children who participated in the Care4Rare initiative and received a resulting diagnosis for his rare genetic condition. Photo courtesy of Melanie Tempel.

Imagine having a child who is sick and after years of tireless diagnostic testing and countless specialist appointments, their diagnosis remains inconclusive. That is usually the dilemma facing a parent whose child suffers from a rare disease. A disease is considered ‘rare’ if it affects less than 200,000 people.1 However, rare disorders are, in fact, extremely common, impacting millions of people worldwide. These diseases are generally chronically debilitating and can even be life threatening.2 In Canada, over a million people suffer from one or more of the 7,000 rare genetic diseases (RDs), in which a third have an unknown underlying genetic cause.3 In search of answers, Dr. Kym Boycott has been changing the game of rare disease patient care and diagnosis. Under the Department of Genetics at Children’s Hospital of Eastern Ontario (CHEO), Dr. Boycott has been a pioneer in improving patient care by understanding the molecular pathogenesis of rare diseases. In addition to her role as a Tier 1 Canada Research Chair in Rare Disease Precision Health, she is also a renowned Clinical Geneticist and a Senior Scientist at the CHEO Research Institute. 

When asked about what sparked her career path, Dr. Boycott stated that it was a lecture given by Dr. Patrick McLeod during her undergraduate degree that ignited her interest in human genetics. Dr. Boycott stated, “When you look back at your life at my age, you will see those forks in the road and that was one of them.” Her experience working with both clinicians and researchers motivated her to pursue a PhD and MD, followed by FRCPC training in Medical Genetics at the University of Calgary. Throughout the course of her academic journey, one of the most prominent turning points she experienced was in 2011 when she, alongside her colleagues, launched a national network entitled the Finding of Rare Disease Genes in Canada (FORGE Canada) project. This project primarily used next generation sequencing technology (NGS) to study rare diseases4. In the context of diagnosing rare diseases using NGS, she mentioned, “These [were] amongst the first exomes done for rare disease in Canada at scale.” To her surprise, a bioinformatics masters’ student at that time, Jeremy Schwartzentruber, interpreted the genomic data and identified candidate genes for several of the syndromes on the first sequencing runs. She candidly stated, “It had taken me six years to find my first gene. And during this one afternoon in 2011, we’d found six genes for syndromes that had been without a known genetic cause for decades in 1 hour. […] This is going to be something really important for genetics.” With this major advent in NGS technology over the past decade, Dr. Boycott has led genomic sequencing initiatives worldwide, including FORGEand Care4Rare in Canada, in combination with various ‘-omics’ technologies to unlock the secrets behind rare diseases.

What Is Care4Rare?

One of Dr. Boycott’s greatest milestones is the Care4Rare project (Figure 1)5,which focuses on finding diagnoses for individuals with rare diseases that remain undiagnosed. Founded in 2011, Care4Rare is a pan-Canadian consortium consisting of clinicians, bioinformaticians, scientists, and researchers. The consortium is exploring ways to improve the care of patients with rare diseases in Canada and around the world. In addition to its headquarters at CHEO, Care4Rare has 21 academic sites across the country, and is recognized internationally as a pioneer in genomics and personalized medicine.

Figure 1: Care4Rare milestones by the numbers. The figure depicts the major outcomes of the Care4Rare project over the past decade. Figure adapted from.5

Care4Rarehas two main goals: 1) access and 2) understanding. The first goal strives to provide access to exome (ES) or genome sequencing (GS) for all eligible individuals with a suspected rare genetic disease in Canada. The second goal aims to better understand how genetic variation contributes to diseases. Over a 10-year period, Care4Rarehas studied more than 5000 families. When asked about Care4Rare’s proudest accomplishment Dr. Boycott cited, “The fact that all of those 5000 families got the opportunity to access this sequencing technology before it was available in the clinic.” Over 50% of those families have already received answers from this research, while the remaining 50% are still being investigated after inconclusive genomic sequencing results. Dr. Boycott expects that within the next few years, genomic sequencing will become incorporated early on in the diagnostic care pathway for individuals with suspected rare genetic syndromes. Dr. Boycott explained further, “The more we can push it to the front of the diagnostic pathway, the better.” The early integration of genetic sequencing will likely shorten the diagnostic timeline and avoid other inconclusive testing and specialist referrals. 

The type of sequencing most appropriate for clinical use is hotly debated. Dr. Boycott stated, “genome sequencing provides about a 5% increase in diagnostic yield over exome sequencing. [There is] not much ‘genome’ can find that an exome didn’t already find for you, especially if you’ve had a microarray done, but our understanding of the genome will improve over time.” She did acknowledge the importance of genome sequencing in playing a critical role in revealing mutational mechanisms and ‘hidden answers’ not accessible by exome sequencing alone. These revelations will push genomic understanding further and make the data produced by ES/GS much more medically actionable6.

Integrating The ‘-omics’ Technologies

Care4Rare – SOLVE, the third phase of the project, is currently focussing on optimizing the delivery of both clinical genome-wide sequencing and multi ‘-omics’ approaches6. This is alongside global data sharing and new bioinformatics, facilitating delivery of innovative diagnostic care for rare diseases. Any individual still undiagnosed after ES, with no candidate variants identified, likely has a complex disease mechanism which will be challenging to detect. For example, a disease mechanism involving long range genomic interactions or heterogeneity in the genetic makeup of the affected tissue means that ‘deeper digging’ is often required to uncover a diagnosis6. For families who failed to receive a clear diagnosis from initial ES, Care4Rare’s clinical laboratory teams will follow-up by supplementing this genomic data with multi ‘-omics’ technologies (Figure 2)6,7,8. The integration of these newer ‘-omics’ technologies is a current focus of Care4Rare, with the hope that this can help ‘solve’ the underlying disease mechanism in individuals or families that were undiagnosed after clinical ES6. Dr. Boycott particularly emphasized the impact of using long-read genome sequencing, transcriptomics, methylomics, metabolomics and lipidomics methodologies in rare disease diagnostics6. Due to their relative novelty, understanding these technologies is a primary focus. Care4Rare subsequently hopes to develop a decision-making tool for determining which ‘-omics’ technologies to use next in the clinical diagnostic pathway based on the suspected disease mechanism. Combining these technologies generates valuable data which increases the potential for clinical actionability6. From this increased understanding of genomic variation and disease, novel therapeutic targets can be elucidated allowing the development of more precise treatment approaches tailored to an individual. 

Figure 2: Integration of multi ‘-omics’ technologies in the Care4Rare bioinformatics pipeline. This multi-approach method allows for deeper understanding of the many layers of interacting biomolecules in rare diseases. Together the many ‘-omics’ pieces fit together to uncover the bigger picture of the underlying diagnosis. Figure adapted from.7,8

When asked about any potential barriers in the current expansion of the Care4Rare initiative, Dr. Boycott said the only real challenge recently has been the impact of the COVID-19 pandemic restrictions. Particularly, their ability to readily collect samples and therefore the recruitment had been reduced, however, this has been improving as restrictions are being lifted. At CHEO, the aim is to set up a clinic for undiagnosed patients supported by the collaboration between clinical research staff, clinical geneticists, and genetic counselors. Since various sample types can be required for use in other ‘-omics’ technologies, the clinic’s mission is to provide a central location for families to undergo multi-sample collection. This clinic will thereby help to ease the length of time in the research and testing process and ultimately further Care4Rare’s main goal of improving access to genetic testing.

The RareConnect Platform 

The RareConnectplatform, initially set up by EURODIS (Rare Diseases Europe), accompanies the research of Care4Rare.9 It offers a private, supportive, and safe social network platform in 13 languages for families that have ultra-rare diseases who wish to connect, ask questions, and share their experiences and stories.9 The RareConnectplatformis divided into disease specific online discussion groups and communities based on topics pertaining to many disease areas.9 It also offers a community for those without a current diagnosis.9  Dr. Kym Boycott pointed out, “These tools have helped address the isolation that families often experience when they have a rare disease”. The CHEO initiatives led by Dr. Boycott have helped thousands of individuals reach a diagnosis for their rare genetic disease, oftentimes providing families affected by rare genetic disease with immediately actionable therapeutic avenues upon finally receiving their highly elusive diagnosis. 

Future Prospects for Medical Genomics 

The Care4Rare initiative has been a pioneering project leading the way for integration of medical genomics into clinical practice. This project has demonstrated the usefulness of ES/GS alongside multiple ‘-omics’ technologies in diagnosing individuals and families with rare genetic diseases6. Identification of new disease-causing genes will help clinicians and researchers better understand what causes a rare disease and may inform approaches to development of subsequent therapeutics6. While there is currently limited knowledge regarding the epidemiology, diagnosis, and treatment of RDs, global efforts are ongoing to increase awareness, treatment options, and education. 

When asked about why she thinks medical genomics research is so important, Dr. Boycott stated, “I think it’s so important because we don’t understand the medical genome – and this impacts patient care – its clinical utility will only increase with our increased understanding”.  Dr. Boycott emphasized the importance of medical genomics research in impacting rare diseases and cancer management in the future. As the integration of genomics/other ‘-omics’ becomes more widely used,  all that data produced will need to be interpreted. She also noted how interesting it will be to see how “ultimately patients’ treatment might change.” As the Care4Rare initiative has demonstrated, this advancement of genomic and other ‘-omics’ technologies greatly increases the necessity for individuals and researchers that are trained in the medical genomics field. Overall, Care4Rare serves as a fantastic model for other rare genetic disease research and will pave the way for novel research, therapeutic approaches, and diagnostic care.


1.         Diseases | Genetic and Rare Diseases Information Center (GARD) – an NCATS Program. https://rarediseases.info.nih.gov/diseases (Accessed 2022).

2.         Boycott, K. M., Vanstone, M. R., Bulman, D. E. & MacKenzie, A. E. Rare-disease genetics in the era of next-generation sequencing: discovery to translation. Nat. Rev. Genet. 14, 681–691 (2013).

3.         Care4Rare Canada: Harnessing multi-omics to deliver innovative diagnostic care for rare genetic diseases in Canada (C4R-SOLVE) | Genome Canada. https://www.genomecanada.ca/en/care4rare-canada-harnessing-multi-omics-deliver-innovative-diagnostic-care-rare-genetic-diseases/ (Accessed 2022).

4. Beaulieu, C. L. et al. FORGE Canada Consortium: Outcomes of a 2-Year National Rare-Disease Gene-Discovery Project. Am. J. Hum. Genet. 94, 809–817 (2014).

5.         CARE for RARE. CARE for RARE http://care4rare.ca/ (Accessed 2022).

6. Driver, HG. et al. Genomics4RD: An integrated platform to share Canadian deep-phenotype and multiomic data for international rare disease gene discovery. Hum Mutat. doi: 10.1002/humu.24354. Epub ahead of print. PMID: 35181971 (2022).

7.         Computational Multi-Omics. Computational Multi-Omics https://comics.dcv.fct.unl.pt/ (Accessed 2022).

8.         Labory, J. et al. Multi-Omics Approaches to Improve Mitochondrial Disease Diagnosis: Challenges, Advances, and Perspectives. Front. Mol. Biosci. 7, 590842 (2020).

9.         RareConnect. https://www.rareconnect.org/en/ (Accessed 2022).

Clinical Genetics: Medicine, Genomics, and Education, in Action

Dr. Faghfoury, a prominent clinical geneticist at SickKids, shares her expertise on all things medical genomics; including her professional journey, misconceptions, and challenges of genetic testing.

George Guirguis, Yasmeen Kurdi, and Anahita Bahreini-Esfahani

Dr. Hanna Faghfoury is a well-known clinical geneticist currently working at some of the most prominent healthcare facilities in Canada such as Mount Sinai Hospital, The Hospital for Sick Children (Sickkids), and University Health Network. She obtained her MD degree from McGill University in 2004, and pursued her interest in medical genetics by completing her post-graduate studies in Medical Genetics, followed by an additional two years training in Clinical Biochemical Genetics – both at the University of Toronto. She is currently the post-graduate director of the Medical Genetics and Genomics program at University of Toronto, and also holds an associate professor position at the Temerty faculty of Medicine. Photo credit Dr Faghfoury.

Imagine being in medical school after years of hard work and dedication only to find yourself not drawn to any of its disciplines. Most medical disciplines are categorized based on organ groups. Dr. Hanna Faghfoury found herself in this specific situation – not drawn to any particular organ system, she was uncertain whether she would find a suitable specialty. This doubt changed to passion and excitement when she enrolled in a Medical Genetics elective. “After the  first day, I called my parents, and I said I found what I want to do for the rest of my life.” What really stood out to Dr. Faghfoury was that a medical geneticist is not focused on a single organ system, yet was not considered to be a generalist. More importantly, medical geneticists had the ability to follow patients longitudinally – from birth and throughout the patient’s life. After getting accepted into the medical genetics residency at the University of Toronto (UofT), she enrolled in an elective of which she has never heard before in medicine- Metabolics. Having completed an undergraduate degree in biochemistry was helpful – despite the often dry and seemingly irrelevant delivery of biochemical pathways as Dr. Faghfoury highlighted. In light of genetics, metabolic pathways made more sense, as they provided clearly actionable targets of intervention. This intensified Dr. Faghfoury’s passion for medical genetics and she pursued this specialty for her career. Today, Dr. Faghfoury is the post-graduate director of the Medical Genetics and Genomics program at UofT, where she also holds an associate professor position at the Temerty Faculty of Medicine.

The completion of the Human Genome project in 2003 ushered in a new era of modern medicine and led to the advent of sophisticated technologies used to sequence DNA. These advances have since transformed the landscape of clinical diagnostics and management of genetic disorders. Contemporary medical genetics has become an expansive subspecialty of medicine, entailing the use of genetic principles such as inheritance and gene mapping in the diagnosis of management of disease. Previously, a geneticist’s expertise in recognizing dysmorphological features was a pivotal factor in identifying candidates for genetic testing1. Furthermore, genetic testing was widely inaccessible due to the slow turnaround times of lab results processing and the astronomically high cost of sequencing. Fast forward to 2011 when the FDA approved next generation sequencing for application in clinical diagnosis2– this marked a paradigm shift in clinical assessment. As genetic testing became cheaper and more accessible, geneticists increasingly integrated these sequencing technologies into their practice, slowly moving away from strictly assessing clinical presentation, or phenotyping, to identify or rule out disease. Dr. Faghfoury notes that as technological and financial barriers surrounding genetic testing decrease over time, the need for phenotyping will decrease – which is what clinical geneticists have been traditionally trained for. She notes that this gradual shift poses somewhat of a professional identity crisis for clinical geneticists in terms of distinguishing the profession from that of a genetic counselor. That being said, medical geneticists have distinct skills from lab personnel and counselors because they are trained in patient management. One limitation that prevents geneticists from broadening the scope of their practice is constraints in capacity and resources that can be attributed to the current model of care. Addressing these limitations will require a systemic re-imagination of the role and scope of medical geneticists in the rapidly changing era of genomics. Despite these capacity and resource constraints, medical geneticists, like Dr. Faghfoury, maintain an invaluable role in patient care.

Dr. Faghfoury’s day-to-day work is dynamic and varied given her multitude of roles. However, a constant part of her work is patient education, where she addresses hesitancies and misconceptions surrounding genetic testing. In pre-test consultations with patients, she emphasizes that “there isn’t a one size fits all for genetic testing”, and that a myriad of tests can offer varied insights that together aid in clinical evaluation. A type of genetic test routinely used in genetic clinics, such as the Fred A Litwin Family Centre in Genetic Medicine where Dr. Faghfoury works as a geneticist, is whole exome sequencing (WES). This technique made its way into clinical diagnostics around the year 2011, and applies next generation sequencing to determine variation in coding regions of genes, also known as exons. About 85% of disease-causing mutations in Mendelian disorders- disorders caused by mutations in only one gene- are contained in exons3. One example of a disease where WES provides a high level of sensitivity and specificity to identify or rule out disease is Wilson’s disease – a genetic disorder that interferes with the body’s ability to remove excess copper. One important limitation of WES is that it only examines one percent of the human genome4. At times, this limitation may render WES ineffective at determining a genetic cause for a patient’s suspected disorder. This is because regulatory regions that modulate expression of genes- essentially turning them on/off- exist outside of exons5. For example, in malformations of cortical development disorders, many patients have no mutations in their genes, but rather in the regulatory regions surrounding them6. For example, intronic repeat expansions have been shown to cause brain disorders such as epilepsy7. The mutations present in these patients are often missed with the use of WES. This is why Dr. Faghfoury educates her patients that a normal WES result does not equate to a negative result, rather it is inconclusive.  “I don’t call a negative result negative, I say ‘it’s inconclusive’ because we just haven’t found the cause of [the] problem”. On the contrary, many patients believe that genetic testing is the be-all-end-all, and that it will always provide answers. “The misconceptions either fall in the category of overvaluing genetic testing or undervaluing it”. Whole genome sequencing (WGS), on the other hand, captures virtually the entire genome, including regulatory regions. Because of this, WGS can provide a more conclusive result. Alongside the advantage of capturing immensely more of the genome, WGS requires extensively more analysis. In addition, WGS is more accurate than WES4.Regrettably, for most Ontario patients, WGS is not currently requestable by physicians. Instead, it is conducted randomly in lieu of WES.

Figure 1: Diagram depicting the whole exome sequencing pipeline. The left side of the figure displays an enrichment of DNA fragments to isolate for protein coding regions (exons). The exons then go through the process of Next-generation sequencing, which involves mapping reads to a reference genome to identify variants including deletions and single nucleotide polymorphisms. Processed reads are then filtered and annotated for associations with disease. (Retrieved from8)

There are many challenges facing the field of clinical genetics, where limited resources represent an especially pertinent challenge. Ideally, a clinical geneticist would diagnose a patient and continually follow-up with them long term. Unfortunately in Canada, there are only seven genetics residency programs in the country that graduate a handful of students each year, creating a high demand for geneticists with a low supply. Because there are not enough clinical geneticists to go around, patients are often followed up by their family physician post diagnosis. This can pose potential issues as clinical genetics is a rapidly evolving specialty and family physicians may not have the specific expertise to follow up with patients diagnosed with genetic disorders. This led to coinage of the term ‘diagnose and adios’ by clinical geneticists, who oftentimes find themselves disengaged from patient management. This is an area in the current medical system that requires more advocacy and change. Not all patients diagnosed with a genetic disorder follow-up with their family physician, however. For certain genetic disorders, there are clinics where clinical geneticists follow-up with their patients, such as the GoodHope clinic (for Ehlers-Dalnos syndrome) and the Genometabolic clinic, where Dr Faghfoury practices. Unfortunately for many patients, this is an equity problem. For example, a patient with a certain genetic disorder will not find a clinic with clinical geneticists to follow-up with, and must do so with their family physician. “Why is it their fault that their mutation happened to be in a gene that didn’t have a subspecialty clinic attached to it? It’s not fair.” This inequity between patients with different genetic disorders is a target for many genetic professionals, whose goal is to ensure that all patients get the best care possible.

The future of the field of clinical genetics looks promising. Recent developments in the field of genetics such as whole-genome sequencing and whole exome sequencing have drastically changed the landscape of managing genetic disorders. An exciting paradigm shift for clinical geneticists mentioned by Dr. Faghfoury is straying away from strictly depending on phenotyping for clinical identification thanks to genetic testing. One example of this shift can be seen with the rapidly expanding field of pharmacogenomics, the study of how genes affect an individual’s response to drugs. Cytochrome P450 2D6 (CYP2D6) is an important gene involved in the metabolism of about 20% of commonly prescribed drugs (Taylor 2020). Interestingly, CYP2D6 is highly variable across different populations, which can directly influence drug metabolism in individuals carrying such variants. To date, 72 different drugs have CYP2D6 clinical guidelines mentioned within their FDA-approved product labels (Taylor 2020). Instead of the trial and error approach typically needed to assess drug efficacy in patients, genetic testing of CYP2D6 can identify individuals that may experience adverse reactions or reduced efficiency, to tailor therapeutic doses accordingly (Taylor 2020). While pharmacogenomics offers exciting potential for personalizing medicine, barriers remain to clinical implementation. Such barriers include the necessary educational and equipment infrastructure to perform and interpret such tests. Moving forward, there will be a greater need for expertise to efficiently integrate genetic testing into commonplace clinical practice. As Dr. Faghfoury puts it,  “right now we need all hands on deck” to effectively usher in this new and rapidly evolving era of healthcare.


  1. Tromans, E., Barwell, J. Clinical genetics: past, present and future. Eur J Hum Genet (2022). https://doi.org/10.1038/s41431-022-01041-w
  2. Efthymiou, S., Manole, A., & Houlden, H. Next-generation sequencing in neuromuscular diseases. Current opinion in neurology, 29(5), 527–536. (2016). https://doi.org/10.1097/WCO.0000000000000374
  3. Rabbani, B., Tekin, M. & Mahdieh, N. The promise of whole-exome sequencing in medical genetics. J Hum Genet 59, 5–15 (2014). https://doi.org/10.1038/jhg.2013.114
  4. Belkadi, A., et al. Whole-genome sequencing is more powerful than whole-exome sequencing for detecting exome variants. Proc Natl Acad Sci U S A. 112(17), 5473–5478. (2015). https://doi.org/10.1073/pnas.1418631112
  5. Barrett, L. W., Fletcher, S., & Wilton, S. D. Regulation of eukaryotic gene expression by the untranslated gene regions and other non-coding elements. Cellular and molecular life sciences : CMLS, 69(21), 3613–3634. (2012). https://doi.org/10.1007/s00018-012-0990-9
  6. Perenthaler, E., Yousefi, S., Niggl, E., & Barakat, T. S. Beyond the Exome: The Non-coding Genome and Enhancers in Neurodevelopmental Disorders and Malformations of Cortical Development. Frontiers in cellular neuroscience, 13, 352. (2019). https://doi.org/10.3389/fncel.2019.00352
  7. Scheffer IE. The Key to FAME: Intronic Repeat Expansions Cause Human Epilepsies. Epilepsy Curr. 2018;18(4):238-239. doi:10.5698/1535-7597.18.4.238
  8. Goh, G., Choi, M. Application of Whole Exome Sequencing to identify Disease-Causing Variants in Inherited Human Disease. Genomics Inform. 10(4):214-219. (2014).
  9. Taylor C, Crosby I, Yip V, Maguire P, Pirmohamed M, Turner RM. A Review of the Important Role of CYP2D6 in Pharmacogenomics. Genes (Basel). 2020;11(11):1295. Published 2020 Oct 30. doi:10.3390/genes11111295

A Study in DNA: The Adventures of a Clinical Geneticist

Genetic disorders often present with a puzzling array of symptoms, making diagnosis challenging. Fortunately, clinical geneticists are on the case! Dr. Marjan Nezarati takes us through the process of providing her patients and their families with answers.

Meredith Laver & Alex Margaritescu

Dr. Marjan Nezarati, M.D., RCPSC Specialist. Photo courtesy of Marjan Nezarati.

The Clinical Genetics department at NYGH sees a wide variety of cases that span a few main categories. Generally, a person is referred to clinical genetics if they are suspected of having a genetic disorder, either because of a family history, or because they are presenting symptoms. Children are often referred for developmental delay combined with one or more dysmorphic features. Prenatal cases are referred when a parent has an unusual screening test such as an ultrasound, or a family history of genetic disorders. NYGH also runs a hereditary cancer clinic which sees individuals with a familial history of cancer. Dr. Nezarati laments lengthy wait times and explains that they “really don’t have the resources”, given the number of referrals they receive. After a referral is accepted, the patient is scheduled to see a clinical geneticist like Dr. Nezarati. She then gets a detailed picture of the patient’s family history and gives a preliminary overview of possible findings. Most cases require additional testing to elucidate physical symptoms, or to investigate genetic causes. 

Figure 1 – Process of diagnosing genetic disorders in prenatal, child, adolescent or adult patients. All cases begin with a visit to a physician, who may write a referral to a genetics clinic if the findings suggest the possibility of a genetic disorder. At the clinic, a geneticist re-examines the patient’s physical symptoms and family history, and orders appropriate genetic testing. Image created in BioRender.

Dr. Nezarati has access to a toolkit of genetic tests to help identify the molecular causes of disease. Genetic tests look for the presence of potentially disease-causing changes in a patient’s DNA. Prenatal cases receive either non-invasive prenatal screening (NIPS) or invasive prenatal testing (IPT). Prenatal testing is time-sensitive as parents must make informed decisions and prepare for health challenges before a child is born. Although IPT is faster and provides more information, some parents opt for NIPS first because of the small risk of miscarriage associated with IPT4. One of two common IPT methods, chorionic villus sampling (CVS) or amniocentesis, is used to acquire a sample of fetal DNA. CVS harvests a small tissue sample of the chorion which is a membrane enveloping the fetus, and amniocentesis harvests the amniotic fluid which surrounds the fetus4. The DNA derived from these samples can then be tested for common disease-causing mutations and chromosomal abnormalities.

In contrast, children and adults who present with suspected genetic syndromes usually receive microarray testing of blood samples. Microarrays detect duplications or deletions of specific genomic regions. If a particular condition is suspected, a microarray is ordered which tests at sites at which duplications or deletions are known to cause that condition. Since microarrays have become fairly common tests, geneticists are now trying to encourage family physicians and specialists to order them independently, instead of submitting a genetics referral.

If microarray testing doesn’t reveal a diagnosis, and a genetic syndrome is still suspected, Dr. Nezarati will often order either a gene sequencing panel or whole exome sequencing (WES). Sequencing identifies the DNA sequence of a portion of the genome. Gene panels involve sequencing only the genes which are commonly associated with a specific disorder or symptom, and are typically used to confirm a clinical diagnosis. WES looks at the entire exome, which is the portion of the genome that contains instructions to make cellular products such as proteins. Although the exome makes up only 1% of the genome, approximately 85% of disease-causing mutations are located in these areas5. It can be much more cost effective to sequence the entire exome than to run multiple gene panels if the first is inconclusive, making WES a good diagnostic test for patients whose clinical diagnosis remains elusive5.

In some cases, the usual genetic tests fail to identify a causative mutation, leaving patients and families without answers. Geneticists can bridge the gap between emerging research and clinical practice by submitting these especially puzzling cases to research studies. This practice helps to provide patients with a diagnosis, and uncover new molecular signatures of disease. Dr. Nezarati is the primary investigator at NYGH for two research studies which use expanded testing methods to investigate undiagnosed cases: Care4Rare–SOLVE and EpiSign.

Care4Rare is a consortium that was founded in 2011 to unite researchers and clinicians across Canada in providing care for individuals with rare diseases6. The current iteration of the project is called Care4Rare–SOLVE and is focused on identifying the molecular causes of rare genetic conditions6. Clinical researchers like Dr. Nezarati collect and share data to help expedite patient diagnosis and the classification of new disorders. Patients enrolled in Care4Rare receive access to whole exome and genome sequencing, as well as expanded testing methods which include RNA sequencing6. Dr. Nezarati signed a young girl up for an early form of Care4Rare after a battery of standard tests failed to produce a diagnosis. They entered the patient’s phenotype and genotype data into a knowledge sharing database called Matchmaker Exchange and suddenly the pieces began falling into place. There was “someone from Australia and another person from the US, and they [had] patients with mutations in the same gene.” Researchers and clinicians around the world were able to work together to formally classify a new rare genetic disorder and begin to build a knowledge base7. Around half of the individuals enrolled in Care4Rare have received a diagnosis for their rare disease6. A formal diagnosis can help patients and families to seek appropriate healthcare, inform family planning decisions, and allow them to connect with others through shared experiences. 

Even advanced DNA testing methods can sometimes fail to produce a diagnosis. In these cases, patients can be enrolled in EpiSign for epigenetic analysis. Genetic and environmental differences create changes in the way that DNA regions are packaged and read. Epigenetics is a branch of genetics that looks at how these differences impact gene expression. Certain genetic disorders such Fragile X, Prader-Willi, or Kabuki Syndromes are associated with recognizable epigenetic signatures8. EpiSign analyzes a patient’s epigenetic pattern in order to identify these signatures and connect them to a diagnosis8.

So how does a patient qualify for submission to a research study? “Really, it’s when we are highly suspicious…that it’s a syndromic diagnosis that we’re not catching by routine testing. And sometimes it’s individuals who have a clinical diagnosis”, Dr. Nezarati explains. “So I’m looking at this person and I think they have Kabuki syndrome, let’s just say, and we do the [sequencing] panel of Kabuki genes and we don’t find a hit. Then that would be a case where you could say, well, let’s submit this to Care4Rare–SOLVE or even to EpiSign to see if the epigenetic signature matches the epigenetic signature for Kabuki syndrome.” The interest and consent of the family is also paramount – “if they don’t want to do it, that’s the end of the discussion.”

In some cases, clinical geneticists are able to collaborate with researchers around the world to help assess the impact of new mutations. “I find most of the time when I’ve reached out to people internationally, even big names… I hear back from them”, Dr. Nezarati recounts. “Geneticists are generally… very, very generous with their time.” One couple who had lost multiple pregnancies was looking for an answer. Often in these cases, recurrent mutations in the fetus are responsible. Genetic testing identified mutations in the fetus and parents in a gene which had not been formally recognized as disease-causing. Dr. Nezarati reached out to a group researching the gene to help solve the case. The researchers recreated the mutations in yeast and found that this particular combination of mutations completely disabled the gene. Fortunately, the couple was able to receive prenatal testing for these mutations in future pregnancies.

Nevertheless, a clinical geneticist’s job isn’t all thrilling detective work and happy endings. Even if a diagnosis can be found, many genetic disorders lack therapy options which address the root cause; patients rely on treatments to manage each individual symptom. Families may also face hurdles from the medical system; Dr. Nezarati describes how one child’s mother “had to really fight to get a referral.” Nonetheless, Dr. Nezarati finds that many patients and families take comfort in understanding their situation, and in feeling understood. “Sometimes I really find I’m sort of just a listener. Sometimes I make very little difference and it’s just the willingness, and having the time to sit and listen to someone. That may be all I can do for them, but sometimes that’s helpful.”


1. Baird, P. A., Anderson, T. W., Newcombe, H. B. & Lowry, R. B. Genetic disorders in children and young adults: a population study. Am J Hum Genet 42, 677–693 (1988).

2. Basel, D. Dysmorphology in a Genomic Era. Clin Perinatol 47, 15–23 (2020).

3. About CORD | Canadian Organization for Rare Disorders. https://www.raredisorders.ca/about-cord/.

4. Beta, J., Zhang, W., Geris, S., Kostiv, V. & Akolekar, R. Procedure-related risk of miscarriage following chorionic villus sampling and amniocentesis. Ultrasound in Obstetrics & Gynecology 54, 452–457 (2019).

5. Choi, M. et al. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proc Natl Acad Sci U S A 106, 19096–19101 (2009).

6. Osmond, M. et al. Outcome of over 1500 matches through the Matchmaker Exchange for rare disease gene discovery: The 2-year experience of Care4Rare Canada. Genetics in Medicine 24, 100–108 (2022).

7. White, S. M. et al. A DNA repair disorder caused by de novo monoallelic DDB1 variants is associated with a neurodevelopmental syndrome. Am J Hum Genet 108, 749–756 (2021).

8. Sadikovic, B. et al. Clinical epigenomics: genome-wide DNA methylation analysis for the diagnosis of Mendelian disorders. Genet Med 23, 1065–1074 (2021).

Slipping into the DNA architecture of tandem repeat expansion disorders

Understanding the mechanism of repeat expansion has allowed Dr. Christopher E. Pearson and colleagues to target unique disease-associated mutagenic DNA structures as a potential therapeutic avenue.

Elvira Mukharryamova, Sornnujah Kathirgamanathan, and Tanvi Anadampillai

Dr. Christopher E. Pearson is a Canada Research Chair in Disease-associated Genome Instability, a Senior Scientist at The Hospital for Sick Children in Toronto, and a Full Professor with the Department of Molecular Genetics at the University of Toronto. Photo from The Hospital of Sick Children. 

            The progressive neurodegeneration (loss of brain cells) in individuals with Huntington Disease (HD) highlights the limits of modern medicine in relation to prognosis and cure. As a condition that worsens over time, HD individuals become entirely reliant on others for their daily living. The characteristic neurodegeneration in HD individuals is due to a curious mutation of DNA, called tandem repeat expansions, in the protein-coding gene HTT, which is involved in brain development. These repeat expansions consist of nucleotide sequence units, such as CAG in the case of HD, that occur in tandem (‘CAG CAG CAG…’). For example, healthy individuals carry a repeat tract lengths of 5-35 ‘CAG’ units in the HTT gene. Individuals with 35-39 copies are at an increased risk for HD, while those with 40 or more copies will develop HD earlier in life1 (Fig. 1A). Importantly, Pearson says “As patients age, the mutation continues in their brains and their disease worsens. For example, ‘THE CAT ATE THE FAT FAT RAT’ mutates to ‘THE CAT ATE THE FAT FAT FAT RAT,’ which eventually mutates to ‘THE CAT ATE THE FAT FAT FAT FAT FAT RAT,’ and so on.” The number of tandem repeats in functionally relevant genes – also referred to as repeat length – is negatively correlated with symptom age-of-onset and positively correlated with disease progression and severity  (Fig. 1B). Generally, longer repeat lengths lead to an earlier age-of-onset with a more severe disease phenotype2. “Essentially, for therapy we would like to put that RAT on a diet, which should delay onset and slow progression”, says Pearson. Tandem repeat expansions also cause 69 other serious disorders3

Figure 1: Representation of tandem repeat expansion. A) The CAG repeat tract lengthens with each subsequent expansion event. B) Longer repeats speed earlier disease onset and enhance disease progression. Figure created with BioRender.

            When it comes to elucidating the underlying mechanisms of disease-associated repeat expansions, it is difficult to find someone with a higher level of expertise than Dr. Christopher E. Pearson – a Canada Research Chair in Disease-Associated Genome Instability, a Senior Scientist at The Hospital for Sick Children, and a Full-Professor at the University of Toronto. In a career that spans nearly three decades, Dr. Pearson has published 97 publications largely focusing on tandem repeat DNA sequences and the mechanism of disease-causing repeat expansion. Looking back on his decision to pursue what Dr. Pearson calls “dynamic mutations” back in 1993, he considers himself fortunate to have discovered something that has captured his curiosity and become increasingly relevant all these years. 

The inspiring work of Dr. Pearson and his team has contributed greatly to our current understanding of repeat expansions. His recent publications featured here, have catapulted the field closer to developing a treatment that can potentially reverse repeat-associated neurodegenerative diseases.

Repeat expansions as a driver of disease

            In molecular genetics, the adage “you can’t harvest what you haven’t planted” holds true. One cannot design a treatment for a complex genetic disorder without first understanding the molecular mechanisms of its pathogenicity4. In HD, the root cause of disease is the inheritance and ongoing expansion of tandem repeats, where the repeats expand throughout an individual’s life, causing symptoms to worsen2. Although the exact mechanism of expansion has remained elusive, several factors involved in repeat instability have been established. They include repeat length, slipped-DNA structures, and the influence of DNA repair proteins5.

A distinguishing feature of disease genes with expanded repeats is the presence of unusual slipped-DNA structures. Slipped-DNAs form at expanded repeats when unwound DNA attempts to reanneal but does so incorrectly, “much like a mis-aligned zipper”, says Pearson (Fig. 2). Slipped-DNAs occur only if the gene contains a threshold number of tandem repeat units, where greater number of repeats enhances slip-out formation. Slipped-DNAs are critical because they act as mutagenic intermediates of instability by attracting DNA repair proteins, which ultimately drive further repeat expansion, which enhances slip-DNA formation…leading to a compounding cycle of expansion mutations. These DNA repair proteins introduce additional repeats through the error-prone attempts to repair the slipped-DNAs – in this manner, rather than protecting against mutation the repair proteins are driving mutations (Fig. 2)6.

Figure 2: Overview of repeat expansion mechanism. Unwound DNA (such as that found during transcription) may re-anneal out-of-register in highly repetitive regions. Mispairing between repeats results in the formation of slip-out DNA structures. DNA repair proteins attempt to resolve these slipped-DNAs, but instead induce further repeat expansions. Figure created with BioRender.

            Dr. Pearson remembers identifying slipped-DNAs by accident during his time as a post-doctoral fellow. He recalls thinking at that moment that these unusual structures must be important and might even be the key to novel therapeutics. Lo and behold, Dr. Pearson’s suspicions turned out to be right.

Overview of mutation-centric therapeutic targets

            Multiple therapeutic approaches can target various downstream pathogenic aspects of HD, such as lowering the mutant repeat RNA transcript or mutant protein aggregates. Current approaches looking to treat repeat expansion disorders at the root-cause, the DNA mutation, have either targeted the repeat sequences themselves, or the DNA repair proteins involved in repeat expansions4. However, a significant limitation of these approaches is that they lack the specificity required to treat only the disease-causing gene in affected cells, while avoiding the normal gene and other off-target effects. Dr. Pearson provides the example of potentially targeting MSH3 or FAN1, DNA repair proteins that drive or supress CAG expansions6. Key features of these proteins is their DNA structure-specificity, meaning they only recognize and process unusual structures like slipped-DNAs. MSH3 and FAN1 can modulate repeat stability by either promoting or inhibiting repeat expansion5,7. Additionally, certain variations in the MSH3 and FAN1 genes can alter the age-of-onset and progression of various repeat expansion disorders, including HD. Taken together, altering levels of MSH3 or FAN1 could therapeutically modulate expanded pathogenic repeats. However, due to the involvement of MSH3 and FAN1 in maintaining the integrity of the entire genome through DNA repair, targeting these proteins would certainly affect their actions elsewhere beyond the mutant CAG tract. One can expect modulating the levels or activities of MSH3 and FAN1 will cause widespread DNA abnormalities, possibly resulting in cancer.  This lack of specificity could be worrisome.

A novel molecule targets slip-out structures to reverse repeat expansion 

            Hoping to find an alternative therapeutic avenue that can address the challenge of specificity, Dr. Pearson and colleagues designed the small molecule DNA ligand Naphthyridine–Azaquinolone (NA). This molecule has a high degree of specificity to slip-out structures within expanded CAG repeats, effectively providing a means of differentiating between normal and pathogenic alleles, as well as the rest of the genome8. This feature of NA reduces its off-target effects and can be attributed to Dr. Pearson’s unique appreciation for the importance of structure-specificity: “Slipped-DNAs only form at the disease repeats that are long and unstable, this provides exact specificity of NA to only the disease gene.”

            Although the discovery of a molecule that could recognize and bind pathogenic CAG repeats was exciting, Dr. Pearson admits that the group had no prior knowledge of whether this molecule could prevent repeat expansions, let alone induce contractions. He adds “It was a blind experiment…stabilized repeats would be good, contractions would be even better, but enhanced expansions would be really bad”. Subsequent work by Dr. Pearson and colleagues demonstrated that in addition to its binding specificity, NA stabilized and shortened the expanded repeats in affected brain cells. “We were ecstatic that NA induced CAG contractions in the brain to less than what the HD mice inherited”, explains Dr. Pearson. NA is believed to obstruct the processing of slip-out structures by FAN1, thus inducing CAG contractions, but details of this obstruction remain to be elucidated. 

            Dr. Pearson explains that in addition to having spectacular specificity, NA induces contractions in the majority of treated brain cells in HD mice. This is astounding feat considering that NA must cross both cellular and nuclear membranes to reach its target DNA. Moreover, Dr. Pearson and colleagues observed an improvement in motor coordination of these mice after only four weeks of treatment with NA9. Assuming the effects in mouse models can be translated into humans, the effectiveness of NA in treating repeat expansion disorders is extremely promising. Given the complexity and progressive degeneration of these conditions, NA’s rapid and effective onset of action, makes the molecule an attractive treatment option for HD individuals. While direct delivery to the central nervous system is an option, the ability for NA to cross the blood-brain barrier, which is unknown, would facilitate delivery. Further studies are needed to enhance delivery, and characterize this molecule’s tissue distribution and safety profile.

The Future of HD Therapeutics: Just Keep Fishing

            According to Dr. Pearson, the first-of-its-kind approach of targeting slip-out-structures with NA has advanced the field of HD therapeutic development. However, as this approach is still in its infancy, whether NA will survive the “valley of death” – a term used to describe the hurdles of drug development – is still unknown. Dr. Pearson intends to continue improving the druggability and safety profile of NA up until its translation to the bedside: “We will do what we can to improve delivery and safety – we’re working on that now.”

            Dr. Pearson’s team are investigating other potential therapeutic avenues centered upon targeting expansions – or in his terms, “fishing in multiple waters”. These approaches include identifying new DNA repair proteins involved in expansions, screening for inhibitors/modifiers of MSH3, FAN1 or other DNA repair proteins. Dr. Pearson emphasizes that “Fishing in multiple waters increases the likelihood that one of these approaches will cross the long, wide and deep valley of death” and go on to become an approved treatment for HD. Were more than one approach to succeed, combinatorial therapeutic regimens could be developed to further enhance patient outcomes. Despite current excitement and hope, Dr. Pearson acknowledges that crossing this valley is a long and challenging journey and credits the young, bright, and intelligent students and fellows in his lab for taking up the challenge.

The applicability of NA in treating repeat expansion disorders

            Might the discovery of NA be applied to other repeat expansion disorders? That NA targets CAG slip-outs suggests it could act on the other 15 CAG-expansion disorders, including spinocerebellar ataxias and dentatorubral-pallidoluysian syndrome (DRPLA). Dr. Pearson and his team recently revealed that NA contracted CAG repeats and improved motor coordination in a mouse model of DRPLA8, validating the broad applicability of this approach. 

                  Looking to the future, Dr. Pearson is expanding his focus to other repeat expansion disorders, such as amyotrophic lateral sclerosis, frontotemporal dementia, and schizophrenia. Dr. Pearson claims, “the likelihood that other repeat sequences causing other diseases are forming unusual mutagenic structures is extremely high, which is why we are searching for ligands to those”. As the field of repeat expansion disorders continues to advance, Dr. Pearson is ready to face new questions that will arise for him and his team to address. 

Hoping to motivate young minds, Dr. Pearson concludes our interview by thoughtfully reminding us of the importance of pursuing interests, not career paths: “follow your nose, follow what excites your curiosity”. 


1.        Lu, X. H. & Yang, X. W. ‘ Huntingtin Holiday’ : Progress toward an Antisense Therapy for Huntington’s Disease. Neuron 74, 964–966 (2012).

2.        Flower, M. D. & Tabrizi, S. J. A small molecule kicks repeat expansion into reverse. Nat. Genet. 52, 136–137 (2020).

3.        Gall-Duncan, T., Sato, N., Yuen, R. K. C. & Pearson, C. E. Advancing genomic technologies and clinical awareness accelerates discovery of disease-associated tandem repeat sequences. Genome Res. 32, 1–27 (2022).

4.        Malik, I., Kelley, C. P., Wang, E. T. & Todd, P. K. Molecular mechanisms underlying nucleotide repeat expansion disorders. Nat. Rev. Mol. Cell Biol. 22, 589–607 (2021).

5.        Deshmukh, A. L. et al. FAN1, a DNA Repair Nuclease, as a Modifier of Repeat Expansion Disorders. J. Huntingtons. Dis. 10, 95–122 (2021).

6.        Deshmukh, A. L. et al. FAN1 exo- not endo-nuclease pausing on disease-associated slipped-DNA repeats: A mechanism of repeat instability. Cell Rep. 37, 110078 (2021).

7.        Porro, A. et al. FAN1-MLH1 interaction affects repair of DNA interstrand cross-links and slipped-CAG/CTG repeats. Sci. Adv. 7, 1–13 (2021).

8.        Hasuike, Y. et al. CAG repeat-binding small molecule improves motor coordination impairment in a mouse model of Dentatorubral–pallidoluysian atrophy. Neurobiol. Dis. 163, 105604 (2022).

9.        Nakamori, M. et al. A slipped-CAG DNA-binding small molecule induces trinucleotide-repeat contractions in vivo. Nat. Genet. 52, 146–159 (2020).

Neurodevelopmental Disorders: Where does my child fall on the spectrum?

University of Toronto researcher, Dr. Lucy Osborne, aims to discover novel genetic factors contributing to the wide spectrum of phenotypes observed in cognitive disorders. She strives to help families better predict the clinical implications of such complex conditions.

Neta Pipko and Celia Pennimpede

Dr. Lucy Osborne, PhD is the Canada Research Chair in Genetics of Neurodevelopmental Disorders. She is also a Professor in the Departments of Medicine and Molecular Genetics at the University of Toronto. Photo provided by Dr. Osborne, photographed by Mikaeel Valli.

Identifying the correct diagnosis for a child’s underlying behavioural and learning disabilities is challenging since the same symptoms can be caused by a number of disorders. Therefore, when the pieces of the puzzle finally begin to form a picture, parents start to experience an immense sense of relief when placing a concrete label on their child’s symptoms. A diagnosis, however, may often serve as a double-edged sword when it comes to neurodevelopmental disorders (NDDs).

NDDs are a group of complex conditions that affect brain development and growth, impairing several cognitive and behavioural features such as learning, self-discipline, language, and social communication1. Common NDDs include conditions such as intellectual disability, autism spectrum disorder (ASD), and attention-deficit/hyperactivity disorder (ADHD). Signs and symptoms appear early in childhood development and can fall within a wide spectrum, ranging from mild to severe phenotypes1. Pinpointing the correct diagnosis is particularly challenging since symptoms often overlap and co-occur amongst different NDDs. Consequently, therapeutic interventions should be tailored to the specific NDD and its characteristic features. Therefore, an official diagnosis can offer parents a tremendous wave of comfort and ease. However, the battle does not end here, since the large spectrum of phenotypes adds a layer of uncertainty to managing this diagnosis.

Toronto-based researcher, Dr. Lucy Osborne, hopes to help parents get some of the answers they are looking for. Dr. Osborne is a principal investigator and professor at the University of Toronto in the Departments of Medicine and Molecular Genetics. She also holds the title of Canada Research Chair in Genetics of Neurodevelopmental Disorders. Her work largely focuses on two rare NDDs, Williams-Beuren syndrome (WS) and 7q11.23 duplication syndrome (Dup7). WS and Dup7 are caused by the reciprocal deletion and duplication of the same ~25 genes on human chromosome 7, respectively (Fig 1)2. Deletions and duplications are structural genetic changes called copy number variants (CNVs) that lead to the loss and gain of genetic material3. Studying reciprocal CNVs of the same genetic segment offers Dr. Osborne a golden opportunity to evaluate how the copy number of a gene may impact neuronal development.

Figure 1. The two CNVs within the 7q11.23 region on human chromosome 7. Typically developing individuals have two copies of the 7q11.23chromosomal region. Those with WS have a deletion of this region, whereas individuals with Dup7 have a duplication of this region. Figure generated using Biorender and adapted from the Osborne Lab4.

“A small set of genes can have such a huge impact on cognition and behaviour,” Dr. Osborne answers when asked what fascinates her about the two NDDs she studies. “It really changes how somebody appears and sees the world.”

WS and Dup7 are distinct disorders with overlapping and opposing phenotypes (Fig 2), likely attributed to the varying copy number of some of the genes in the 7q11.23 critical genetic region2. While unique in their own ways, both of these NDDs are associated with a wide spectrum of clinical manifestations. “A syndrome is not written in stone. You have a list of phenotypes spread across all the people you see, and very few have all of those symptoms, but the question is why,” says Dr. Osborne. She revealed that the greatest challenge is having no way of predicting the extent of a child’s disability despite reaching a final diagnosis. “Parents want to make some sort of plan or have some expectation about what that diagnosis is going to mean, but there is huge variation,” says Dr. Osborne. “We have no predictors right now and that is [a huge burden] for families.”

Figure 2. Common phenotypic features of WS and Dup7. The genetic nature of the 7q11.23 CNVs results in both overlapping and opposing behavioural and physiological features in patients with the two disorders. Figure adapted from Osborne & Mervis2.

Interestingly, two children with the same CNV and diagnosis may fall on opposite ends of a phenotypic spectrum. Dr. Osborne aims to unravel what might be contributing to this widespread continuum to find some predictors for families. In a recent collaborative study with SickKids genetic scientist Dr. Ryan Yuen, the two research groups investigated why some individuals with Dup7 have an additional ASD diagnosis. “Anecdotally, a lot of the [Dup7] kids coming into our study already had a diagnosis with autism, but most of them did not have autism,” Dr. Osborne explained. By virtue of their separation anxiety and shy nature, those kids got labeled and lumped in with other ASD children without going through a formal diagnostic test. However, after putting them through the proper assessment, it was identified that most of the Dup7 kids were misdiagnosed, rather only ~20% of them had an additional clinical ASD diagnosis5. Notably, 20% appears as a striking increase when compared to the general population’s ASD prevalence rate of ~1.5%6. Therefore, Dr. Osborne wondered, “could they (Dup7 kids with ASD) have a ‘second hit’ layered on top of this one CNV that pushes them over the edge that the others do not?”

Unlike monogenic diseases that are caused by a single gene, complex disorders have an array of different variants (mutations) and environmental factors contributing to the disease outcome. Thus, Drs. Osborne and Yuen hypothesized that Dup7 children diagnosed with ASD are likely to carry additional rare damaging variants in ASD-relevant genes. These additional variants are known as genetic modifiers, which suppress or enhance the phenotype of the primary disease-causing gene7. Typically, the more additional variants or ‘hits’, the more severe the phenotype8. This is known as the ‘multiple hit model’, which contributes to the wide variability and overlap in symptoms observed in individuals with NDDs2,8.

To test their hypothesis, they performed whole-genome sequencing (WGS) on twenty Dup7 individuals, half of whom had an ASD diagnosis. WGS is a tool that reads the entire DNA sequence of an individual, which they used to look for second hits across the genome that may be contributing to the ASD phenotype. Unfortunately, they did not identify any variants that could explain the ASD diagnosis9. Surprised by this analysis, Dr. Osborne states, “It wasn’t as simple as that. It wasn’t the CNV and one additional hit that will push you towards autism. It’s more complicated than that”. She explains that rather than one large second hit, there may be a collection of smaller hits with smaller impacts that ultimately add up.

This concept can be visualized as a cup with two thresholds, one for Dup7 filled about halfway, and one for the ASD phenotype bordering the top of the cup (Fig 3)2. Dr. Osborne describes that on their own, modifier genes with small effects are not enough to fill up the cup and push you over either threshold. However, for those with Dup7, their cups are already half full and have surpassed the first threshold. Therefore, Dr. Osborne presumes that unlike in typically developing children, these additional small modifiers may be the distinguishing factors that push the kids with Dup7 and ASD, over the edge (Fig 3, Threshold B).

Figure 3. Model of genetic factors contributing to common and variable features of 7q11.23 CNV disorders. In typically developing individuals, the combination of genetic and environmental factors falls below thresholds A and B. However, individuals with the 7q11.23 copy number variantsare predisposed to WS or Dup7, which on its own is enough to pass Threshold A. Other genetic and environmental factors may modify the phenotype observed if these contributors cumulatively surpass Threshold B. Figure adapted from Osborne & Mervis2.

Even though they failed to identify a clear correlation between having a second hit and an ASD diagnosis, it does not mean these hits are not present. Dr. Osborne explains that the smaller hits are much more difficult to find. In fact, the effects of genetic modifiers are becoming more apparent in complex diseases, including NDDs, and will likely become a major focus of genomics research moving forward.

When asked whether the lack of association with ASD was discouraging, Dr. Osborne said, “No, not really. You ask questions and do not know what answer you will get”. In fact, the team discovered a phenotypic association when shifting their focus towards examining Dup7 as a whole, rather than splitting the children into groups based on ASD diagnosis. They successfully found that some rare variants correlate to various clinical phenotypic measures, such as intellectual ability and adaptive behavior9. This finding could lead to the future development of polygenic risk scores for Dup7. Polygenic risk scores estimate an individual’s relative risk of developing a disease by calculating the weighted sum of all genetic and environmental contributors10. This cumulative measure can hold predictive value in estimating severity in such phenotypic features. In the case of Dup7, polygenic risk scores can estimate an individual’s level of cognition and aspects of behaviour. Ideally, this information could help inform families about whether their child will be shy, socially independent, communicative, and what their intellectual abilities may look like in the future. “Being able to place your child at one extreme or the other would be valuable,” Dr. Osborne explains. This study “gives us hope that there will be other measures that we will be able to find” to further increase the predictive value of these symptoms.

The degree of success attained in identifying such predictors sparked a similar study in children with WS. Dr. Osborne shared that they are in the process of examining whole genomes of ~250 WS children for potential correlations between rare variants and scores measuring cognitive abilities, patterns in social behavior, as well as cardiovascular outcomes. Like Dup7, Dr. Osborne hopes this research brings them one step closer to finding enough predictors to develop polygenic risk scores for WS as well. Identifying an individual’s relative risk can allow for the introduction of personalized therapeutic interventions early on in life, such as speech therapy or cardiovascular monitoring. While these scores hold some predictive power, they should always be taken with a grain of salt, as they should not be used for diagnosis.

Despite not finding the associations they were looking for with ASD, Drs. Osborne and Yuen did find associations with phenotypic measures that may explain some of the variation observed with NDDs. Shifting gears when studying complex disorders is often needed as many different genetic, environmental, and lifestyle factors contribute to the overall clinical manifestation of a disease. “Don’t be afraid to tackle something that is complex,” says Dr. Osborne. “You can still find answers for things even if you know it’s going to be complicated, and it really does take teamwork.” Dr. Osborne shares that while the research field was previously quite competitive, the scientific community is beginning to realize that there is large value in collaborating and integrating patient data. Examining syndromes from different angles will give you a more comprehensive insight into NDDs, ultimately granting families more certainty when planning and investing in their child’s future.


1.   Morris-Rosendahl, D. J. & Crocq, M.-A. Neurodevelopmental disorders—the history and future of a diagnostic concept. Dialogues Clin. Neurosci. 22, 65–72 (2020).

2.   Osborne, L. R. & Mervis, C. B. 7q11.23 deletion and duplication. Curr. Opin. Genet. Dev. 68, 41–48 (2021).

3.   Hastings, P., Lupski, J. R., Rosenberg, S. M. & Ira, G. Mechanisms of change in gene copy number. Nat. Rev. Genet. 10, 551–564 (2009).

4.   About Us | Osborne Lab. http://individual.utoronto.ca/osbornelab/.

5.   Klein-Tasman, B. P. & Mervis, C. B. Autism Spectrum Symptomatology Among Children with Duplication 7q11.23 Syndrome. J. Autism Dev. Disord. 48, 1982–1994 (2018).

6.   Lyall, K. et al. The Changing Epidemiology of Autism Spectrum Disorders. Annu. Rev. Public Health 38, 81–102 (2017).

7.   Rahit, K. M. T. H. & Tarailo-Graovac, M. Genetic Modifiers and Rare Mendelian Disease. Genes 11, 239 (2020).

8.   Guo, H. et al. Genome sequencing identifies multiple deleterious variants in autism patients with more severe phenotypes. Genet. Med. 21, 1611–1620 (2019).

9.   Qaiser, F. et al. Rare and low frequency genomic variants impacting neuronal functions modify the Dup7q11.23 phenotype. Orphanet J. Rare Dis. 16, 6 (2021).

10. Lewis, A. C. F. & Green, R. C. Polygenic risk scores in the clinic: new perspectives needed on familiar ethical issues. Genome Med. 13, 14 (2021).

Maternal Vitamin C Intake Regulates the Epigenome During Germline Development

The Santos Lab has played a pivotal role in understanding how maternal diet affects the development of embryos. Their findings reveal that maternal vitamin C deficiency dysregulates the epigenetic landscape in embryonic germ cells.

Yayra Gbotsyo, Saloni Modi, Anthea Travas

Dr. Miguel Ramalho-Santos is a Senior Investigator at the Lunenfeld-Tanenbaum Research Institute, specializing in mammalian development. He is also a Professor in the Molecular Genetics Department at the University of Toronto. (Image taken from the Santos Lab website).

Maternal diet is an essential factor that affects the health and development of offspring during pregnancy. Several lines of evidence demonstrate that poor maternal nutrition, such as lack of vitamin C (vitC) during pregnancy, leads to abnormal fetal development. Changes in the in vitro environment due to external factors such as smoking, and drinking pose long-term consequences for the developing embryo. Such consequences are shaped by a process known as epigenetics. Epigenetics is the study of DNA and histone methylation patterns that alter chromatin state and gene expression (Goldberg et al. 2007). At the forefront of developmental research is Dr. Miguel Ramalho-Santos, who uses cutting-edge technology to understand epigenetic regulation during gestation.

Dr. Ramalho-Santos explains that he was drawn to Toronto’s vibrant research community and their collaborative efforts in developmental stem cell biology. He received his Ph.D. at Harvard University in 2002, where he trained as a developmental biologist. He then became a Fellow at the University of California San Francisco (UCSF). In 2007, he became an Assistant Professor at the UCSF and was later promoted to Associate Professor in 2013. In 2018, Dr. Ramalho-Santos was recruited to become the Canada 150 Research Chair in Developmental Epigenetics. This initiative recruits exceptional scholars to enhance Canada’s reputation for research and innovation. Alongside this endeavour, Dr. Ramalho-Santos is a Senior Investigator at the Lunenfeld-Tanenbaum Research Institute and a Professor in the Department of Molecular Genetics, at the University of Toronto. His lab uses mouse models to investigate how environmental inputs such as inadequate diet regulate proper gene transcription. Currently, he and his team aim to understand the underpinnings of gene activation during development at the right place and level.

Tet Enzymes Regulate DNA Methylation Patterns in Embryonic Stem Cells

Developmental epigenetics is the study of how environmental inputs influence gene expression during gestation1–3. Environmental factors such as nutrient availability during pregnancy can positively or negatively affect the way genes are expressed in the fetus1,2,4. According to Dr. Ramalho-Santos, the epigenetic landscape during development facilitates our understanding of how certain disruptions in adulthood can be traced back to insults in early uterine life3.

One of the most important takeaways from the Santos Lab was that “mammalian embryos are acutely aware of their mother’s environment.” In utero, the embryo is remarkably responsive to environmental agents that can alter their development2,5. This became evident when they realised that the epigenetic regulating enzyme, ten-eleven translocation (Tet), is dependent on maternal nutrient availability during development5–7. Tet enzymes play a key role in demethylating cytosine nucleotides, thereby removing methyl modifiers and making DNA more accessible6,8. This process promotes the transcription of many genes and maintains pluripotency, thereby giving rise to several different cell types. Tet enzymes function to demethylate DNA within the germ cells of the developing embryo8. Germ cells develop in the gonads of the growing embryo and will ultimately give rise to gametes9. (Figure 1). During embryonic development, DNA demethylation is important for keeping chromatin in an accessible state so that genes are actively expressed, and cell differentiation is restricted3,7. Therefore, Tet-mediated demethylation is crucial in maintaining embryonic stem cell (ESC) pluripotency10. Previous studies show that the offspring of Tet1 knockout mice have significantly reduced germ cells, which leads to compromised fertility10.  In order to further understand this process, Dr. Ramalho-Santos’ research investigates how environmental conditions, such as adequate access to vitC, modulates Tet activity.

Figure 1. Maternal Vitamin C (VitC) promotes germ-cell development by modulating Tet mediated demethylation. A) VitC taken by maternal F0 activates Tet enzymes, which promotes DNA demethylation in the F1 germ cells. Chromatin remodelling into an open state allows active transcription of genes and keeps embryonic germ cells in a stem cell-like state. B) VitC deficient embryos exhibit impaired Tet demethylation activity. As a result, DNA is kept in a non-permissive state. This dysregulation results in reduced embryonic germ cells in the F1 generation. (Figure is not quantitative) Figure created in Bio Render and adapted from3.

Maternal Vitamin C Deficiency Hinders Fecundity in Offspring

In 2013, Dr. Ramalho-Santos and his group discovered that Tet enzymes induced ESC to stay pluripotent when cultured in media containing vitC5. VitC is a potential cofactor of Tet enzymes and helps mediate demethylation5. More recently, these findings were implemented in pregnant mice models to understand how maternal vitC intake regulates Tet demethylation and thus embryonic germline development. 3,11 Dr. Ramalho-Santos explains that vitC supplemented from the diet can modulate Tet-mediated demethylation activity in the embryonic germ line cells (Fig. 1A)3,5,11. This keeps gene promoters accessible for transcription. While VitC deficient F1 embryos are viable, there is a reduction in Tet Demethylation activity in the F1 germ cells, which hinders their ability to give rise to the next generation (Figure 1B, 2)3,11. Interestingly, embryos deficient in vitC have transcriptomes and phenotypes that are remarkably similar to embryos with Tet1 deficiency3. These phenotypes include a reduced number of germ cells in the ovary, reduced fertility, and defects in meiosis3,10,11. In Tet1-deficient mice, defects in meiosis are proposed to be due to insufficient demethylation that fails to activate meiotic genes. These novel findings exemplify the effects of intergenerational epigenetics. This highlights that the F0 maternal environment propagates long-term impacts in the F1 generation.

Figure 2. Maternal Vitamin C deficiency causes defects in embryonic germ cells. VitC deficiency in F0 females leads to intergenerational effects, where the F1 embryo has reduced germ cell count and reduced fecundity in adulthood. Figure created in Bio Render and adapted from4.

When asked whether the effects of vitC deficiency were reversible in mice, Dr. Ramalho-Santos explained that this is only possible if vitC is reintroduced before mid-gestation11. After this point, the adverse effect on germ cells was irreversible. This became an important discovery for demonstrating that vitC is essential during mid-gestation11. The irreversible effects of vitC dysregulates demethylation, thereby hindering germline development11.

Environmental Factors Influencing the Epigenetic Role of Vitamin C

Dr. Ramalho-Santos’ work reveals exciting insights into how maternal vitC intake regulates the germ cells of mice embryos. To this end, one may wonder how these findings relate to the world outside of a lab setting. Inadequate vitC intake may be a reality for individuals with lower incomes and inadequate access to fresh produce12,13. Additionally, studies demonstrate that exposure to pollutants such as cigarette smoke and heavy metals can inhibit or inactivate vitC through oxidation reactions 14,15. This ultimately hamper’s vitC’s role as a cofactor of Tet, thereby dysregulating the epigenetic landscape of developing embryonic germ cells16,17. In the current day and age, regardless of geographical location, exposure to the aforementioned toxic substances has become common and further compounds the effects of vitC deficiency. This reality poses a concern for pregnant mothers as these adverse environmental contributors can lead to dysregulation of Tet.  This imposes serious epigenetic impacts that may not be noticed until after the fetus becomes an adult (i.e. reduced fecundity).

Translating the Effects of Maternal Vitamin C Deficiency from Mice to Humans

Today, researchers strive to accumulate insights into factors that hinder molecular pathways and lead to downstream effects on fetal development. However, it is also important to understand how these scientific findings in animal models translate to improving the health of human populations. Previous studies have demonstrated that vitC modulates Tet enzymes and maintains pluripotency in human ESCs (cells extracted from early human embryos)18. While many of the experiments done in the Santos lab are modelled in mice, they are difficult to replicate in humans and hold ethical barriers. Translation into human subjects would require studying vitC deficiency during pregnancy and then tracing the fecundity of the human offspring throughout adulthood. While this data does not currently exist; Mount Sinai Hospital’s ‘Ontario Birth Study’ program holds promise for recruiting relevant cohorts. This program collects clinical data for pregnant women to understand factors that contribute to maternal and child health. This data is based on multi-generational cohorts, making it immediately accessible for researchers to study trends in the role of epigenetics across many generations.

It is well-known that only a fraction of a disease is attributed to specific genetic defects. Research in the Santos lab addresses some of the missing gaps towards  explaining the epigenetic context of diseases. By studying the role of vitC in fetal development, many further opportunities have opened up for exploring the epigenetic effect of other environmental factors. Some of these factors include the availability of food, temperature changes, stress, and exposure to pathogens. Studies based on these factors will provide insights into how environmental inputs shape the genome over time.

Future Directions

For the Santos Lab, the effects of vitC on mammalian development has opened doors to new questions and research directions. If embryos can respond to their surroundings, “we wonder what else the offspring can sense”. He aims to further explore how environmental inputs, both good and bad, ensure that gene expression happens at the appropriate pace and level. Alongside this environment-epigenome project, Dr. Ramalho-Santos is interested in understanding the biological significance of hyper-transcription, a state of accelerated gene expression in ESCs and other stem cells19. Recent work has shed light on the importance of hyper-transcription in processes such as embryogenesis, neurogenesis, and development19. Dr. Ramalho-Santos explains that there is a link between nutrient availability and hyper-transcription. Instead of entering a hyper-transcription state, the lack of nutrition leads ESCs to become dormant. Ultimately resulting in serious consequences such as missed developmental milestones.

Dr. Ramalho-Santos’ overarching goal is to provide scientific evidence that “development doesn’t happen in a vacuum.” Instead, embryonic development is remarkably reflective of its surrounding maternal environment. Overall, the Santos lab has highlighted that vitC consumption during pregnancy is important for DNA demethylation and plays a key role in establishing the epigenetic landscape of embryonic germ cells. It is important to understand that vitC influences epigenetic changes as they leave a long-lasting effect on many characteristics in offspring.


1.   John, R. M. & Rougeulle, C. Developmental Epigenetics: Phenotype and the Flexible Epigenome. Front. Cell Dev. Biol. 6, 130 (2018).

2.   Legoff, L., D’Cruz, S. C., Tevosian, S., Primig, M. & Smagulova, F. Transgenerational Inheritance of Environmentally Induced Epigenetic Alterations during Mammalian Development. Cells 8, 1559 (2019).

3.   DiTroia, S. P. et al. Maternal vitamin C regulates reprogramming of DNA methylation and germline development. Nature 573, 271–275 (2019).

4.   Coker, S. J., Smith-Díaz, C. C., Dyson, R. M., Vissers, M. C. M. & Berry, M. J. The Epigenetic Role of Vitamin C in Neurodevelopment. Int. J. Mol. Sci. 23, 1208 (2022).

5.   Blaschke, K. et al. Vitamin C induces Tet-dependent DNA demethylation in ESCs to promote a blastocyst-like state. Nature 500, 222–226 (2013).

6.   Dawlaty, M. M. et al. Loss of Tet Enzymes Compromises Proper Differentiation of Embryonic Stem Cells. Dev. Cell29, 102–111 (2014).

7.   Jenkins, T. G. & Carrell, D. T. Dynamic alterations in the paternal epigenetic landscape following fertilization. Front. Genet. 3, 143 (2012).

8.   Kohli, R. M. & Zhang, Y. TET enzymes, TDG and the dynamics of DNA demethylation. Nature 502, 472–479 (2013).

9.   Cinalli, R. M., Rangan, P. & Lehmann, R. Germ Cells Are Forever. Cell 132, 559–562 (2008).

10. Caldwell, B. A. et al. Functionally distinct roles for TET-oxidized 5-methylcytosine bases in somatic reprogramming to pluripotency. Mol. Cell 81, 859-869.e8 (2021).

11. Ebata, K. T. et al. Vitamin C induces specific demethylation of H3K9me2 in mouse embryonic stem cells via Kdm3a/b. Epigenetics Chromatin 10, 36 (2017).

12. Mosdol, A., Erens, B. & Brunner, E. J. Estimated prevalence and predictors of vitamin C deficiency within UK’s low-income population. J. Public Health 30, 456–460 (2008).

13. Shohaimi, S. et al. Occupational social class, educational level and area deprivation independently predict plasma ascorbic acid concentration: a cross-sectional population based study in the Norfolk cohort of the European Prospective Investigation into Cancer (EPIC-Norfolk). Eur. J. Clin. Nutr. 58, 1432–1435 (2004).

14. Rider, C. F. & Carlsten, C. Air pollution and DNA methylation: effects of exposure in humans. Clin. Epigenetics 11, 131 (2019).

15. Liu, S. et al. Arsenite Targets the Zinc Finger Domains of Tet Proteins and Inhibits Tet-Mediated Oxidation of 5-Methylcytosine. Environ. Sci. Technol. 49, 11923–11931 (2015).

16. Efimova, O. A., Koltsova, A. S., Krapivin, M. I., Tikhonov, A. V. & Pendina, A. A. Environmental Epigenetics and Genome Flexibility: Focus on 5-Hydroxymethylcytosine. Int. J. Mol. Sci. 21, 3223 (2020).

17. Yin, R. et al. Ascorbic Acid Enhances Tet-Mediated 5-Methylcytosine Oxidation and Promotes DNA Demethylation in Mammals. J. Am. Chem. Soc. 135, 10396–10403 (2013).

18. Chung, T.-L. et al. Vitamin C Promotes Widespread Yet Specific DNA Demethylation of the Epigenome in Human Embryonic Stem Cells. Stem Cells 28, 1848–1855 (2010).

19. Bulut-Karslioglu, A. et al. Chd1 protects genome integrity at promoters to sustain hypertranscription in embryonic stem cells. Nat. Commun. 12, 4859 (2021).

Harnessing the power of population databases, one study at a time

Dr. Shreejoy Tripathy and his team demonstrate the power of the UK BioBank population database in a study that unpacks the complicated interplay between schizophrenia polygenic risk score, psychotic episodes, and cannabis use.

Milcah Sutanto, Gabriela Tanumihardja, & Yuan Tian

Dr. Shreejoy Tripathy, Ph.D. (right) is pictured with postdoctoral fellow Dr. Michael Wainberg, Ph.D. (left). Dr. Tripathy is an independent scientist at the Krembil Centre for Neuroinformatics within the Centre for Addiction and Mental Health and an Assistant Professor in the Department of Psychiatry at the University of Toronto. Photo provided by Dr. Tripathy.

Have you ever wondered if genetics and the environment interact to play a role in the context of mental illnesses? This is exactly what Dr. Shreejoy Tripathy (Ph.D.), an Assistant Professor at the University of Toronto and an independent scientist at the Krembil Centre for Neuroinformatics within the Centre for Addiction and Mental Health, seeks to understand. The importance of understanding mental illnesses has been heightened with the onset of the COVID-19 pandemic. The pandemic has had a significant negative impact on the mental health of the general population worldwide1. In Canada specifically, 1 in 5 people experience a mental illness annually2. This demonstrates the urgent need to better understand the underlying causes of mental illnesses in hopes of developing both preventative and treatment strategies. Emerging research has been centred around understanding the development of mental illnesses; this has included investigating the interplay between genetic and environmental factors3. One strategy used to study these gene-environment relationships is large population databases, like the UK BioBank4. In collaboration with Dr. Michael Wainberg (Ph.D.), a postdoctoral fellow, Dr. Tripathy used the UK BioBank to investigate the relationship between cannabis use and psychotic experiences in the general population and those with a genetic predisposition for schizophrenia5.


Presentation of schizophrenia

Schizophrenia is a complex heritable mental illness that has a long-term impact on patients and society6. The symptoms of schizophrenia are usually classified as either positive, negative, or cognitive (Figure 1)6. Positive symptoms are characterized by a distortion or amplification of normal behaviours, such as hallucinations, whereas negative symptoms are indicated by a loss or dampening of normal functions, such as reduced emotional expression. Cognitive symptoms consist of difficulties in memory and attention.

Figure 1: Diagram depicting the potential symptoms of schizophrenia6. There are three classifications of symptoms. Positive symptoms arebehaviours that are distorted or amplified from normal behaviours, including hallucinations, delusions, disorganized speech, and confused thoughts. Negative symptoms are behaviours that show a loss or decrease in normal functions such as a lack of pleasure and struggling with daily routine (a lack of motivation). Cognitive symptoms can include memory problems and impaired sensory perception. Image created in Biorender.com.

Unearthing the heritability behind schizophrenia

“Genetics has been really useful in psychiatry and [in] helping [us] to understand and assess risk for various [mental] illnesses”, stated Dr. Tripathy when asked about the implications of genetics in psychiatric research. One of the first methods used to study the genetic component of developing mental illnesses was twin studies3. This technique evaluates whether a certain trait is more commonly shared in monozygotic twins (genetically identical) compared to dizygotic twins (non-genetically identical). Traits that are shared more commonly between monozygotic twins are considered more heritable, indicating that the traits are more heavily influenced by genetic factors. Interestingly, recent twin studies have estimated schizophrenia’s heritability to be between 60-65%, which alludes to the importance of genetic factors for its expression3. Moreover, it has been widely accepted that first-degree relatives of schizophrenic patients have a higher risk of developing schizophrenia compared to those without affected first-degree relatives3. Overall, the variation within an individual’s genetic makeup significantly contributes to the risk of developing schizophrenia.

Like virtually all mental illnesses, schizophrenia is a complex polygenic disease, which relies on the action of several different genes to manifest7. To find the genes that are significantly associated with the disease, genome-wide association studies (GWAS) are often conducted. GWAS examines the genomes of a large set of individuals, with and without the disease of interest, and looks for genetic markers that can be used to predict the occurrence of the disease (Figure 2)3. GWAS has linked more than 100 common single nucleotide polymorphisms (SNPs), spanning more than 600 genes, with the development of schizophrenia3. Each of these genetic markers, also known as genetic variants, found by GWAS can be used to statistically estimate an individual’s risk of developing the disease due to genetics alone. This statistical estimate, often referred to as polygenic risk score (PRS), is calculated by taking the weighted sum of the risk of each disease-associated genetic variant7. With many genetic variants contributing to the PRS to a small degree, it is difficult to determine the overall risk of developing the disease without considering other factors, such as the environment.

Figure 2: Simplified outline of a schizophrenia GWAS3. A schizophrenia GWAS seeks to understand the relationship between having both schizophrenia and common genetic variants found within the population. The genomes of two large groups of individuals with and without schizophrenia are analyzed for genetic markers that may be predictive of developing schizophrenia. These genetic markers are identified by analyzing genetic SNPs within the population. These markers are then statistically analyzed to determine if they can be significantly associated with schizophrenia. Figure created in BioRender.com.

Dr. Tripathy noted that “for the most part there are no psychiatric disorders that are completely due to genetics”. In fact, it has been well established that most psychiatric illnesses are a product of the interaction between genetic and environmental factors. The development of schizophrenia has been linked with exposure to many environmental factors such as childhood trauma, contraction of certain viral and bacterial infections, socioeconomic factors, and the use of cannabis6. The interaction between genetic and environmental factors is complex, and often very difficult to disentangle. Large-scale population databases that contain significant genetic and non-genetic information, like the UK BioBank, can be used to further investigate these relationships.

Using the UK BioBank to unravel the interaction between the PRS of schizophrenia, psychotic experiences, and cannabis use

Dr. Tripathy’s research lab used the UK BioBank to unpack the relationship between the PRS of schizophrenia and cannabis use. The UK BioBank is a large open-access resource that contains anonymized genetic and non-genetic information from 500,000 UK residents and is updated regularly8. This database includes information on participants’ genome-wide genotypes, physical measurement examinations, health-related records, and answers to online questionnaires (Figure 3). When the participants joined the UK BioBank project, they ranged between 40-69 years old, which allowed for the data collection on any age-related health problems and baseline data before the onset of any severe diseases. However, an important limitation of this database to note is its lack of diversity—most participants were White British. All in all, the UK BioBank was created to inspire well-powered research to determine the true effect of genetic and non-genetic factors contributing to disease. The availability of this online database to researchers around the world has spurred on many studies that focus on health-related research to improve clinical care. As explained by Dr. Tripathy, “these types of datasets are really powerful”. The wide range of information available in this population database will also allow researchers to see potential connections and correlations, inspiring new studies that could further the field.

Figure 3: Schematic of the data collection points for the UK BioBank8. The UK BioBank collects data from 500,000 study participants. This data includes genetic and non-genetic information. Non-genetic data consists of information collected from health-related records, physical measurement exams, interviews, and self-reported questionnaires. Figure created in BioRender.com.

As data analysts, Drs. Tripathy and Wainberg evaluated the available data in the UK BioBank and found that there were over 150,000 participants who completed the Mental Health Questionnaire and self-reported information relating to substance use5. They quickly realized that this massive amount of data could be used to investigate the interaction between schizophrenia and cannabis use–providing an important insight into the development of the disease. When talking about this study, Dr. Tripathy remarked that it was especially “timely because cannabis has been legalized in Canada… and it’s increasingly becoming decriminalized throughout the world”. The use of cannabis is very common amongst Canadians–1 in 4 Canadians reported to have used cannabis within the past 12 months in the 2021 Canadian annual statistics9

Dr. Tripathy and his team performed a cross-sectional analysis using approximately 110,000 UK BioBank participants from unrelated White British ancestry5. They compared data from healthy participants (without a clinical diagnosis of schizophrenia) with high and low schizophrenia PRS to investigate the impact of ever having used cannabis in their lifetime on having psychotic experiences. Specifically, they looked for statistically significant associations between PRS, cannabis use frequency, and psychotic experiences like auditory and visual delusions. They found that the use of cannabis is more strongly associated with early-onset psychotic experiences in participants with a higher schizophrenia PRS compared to those with a lower schizophrenia PRS5. However, it is important to note that an association does not mean causation.

While this study was unable to establish causation, high-powered population databases, like the UK BioBank, can be used to define meaningful associations that have potential clinical applications. With both genetics and environmental factors coming to play in the development of schizophrenia, these results have indicated a potential avenue for preventive risk management5. For example, in this case, individuals with a higher PRS of developing schizophrenia could be advised to avoid cannabis, especially early on in their lives, in hopes of prolonging or preventing disease presentation. Looking to the future, the increased access and decriminalization of cannabis across the globe should lead the way to better knowledge dissemination and education regarding the intricacies of cannabis use.   

The future of research using large population databases

This study shows that meaningful associations can be made by harnessing the power of large population databases, like the UK BioBank. The use of large-population databases in research can help reduce the timeframe required to complete research projects. Additionally, the results produced by research projects analyzing large amounts of data are robust, as the amount of data available for analysis is much larger than what one single study can gather. As Dr. Tripathy explains in the case of the UK BioBank, “one cross-sectional study using half a million people may be better than 100 studies that use 50 people”. This is just the beginning of population database research and there are many possibilities within the field that has yet to be explored4

When asked about the potential research projects that can use this type of resource, Dr. Tripathy noted that while it may be relatively “easy to generate data…it’s still really hard to figure out what it means”, referring to the difficulties present in data analysis. One potential method to overcome this problem is programming. For instance, in this study, Dr. Tripathy and his team used programming languages like Python and R to analyze data from more than 110,000 patients from the UK BioBank5. With this being such a data-driven project, Dr. Tripathy mentioned that the most exciting part of conducting this research was having the chance to collaborate with and learn from his colleagues. He emphasizes that constant learning is a large part of this field and urges the next generation of scientists to become familiar with at least one programming language.  He advises, “To anyone who’s interested in research in science, I would strongly encourage taking a programming class”. Learning a programming language like R or Python can help fill the high demand for data analysts with the skillset required to process large datasets. With the future of research becoming more data-centric, this is one step you can take to better situate yourself for a successful career in data research.


1.         Tsamakis, K. et al. COVID‑19 and its consequences on mental health (Review). Exp. Ther. Med. 21, 1–1 (2021).

2.         Fast Facts about Mental Health and Mental Illness. CMHA National https://cmha.ca/brochure/fast-facts-about-mental-illness/.

3.         Zhuo, C. et al. The genomics of schizophrenia: Shortcomings and solutions. Prog. Neuropsychopharmacol. Biol. Psychiatry 93, 71–76 (2019).

4.         Stewart, R. & Davis, K. ‘Big data’ in mental health research: current status and emerging possibilities. Soc. Psychiatry Psychiatr. Epidemiol. 51, 1055–1072 (2016).

5.         Wainberg, M., Jacobs, G. R., di Forti, M. & Tripathy, S. J. Cannabis, schizophrenia genetic risk, and psychotic experiences: a cross-sectional study of 109,308 participants from the UK Biobank. Transl. Psychiatry 11, 211 (2021).

6.         Owen, M. J., Sawa, A. & Mortensen, P. B. Schizophrenia. The Lancet 388, 86–97 (2016).

7.         Foley, C., Corvin, A. & Nakagome, S. Genetics of Schizophrenia: Ready to Translate? Curr. Psychiatry Rep. 19, 61 (2017).

8.         Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).

9.         Canada, H. Canadian Cannabis Survey 2021: Summary. https://www.canada.ca/en/health-canada/services/drugs-medication/cannabis/research-data/canadian-cannabis-survey-2021-summary.html (2021).

Unravelling genetic mysteries: The present and future of whole genome sequencing

Toronto based research group shows the promise of whole genome sequencing as a first-tier diagnostic tool for children with rare diseases, ushering in a new era of precision medicine.

Liali Aljouda, Meredith Curtis, Anjali Jain

Dr. Gregory Costain, MD, PhD. is a staff physician in the Division of Clinical and Metabolic Genetics, a scientist-track investigator in the Genetics & Genome Biology program at SickKids, and an assistant professor in the Department of Paediatrics at the University of Toronto. Photo provided by Dr. Costain.

It was a good time to be a medical geneticist in Toronto in 2014. At a time when the genetics field was just starting to move towards whole exome sequencing (WES), a test that reads the ~1% of DNA that codes for proteins, SickKids made a revolutionary decision. They decided that the way forward was to invest in Whole-Genome Sequencing (WGS), a technology that reads every single letter in the ~3 billion letter sequence that makes up human DNA(Figure 1). Access to this amount of genetic data was simply a fantasy to geneticists decades prior, but at this time researchers at SickKids were starting to get a taste of how this data could change the landscape of precision medicine. This decision to invest in the future is where the Genome Clinic Study was born2, which fostered Dr. Gregory Costain’s interest in WGS and its potential as a first-line test for more than just a diagnosis.

Figure 1. Comparison of whole exome and genome sequencing. All our DNA together is called the genome. About 1% of the genome includes genes that code for proteins, this is called the exome. Whole exome sequencing is a type of genetic test that reads this 1% to look for a change in the sequence that causes disease. Whole genome sequencing reads every single letter of DNA in our cells, which includes over 3 billion letters. Therefore, it can assess all of the protein-coding genes that make up the exome as well as the areas in between known as intronic regions. Intronic regions can influence how genes function. Figure adapted from  https://mygene2.org/MyGene2/exomesequencingdetails

Dr. Gregory Costain is a staff physician in the Division of Clinical and Metabolic Genetics, and a scientist-track investigator in the Genetics & Genome Biology program at SickKids. He runs a specialized Neuropsychiatric Genetics Clinic and a research program that uses genome sequencing to understand the cause and consequences of rare variation in the genome of undiagnosed paediatric patients. In September 2020, Dr. Costain and his colleagues published their findings from a study using WGS as a first-tier diagnostic test for children with medical complexity (CMC)3. However, throughout our time together it became clear that this research had significance far greater than finding a diagnosis. As Dr. Costain stated himself, “The motivations are not only to end that diagnostic odyssey for that family… but it’s because the rare helps us to understand the common”.

In the 2020 study, Dr. Costain co-led a research team which sequenced the genomes of 49 patients who are followed by the Complex Care Program at SickKids hospital and their parents. These patients are considered children with medical complexity (CMC), who are defined in Ontario as kids with multiple complex chronic conditions necessitating specialized care from multiple medical and community providers, have a functional disability, and rely on some degree of medical technology for their daily living activities (Figure 2)4. All of the study participants were suspected of having an underlying genetic condition. However, even after years of searching for a diagnosis, ranging from 5-17 years, no answers were found through conventional genetic tests. The results of the study showed that WGS was not only able to identify all previous genetic variants identified by conventional genetic testing, but it also identified genetic variants that could not have been detected by any other method. This was remarkable because these results gave new hope to CMC that they never previously had.  Fifteen of the 49 participants received a new molecular genetic diagnosis, of which seven received a diagnosis that had an immediate impact on their treatment plan. For instance, one child was diagnosed with a genetic condition caused by an enzyme deficiency and was started on targeted therapy. The researchers proposed that using WGS by itself could save years worth of time and money on genetic testing for CMC and their families.

Figure 2. Model to define children with medical complexity (CMC). CMC suffer from chronic conditions which may be a known or unknown genetic condition. They may have severe functional limitations necessitating technology dependence for breathing, walking, eating etc. This results in increased needs in terms of constant care from family/caregivers and greater financial needs to cover for increased health care expenditure. Figure adapted from5

When asked why the research team decided to focus on CMC, he mentioned that these patients account for less than 1% of all the paediatric cases in Ontario; however, that small group accounts for over 30% of all paediatric health care expenditure in Canada6. These children usually spend a significant amount of their lives in a hospital or other healthcare setting, and they end up needing a lot of additional resources and support. “The vast majority of CMC likely have an underlying genetic condition”, he states. However, unfortunately, the current conventional genetic testing strategy can be costly, time-consuming, and ultimately unsuccessful7. Therefore, he says, “to me, it made sense that they [CMC] should be at the forefront of any new advances in genetic testing, as they are the ones who may benefit the most.”

Additionally, Dr. Costain notes, “We feel there is value in having a diagnosis for a child’s issues that goes beyond an immediate impact on their clinical care.” Many families tend to have some degree of self-blame associated with having a child who has an unexplained medical complexity, and it is essential to give them a tangible explanation that takes away that locus of responsibility and worry. Moreover, uncovering these genetic causes is crucial to better inform recurrent risk. Therefore, aside from treatment, a diagnosis can provide a family with options for family planning, connections to a community of others dealing with a similar diagnosis, and a platform for advocacy. With WGS, “we were able to finally give answers to a significant fraction of those children” states Dr. Costain.

One might think that the benefits of this study are only limited to children with medical complexities; however, this research has far broader implications. “All of us are at risk of having a child with a genetic condition,” says Dr. Costain. While this is a lesser-known fact by the general public, it is an undeniable truth. In his clinic in Toronto, Dr. Costain deals with cases which are inherited in different ways. In cases of traditional inheritance patterns, where either one (dominant) or two (recessive) defective copies of a gene are required to cause disease, it is easier to calculate the risk of a having a child with disease8. Nevertheless, the majority of cases that Dr. Costain works with present with new, spontaneous mutations that are not inherited from either parent. Although spontaneous changes in our DNA occur throughout every individual’s life and are a part of what makes us human, they can sometimes disrupt major pathways and lead to disease. So, really no one is completely free from the risk of having a genetic disease. Additionally, the findings from this WGS research are not only improving care for children with medical complexities but also providing the scientific community with knowledge that will inform precision medicine in the future.

With great joy, Dr. Costain shares exciting news, “the Ministry of Health last week sent out an alert acknowledging that, as of April 1, 2021, genome-wide sequencing will be available in our province.” Currently, SickKids and Children’s Hospital of Eastern Ontario (CHEO) have the necessary infrastructure and the expertise to organize sequencing in-house for clinical diagnostic purposes9. One thing that adds to the excitement is that SickKids will be at the forefront of this initiative and will be responsible for WGS in the province. “Thanks to the work of my colleagues at SickKids, CHEO, and elsewhere over the last several years” says Dr. Costain, the longstanding barrier of making WGS a clinical diagnostic test has mostly been overcome. “This is very much a team process.”

Now that WGS is slowly making its way into the clinic, there are many new pressing challenges and considerations which need to be addressed. The most important one is to view WGS through the “lens of health care providers and policymakers as opposed to researchers,” says Dr. Costain. The data from research studies have proved how valuable WGS testing is, but it is now time to shift the perspective and focus on questions like eligibility and cost of WGS relative to other diagnostic tests. After these questions are addressed, next comes the task of having an expert workforce who can interpret and return the data so it can then be responsibly communicated to the patient and family members. Dr. Costain highlights a shortage of genetic counsellors, medical geneticists, and genome analysts in Canada and other parts of the world. For WGS to be successful without being a burden, it is essential to fill the gap between the number of experts who can deal with WGS data, and the demand for this test in the clinic. In parallel, it is also essential that the people receiving this test understand its limitations. “No test in medicine is perfect and that is equally true for WGS,” says Dr. Costain. One big concern is that people may get false reassurance if their test comes out to be negative. He thinks the best way to address this is through an “education campaign” informing people about these caveats. 

As for the future of WGS in clinical care, Dr. Costain believes, “the next step with personalized medicine is using all of the genetic information we get from WGS and not just the individual diagnostic rare change”. Personalized medicine, also known as precision medicine, is medical care tailored to the patient based on extensive individual data to improve health outcomes10. Genomic data plays an important role in this approach to healthcare as it can provide, not only a diagnosis, but also insight into differences in treatment response and variable expression, wherein some people with the same genetic changes have varying disease presentations. Clinicians can leverage all the genetic data obtained from WGS to look for clues to explain this variability. Dr. Costain aptly states, “there are always going to be unknowables, [but] I think we can do a better job of tailoring someone’s risk”. 

Although the future is yet to be determined, there are a few things we know for sure. “There is going to be a learning curve and there are going to be pains and challenges,” Dr. Costain emphasizes. As with introducing any new technology, it requires time and resources for physicians, especially those outside the genetics field, to learn how to best utilize these tools. For complex cases, such as CMC, geneticists are likely to be heavily involved in ordering, interpreting, and translating genomic information to patients and their families. However, other physicians will increasingly be asked to take on that responsibility in their own practice. This is no small request, but Dr. Costain does not see this as a hurdle that his non-genetics colleagues must overcome on their own. Genome sequencing generates a large amount of data, and it can be difficult to tease out the relevant information and understand its limitations. Dr. Costain believes that those in medical genetics and genetic counselling should be seen as “resources and colleagues who can help them [clinicians]” manage this learning curve. He also acknowledges the opportunity to create new career paths in translational genomics, whereby individuals with knowledge of the issues around translating research genomics into clinical care may have a place in the education of non-genetics care providers, the lay public, and even politicians and policy makers. Dedicated training, such as what is provided through the University of Toronto’s Medical Genomics program may help to fill this gap. “So, I’m excited that this program exists because it makes a lot of sense to me.” Dr. Costain concluded.

With the knowledge being generated out of Toronto by research groups like the one led by Dr. Costain and the new announcement from the Ministry of Health regarding routine WES and WGS for clinical care, we are on the cusp of a paradigm shift towards precision medicine. Although 2014 may have been an exciting time to be a medical geneticist, 2021 is shaping up to be an exciting time to be in the field of medical genomics.


1.         Yin, R., Kwoh, C. K. & Zheng, J. Whole genome sequencing analysis. Encycl. Bioinforma. Comput. Biol. ABC Bioinforma. 3, 176–183 (2018).

2.       Bowdin, S., Ray, P. N., Cohn, R. D. & Meyn, M. S. The Genome Clinic: A Multidisciplinary Approach to Assessing the Opportunities and Challenges of Integrating Genomic Analysis into Clinical Care. Hum. Mutat. 35, 513–519 (2014).

3.       Costain, G. et al. Genome sequencing as a diagnostic test in children with unexplained medical complexity. JAMA Netw. Open. 3, e2018109 (2020).

4.       Cohen, E., Berry, J. G., Sanders, L., Schor, E. L. & Wise, P. H. Status complexicus? The emergence of pediatric complex care. Pediatrics. 141, S202–S211 (2018).

5.       Dewan, T. & Cohen, E. Children with medical complexity in Canada. Paediatr Child Health. 18, 518–522 (2013).

6.       Cohen, E. et al. Patterns and costs of health care use of children with medical complexity. Pediatrics. 130, e1463 (2012).

7.       Oei, K., Hayeems, R., Ungar, W., Cohn, R. & Cohen, E. Genetic Testing among children in a complex care program. Children. 4, 42 (2017).

8.       Genetic Alliance. The New York-Mid-Atlantic consortium for genetic and newborn screening services. Underst. Genet. A New York, Mid- Atl. Guid. Patients Heal. Prof. Appendix E Inheritance Patterns (2009).

9.       Genome-wide Sequencing Ontario. Genome-wide sequencing Ontario: A pilot implementation for rare disease diagnostics. https://gsontario.ca/ (2021).

10.     Costain, G., Cohn, R. D. & Malkin, D. Precision child health: an emerging paradigm for paediatric quality and safety. Current Treatment Options in Pediatrics. 6, 317–324 (2020).

The Genome on Repeat: Expanding our understanding of autism spectrum disorder by studying expanded DNA

Dr. Ryan Yuen, researcher at SickKids, uses artificial intelligence to analyze repetitive DNA regions that contribute to autism spectrum disorder. 

Nihal Al Menabawy, Daniel Kiss, and Elise Poole 

Dr. Ryan Yuen is a Scientist in Genetics and Genome Biology at SickKids Research Institute and an Assistant Professor in the Department of Molecular Genetics at The University of Toronto. Image from the University of Toronto Department of Molecular Genetics website. 

As quarantine hobbyists can attest to, putting together a puzzle when pieces are missing, or the reference photo is incomplete is extremely difficult. For clinicians and geneticists alike, this is an everyday reality when trying to understand the genetic basis of conditions like autism spectrum disorder (ASD). ASD is a group of common neurodevelopmental disorders, impacting approximately 1 in every 66 children in Canada. Diagnosis is currently informed largely by clinical presentation and is characterized by a wide range of features including challenges in social interaction and communication, repetitive patterns of behavior, restricted interests1.  

Along with his team, Dr. Ryan Yuen, a genetics scientist at SickKids Research Institute and an assistant professor of molecular genetics at the University of Toronto, has created a novel computational pipeline to unravel hidden genetic components contributing to ASD (Figure 1). His aim is to develop novel disease gene discovery strategies, effective diagnostic approaches and better treatment options for individuals with neurodevelopmental and neurological disorders.  

Figure 1: Genome sequencing workflow. After preparing DNA samples, they are inserted into a sequencer to read the DNA codes and give an output of millions of short DNA sequences. These sequences go into Dr. Yuen’s computational pipeline, a group of high-powered software programs that arrange the sequences and search for tandem repeats (created using biorender.com).  

ASD as a genetic condition. 

As with many neurological conditions, ASD has a strong genetic component. In studies conducted on identical twins, researchers have shown that if one twin has ASD, the identical twin will also have ASD 87% of the time2. In fact, 50-90% of the risk surrounding ASD may be determined by genetics alone2. Although research in genetics has been able to unlock a great deal of information about ASD inheritance and the underlying biology, scientists are yet to discover all genetic causes3.  

ASD is classified as a complex disorder, meaning that although there is a clear genetic component, it is not caused by changes in a single gene. Rather, ASD is a result of multiple changes throughout the genome and individually contribute to an increased risk towards developing the condition4 

The curious case of missing heritability 

Over the past two decades, next generation DNA sequencing (NGS) has transformed the field of genetic research and empowered researchers to study ASD on a genome-wide level4. Despite advances in genetic research, only 20% of genetic contributors to ASD have been discovered, with the majority of individuals ASD cases having an unknown genetic cause2.  

When asked why the number of known genetic contributors remains low, Yuen states, “We know that there is a genetic basis in ASD, but we cannot find where the variants are.” 

ASD is a true genetic puzzle, and Dr. Yuen’s team is motivated to help put more of the pieces together. “There is so much in ASD that still needs to be uncovered”, says Yuen. He also mentions that the current sequencing analysis pipeline for ASD is primarily focused on functional (protein coding) regions of the genome. “Our genome is mostly unexplored,” says Yuen, “and many of the unexplored parts are the repetitive regions including DNA tandem repeats.” Dr. Yuen and his team have developed a novel computational approach to identify these DNA repeats and believe they could be the key in helping to uncover novel contributions to genetics of ASD. This research has recently been published in Nature2. 

DNA tandem repeat expansions: When copying and pasting goes wrong 

When cells in the body divide to make new ones, the DNA within each cell has to be copied. Although most cells can complete this process without any issues, certain DNA building blocks known as nucleotides are occasionally removed, inserted, or even repeated by mistake and passed onto new cells. Tandem repeats, a specific type of DNA error, occur when short DNA sections are mistakenly copied and pasted one after another. These repeated regions act like “wrinkles” in the DNA. As with photocopying a wrinkled piece of paper, these DNA wrinkles can confuse the cell’s DNA-copying mechanism and cause it to create more wrinkles, and further extends the repeated region. This is referred to as “tandem repeat expansion”. 

Although tandem repeats vary in length and DNA location from person to person, certain regions of DNA can be damaged by too many repeats. These repeats may be passed on from parents to their children, where they can continue to expand and become more harmful (Figure 1). Some of the most common genetic conditions caused by tandem repeat expansion include Huntington’s Disease and Fragile X syndrome, both of which are caused by an abundance of repeats in a specific gene5

Figure 2. Tandem Repeats Explained. As DNA is passed from parent to child, tandem repeats can expand, leading to defects in gene function, and downstream complications. Repeat expansion has been thought to associate with ASD, but until now, no large-scale research has investigated this connection. 

Understanding ASD genetics: A missing link 

Until recently, scientists were only able to look for tandem repeat expansions in one gene at a time and determining the impact of each repeat required numerous studies and experiments. Although these methods have been able to provide information about genetic disorders and suggest treatments options, not all genetic traits are controlled by one single repeat region. As a complex disorder, hundreds to thousands of genes and unique regions of DNA potentially control ASD, making it difficult to determine whether specific tandem repeats are indeed contributing to each ASD-related trait. Furthermore, most research in the field of ASD thus far is focused on mutations and repeats within protein-coding regions of genes, which make up less than 2% of the genome. 

“We know that there is a genetic basis, but we don’t fully know how it works,” stated Yuen. 

With nearly a million tandem repeat regions in the human genome6, finding out which repeat locations and repeat sizes might contribute to ASD has proven to be incredibly difficult. 

Computers pave the way for a new era in genetics 

Thankfully, with recent advances in artificial intelligence technologies, Dr. Yuen’s team was able to create and implement a computer-directed genome searching method that can pinpoint which repeats are associated with ASD, and which ones are not. Using statistical methods to analyze genomes of individuals with ASD, this method can find patterns in repeat regions that occur more often in individuals with autism than those without. 

The team applied their method on genome data from over 17,000 individuals, over 5000 of which had been diagnosed with ASD. Collectively, they were able to identify 2,588 locations in the genome where tandem repeats were significantly more common in individuals with ASD, highlighting the importance of studying repeat expansion.

Although some repeats were found in genes with previous ASD association, over half were found in genes completely new to autism, paving the way for studies that can look closer at these genes to determine the specific mechanisms by which autism occurs. 

Tandem repeat studies can happen in a broad spectrum of genetic conditions 

Although this research was conducted in ASD, Yuen highlights that the computational method itself was not created exclusively for ASD, stating, “This is a breakthrough in genetics. It’s the first time a complex disorder can be related to tandem repeat expansion. Until now, they were only related to monogenic (single gene) disorders. The pipeline can be applied to other disorders. We can use this pipeline to identify virtually any repeat-associated variant in any disease.” 

Yuen emphasizes that although ASD was the first complex disorder to be analyzed, his method can be applied to any disease with a known repeat-associated genetic basis, or even to conditions with an unknown genetic cause. He states, “This is a potential avenue for understanding the biology of tandem repeats in complex neurological conditions.” Since such conditions present with a wide variety of symptoms, Yuen hopes that the pipeline will highlight the need for, and aid in the gathering of more genetic information to decipher the underlying mechanisms of various conditions. 

Unraveling the next steps: Pipeline reimagined 

Regarding future applications, Yuen highlights the power of revisiting existing data. “We focus on what we know is biologically relevant, in this case; tandem repeats, and utilize what has been missed.” Although this new methodology has provided significant advancements in understanding the genetic landscape of tandem repeats, Yuen notes that there is room for improvement. 

One of Dr. Yuen’s major goals is to develop an even better analysis pipeline. An overhaul of current analysis methods using new technological advancements in gene sequencing would enable more precise detection of tandem repeats. Rather than searching retroactively, tandem repeats could be actively investigated as soon as genetic testing is requested. 

Another consideration for pipeline improvement relies on improved analysis of large amounts of genomic data. In order to effectively detect tandem repeats throughout the genome, thousands of genomes must be analyzed simultaneously to search for patterns. To this end, Dr. Yuen shares, “One obstacle that I am facing right now is that there are many genomes available, not locally but internationally…The bottleneck is how do we access this data, and how can we process large-scale datasets?” Although the data is available, and Yuen is equipped with high performance computers, the current iteration of his analysis pipeline is intensive and time consuming. 

“I’m just hoping that in the near future the infrastructure can also develop in a way that we can analyze data more efficiently.” 

Expanding into new avenues of clinical research 

In addition to providing a new piece to help fill in the genetic puzzle associated with ASD, research conducted by Yuen and colleagues provides new insights into the importance of studying tandem repeats throughout the genome. While their novel findings in ASD are extremely valuable, the group also discovered that tandem repeats throughout the genome are much more variable than previously thought. A variety of genes containing tandem repeats may be subject to a multitude of different repeat patterns, which may have been previously overlooked. When combining this new knowledge of tandem repeats with the analysis pipeline developed by Dr. Yuen, the approach may be applied to a variety of genomic contexts. 

“I’m currently working with many clinicians and other researchers that are studying other disorders or even rare diseases that currently do not have a [known] genetic cause.” 

The development of such an analytical pipeline opens the door for researchers to explore tandem repeats as a genetic basis of disease. In many situations, especially concerning rare diseases, a diagnosis itself may be therapeutic from the perspective of the patient. For many individuals with rare diseases, diagnosis can take upwards of 6 years or more7, and for some, a diagnosis may never come. As the field of genomics continues to grow, there is hope that new developments in tandem repeat research and detection will strengthen our understanding of the genetic basis of disease, and subsequently improve current diagnostics. 

While improvement in diagnostic capabilities is a large achievement in and of itself, Yuen highlights that in certain disorders, such as Huntington’s disease, targeting tandem repeats could provide new opportunities for medical management. Research by one of Dr. Yuen’s collaborators, Dr. Christopher Pearson, identified a small molecule that was able to bind to tandem repeats present in Huntington’s disease, and was able to shorten the length of the repeat, and potentially reduce the severity of the disease8. “This is our very long-term hope,” Yuen replies when asked if the combination of tandem repeat detection and therapeutic targeting could become clinically relevant in the future. 

While the field of ASD genetics and tandem repeat expansion is still in its infancy, Dr. Yuen’s research paves the way for a multitude of further studies combining genomics with artificial intelligence, enabling more pieces of the ASD puzzle to come together. Overall, Dr. Yuen’s pipeline will change the way we look at our genome and will enable us to discover more genetic contributors to complex diseases. 


1. Pinto, D. et al. Functional impact of global rare copy number variation in autism spectrum disorders. Nature 466, 368–372 (2010). 

2. Colvert, E. et al. Heritability of autism spectrum disorder in a UK population-based twin sample. JAMA Psychiatry 72, 415–423 (2015). 

3. Trost, B. et al. Genome-wide detection of tandem DNA repeats that are expanded in autism. Nature 586, 80–86 (2020). 

4. Rylaarsdam, L. & Guemez-gamboa, A. Genetic Causes and Modifiers of Autism Spectrum Disorder. Front. Cell. Neurosci. 13, 1–15 (2019). 

7. Canadian Organization for Rare Disorders. Now is the Time: A Strategy for Rare Diseases is a Strategy for all Diseases. (2015). 

8. Nakamori, M. et al. A slipped-CAG DNA-binding small molecule induces trinucleotide-repeat contractions in vivo. Nature genetics52(2), 146–159 (2020). 

From lab to clinic: The search for new drug developments and therapies for Congenital Myopathies

Dr. Jim Dowling shares his knowledge regarding congenital myopathies, discusses the state of gene therapy and drug development, and tells where he believes these technologies are heading in the future. 

David Di Iorio, Jim Zhang, Zhilin Xu

Dr. Jim Dowling, MD/PhD, Physician and Clinical Scientist at The Hospital for Sick Children (SickKids) as well as Professor in the Department of Molecular Genetics in University of Toronto. Image taken from the University of Toronto Department of Molecular Genetics website.


Dr. Jim Dowling received his BSc and MSc from Yale University and his MD/PhD from the University of Chicago. He completed his residency at the Children’s Hospital of Philadelphia and pursued a Postdoctoral Fellowship at the University of Michigan. Dr. Dowling describes his research as “fundamentally aimed at developing therapies for patients with rare or genetic diseases, specifically for children who have muscle conditions.” Based on his clinical expertise and scientific research, he is considered one of the leading authorities on the diagnosis and treatment of congenital myopathies (CMs).

What are Congenital Muscular Diseases?

CMs describe a group of muscular diseases resulting from specific genetic mutations1. It is characterized by a defect of structural proteins and classified by the findings of muscle biopsy. The onset usually occurs during the neonatal period with hypotonia and weakness1. Additionally, CMs are relatively rare diseases, with a rough incidence as low as 1/25,0001,2. Although progress has been made in the field of medical genetics to provide a better understanding of the disease, the treatment is still limited to ameliorating the course of the disorder and preventing pulmonary complications, which means further studies in the development of new therapeutics are required3.

Dr. Dowling’s interest in congenital myopathies began during the first week of his residency in the pediatric neurology department. “We saw a muscle biopsy from a CMs patient which was incredibly abnormal in a very unique way,” he explained. It was fascinating to him how a single gene defect could lead to such a serious disease. Since there is no definitive treatment for these diseases1, “the desire to develop a therapy for something that had no treatment was very attractive to me,” Dr. Dowling said.

Zebrafish as a model for rare diseases

Congenital muscular dystrophy (CMD) is a category of congenital muscle disease characterized by a dystrophic pattern seen in muscle tissue histology2. Animal models of CMD are important to understand the pathogenesis of this disease and are also critical to develop novel treatment in preclinical trials4. Mice models have already made progress in identifying potential therapeutic targets. However, more recently, zebrafish have been recognized as a better animal model for studying muscle diseases5. Zebrafish share a similar skeletal muscle structure with humans and conserve both genetic and histological features. Additionally, their rapid embryonic development, ability to produce large numbers of offspring and the optical transparency of embryos and larvae also potentiate zebrafish to be a desirable tool for studying muscle disorders.

“One of our pioneering research efforts was to start using zebrafish as a model for rare diseases,” Dr. Dowling explained. In one of his papers5, a comprehensive analysis of muscle damage was conducted using LAMA2-related congenital muscular dystrophy (LAMA2-MD) zebrafish model (Fig.1). This model contributes remarkably to the understanding of pathogenic mechanisms and the development of new therapeutic strategies. Furthermore, genome-editing techniques such as CRISPR/Cas9 systems can be used with zebrafish easily and to generate a large number of patient-specific mutations for further studies, which cannot be accomplished in a mouse model in a cost-effective way due to their long period for embryo growth and relatively fewer offspring compared with zebrafish.

Figure.1 Analysis of the zebrafish LAMA2-MD model. (A, A’) Birefringence assay: the organization of muscle fibers can be visualized. Arrows indicate detached muscle in A’. (B, B’) Immunohistochemistry staining: arrows indicate detached muscle fibers in B’. (C, C’) Injected fluorescent marker: arrows indicate detached fibers in C’. (D, D’) Swimming assay: swim behavior can be tracked and quantified. Indicators of muscle function includes time spent, distance and speed. Fewer tracks and lower speed (indicated by green tracks) are seen in D’. Figure adapted from Fabian et al., 20205.

When asked to compare mouse and zebrafish models, Dr. Dowling explained: “there is no perfect model in any scenario.” Zebrafish models have their own drawbacks. “The key is to think about picking a model for your experiment [that has] the most pros and fewest cons.” For example, muscle disorders influence respiratory function, leading to respiratory failure due to diaphragmatic weakness, and, according to Dr. Dowling, “that is not something we are going to see in a fish.” He concluded that zebrafish would be an excellent model for large scale drug screening due to their high fertility rate. Afterwards, testing and validation in a mammal model such as a mouse may yield candidate therapeutics with the highest potential for successful translation to patients. After all, it is the researchers’ responsibility to select for the appropriate animal model that can best explain their experimental questions. Dr. Dowling also expressed his appreciation for the academic resources available to him that allow him to select the most appropriate model according to his research needs. For example, the use of a mouse model was crucial in his lab’s discovery of tamoxifen treatment for myopathies. 

 Tamoxifen treatment for myopathy: An unexpected discovery 

By the winter of 2018, a pre-clinical study revealed the ability of tamoxifen to significantly recover the myopathy phenotype in the mouse model. This astonishing finding indicated a promising future for treating myotubular myopathy through the proposition of the alternative tamoxifen usage6. Interestingly, tamoxifen served originally as a negative control in another CMs study also done by Dr. Dowling’s team7, but was found to elicit an unexpected increase in mouse survival. When Dr. Dowling’s technician realized that the tamoxifen-treated mice were “a little better, more active, and healthier” in a manner beyond the placebo threshold, the research team was motivated to uncover the mechanisms behind this unexpected development6.

To explore this, the lab used Mtm1-knockout mice (MTM) to mimic the myotubular myopathy phenotypes to prove the beneficial effect of tamoxifen. This includes increased mouse survival rate, improved muscle function, and restored muscle structure (Fig.2)6. One of the key phenotypes of MTM is the upregulation of a gene called DNM2, resulting in disrupted microtubule networks and progressive muscle weakness8. Dr. Dowling’s team uncovered a clear estrogen-receptor-α mediated pathway of tamoxifen that modifies the DNM2 protein translation process via ubiquitination6. Fortunately, the discovered pathway is maintained in the human biological system, suggesting a novel therapeutic potential of tamoxifen in myopathy6

Figure 2. Tamoxifen (TAM) treatment improves the muscle structure in Mtm1-KO (MTM). Tamoxifen treatments begin at 21 days where the treated-mouse muscle samples were screened at day 36. The pink (H&E) stained for nuclei localization; the purple (SDH) stained for mitochondrial aggregation; and the green (IF) stained for membrane structure. High-dose tamoxifen-treated MTM demonstrated reduced central nuclei, less aggregated mitochondria and restored membrane structure in comparison to untreated MTM, which collectively indicated alleviated myopathy phenotype. Figure adapted from Maani et al., 20186.

As an FDA-approved therapeutic, tamoxifen is available for immediate medical translation. However, Dr. Dowling’s team had encountered obstacles in their attempts to repurpose tamoxifen, as it is reasonably difficult for profit driven pharmaceutical companies to lend their drug to experimentation for alternative indication. But after two years of effort, Dr. Dowling can now proudly announce that the team is “poised to bring this [therapy] to patients.” As of March 2021, the initialization of this approach is under evaluation in a stage I placebo-controlled randomized trial.

When asked to reflect on the unexpected success of tamoxifen in his lab, Dr. Dowling concluded on the “good-controlled experimentation” that allowed them to discover the novel potential of tamoxifen. The discovery demonstrated the importance of responsible and thorough protocols during experimentation, ensuring no stone is left unturned.  

 Challenges of Drug and Therapy Development

Achieving these exciting advances in congenital muscular disease treatments have not come without their challenges, and the Dowling lab is not immune to common obstacles that nearly all research labs face. One notable example is the difficult process of taking research discoveries beyond the lab and into true development phases. This phenomenon, popularly known as the “Valley of Death”, represents a multitude of potential pitfalls research labs can experience, making it hard to create actionable therapies on their own9. “As academics, we are very good at discovery science, but [less so] when it comes to development science.” says Dr Dowling. This is a big reason why a lot of drug development ends up in the hands of industries rather than research labs.

As Dr. Dowling’s research focuses on rare diseases, they can also be faced with other, more unique challenges that are less of a factor in other labs researching more common diseases. In the case of congenital myopathies, there is a very real possibility of a lack of clinical data slowing the process of clinical trials. “It’s hard to have precision therapy for an individual if you don’t know the cause of their disease.” says Dr. Dowling, who understands that a disease with relatively unknown pathology makes it difficult to find a starting point for treatment. “[Even with] a great drug candidate, if there is very little known about the natural history of a disease, you may not have an effective clinical trial.” In the case of rare diseases especially, the challenges lie in generating enough clinical data to create effective drug trials.

Another major factor is the cost of treatment. “Even though it is simultaneously this time of incredible discoveries and excitement surrounding gene therapy, it comes with this bittersweet pill of the fact that it is also [expensive],” says Dr. Dowling. With the development side of drug production managed mostly by industries, and the potential lack of clinical data that comes with researching rare diseases, it is no wonder that the drug solutions which do make it to the end of the pipeline are rather costly. For this reason, cost reduction for these therapies has become a top priority for the future. 

Preparing for the Future

So, what is next in the world of congenital muscular disease gene therapy? For Dr. Dowling, the questions of cost and accessibility are two factors that need to be addressed for the future. There is no shortage of labs with research projects that push the boundaries of gene therapies, but with that abundance comes the question of where that funding will come from. “But, with this concern comes opportunity,” notes Dr. Dowling. This pressure for a funding solution may manifest as creative ideas. New technologies may strive to be cheaper, or perhaps, as Dr. Dowling says, we could even “repatriate these therapies and do them on our own.” Repatriating therapies would mean in-house development that reduces costs and enables more direct accessibility. 

Despite obstacles, Dr. Dowling remains optimistic about the future of congenital myopathy therapies, and rare disease therapies in general. According to him, the past few years of research have shown industry companies that “rare diseases are both potentially and profitably treatable.” Because of this, more and more companies are getting on board with rare disease research. 

Additionally, the findings in the Dowling lab have shown promise in potential future applications for other, unrelated diseases. Methods and therapies first conceived in the Dowling lab may be directly actionable in other diseases and, in some cases, already have been.CMD was the first muscle disease to undergo gene therapy trials, and, in the wake of their promising results, it was not long before many other disease labs followed suit. “The lessons we’ve gained from our studies are things that can apply to many diseases,” says Dr. Dowling. “It’s turned into something where we’ve done a lot of infrastructure and paradigm building that spans across many diseases.”


1.        Cassandrini, D. et al. Congenital myopathies: clinical phenotypes and new diagnostic tools. Ital J Pediatr 43, 101 (2017).

2.        Tubridy, N., Fontaine, B. & Eymard, B. Congenital myopathies and congenital muscular dystrophies. Curr Opin Neurol 14, 575-82 (2001).

3.        Claeys, K.G. Congenital myopathies: an update. Dev Med Child Neurol 62, 297-302 (2020).

4.        Widrick, J.J., Kawahara, G., Alexander, M.S., Beggs, A.H. & Kunkel, L.M. Discovery of Novel Therapeutics for Muscular Dystrophies using Zebrafish Phenotypic Screens. J Neuromuscul Dis 6, 271-287 (2019).

5.        Fabian, L. & Dowling, J.J. Zebrafish Models of LAMA2-Related Congenital Muscular Dystrophy (MDC1A). Front Mol Neurosci 13, 122 (2020).

6.         Maani, N. et al. Tamoxifen therapy in a murine model of myotubular myopathy. Nat. Commun. 9, (2018).

7.         Sabha, N. et al. PIK3C2B inhibition improves function and prolongs survival in myotubular myopathy animal models. J. Clin. Invest. 126, 3613–3625 (2016).

8.         Zhao, M., Maani, N. & Dowling, J. J. Dynamin 2 (DNM2) as Cause of, and Modifier for, Human Neuromuscular Disease. Neurotherapeutics 15, 966–975 (2018).

9.         Coller, B. S. & Califf, R. M. Traversing the Valley of Death: A Guide to Assessing Prospects for Translational Success. Science Translational Medicine 1, 10cm9-10cm9 (2009).