The researchers used Paraphase on more than 400 samples of spinal muscular atrophy (SMA) comprising 5 ethnicities.
Researchers of a new study have created an approach for comprehensive SMN1 and SMN2 profiling in samples of spinal muscular atrophy (SMA), which they say may be able to identify a larger number of pathogenic variants and silent carriers.
Their findings appeared in American Journal of Human Genetics.
The informatics method, called Paraphase, uses PacBio HiFi data to detect full-length SMN1 and SMN2 haplotypes, determine the gene copy numbers, and call phased variants.
“Here we provide the most comprehensive analysis of variation in one of the most difficult, clinically important regions of the human genome. Extending beyond copy-number testing based primarily on c.840C>T as is often done, Paraphase phases the region to provide a much richer level of information,” explained the researchers, noting that this often fails to identify the approximately 1% to 2% of carriers with pathogenic variants outside of c.840C>2. “Using the phasing information, Paraphase can detect other pathogenic variants and enable haplotype-based screening of silent carriers.”
The researchers used Paraphase on more than 400 samples comprising 5 ethnicities. Notably, they were able to identify a 2-copy SMN1 allele that makes up over two-thirds of 2-copy SMN1 alleles in African populations. A positive result for these 2 haplotypes resulted in a silent carrier risk of nearly 90% (88.5%) in the 87 African alleles with 2 copies of SMN1, significantly higher than the currently used marker of 1.7% to 3%.
Through Paraphase, the group also observed cosegregation patterns. For example, SMN1 haplogroup is often cosegregated with the SMN2 haplogroup that is most similar in sequence. This, said the researchers, indicates that the evolution of this region is largely attributed to intrachromosomal gene conversion between SMN1 and SMN2.
The researchers acknowledged the relatively small number of samples included in their study, noting that larger sample sizes will be important for making more statistically powered findings.
“With larger sample datasets enabling more accurate allele frequency calculations, it should be possible to build a probabilistic model to predict the most likely allele/genotype configurations based on the haplotypes seen in an individual. This would be very helpful for silent carrier detection,” they concluded. ‘For example, an individual with S1-8, S1-9d, and S2-1 haplotypes is very likely a silent carrier, as S1-8 and S1-9d rarely exist as singleton SMN1 alleles and S2-1 rarely segregates with S1-8 or S1-9d. For an individual with these haplotypes, the most likely alleles are 2 copies of SMN1 (S1-8þS1-9d) with no SMN2 on 1 allele and 1 copy of SMN2 (S2-1) with no SMN1 on the other allele.”
Among the single-copy SMN1 alleles, S1-1 was the most common haplotype for all ethnicities, ranging from a frequency of 29.9% in Africans to 83.3% in East Asians. S1-2 and S1-3 were common in Europeans, South Asians, and Admixed Americans, ranging from 10% to 20%.
Chen X, harting J, Farrow E, et al. Comprehensive SMN1 and SMN2 profiling for spinal muscular atrophy analysis using long-read PacBio HiFi sequencing. Am J Hum Gen. 2023;110(2):240-250. doi:10.1016/j.ajhg.2023.01.001