Linking protein aggregation and structural stability to predict pathogenic MYH7 variants via machine learning

MYH7 gene encodes for slow/β-myosin heavy chain (MyHC I) – the principal myosin isoform in both type-I skeletal fibers and ventricular cardiomyocytes (Naderi et al., 2023, Viswanathan et al., 2017). This contractile protein in association with two light chains contributes to the pivotal motile protein complex of the main contractile unit of muscle cell – sarcomere. Apart from the MYH7 protein, the sarcomeric structure includes many other proteins; however, MYH7 is one of the largest and plays a central role in sarcomere assembly, three-dimensional organization, and function. Consequently, changes in the structure and function of MYH7 give rise to various MYH7-associated diseases, which significantly impact both cardiac and skeletal muscle function.

The MYH7 molecule consists of an N-terminal head (S1 region) with a globular structure that harbors ATPase activity, an S2 region serving as a neck/flexor, and a long C-terminal rod region that enables dimerization through its α-helical coiled-coil structure (Fig. 1). The coiled-coil structure follows a seven-residue repeat pattern (abcdefg)n throughout most of its sequence, with hydrophobic residues typically occupying positions a and d and polar residues at the remaining positions. These distinct regions of MYH7 contribute to its various functions, and consequently, alterations in each region can lead to diverse human diseases and phenotypes. Most MYH7 pathogenic variants affecting myosin head cause hypertrophic or dilated cardiomyopathy due to altered ATPase activity, abnormal interaction with actin-troponin complex and, as a consequence, abnormal sarcomeric contractility and Ca2+-sensitivity. In contrast, MYH7 mutations in distal rod domain can lead to myosin storage myopathy and other myopathic disorders due to abnormal protein aggregation, destabilization of protein–protein interactions and formation of high molecular weight protein inclusions when mutant myosin accumulates as subsarcolemmal “hyaline” bodies in type-I fibers (Naderi et al., 2023, Viswanathan et al., 2017). However, these mutations remain relatively poorly characterized compared to those affecting the myosin head. While molecular mechanisms of MYH7-related cardiomyopathies are being studied quite intensively even resulting in introduction of new therapeutic approaches and creating the new personalized drugs (Braunwald et al., 2023), much less is known on MYH7-related MSM and other MYH7-related skeletal muscle disorders. In a subset of muscular disorders associated with the α-helical coiled-coil region of MYH7, a cardiomyopathy phenotype has been observed (Fiorillo et al., 2016). Thus, structural changes in MYH7 leading to its insoluble protein aggregates can underline both skeletal myopathies and cardiomyopathies. The latter molecular mechanism has been greatly underestimated, and the number of structural and functional studies on MYH7-aggregate formation leading to cardiac and muscle phenotypes is far lower than that of studies on S1-related mutations.

Most MSM‐causing variants reported to date cluster in the rod‐domain region of MYH7 called light meromyosin (Dye et al., 2006, Naderi et al., 2023). Armel and Leinwand first identified four such rod‐domain mutations (L1793P, R1845W, E1886K, H1901L) that disrupt filament assembly and cause myosin storage myopathy (Armel and Leinwand, 2010). Subsequent studies detected additional rod variants: for example, an X1936Wfs stop‐loss mutation and an in‐frame K1784 deletion each yielded MSM (Stalpers et al., 2011). These mutations are located in exons 37–40, often affecting the assembly-competence domain critical for coiled-coil dimerization. MSM is typically autosomal dominant, but rare homozygous mutations (e.g. E1886K) have also been described leading to the severe recessive disease with early-onset cardiomyopathy (Armel and Leinwand, 2010).

As sequencing of individual genomes, as well as selected genes, continues to expand, these data increasingly outpace studies linking genetic variants to specific pathologies. Therefore, developing in silico prediction methods capable of associating mutations with disease has become increasingly important. Previously, we developed and applied a machine learning method that primarily used data derived from the known three-dimensional (3D) tetramer structure of TTR to detect potentially pathogenic variants leading to cardiomyopathy (Pyankov et al., 2025). The effectiveness of this approach depends on the availability and quality of structural data for the analyzed proteins. In this study, we applied a similar approach to MYH7, leveraging available structural information obtained both from known 3D structures and from cutting-edge modeling methods. As a result, we developed a machine learning–based predictor, RDSM-MYH7, designed to assess the pathogenicity of MYH7 mutations. Benchmarking analyses demonstrated that RDSM-MYH7 outperforms existing computational tools in predicting mutation impact. This predictor can be applied to individual gene sequencing data to identify MYH7-associated myosin storage myopathy.

Comments (0)

No login
gif