Here, we present a method to estimate the proportion of skeletal muscle fiber types from frozen samples allowing for a larger number of samples to be measured in a standardized, cost, and labor-efficient way. Skeletal muscle fiber type distribution is a determinant of physical performance [27,28,29,30] and overall health [31,32,33,34,35] and is highly heritable in humans [11, 12]. For example, a reduced proportion of oxidative slow-twitch type I fibers is associated with lower insulin sensitivity in the diabetic muscle [31,32,33,34] and muscle atrophy, e.g., age-related sarcopenia is progressing in a fiber type-specific manner . Recently, it has also been shown that skeletal muscle response and recovery from exercise training is dependent on fiber type composition and is thus an important factor to consider in the development of individualized training advice [36,37,38].
However, many of the abovementioned associations with fiber type distribution are based on low-powered studies. One of the limiting factors is that the methodology for determining fiber type distribution is labor-intensive and thus relatively expensive. The method presented here allows for a larger number of samples to be analyzed using totRNAseq in a standardized and relatively fast way. With this new method, the distribution of type I versus type II fibers can be estimated, but it cannot distinguish between type IIa and type IIx fibers. However, this is achievable with only a small amount of muscle tissue (~ 5–10 mg), e.g., from sampling with the minimally invasive microbiopsy technique . The deconvolution inference of fiber type composition was accurate even for very low sequencing depths, i.e., down to an average of ~ 10,000 PE reads. This means that a shallow-coverage totRNAseq experiment (or targeted RNAseq) will be sufficient to accurately estimate skeletal muscle fiber type composition at a low cost per sample (< 1 dollar/sample). This new method consequently allows for the measurement of fiber type distribution of a larger number of samples and makes it feasible to study the connection between fiber type distribution and health with well-powered studies. It can also be used for estimating fiber type distribution in public repositories of totRNAseq data (e.g., Genotype-Tissue Expression project) and perform in silico analyses of fiber type associations. In conclusion, totRNAseq can efficiently be used to estimate skeletal muscle fiber type distributions of frozen samples.
Materials and methods
snRNAseq data generation
For nuclei isolation from frozen tissue, all following steps were performed on ice with precooled buffers and centrifugation steps were performed at 4 °C. The tissue was disrupted and nuclei liberated through dounce homogenization in ice-cold homogenization buffer (0.32 M sucrose, 3 mM CaCl2, 2 mM magnesium acetate, 0.1 mM EDTA, 1 mM DTT, 10 mg/ml BSA, 10 mM Tris–HCL, protease inhibitors (Sigma-Aldrich)) in the presence of 0.1% NP40. The nuclei suspensions were sequentially passed through 40-, 30-, and 20-μm cell filters (Miltenyi Biotec) and centrifuged at 1000 g for 10 min. The pellet was resuspended in 1% RNase-free BSA in PBS and stained using DAPI (1:1000, BD Pharmingen). Intact single nuclei were sorted in bulk using the DAPI-positive event population, at single-cell sort precision and using a 100-µm nozzle (BD FACS AriaIII) into PBS/1% BSA. Nuclei were counted and loaded on a 10 × chromium microfluidic chip, aiming for the maximum possible number of nuclei to be targeted obtained from the sorting. Single-nucleus experiments were performed using the 10 × genomics single cell 3′ kit v.2 to encapsulate nuclei along with barcode tagged beads, generate, and amplify cDNA and to generate sequencing libraries. Each pooled library was barcoded using i7 barcodes provided by 10 × Genomics. cDNA and sequencing library quality and quantity were determined using Agilent’s High Sensitivity DNA Assay. Libraries were pooled and sequenced in 150-bp paired-end mode on Illumina’s HiSeq platform.
snRNAseq data processing and analysis
Post-processing pipeline Cell Ranger (https://support.10xgenomics.com/single-cell-gene-expression/software/) provided by 10X Genomics was used for demultiplexing, alignment, filtering, barcode, and unique molecular identifier (UMI) counting. The pipeline produced files in FASTQ and BAM formats, as well as the matrix of UMI counts. We used Seurat workflow for further quality control and downstream analysis of the snRNAseq gene expression data. The initial data set contained 2937 nuclei and 22,336 expressed genes. Nuclei with a high fraction of their counts coming from mitochondrial and ribosomal genes were removed. Next, genes with at least one UMI count present in at least one nucleus were selected. After these steps, 2699 nuclei and 22,084 genes were included in the downstream analysis. We normalized the gene expression data with the LogNormalize method of Seurat and standardized the count values prior to performing the Principal Component Analysis (Supplementary Fig. 3). JackStraw procedure  was applied as a denoising step in order to select an optimal number of Principal Components (PCs), indicating that 3 PCs to keep for further downstream analysis, which can be viewed as identifying the true intrinsic dimensionality of the snRNAseq data. Further, the number of cells predicted to be proliferative was investigated using a list of the genes annotated as functioning in the cell cycle according to . The vast majority of cells were not detected to be proliferating and the cycling cells did not form any separate cluster (Supplementary Fig. 4a-b). Graph-based clustering based on Louvain modularity optimization  with resolution parameter equal to 0.1 was used for detecting boundaries between different populations of nuclei, and tSNE non-linear dimension reduction with perplexity 50, that was chosen as a square root of the number of nuclei according to the k-nearest neighbors rule of thumb, was applied for visualization of the nuclei populations (Fig. 1a). The snRNAseq data was uploaded to Gene Expression Omnibus (GEO) and is available under accession number GSE190489.
Deconvolution analysis  was performed using the snRNAseq data as a reference. For this purpose, the snRNAseq data was used to produce cluster signatures, which are marker gene expressions averaged across cells from each cluster. After that, the gene expression of each type of the skeletal muscle fibers in a bulk RNAseq sample was inferred via linear matrix decomposition . The deconvolution analysis was performed using DeconRNASeq R/Bioconductor package, and the results are presented in Fig. 2a, b.
The MSAT cohort: 39 Swedish male subjects (Table 1) were enrolled in the study by advertising on social media and through local cycling clubs. Inclusion criteria were as follows: (1) male, (2) healthy, not on any medications, and (3) age range between 20 and 55. Subjects were given both oral and written information about the experimental procedures before giving their written informed consent. Each participant went through three visits at different time points. All subjects completed all three visits. The first visit involved a regular doctor’s examination with blood samples and measuring anthropometric characteristics. The second visit consisted, after an overnight fast, of a Wingate test followed by muscle biopsy and VO2max was measured during the third and last visit. The study was approved by the local Ethics committee, Lund University (Dnr 2015/593). For determination of peak anaerobic power (Wingate) and VO2max, subjects were instructed to perform only easy training 48 h prior to each test. To determine peak anaerobic power, a 30-s all-out Wingate test  was conducted on a cycle ergometer (Monark Peak power). Before the test, a 5-min low intensity ~ 150w warm-up, with instructions to perform a 5 s high cadence drill each minute was performed. The test started with the subject pedaling as fast as possible. When a cadence of 120 rpm was reached, a braking resistance equivalent to 0.7 N × m × kg−1 was applied to the freewheel and remained constant during the 30 s. Subjects were instructed to sit down throughout the test. Strong verbal encouragement was given throughout to ensure a maximal effort was provided. An incremental test to exhaustion was performed to determine VO2max. The test started with 3 min of cycling at 3 W × kg−1 (rounded down to nearest 10 W) and then increased by 35 W every 2 min until voluntary exhaustion or failure to maintain ≥ 60 rpm. Strong verbal encouragement was given throughout. VO2 was measured using an Oxycon Pro (Jaeger GmbH, Germany) with a mixing chamber and a 30 s sampling time. Gas sensors were calibrated according to instructions by the vendor before every test. Maximal oxygen uptake was determined as the mean of the two highest values attained during exercise from any 30-s period.
The Juntendo cohort: 23 Japanese subjects (Table 2) were recruited to examine the associations between RNA expression profiles and muscle fiber composition in the Japanese population. All subjects gave their signed informed consent before inclusion in the study. The study protocols were approved by the Ethics Committees of Juntendo University and were performed in accordance with the Declaration of Helsinki. No additional tests were performed in this cohort.
Muscle biopsies and histology
In the MSAT cohort (Table 1), muscle biopsies were taken from the vastus lateralis muscle under sterile conditions and local anesthesia (1% lidocaine) by using a 5-mm Bergström needle and frozen in liquid nitrogen. The biopsies were taken within 5 min after the Wingate test. Serial Sects. (10 μm) were cut using a cryostat at − 20 °C. Myofibrillar ATPase histochemistry was performed by preincubation at pH 4.4, 4.6, and 10.3 to identify fiber types ; the proportion of fiber types (i.e., type I, IIa, or IIx) were calculated as the number of each fiber type, divided by the total number of fibers in the section. Computer image analysis was performed using image analysis equipment (BioPix IQ 2.0.16 software, BioPix AB, Sweden).
In the Juntendo cohort (Table 2), muscle biopsies were taken from the vastus lateralis muscle under sterile conditions and local anesthesia (1% lidocaine) by using a disposal needle biopsy instrument (Max Core; C. R. Bard, Covington, GA). The biopsies were collected from approximately 15 cm above the patella in both legs of each subject under ultrasound imaging (Noblus; Aloka, Tokyo, Japan) and avoided the inclusion of subcutaneous fat and the subfascial and myotendinous parts as far as possible. In addition, any visible non-muscle tissues (e.g., adipose tissue) were removed from the biopsy samples. Samples were frozen immediately in liquid nitrogen and stored at − 80℃ until further analysis. Myosin heavy chain (MyHC) protein isoforms were assessed as markers of muscle fiber composition. Frozen muscle samples were homogenized in ice-cold lysis buffer [50 mM HEPES (pH 7.4), 10 mM EDTA, 4 mM EGTA, 50 mM β-glycerophosphate, 25 mM NaF, 5 mM Na3VO4], containing a phosphatase inhibitor (PhosSTOP tablet; Roche Diagnostics, Indianapolis, IN), and a protease inhibitor (Complete tablet; Roche Diagnostics). The lysates obtained were centrifuged at 10,000 g for 10 min at 4℃. An insoluble pellet, obtained after centrifugation, was suspended in a sufficient volume of SDS sample buffer [30% glycerol, 5% β-mercaptoethanol, 2.3% SDS, 0.05% bromophenol blue, and 62.5 mM Tris–HCl (pH 6.8)] and boiled at 95℃ for 5 min. MyHC composition was determined by glycerol SDS-PAGE, according to Kakigi et al. . Briefly, protein samples were resolved by performing glycerol SDS-PAGE [stacking gel: 4% acrylamide, 34.7% glycerol, and 125 mM Tris–HCl (pH 6.8); separating gel: 8% acrylamide, 33.3% glycerol, and 375 mM Tris–HCl (pH 8.3)]. Electrophoresis was started at 60 V with stacking gel at 8℃. The voltage was set to 150 V and run for 18 h at 8℃ when the tracking dye had entered the separating gel completely. After separation, the gels were stained with Coomassie brilliant blue (Biosafe G250; Bio-Rad Laboratories, Hercules, CA) and rinsed repeatedly with water. Each gel was scanned using a calibrated densitometer (ChemiDoc Touch Imaging System; Bio-Rad Laboratories), and the relative proportion of MyHC-I, MyHC-IIa, and MyHC-IIx were determined using the calibrated densitometer (ChemiDoc Touch Imaging System) and analytical software (Image Laboratory software version 5.2.1; Bio-Rad Laboratories).
RNA extraction and totRNAseq analysis
In the MSAT cohort, RNA was extracted from 25 to 30 mg of muscle biopsies using a TissueLyser II (Qiagen) and the miRneasy Mini Kit (Qiagen). The RNA concentration was determined using a NanoDrop ND-1000 spectrophotometer (A260/A280 > 1.8 and A260/A230 > 1.0 (NanoDrop Technologies, Wilmington, DE, USA). RNA integrity was verified using the 2200 TapeStation instrument (Agilent Technologies, CA, USA), where all samples had an average RNA integrity number (RIN) above 8. In the Juntendo cohort, frozen muscle samples were crushed with 5.0-mm zirconia beads using a Micro Smash MS-100R (Tomy Seiko, Japan) at 3000 rpm twice for 15 s at 2 °C. The total RNA was extracted from muscle samples using TRIzol® Reagent (Thermo Fisher Scientific, Waltham, MA, USA) according to the manufacturer’s protocol. RNA concentration and purity were checked using a NanoDrop 8000 UV–Vis Spectrophotometer (Thermo Fisher Scientific, Wilmington, DE, USA). RNA integrity was verified using the 2200 TapeStation instrument (Agilent Technologies, CA, US), where all samples had an average RNA integrity number (RIN) above 8.
All samples from both the MSAT and Juntendo cohorts were sequenced at Lund University using 800 ng input RNA. Library preparation was made using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Human/Mouse/Rat Set A (Illumina), and the 75 bp paired-end sequencing was performed on a NextSeq instrument using the NextSeq® 500/550 High Output Kit v2 (150 cycles) (Illumina). The sequencing quality was checked with fastQC v0.11.9 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc) and multiQC  v1.9. Gene expression was assessed using Salmon  v1.2.1. Exon expression was obtained by mapping reads with STAR  v2.7.6 and counting with featureCounts  v2.0.1 (options: -p -f -C -O). We used the GRCh38 Ensembl v77  as a reference genome. We used the Seqtk tool (https://github.com/lh3/seqtk) for gradual random downsampling of the MSAT data and applied Salmon for quantifying gene expression of the downsampled RNAseq data. The quantified gene expression for each downsampling iteration was normalized with TMM , and deconvolution analysis was performed using the gene markers identified for slow- and fast-twitch human clusters in the snRNAseq data as described above. We computed the Spearman correlation coefficient and mean square deviation between predicted and true fiber type composition for each downsampling iteration.