Y2H bait design
Bait design was organized in three successive rounds in which primary, secondary and tertiary baits were selected. Structural and functional domain predictions from TM-HMM, SignalP3.0 or PFAM (PFAM 23.0, release 19/08/2008) were used to exclude hydrophobic trans-membrane domains, signal peptides and transcriptional trans-activation domains from bait constructs. In addition, to favor the identification of novel protein-protein interactions, we usually selected regions on the proteins that had not been previously documented for their functional role. For selection of secondary and tertiary bait proteins and for design of their bait sequences, we used additional bibliographic searches and other criteria computed from our Y2H results such as the Predicted Biological Score (PBS) categories and information from Selected Interacting Domains (SID; see below section “Identification of interacting fragments and scoring of the interactions”). Some examples of bait candidates that came to our attention included targets of choice for therapeutic strategies such as proteins that participate in signaling pathways, proteins involved in various forms of myopathies or proteins expressed in typical muscle cellular compartments such as the sarcomere or sarcoplasmic reticulum.
Bait cloning and library construction
Bait sequences were PCR-amplified from MRC Gene Service or Invitrogen plasmids or from a random primed cDNA library obtained by reverse transcription of a poly(A) RNA library isolated from adult (Ambion AM7983) or 18-19 week-old fetal (Stratagene #778020) human skeletal muscles. Bait PCR products were cloned in the pB27 plasmid, a plasmid derived from the original pBTM116, as a LexA C-terminal fusion. Plasmid DNA was purified with the QIAprep Spin Miniprep (QIAGEN), verified by full insert sequencing and introduced into the L40ΔGal4 (MATa) bait yeast strain. Alternatively, prey fragments were directly extracted from the prey plasmid and subsequently cloned in pB27 to use them as secondary or tertiary baits.
The prey library in yeast was constructed from adult (Ambion AM7983) and fetal (Stratagene #778020) human skeletal muscle poly(A) RNA. Random-primed cDNA fragments were prepared from these two RNA pools and cloned in the pP6 plasmid derived from the original pACT2 (Clontech) as a C-terminal fusion of the Gal4 transcription activating domain. Altogether, 90% of the plasmids contained a cDNA insert with an average size of 600 bp. After amplification in Escherichia coli (50-100 million independent clones), the Y187 (MATalpha) yeast strain was transformed with an equimolar pool of the adult and fetal cDNA libraries. Ten million independent yeast colonies were then collected, pooled and stored at -80°C as equivalent aliquot fractions of the same prey library. Validation of the prey library was performed by recapitulating several published interactions as described in. Bait proteins belonging to different functional classes were used: a GTPase (Rac1), a transcription factor (TP53), a splicing factor (SF1) and a component of a E3-ligase complex (BTRC).
Screening procedure and identification of prey fragments
Y2H screens were performed using a mating method as described in at the Hybrigenics facility (Hybrigenics, Paris). As first step, small-scale screenings were performed to assess toxicity and auto-activation capacity of the baits and to adjust selective pressure of the screens accordingly. In particular, the optimal concentration of 3-aminotriazol (3-AT) was determined prior to performing each large-scale screen. Auto-activating baits able to activate transcription of the reporter gene by themselves were identified and were not considered for large-scale screenings. Subsequently, each bait clone was tested in a full-size screen against an average of 103 million yeast prey clones, equivalent to ten-fold coverage of the library. All positive clones were picked and the corresponding prey fragments were PCR-amplified and sequenced at their 5′ and 3′ junctions. Sequence contigs were built and identified by comparison to the NCBI Human RefSeq database as described in.
Identification of interacting fragments and scoring of the interactions
Following contig assembly of positive clones, the common sequence shared by the assembled prey fragments was used to define the SID along each prey protein. Furthermore, for each interaction, a PBS was computed with E-values ranging from 0 to 1 to establish six distinct categories: PBS-A to -E (see for details on calculation). The technically most reliable interactions were associated with the PBS-A, -B or -C categories (with P values < 1e-10 for PBS-A;< 1 e-5 for PBS-B and < 1e-2.5 for PBS-C) and are found in two reciprocal and independent screens (X->Y and Y->X) and/or in interaction cycles (X-Y, Y-Z and X-Z) and/or in a single screen but with many overlapping prey fragments. Interactions were assigned to the PBS-D category when they were supported either by a single experimental clone from a screen or by several clones bearing the same start and stop positions, the SID being identified by a singleton fragment instead of a family of several overlapping fragments. This PBS-D category corresponds to a heterogeneous group of interactions that theoretically could consist of technical false-positive interactions as well as true-positive interactions hardly detectable by Y2H systems (due to constraints in tri-dimensional conformation of bait or prey domains, toxicity in yeast, poor mRNA representation of the prey in the library, …). All the PBS-D should therefore be considered as putative unless validated by a second technique. The PBS-E category characterizes SID that have been found as prey in more than ten independent screens with unrelated bait proteins in all screenings performed with human libraries at the Hybrigenics facility. These interactions potentially represent possible false-positives of the Y2H system as well as interactions with proteins known to be highly connected due to their biological function or with proteins containing a biochemically promiscuous motif. Finally, interactions with proteins or domains corresponding to known false positives of the Y2H system as it is described above were removed from the data and from our analyses. Examples of yeast growth assays describing interactions with the different PBS categories using the same experimental procedures can be found in[16–18].
The antibodies used for immunoprecipitation of the baits are BD Biosciences anti-TCAP (T26820-050), Novocastra Laboratories Ltd anti-DYSF (NCL-Hamlet) and Santa Cruz Biotechnology anti-ABI1 (sc-30038), anti-ACTN2 (sc-15335), anti-DES (sc-14026), anti-MYOM1 (sc-30390) and anti-TCAP (sc-8725).
The antibodies used for prey detection by western blot are Abcam anti-SNAPIN (ab37496), Abnova anti-ADPGK (H00083440-M01), anti-APPL1 (H00026060-A01) and anti-ENO1 (H00002023-M04), Aviva anti-KBTBD10 (ARP38732_T100) and Santa Cruz Biotechnology anti-KIF1B (sc28540) and anti-KTN1 (sc33562).
The antibodies used for immunochemistry and Duolink assays and their corresponding dilutions are: Abcam anti-CMYA5 (ab75351, 1:50) and anti-OPTN (ab23666, 1:100), Abgent anti-DGKD (AP8126b, 1:50), Abnova anti-DNAJB6 (H00010049-M01, 1:100) and anti-EEF1G (H00001937-M01, 1:50), Novocastra Laboratories Ltd anti-DYSF (NCL-Hamlet, 1:20), Proteintech Group anti-SNAPIN (10055-1 AP, 1:50), Santa Cruz Biotechnology anti-ACTN2 (sc-15335, 1:100), anti-ALMS1 (sc-54507, 1:50), anti-APPL1 (sc-67402, 1:50), anti-DES (sc-14026, 1:100), anti-FLNC (sc-48495, 1:100), anti-KIF1B (sc-28540, 1:50), anti-MYOM1 (sc-30390, 1:100), anti-MYOM2 (sc-50435, 1:200) and anti-NEB (sc-28286, 1:100) and Sigma anti-NPHP3 (HPA009150, 1:75).
The bait proteins were isolated from R9 cell extracts (a gift from Dr. Anne Galy, Inserm U790, Evry, France) at myoblast or myotube stage (7 to 10 days of differentiation) or from gastrocnemius muscle excised from four week-old mice and homogenized in 6 ml lysis buffer (Tris 20 mM, pH 7.5, NaCl 50 mM, EGTA 2 mM, Triton 1%, Protease Inhibitor Cocktail (Complete mini, Roche), E64 2 μM) using a FastPrep-24 apparatus (MP Biomedicals). The mouse samples correspond to a protocol approved by Genethon’s ethics committee under the number CE11_014 and performed in accordance with the directive of 24 November 1986 (86/609/EEC) of the Council of the European Communities. After centrifugation of the lysates, 500 μg to 1 mg of proteins in 1 ml were incubated with 30 μl of protein G–Sepharose beads (Amersham) for 1 h to clear from nonspecific binders. The protein extract was then subjected to immunoprecipitation by 1 h incubation at +4°C with 2 to 4 μg of primary antibodies corresponding to the baits, then 30 μl of protein G Sepharose beads (Amersham) were added and incubation was carried out for 2 h or overnight at +4°C.
After centrifugation at 1000 g for 5 min, the immunocomplex was washed three times with 1 ml of buffer and resuspended in 15 μl 4x NuPAGE LDS sample buffer (Invitrogen) and dithiothreitol reducing agent. Samples were then heated at +70°C for 10 min and centrifuged briefly. Protein complexes were separated by electrophoresis on SDS-PAGE NUPAGE 4-12% Bis-Tris gel (Invitrogen). Transfer of the proteins was performed on PVDF membrane and verified by staining with Ponceau red. Immunostainings were performed with primary antibodies corresponding to prey and IRD-680 or 800 donkey anti-mouse, -rabbit or -goat as secondary antibodies according to LI-COR’s protocol. Bands were then visualized with the Odyssey infra-red imaging system (LI-COR-Biosciences) at 700 nm (red) and 800 nm (green).
Indirect immunofluorescence microscopy assays were carried out on transversal cryosections prepared from normal human paravertebral striated muscles of a 13-year old female biopsy obtained from the biobank Myobank under the validation number AC-2008-87 from the French ministry of research (Institute of Myology, Paris). The sample was treated anonymously. Frozen slides were air-dried for 30 min at room temperature, fixed with 4% PAF for 5 min, washed 3 × 5 min in PBS, incubated in a blocking buffer (4% BSA, 0.02% Triton) for 30 min, washed in PBS, then incubated with a biotin blocking solution (Vector Laboratories, SP-2001) for 15 min and washed in PBS for 5 min. Slides were stained at room temperature for 1 h or at +4°C overnight with primary antibodies diluted in the labeling solution (1% BSA / PBS). Slides were then incubated with a donkey anti-mouse-Alexa 488 for dysferlin and a donkey (anti-rabbit or anti-goat) biotinylated secondary antibody for its partner (dilution 1:1000) for 45 min, washed 3x 5 min in PBS and stained with streptavidin coupled to Alexa-594 (Molecular Probes, dilution 1:500 in PBS) for 30 min. For nucleic acid staining, slides were then incubated with TOPRO-3 (Molecular Probes, dilution 1:2000) for 5 min, washed 2 x 5 min in PBS and 1x in water for 2 min. Slides were subsequently mounted in Fluoromount-G™ (SouthernBiotech, 0100-01). Images were acquired using the 40x or 63x objective of a Zeiss Axiovert 100 M. LSM.510 Meta laser scanning confocal microscope and the constructer software. Colocalization analyses were performed by statistical analysis of the correlation between the intensity values of red and green pixels in a dual- channel image. The JACop plug-in for ImageJ (Rasband, W.S., ImageJ, U. S. National Institutes of Health, Bethesda, Maryland, USA,http://imagej.nih.gov/ij/, 1997-2011 ) was used to calculate Pearson’s Correlation coefficient. Co-localization was defined as strong for 0.5<R≤1, medium for 0.25<R≤0.5 and low for R≤0.25.
Proximity ligation assays
The Duolink® kit (Olink Bioscience) is based on the use of two unique and bi-functional probes called PLA™, each probe consisting of a secondary antibody attached to a unique synthetic oligonucleotide that acts as a reporter. After a 10 min fixation with paraformaldehyde 4% and blocking (BSA 5% in PBS) steps, muscle sections were stained with one or two primary antibodies depending on the experiment (single protein detection or detection of interacting proteins) over-night at +4°C. After washing, the sections were incubated with the secondary oligonucleotide-linked antibodies (PLA probes) provided in the kit. The oligonucleotides bound to the antibodies were hybridized, ligated, amplified, and detected using a fluorescent probe (Detection Kit 563). Dots were detected with the Zeiss laser scanning confocal microscope and intensity signal counted using ImageJ software (http://imagej.nih.gov/ij/). A series of controls were performed for each analysis (bait antibody only, prey antibody only and negative control for which the primary antibody is omitted).
For quantification analysis: three images were acquired under the same conditions (laser power, PMT gain and pinhole) for each experiment. For each image, five fibers were randomly selected and used to count all positive spots within each compartment (total of 15 cells). The regions of interest (ROI) for membrane and cytoplasm compartments were separately delimitated manually and signal quantification was performed on all identified spots using the ImageJ software. For each compartment, we considered that the PPI was validated by the assay when the mean signal ratio between the PLA images of the PPI, “PPI signal”, and the control images of the prey, “PREY signal”, was superior to 0.2, indicating that the interaction with the prey potentially recruited more than 20% of the interacting partner in the delimited compartment.
Bioinformatics and statistical analyses
IpScan with Interpro 17.0 was used to annotate the protein sequences. The SID coordinates were compared with the position of the different Interpro domains. Cytoscape tools (http://www.cytoscape.org) were used to infer connectivity, a parameter that indicates the number of proteins that directly interacts with a given protein. Comparison of PPIs identified by our Y2H screenings with previously published PPIs was performed using the iRefWeb interface (;http://wodaklab.org/iRefWeb/) by considering direct interaction found in mammals.
GO mapping and clustering were performed with the DAVID 6.7 web interface[23, 24] using the Functional Annotation Clustering tool and the GOTERM_FAT annotation categories in order to filter the broadest GO terms. For the LGMD-centered dataset, a list of official gene symbols was used to identify the proteins and within each identified GO cluster, GO terms were analyzed in terms of hierarchy to identify the most specialized children terms common to all proteins within the cluster and these terms were reported as “shared GO” annotations (Additional file1: Table S1). To analyze whether our datasets were statistically different from a random dataset, GO clustering was also performed with a list of Uniprot accessions for the 19220 human protein-coding genes (HGNC,http://www.genenames.org/). The number (Shared-i) of human proteins with which a given bait protein (Bait-I) shared a GO cluster was calculated for all three GO classes (BP, MF and CC) and all baits and the number of protein pairs not sharing a GO cluster was deduced (NonShared-i = 19220 - Pi). The overall frequency of expected shared and non shared protein pairs was calculated as the ratio between the sum of Shared-i and the sum of all pairs (76 × 19220), and the sum of NonShared-i and the sum of all pairs, respectively. A Chi-2 test (P<0.05) was used to compare expected values with observed values from the LGMD-centred dataset or the subset consisting of all PBS-A to -C categories.
GO enrichment analyses were performed using the DAVID Functional Annotation tool with Uniprot accession numbers as identifiers, the Homo sapiens background and the GOTERM_FAT annotation categories. Enrichment at 1% significance level was defined with a modified Fisher exact P value (the “EASE” score) as recommended by the DAVID interface.
Statistical analysis of obtained proportions for the other analyses was done using the Fisher test function in R.