SUPPLEMENTARY DATA
Structural insights into the unique single-stranded DNA binding mode of DNA processing protein A, DprA from Helicobacter pylori Wei Wang, Jingjin Ding, Ying Zhang, Yonglin Hu* and Da-Cheng Wang* The National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, 15 Datun Road, Chaoyang District, Beijing 100101, China * To whom correspondence should be addressed. Tel.: 86-10-64888547; Fax: 86-10-64888560; E-mail:
[email protected]
Table S1. All proteins mentioned in this study. Protein
Expression vector
Experiments
Full-length HpDprA
pET22b
Crystal, EMSA, MST
L196E-F205E
pET22b
EMSA
H8E
pET22b
EMSA
Y11E
pET22b
EMSA
R52E
pET22b
EMSA
HpDprA
Y108E
pET22b
EMSA
HpDprA
K137E
pET22b
EMSA
F140E
pET22b
EMSA
R143E
pET22b
EMSA
R52E/K137E
pET22b
EMSA
K137E/R143E
pET22b
EMSA
R52E/R143E
pET22b
EMSA
HpDprA(5-225)
pET22b
Crystal, EMSA, MST
HpDprA(5-225)
pGEX-6p-2
EMSA
WT HpDprA(5-217)
pET22b
Crystal, EMSA, MST
H8E HpDprA(5-217)
pET22b
MST
Y11E HpDprA(5-217)
pET22b
MST
R52E HpDprA(5-217)
pET22b
MST
HpDprA(5-217)
Y108E
pET22b
MST
K137E HpDprA(5-217)
pET22b
MST
F140E HpDprA(5-217)
pET22b
MST
R143E HpDprA(5-217)
pET22b
MST
R52E/K137E HpDprA(5-217)
pET22b
MST
K137E/R143E HpDprA(5-217)
pET22b
MST
R52E/R143E HpDprA(5-217)
pET22b
MST
Full-length SpDprA
pET22b
EMSA
HpDprA
HpDprA HpDprA HpDprA
HpDprA HpDprA HpDprA HpDprA
HpDprA
Table S2. All ssDNA sequences mentioned in this study
ssDNA dT35
a
a
Sequence TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT
dA35
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
dC35
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
dR35
TACCTAACTGGATGAATCTGAGAACTACTGTGAAT
dR30
CCACAACTGGATGATGCGGCGAAGTACTGT
anti-dR30
ACAGTACTTCGCCGCATCATCCAGTTGTGG
dTx, x nt deoxythymidine nucleotides (dT). It is notable that dT1-dT6 represents the first-sixth single dT,
in which the number is not labeled as subscript.
Figure S1. (A) Model of RpDprA dimer and its electrostatic surface potential. (B) Model of SpDprA dimer and its electrostatic surface potential. (C) Model of full-length HpDprA dimer and its electrostatic surface potential. The black dashed circles indicated the modeled additional C-terminal domain by I-TASSER server. The colors of different domains of three DprAs are same as Figure 1A. The yellow dashed circles indicated their conserved positively charged pockets. (D) The most
conserved residues from three helices (α4, α6, and α7) and a flexible loop (β3-α3) locate nearby the positively charged binding pocket of HpDprA. (E) The superimpositions of DprA domains from the three homologous proteins. The most conserved residue, the Arginine mediating the binding pocket is shown as stick models with one of RpDprA in orange, of SpDprA in white and of HpDprA in cyan.
Figure S2. (A) Location of residues involved in ssDNA-binding on the HpDprA surface. Residues of molecule B are labeled in green, as molecule B’ labeled in yellow. (B) Base-stacking interactions between proteins and dT6. The color codes of protein residues are as in A and ssDNA bases are shown in cyan. (C) Detailed views of H-bonding sites for dT1–dT3 observed from HpDprA(5-217)–ssDNA crystal structure.
(D)
Superimpositions between main-chains of apo-HpDprA(5-225) (salmon) and
HpDprA(5-217)–ssDNA (pale green). (E) The Arg52 of the molecule A rotates 40º due to ssDNA binding. (F) The “opened” pocket in molecule A of HpDprA(5-217)–ssDNA.
Figure S3. EMSA results from HpDprA–ssDNA using 5’-biotin-labeled ssDNA substrates (5 nM). (A) Full-length HpDprA–dC35. (B) Full-length HpDprA–dR35. (C) Full-length HpDprA–dA35. (D) Full-length HpDprAH8A-Q10A-Y11A–dT35. (E) Full-length SpDprA–dT35. (F) Tag-free HpDprA(5-225)–dT35. The tag-free protein was gotten from the GST-HpDprA(5-225) fusion protein after PreScission Protease cleavage. (G) HpDprA(5-217)–dT35. (H) Results of HpDprA(5-217) binding to different length ssDNAs (dT20 to dT60) probes and quantitative assay. Especially, the gradient of protein concentration is same in dT30–dT60 experiments. FD (free DNA), A1 band (oligomeric complex) and A’2, A’3 bands (polymeric complexes, which are apparently different from A2, A3 bands in full-length HpDprA–dT35 assay). (I) SLS analyses of HpDprA(5-217)–dT60 complex and apo-protein, demonstrating that their molecular weights (MW) are 118.70 ± 2.4 kDa in blue and 51.32 ± 2.0 kDa in pink, respectively.
Figure S4. Effect of mutations of full-length HpDprA on ssDNA-binding activity. (A) EMSA results of wild type HpDprA, seven single-site and three double-site mutations with dT35. A series of nine protein solutions with different concentrations were prepared by consecutive 2-fold dilutions from stock solution (5 µM). All gels were run under identical conditions, using a ssDNA concentration of 5 nM. (B) SDS-PAGE analyses of all mentioned proteins under the same concentration (5 µM).
Figure S5. MST binding curves of HpDprA–ssDNA. The scale of X-axis is DNA concentration (100– 106 nM). (A) MST ligand binding measurements of full-length HpDprA (left panel) and HpDprA(5-225) (right panel). Prior to testing, a series of ssDNAs (1.46 nM to 48 µM) were incubated with a fixed concentration of labeled protein (200 nM) at the same volume ratio for 20 min at 298 K. (B) Results of
wild type HpDprA(5-217), seven single-site and three double-site mutations with dT35. (C) MST analyses of shorter poly-dT ssDNAs to HpDprA(5-217). (D) The results of HpDprA(5-217) interacting with different types of ssDNA including dA35, dR30 and dsDNA30.
Figure S6. Primary sequence alignment of DprA family proteins from both NT-competent (blue stars) and non-NT-competent bacteria. Identical residues are shown in white over the red background, and conserved residues are boxed with red labels. The conserved residues nearby the binding pocket are labeled as blue dots. These sequences cover almost representative DprA proteins from different bacterial taxonomy. The alignment excluded some DprA homologues from virus, archaea and eukaryote.
Figure S7. Three canonical protein folds binding to ssDNA. (A) The OB fold of Plasmodium falciparum Ssb monomer is wrapped by ssDNA (PDB code 3ULP). (B) Two RecA-like folds cooperatively bind to ssDNA (PDB code 3CMW). (C) The complex of HhH domain of Human XPF protein and ssDNA (PDB code 2KN7).