supplementary information - Nature

14 downloads 0 Views 1022KB Size Report
(Mycalidae,Poecilosclerida). Toji, Japan. 34°64.12' N, 138°91.70' E. +. Niphates digitalis. (Niphatidae, Haplosclerida). Little San Salvador, Caribbean Sea.
SUPPLEMENTARY INFORMATION

doi:10.1038/nature12959

SUPPLEMENTARY NOTES Supplementary Note 1: Whole genome amplification of single filaments. The success of WGA can be highly influenced by the condition of the cells prior to MDA, such as cell preservation after collection, cell storage, and cell lysis conditions. For “Entotheonella factor”, WGA was successful when cells were sorted immediately after differential centrifugation. Cells sorted into 96-well plates that were to be used later were stored at 4 °C rather than the standard freezing at -80 °C. Prior to MDA, heat (95 °C) was sufficient to lyse the cells, while genome amplification was conducted based on the manufacturer's protocols.

Supplementary Note 2: Assembly of the 16S rRNA region from the metagenome. Since only a single contig was assembled containing an "Entotheonella"-derived 16S rRNA gene sequence, the original reads for the 16S region were manually inspected, revealing 35 SNPs with a frequency of 30% to 50%. The origin of these variants was reassessed by analyzing 16S rRNA gene sequences amplified from the enriched filamentous sample as well as the MDA plates. This identified two highly similar sequences (97.6% pairwise identity), one with 100% identity to the genomic sequence and a second with 36 nt differences, of which 35 were identical to the SNPs identified from the genome assembly.

Supplementary Note 3: Detailed protein isolation and mass exchange assay for A domain characterization. Cells from overnight expression cultures were harvested by centrifugation (5000 rpm, 20 min, 4 ºC), resuspended in lysis buffer (25 mM Tris-HCl pH 8.0, 400 mM NaCl, 10% (v/v) glycerol, 10 mM imidazole), and lysed by french press. The soluble fraction was purified using Ni-NTA resin with increasing amounts of imidazole and analyzed by SDS-PAGE. Pure fractions were pooled, desalted (PD10 column, GE), and concentrated (Vivaspin MWCO 30 kDa, Sartorius). For mass exchange-based adenylation

WWW.NATURE.COM/NATURE | 1

RESEARCH SUPPLEMENTARY INFORMATION

doi:10.1038/nature12959

assays, a 6 µL reaction mixture consisting of 600 nM enzyme, 1 mM amino acid. 20 mM Tris-HCl (pH 7.5), 5 mM MgCl2, 5 mM inorganic pyrophosphate, 0.3 mM DTT, and 1 mM γ-18O4-ATP was incubated at 25 ºC for 2 h before being quenched with 6 µL of

9-

aminoacridine in acetone (10 mg mL-1). The data were recorded on a MALDI Thermo LTQ Orbitrap™ XL equipped with a nitrogen laser at λ= 337 nm. The MS was operated in negative ionization and FTMS mode. The laser energy was tuned semi automatically on 9aminoacridine matrix and set to 35 µJ. The following parameters were applied: automatic spectrum filtering (ASF) = off, automatic gain control (AGC) = on, microscans = 1, resolution 15000, scan range from 500-520 m/z and crystal positioning system (CPS). The average of 100 scans was used for each experiment. Substrate conversion (%) was calculated with the equation % exchange = (100/0.833)*16O/(16O +

18

O) and normalized to the amino

acid with the greatest specificity.

WWW.NATURE.COM/NATURE | 2

RESEARCH SUPPLEMENTARY INFORMATION

doi:10.1038/nature12959

Table S1| Sequencing and assembly statistics of the different HTS technologies and platforms used. Total

Technology Platform

454 GS-FLX

Illumina Miseq

Run

Reads

Bases

Aligned w/ paired read

Pair distance

Reads

inferred Bases

read error [%]

GW1GXA001

278,628

103,517,209

/

/

199,866 (71.7%)

71,416,758 (69.0%)

1.15

G0BZMZS04

365,682

62,148,666

141.253

1,766.1 ± 592.6

252,538 (69.1%)

42,421,695 (68.3%)

1.79

G5M2T3U03

348,385

60,088,308

137.315

1,765.5 ± 592.6

241,724 (69.4%)

41,164,695 (68.5%)

1.69

4,166,800

583,304,284

4.166.800

537.5 ± 201.6

3,335,888 (80.1%)

457,764,474 (78.5%)

1.15

265,535 (84.0%)

53,994,815 (85.6%)

0.52

WGS-PE

PacBio RS

Total

MP

316,113

63,051,407

316.113

4,393.7 ± 1,098.4

Run01

21,978

28.873.272

/

/

11,468 (52.2%)

14,623,200 (50.7%)

2.29

5,497,586

900,983,146

4,761,481

/

4,307,019 (78.3%)

681,385,861 (75.6%)

1.19

WWW.NATURE.COM/NATURE | 3

RESEARCH SUPPLEMENTARY INFORMATION

doi:10.1038/nature12959

Table S2| Assembly statistics of binned sequence data TSY1 reads

TSY2

1,918,908

617,323

Contaminants & small contigs

Total

1,770,788

4,307,019

Assembled bases

*

*

*

303,571,246

97,660,499

280,154,117

681,385,861

563

860

436

1,859

contigs in scaffolds

1,577

2,303

703

4,583

large contigs (>= 500bp)

1,820

3,270

13,003

18,093

all contigs (>= 100bp)

n.d.

n.d.

77,162

82,252

8,894,357

8,820,512

27,775,048

45,489,917

Coverage

34.13

11.07

10.09

14.98

G+C content [%]

55.79

55.55

42.17

47.78

average

15,346

9,229

3,565

10,605

largest

105,049

65,735

21,890

105049

average of large contigs

5,015

2,815

756

1,556

average of scaffolded contigs

3,524

5,923

1,476

3,960

largest

48,845

27,101

5,386

48,845

scaffolds

Number of

Bases in contigs





Scaffold size

Contig size

*

calculated based on an average read length of 158.2 bp (total assembled bases divided by total assembled reads) † contigs of less than 500 bp were not subjected to binning

WWW.NATURE.COM/NATURE | 4

RESEARCH SUPPLEMENTARY INFORMATION

doi:10.1038/nature12959

Table S3| Pairwise identities of orthologous phylogenetic markers from "Entotheonella factor" TSY1 and TSY2. Gene

% Identity (Amino Acid)

% Identity (Nucleotide)

ffh

97.3

91.0

gcp

92.8

87.2

infB

93.7

87.0

lepA

96.7

89.2

pheS

93.0

90.3

pheT

90.5

87.5

pyrG

97.0

92.9

rnhB

87.6

82.0

rplA

94.4

87.5

rplB

94.9

87.1

rplC

95.0

89.7

rplD

94.9

88.7

rplE

98.4

91.4

rplF

93.6

91.2

rplJ

94.8

86.0

rplK

97.2

90.1

rplN

99.2

89.4

rplO

94.3

90.0

rplP

97.1

89.3

rplR

88.9

86.3

rplV

98.5

91.4

rplX

90.4

88.5

rpsB

92.6

89.4

rpsC

97.7

89.6

rpsD

97.7

91.7

rpsH

94.6

91.3

rpsI

92.1

91.4

rpsK

99.2

90.6

rpsL

97.9

89.1

rpsM

95.9

91.7

rpsO

98.9

90.4

rpsQ

97.6

91.1

rpsS

97.9

93.3

tgt

95.8

88.5

tpiA

87.0

84.1

WWW.NATURE.COM/NATURE | 5

RESEARCH SUPPLEMENTARY INFORMATION

doi:10.1038/nature12959

Table S4| Identified natural product biosynthetic loci from "Entotheonella" strains TSY1 and TSY2. Locus ID

Cluster Size [kb]

TSY1_01 TSY1_02 TSY1_03 TSY1_04 TSY1_05 TSY1_06 TSY1_07 TSY1_08 TSY1_09 TSY1_10 TSY1_11 TSY1_12 TSY1_13 TSY1_14 TSY1_15 TSY1_16 TSY1_17 TSY1_18 TSY1_19 TSY1_20 TSY1_21 TSY1_22 TSY1_23 TSY1_24 TSY1_25

13 28 49 31 20 66 52 9 14 5 16 24 10 20 9 11 7 14 17 14 9 9 8 6 26 or less

TSY2_01 TSY2_02

8 21

TSY2_03

8

TSY2_04 TSY2_05

7 10

TSY2_06

10

TSY2_07 TSY2_08

9 3

Biosynthetic Type

Proposed or Known Product

NRPS Hybrid PKS-NRPS Hybrid PKS-NRPS NRPS Type III PKS Type I PKS-NRPS Hybrid PKS-NRPS NRPS Hybrid PKS-NRPS Type I PKS Ribosomal peptide Hybrid PKS-NRPS * NRPS (open) Ribosomal peptide NRPS (open) Ribosomal peptide Ectoine Type I PKS-NRPS NRPS (open) Type I PKS (open) Ectoine NRPS Type I PKS (open) NRPS (open) Ribosomal peptide

Konbamides (putatively inactive) Keramamides/orbiculamides Unknown acylated peptide Nazumamide A Unknown aromatic polyketide Onnamides/theopederins Cyclotheonamides/Pseudotheonamides Unknown peptide Unknown acylated threonine derivative Onnamides/theopederins Polytheonamides Unknown mixed polyketide-peptide Unknown peptide fragment Unknown proteusin Unknown peptide fragment Unknown proteusin Ectoine Unknown mixed polyketide-peptide Unknown peptide Unknown polyketide fragment Ectoine Unknown peptide Unknown polyketide fragment Unknown peptide fragment Unknown proteusin

NRPS (open) NRPS Type III PKS (ortholog of TSY1_05) Type I PKS Ribosomal peptide NRPS (ortholog of TSY1_08) NRPS (open) NRPS (open)

Unknown peptide fragment Unknown pentapeptide Unknown polyketide Unknown polyketide Unknown proteusin Unknown peptide Unknown peptide fragment Unknown peptide fragment

Pathways for known products are indicated in bold. * Incomplete biosynthetic loci are designated with "(open)".

WWW.NATURE.COM/NATURE | 6

RESEARCH SUPPLEMENTARY INFORMATION

doi:10.1038/nature12959

Table S5| NRPS adenylation domain substrate predictions. Adenylation Domain TSY1_01_Kon_Orf1_A1 TSY1_01_Kon_Orf1_A2 TSY1_01_Kon_Orf2_A3 TSY1_01_Kon_Orf3_A4 TSY1_01_Kon_Orf4_A5 TSY1_01_Kon_Orf5_A6 TSY1_01_Kon_Orf6_A7 TSY1_01_Kon_Orf7_A8 TSY1_02_KerA_A0 TSY1_02_KerB_A1 TSY1_02_KerB_A2 TSY1_02_KerC_A3 TSY1_02_KerC_A4 TSY1_02_KerC_A5 TSY1_02_KerE_A6 TSY1_02_KerF_A6 TSY1_02_KerH_A7 TSY1_03 _Orf2_A1 TSY1_03 _Orf3_A2 TSY1_03_Orf3_A3 TSY1_03_Orf3_A4 TSY1_03_Orf4_A5 TSY1_04_Naz_Orf1_A1 TSY1_04_Naz_Orf2_A2 TSY1_04_Naz_Orf2_A3 TSY1_04_Naz_Orf2_A4 TSY1_06_OnnI_A1 TSY1_06_OnnJ_A2 TSY1_07_Cth_A1 TSY1_07_Cth_A2 TSY1_07_Cth_A3 TSY1_07_Cth_A4 TSY1_07_Cth_A5 TSY1_07_Cth_A6 TSY1_08_A1 TSY1_09_A1 TSY1_13_A1 TSY1_13_A2 TSY1_15_A1 TSY1_19_A1 TSY1_22_A1 TSY1_24_A1 TSY2_01_A1 TSY2_02_Orf1_A1 TSY2_02_Orf1_A2 TSY2_02_Orf1_A3 TSY2_02_Orf1_A4 TSY2_02_Orf2_A5 TSY2_06_A1 TSY2_07_A1 TSY2_08_A1

Active Site Code DVEDIGAVEK DAEDIGSVVK DAFFLGVTFK DAEDIGSVVK DLFNNALTYK DAWFLG----* DAWFLGNVVK DALHVGNMAK GIFWLGASGK DAFFLGVTYK DVSFMGAVMK DVGEIGSIDK DVQFIAHVAK DVYFVGAVIK DIYNNALTYK DLYNMSLIWK DALHVGNMAK LDWVSSLADK DVSFMGGVLK DLKNFGTDIK DVQFIAHVIK DVSFMGAIMK DVEDIGAITT DVQFIAQVVK DAFFLGVTFK DVYFMGGVIK DILQLGLIWK DVLDIGAIDK DVSFMGGVLK DIWELTADDK DVQFIAQVVK DVEDIGAITS DAWTIAAVCK DASTIAAVCK DMGGIGCLM-‡ DFWNVGMVHK GLTPLACSWK SDQLFSLADK DAFFLGVTFK DIWEVAADN-‡ DVSFMGGVLK DVYFIGGVIK TDWQFGIIYK DAFWLGGTFK ?§ DFWNIGMVHK DAAKVGQVGK DAWMSGAVCK DMGGIGCLM-‡ ADQLFGLADK GLTPVAFSWK

Nearest Neighbor Prediction Arg Lys Ile Lys Ala Leu Leu Hpg † n.p. Ile Phe Orn Pro Arg Ala Cys Hpg β-Ala Ala Ala Pro Phe Arg Pro Ile Ala Gly Arg Ala Ser Pro Arg Phe Tyr n.p. Thr 2-Oxoisovaleric acid β-Ala Ile Phe Ala Pro Gln Val n.p. Thr Asn Phe Orn β-Ala Asp

*

ORF insertion truncated the A domain binding pocket Predictive residues lie outside the applicability domain22 yielding no prediction (n.p.) ‡ Alignment gap yielding no predictive residue § Assembly gap preventing prediction †

WWW.NATURE.COM/NATURE | 7

RESEARCH SUPPLEMENTARY INFORMATION

doi:10.1038/nature12959

Table S6| UPLC-HRMS data and eMZed based compound identification of two independent enriched "Entotheonella" samples from T. swinhoei. retention time [min]

isotope ratio‡ compound name

sum formula

calculated

detected

mass error (ppm)

simulated

found

M0/M1

M0/M2

M0/M1

M0/M2

standard

found

polytheonamide A*†

C219H376N60O72S

1677.9181 3+ [M+3H]

1677.9195 3+ [M+3H]

-0.83

-

-

-

-

5.7

5.7.

polytheonamide B†

C219H376N60O72S

1677.9181 3+ [M+3H]

1677.9176 3 [M+3H]

0.30

-

-

-

-

6.7

6.7

onnamide A*†

C39H63N5O12

794.45460 + [M+H]

794.45341 + [M+H]

1.50

0.43

0.11

0.43

0.10

10.6

10.4

cyclotheonamide A*†

C36H45N9O8

366.67683 [M+2H]2+

366.67661 [M+2H]2+

0.60

0.41

0.09

0.34

0.05

7.9

7.9

aurantoside A*†

C36H46N2O15Cl2

817.23480 [M+H]+

817.23517 [M+H]+

-0.45

0.39

0.69

0.38

0.66

n.a.

13.3

aurantoside B*†

C35H44N2O15Cl2

803.21915 [M+H]+

803.22006 [M+H]+

-1.13

0.38

0.69

0.36

0.60

n.a.

12.9

aurantoside E†

C38H48N2O15Cl2

843.25045 [M+H]+

843.24689 [M+H]+

4.22

0.41

0.69

0.37

0.60

n.a.

13.9

orbiculamide A*†

C46H62N9O10Br

490.69743 [M+2H]2+

490.69679 [M+2H]2+

1.30

0.50

1.05

0.49

1.12

n.a.

12.3

keramamide B†

C54H77N10O12Br

569.25256 [M+2H]2+

569.25132 [M+2H]2+

2.18

0.58

1.09

0.60

1.19

n.a.

13.2

keramamide E or C†

C53H75N10O12Br

1123.48220 [M+H]+

1123.48050 [M+H]+

1.51

0.57

1.08

0.51

1.01

n.a.

12.9

C26H41NO10

528.28032 [M+H]+

528.27948 [M+H]+

1.59

0.28

0.06

-

-

n.a.

11.8

theopederin D*

* Enriched "Entotheonella" fraction used for metagenome sequencing † "Entotheonella" enriched from fresh sponge specimen ‡ See Supplementary Fig. 7 for polytheonamides M0 represents the monoisotopic peak and M1 and M2 the first and second isotopic peak thereof. n.a. not available

WWW.NATURE.COM/NATURE | 8

RESEARCH SUPPLEMENTARY INFORMATION

doi:10.1038/nature12959

Table S7| UPLC-HRMS data and eMZed based compound identification of a Theonella swinhoei extract. compound name

sum formula

calculated

detected

mass error (ppm)

isotope ratio† simulated

retention time [min] found

M0/M1

M0/M2

M0/M1

M0/M2

standard

found

polytheonamide A*

C219H376N60O72S

1677.9181 3+ [M+3H]

1677.9168 3+ [M+3H]

0.72

-

-

-

-

5.7

5.7.

polytheonamide B*

C219H376N60O72S

1677.9181 3+ [M+3H]

1677.9179 3 [M+3H]

0.12

-

-

-

-

6.7

6.7

onnamide A*

C39H63N5O12

794.45460 + [M+H]

794.45245 + [M+H]

2.71

0.43

0.11

0.45

0.10

10.6

10.6

onnamide B

C37H61N5O12

768.43895 + [M+H]

768.43787 + [M+H]

1.40

0.41

0.10

0.42

0.12

10.2

10.3

onnamide C

C39H61N5O14

824.42878 + [M+H]

824.42706 + [M+H]

2.09

0.43

0.11

0.39

0.10

n.a.

10.5

onnamide D

C38H63N5O11

766.45969 + [M+H]

766.45858 + [M+H]

1.45

0.42

0.10

0.41

0.06

n.a.

10.2

onnamide E

C37H59N5O10

734.43347 [M+H]+

734.43216 [M+H]+

1.78

0.43

0.11

0.42

0.13.

n.a.

10.3

pseudoonnamide A

C38H61N5O12

780.43895 [M+H]+

780.43751 [M+H]+

1.84

0.42

0.11

0.40

0.07

n.a.

10.6

orbiculamide A*

C46H62N9O10Br

490.69743 [M+2H]2+

490.69681 [M+2H]2+

1.26

0.50

1.05

0.51

1.12

n.a.

12.3

keramamide B*

C54H77N10O12Br

569.25256 [M+2H]2+

569.25125 [M+2H]2+

2.30

0.58

1.09

0.56

1.04

n.a.

13.2

keramamide E or C*

C53H75N10O12Br

562.24474 [M+2H]2+

562.24368 [M+2H]2+

1.88

0.57

1.08

0.55

1.11

n.a.

12.9

keramamide D

C52H73O12N10Br

555.23691 [M+2H]2+

555.23572 [M+2H]2+

2.14

0.56

1.08

0.52

1.06

n.a.

12.6

C28H43N7O8

606.32459 [M+H]+

606.32620 [M+H]+

-2.65

0.32

0.06

0.29

0.05

n.a.

13.5

aurantoside A*

C36H46N2O15Cl2

817.23480 [M+H]+

817.23289 [M+H]+

2.34

0.39

0.69

0.39

0.69

n.a.

13.3

aurantoside B*

C35H44N2O15Cl2

803.21915 [M+H]+

803.21784 [M+H]+

1.63

0.38

0.69

0.38

0.85

n.a.

12.9

aurantoside D or C

C37H46N2O15Cl2

829.23480 [M+H]+

829.23279 [M+H]+

2.42

0.40

0.69

0.37

0.58

n.a.

13.4

aurantoside E*

C38H48N2O15Cl2

843.25045 + [M+H]

843.24854 + [M+H]

2.26

0.41

0.69

0.37

0.61

n.a.

13.9

cyclotheonamide A*

C36H45N9O8

366.67683 [M+2H]2+

366.67699 [M+2H]2+

-0.44

0.41

0.09

0.40

0.07

7.9

7.9

theopederin D*

C26H41NO10

528.28032 [M+H]+

528.27970 [M+H]+

1.17

0.28

0.06

0.22

0.06

n.a.

11.8

nazuamide A

* Compounds also present in the enriched "Entotheonella" fraction † See Supplementary Fig. 7 for polytheonamides M0 represents the monoisotopic peak and M1 and M2 the first and second isotopic peak thereof.

WWW.NATURE.COM/NATURE | 9

RESEARCH SUPPLEMENTARY INFORMATION

doi:10.1038/nature12959

Table S8| Deconvoluted LC-HESI-HRMS masses from gene cluster TSY1_14 coexpression experiments. Precursor only

Coexpression with LanM-like protein

Predicted Identity

Expected Mass*

Deconvoluted Mass

Mass Error (%)

Relative Intensity

Deconvoluted Mass

Mass Error (%)

Relative Intensity

M-3H2O

13086.226

-

-

n.d.

13086.225

0.0001

100

M-2H2O

13104.237

-

-

n.d.

13104.250

0.0001

43.9

M-H2O

13122.247

13122.282

0.0003

3.2

13121.242

0.0076

26.9

M

13140.258

13139.255

0.0076

100

13138.237

0.0154

20.0

M-3H2O+2CAM

13200.269

-

-

n.d.

13200.299

0.0002

60.1

M-2H2O+3CAM

13275.301

-

-

n.d.

13275.311

0.0001

25.2

M-H2O+4CAM

13350.333

13351.344

0.0076

2.7

13350.335

0.0001

19.4

M-3H2O+2CAM+Gluc

13378.317

-

-

n.d.

13378.354

0.0003

100

M+5CAM

13425.365

13425.387

0.0002

100

13425.334

0.0002

21.8

M+5CAM+Gluc

13603.413

13603.404

0.0001

26.8

13603.424

0.0001

0.9

Mb+CAM

3225.588

3225.598

0.0003

100

3225.593

0.0001

100

Ma-3H2O+2CAM

4415.929

-

-

n.d.

4415.936

0.0001

67.9

Ma-2H2O+2CAM

4433.940

-

-

n.d.

4433.945

0.0001

76.1

y16-3H2O+2CAM

1654.546

-

-

n.d.

1654.543

0.0002

n.a.

y17-3H2O+2CAM

1767.630

-

-

n.d.

1767.628

0.0001

n.a.

y18-3H2O+2CAM

1880.714

-

-

n.d.

1880.705

0.0005

n.a.

y19-3H2O+2CAM

1993.798

-

-

n.d.

1993.790

0.0004

n.a.

y37-3H2O+2CAM

3710.575

-

-

n.d.

3710.567

0.0002

n.a.

Treatment

TCEP (retention time: 5.85-6.1 min)

TCEP and Iodoacetamide (retention time: 5.85-6.1 min)

TCEP, Iodoacetamide and trypsin (retention time: 6.166.46 min)

MS2 fragments of Ma3H2O+2CAM

67

* as calculated using the ChemCalc online tool M: His-tagged precursor peptide from TSY1_14 (MH124), lacking Met1 n.d. not detected within a range of +/- 3 Da n.a. not available

WWW.NATURE.COM/NATURE | 10

RESEARCH SUPPLEMENTARY INFORMATION

doi:10.1038/nature12959

Table S9| List of sponges investigated for the presence of "Entotheonella" spp. Detection of "Entotheonella"

Sponge

Location

Aaptos ciliata (Suberitidae, Hadromerida)

Oshima-shinsone, Japan 28°52.17' N, 129°33.02' E

+

Agelas dilatata (Agelasidae, Agelasida)

Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W

+

Agelas nakamurai (Agelasidae, Agelasida)

Kuchinoerabu-jima, Japan 30°47.67' N, 130°18.85' E

+

Amphimedon compressa (Niphatidae, Haplosclerida)

Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W

-

Amphimedon sp. (Niphatidae, Haplosclerida)

Io-jima, Japan 30°48.35' N, 130°19.07' E

+

Amphimedon sp. (Niphatidae, Haplosclerida)

Hachijo-jima, Japan 33°12.18' N, 139°70.11' E

-

Anthosigmella (Cliona) raromicrosclera (Clionaidae, Hadromerida)

Mitsukue, Japan 33°46.88' N, 132°25.84' E

+

Anthosigmella (Cliona) raromicrosclera (Clionaidae, Hadromerida)

Kamikoshiki-jima,Japan 31°81.78' N, 129°90.57' E

-

Aplysina aerophoba (Aplysinidae, Verongida)

Rovinj, Croatia 45°7.50' N, 13°39.48' E

+

Asteropus simplex (Ancorinidae, Astrophorida)

Shikine-jima, Japan 34°32.15' N, 139°22.07' E

+

Axinella sp. (Axinellidae, Halichondrida)

Shikine-jima, Japan 34°31.78' N, 139°21.80' E

-

Cacospongia mycofijiensis (Thorectidae, Dictyoceratida)

Mele Bay, Vanuatu 17° 43' S, 168° 14'E

+

Callyspongia vaginalis (Callyspongiidae, Haplosclerida)

Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W

+

Ceratopsion sp. (Raspailiidae, Poecilosclerida)

Yaku-shinsone, Japan 29°47.22' N, 130°19.88' E

+

Dercitus simplex (Ancorinidae, Astrophorida)

Oshima-shinsone, Japan 28°52.17' N, 129°33.02' E

+

Discodermia calyx (Theonellidae, Lithistida)

Nakagi, Japan 34°61.11' N, 138°82.07' E

+

Discodermia kiiensis (Theonellidae, Lithistida)

Nakagi, Japan 34°61.11' N, 138°82.07' E

+

Dysidea avara (Dysideidae, Dictyoceratida)

Rovinj, Croatia 45°7.50' N, 13°39.48' E

+

Dysidea etheria (Dysideidae, Dictyoceratida)

Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W

+

Epipolasis sp. (Halichondriidae, Halichondrida)

Nagannu-jima, Japan 26°14.61' N, 127°31.01' E

+

Epipolasis sp. (Halichondriidae, Halichondrida)

Hachijo-jima, Japan 33°13.77' N, 139°73.47' E

+

Erylus nobilis (Geodiidae, Astrophorida)

Shikine-jima, Japan 34°33.90' N, 139°20.83' E

+

Erylus placenta (Geodiidae, Astrophorida)

Hachijo-jima, Japan 33°07.14' N, 139°77.97' E

+

Fascaplysinopsis sp. (Thorectidae, Dictyoceratida)

Salary Bay, Madagascar 22°33' S, 43°16'E

-

Haliclona digitata (Chalinidae, Haplosclerida)

Ikara-jima, Japan 32°21.48' N, 130°18.98' E

-

WWW.NATURE.COM/NATURE | 11

RESEARCH SUPPLEMENTARY INFORMATION

doi:10.1038/nature12959

Hexadella sp. (Ianthellidae, Verongida)

Kuchinoerabu-jima, Japan 30°47.67' N, 130°18.85' E

+

Ircinia felix (Irciniidae, Dictyoceratida)

Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W

+

Mycale magellanica (Mycalidae,Poecilosclerida)

Toji, Japan 34°64.12' N, 138°91.70' E

+

Niphates digitalis (Niphatidae, Haplosclerida)

Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W

+

Penares aff. incrustans (Geodiidae, Astrophorida)

Hachijo-jima, Japan 33°13.40' N, 139°80.31' E

+

Penares aff. incrustans (Geodiidae, Astrophorida)

Hachijo-jima, Japan 33°13.40' N, 139°80.31' E

-

Penares sp. (Geodiidae, Astrophorida)

Uke-shima, Japan 28°05.42' N, 129°21.77' E

+

Petrosia volcano (Petrosidae, Haplosclerida)

Io-jima, Japan 30°48.35' N, 130°19.07' E

+

Psammocinia aff. bulbosa (Irciniidae, Dictyoceratida)

Milne Bay, Papua New Guinea 9° 32.493’ S 150° 16.715’ E

+

Pseudoceratina purpurea (Pseudoceratinidae, Verongida)

Nakano-shima, Japan 29°83.22' N, 129°85.14' E

+

Pseudoceratina purpurea (Pseudoceratinidae, Verongida)

Oshima-shinsone, Japan 28°52.17' N, 129°33.02' E

+

Ptilocaulis sp. (Axinellidae, Halichondrida)

Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W

+

Stylissa carteri (Dictyonellidae, Halichondrida)

FSAR reef, Thuwal, Saudi Arabia 22°23.09' N, 39°02.86' E

+

Stylissa carteri (Dictyonellidae, Halichondrida)

Kuchinoerabu-jima, Japan 30°47.84' N, 130°18.76' E

+

Theonella swinhoei W1, misakinolide chemotype (Theonellidae, Lithistida)

Hachijo-jima, Japan 33°13.77' N, 139°73.47' E

+

Theonella swinhoei Y1, onnamide chemotype (Theonellidae, Lithistida)

Nakagi, Japan 34°61.11' N, 138°82.07' E

+

Topsentia sp. (Halichondriidae, Halichondrida)

Nichinan-Oshima, Japan 31°53.96' N, 131°41.67' E

+

Xestospongia muta (Petrosiidae, Haplosclerida)

Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W

+

Xestospongia testudinara (Petrosiidae, Haplosclerida)

FSAR reef, Thuwal, Saudi Arabia 22°23’N; 39°03’E

-

Seawater

Rovinj, Croatia 45°7.50' N, 13°39.48' E

+

Seawater

Florida, USA 24°57.11' N, 80°27.30' W

+

Seawater

FSAR reef, Thuwal, Saudi Arabia 22°23.09' N, 39°02.86' E

+

+/-: "Entotheonella" detected/not detected (based on the amplicon sequence)

WWW.NATURE.COM/NATURE | 12

RESEARCH SUPPLEMENTARY INFORMATION

doi:10.1038/nature12959

Table S10| Primers used in this study. Primer ID

Sequence

Target gene (cluster)

onnOF onnOR onnIF onnIR poyOF poyOR poyIF poyIR kon1OF kon1OR kon1IF kon1IR kon2OF kon2OR kon2IF kon2IR naz1OF naz1OR naz1IF naz1IR naz2OF naz2OR naz2IF naz2IR cth1OF cth1OR cth1IF cth1IR cth2OF cth2OR cth2IF cth2IR kerOF kerOR kerIF kerIR ptsOF ptsOR ptsIF ptsIR 16SU27F 16SU1492R EntoIF Ento1290R Ento271F Ento238F Ento1442R KSDPQQF KSHGTGR kerA5F kerA5R cthA2F cthA2R Prec-TSY1_14-F Prec-TSY1_14-R Lanth-TSY1_14-F Lanth-TSY1_14-R

GTCAGCTGAGAACCTGTCGG CTTCCAGCCAGAATGCTGCC TTGCCGTGAATTCCGCTT AGCGGCTTCCAGATGACC CAAGAACTCACAGTCGCCGACGTGTT CGCTACGTGGTGAGCATCGAGGATT CCATTCTAACCCAGAAAGGAGTCCACCAT CATTGATATTGCCACCTGCGACCTGATT AGTTTTGTCCCAACTCCCGTGG AGACGACTTGATAGCGGAAGCG GCTACCGCTCCGACGGC CGTGACGTGAGCCAAATCGTCC TCAAGAAGATGTGGTCGTCGGC ATCAACGGGGTAGGCAAGAACG CACGACCCTGTTTGATTTGTCCG TGGTCTTTCAATGCCGTTTGCG CTTACGCACCACGTTTCCAACC GCCCAGGAAGAGGGTCAAATCG CAGTCTTACGCTCCCACTGTCG CGATGACGAGATCCTCTTGCCC GCGCAGCTTCACCTGAGTATCG CAACACCCGGACAACCTATCCC GACGTGTAAGAGACGAGCGGG GACCTTCATGCTGGCTGACACC CCATCACGCCATTTACGAAGCG TCTAAGACCTCTCCCGTCAGCC AACTTGCTGGTGGCGTACTTGG CGAGCAACCGGAAGGCATGG GTGCGTCTGGCGCTAATAATGG CTCAAGCCTGTGCCTATCTGGG TCTGGTAAGCCCGTTTGACAGC CTTTTGTGCCACGAGTACCTGC TCAGGTGGAACATGACGATGCC CTCACATGCAAGCACGGTTTCC ACCTGTATGGCAAGAGCCAAGC CTAACCGAAACGGGTGAGGTGG CTCGCTTATCTGCGTGCAATCG GTTTGAAGAGCAACCACGAGCG CGGTCGTCTTTAATGCACTCGC CTGGCTTTAGGTGTCGAGGAGG AKWGTTTGATCMTGGCTCAG GGHTACCTTGTTACGACTT GYATTAAGCCKYGGAAACKGT GCCCRGCWYVACCCGGTA GGGAAASGTTCGCBGGTCTG CCGGTCTGAGATGAGCTTGC TCACCCCAATCACCCCGC MGNGARGCNNWNSMNATGGAYCCNCARCANMG GGRTCNCCNARNSWNGTNCCNGTNCCRTG GTATCATATGGTTCACACCTTGCCGCTGCT CTATCTCGAGTCAGCAATCGTCTTTTCGAGCGC GGCAGCCATATGCTCGTCAGTAAGTTGCCTTTGC GTGGTGCTCGAGCTAGTCCAATTCCAACACATCCGCCC GTGCATATGTCACCGGCTGAAAATCGA GACAAGCTTTTACCCGCAAGCCCAACAA CGGTCTTCATGATTTACAAACCATGGGAAAATT GTACCTCGAGTTAGGATATGCTGCCAAAGACCAG

onnamide onnamide onnamide onnamide polytheonamide polytheonamide polytheonamide polytheonamide konbamide konbamide konbamide konbamide konbamide konbamide konbamide konbamide nazumamide nazumamide nazumamide nazumamide nazumamide nazumamide nazumamide nazumamide cyclotheonamide cyclotheonamide cyclotheonamide cyclotheonamide cyclotheonamide cyclotheonamide cyclotheonamide cyclotheonamide keramamide keramamide keramamide keramamide unknown proteusin unknown proteusin unknown proteusin unknown proteusin eubacterial 16S rRNA eubacterial 16S rRNA "Entotheonella" 16S rRNA "Entotheonella" 16S rRNA "Entotheonella" 16S rRNA "Entotheonella" 16S rRNA "Entotheonella" 16S rRNA general PKS gene detection general PKS gene detection keramamide A domain 5 keramamide A domain 5 cyclotheonamide A domain 2 cyclotheonamide A domain 2 TSY1_14 Precursor TSY1_14 Precursor TSY1_14 LanM-like TSY1_14 LanM-like

WWW.NATURE.COM/NATURE | 13

doi:10.1038/nature12959

RESEARCH SUPPLEMENTARY INFORMATION

Supplementary References 22

Rottig, M. et al. NRPSpredictor2 - a web server for predicting NRPS adenylation domain specificity. Nucleic Acids Res. 39, W362-367 (2011).

67

Patiny, L. & Borel, A. ChemCalc: a building block for tomorrow's chemical infrastructure. J Chem Inf Model 53, 1223-1228 (2013).

WWW.NATURE.COM/NATURE | 14