Stochastic event sets for a probabilistic seismic ...

6 downloads 0 Views 14MB Size Report
Mar 13, 2006 - Die Parameter der Normalverteilung hän gen von der räumlichen Verteilung ...... written in Matlab language called bootEQ. The basic mode of ...
Stochastic event sets for a probabilistic seismic hazard assessment

DIPLOMA THESIS

Author: Patrice Tscherrig

Supervised by:

Prof. Dr. Domenico Giardini

Dr. Martin Mai

Dr. Balz Grollimund

March, 2006

Author: Patrice Tscherrig Brüggliweg 3 3113 Rubigen Tel: +41788533826 E-mail: [email protected]

Supervisors: Prof. Dr. Domenico Giardini Institut für Geophysik Schafmattstr. 30 ETH Hönggerberg, HPP P6.1 8093 Zürich Tel: +41446332610 E-Mail: [email protected] Dr. Martin Mai Schweiz. Erdbebendienst (SED) Schafmattstr. 30 ETH Hönggerberg, HPP P7.1 8093 Zürich Tel: +414463334075 E-Mail: [email protected] Dr. Balz Grollimund Swiss Reinsurance Company Alfred-Escher-Str. 82 8093 Zürich Tel: +41432856758 E-Mail: [email protected]

“Did you do some testing whether your results reflect the real world?”

___________________________

”What is the real world?”

A discussion related to the modelling process for the seismic risk in China, SwissRe internal

Abstract A necessary step in seismic risk mitigation is an adequately conducted seismic hazard assessment. In this thesis I present a method rarely discussed in literature, but represents a powerful tool for probabilistic seismic hazard assessment. The method is based on a sampling-with-replacement technique from a given earthquake catalogue and creates synthetic earthquake catalogues that might be representative for the earthquake occurrence in the forthcoming millenniums. This approach allows the incorporation of all information for a common probabilistic seismic hazard assessment (after the Cornell-approach), such as seismic source delineation and the corresponding parameters, empirical ground motion relationships and a probability analysis for the hazard at all sites of interest. The big advantage of generating synthetic earthquake catalogues results from the fact that each ground motion which contributes to the seismic hazard is unambiguously related to any single earthquake in the synthetic catalogue, hence the denotation “event set”. This is indispensable for a complete risk assessment process, in which the loss to lifeline structures or the loss on a portfolio of values needs to be calculated. Based on the presented method, I have conducted a seismic hazard assessment of the Central and Eastern United States and the mainland of China. For the former, a clear correlation between my results and the USGS national seismic hazard assessment program has been found. For the latter, deviations from my results from those of the Global seismic hazard assessment program (GSHAP) have been found, in particular for Eastern China, and I wish to accentuate the need for a revision of the predicted ground motions by GSHAP regarding mainland China. A disadvantage of the creation of synthetic earthquake catalogues presents itself in the fact that each hazard calculation is non-unique. I have investigated the aleatory variability caused by the use of random numbers in the different sampling steps involved, hence the name “stochastic”, and have found that the resulting hazard estimates on every site of interest are normally distributed. The parameters for the normal distribution vary with the spatial randomness applied to the locations of the earthquakes in the input catalogue. The last part of this thesis presents a general validation of the particular implementation of the sampling-with-replacement method. Due to the extraordinarily extensive earthquake catalogue of Eastern China,, a direct comparison between the predicted ground motions and the observed ground motions during the past 500 years is possible. It is found that the presented method has certain weaknesses in accounting for the spatial and temporal behaviour of fault zones. Therefore a method is proposed that accounts for an earthquake cycle model and mimics the spatial occurrence of earthquakes in fault zones. The presented method can easily be adapted to analyze the uncertainties in a seismic hazard assessment. Furthermore a de-aggregation study, the analysis which magnitude-distance pairs contribute most to the hazard at a certain site, could be easily incorporated, and should compose a field of future research.

I



Zusammenfassung Ein notwendiger Schritt zur Verminderung des seismischen Risikos ist eine angemessen durchgeführte seismische Gefährdungsanalyse. In dieser Diplomarbeit präsentiere ich eine Methode, welche in der Literatur wenig diskutiert wird, aber ein enorm starkes Werkzeug für probabilistische seismische Gefährdungsanalysen (PSGA) darstellt. Die Methode basiert auf Ziehen mit Zurücklegen eines gegebenen Erdbebenkataloges und kreiert daraus einen synthetischen Erdbebenkatalog, der für den Erdbebenprozess der nächsten hunderttausend Jahre repräsentativ sein könnte. Die vorgestellte Methode erlaubt die Berücksichtigung aller Informationen einer klassischen PSGA (nach dem Cornell-Ansatz), wie seismische Quell­ definitionen und die dazugehörigen Parameter, empirische Bodenbewegungsbeziehungen und eine Wahrscheinlichkeitsanalyse der Gefährdung an allen Standorten von Interesse. Der grosse Vorteil der Generierung von synthetischen Erdbebenkatalogen zeigt sich darin, dass jede Bodenbewegung, die zu der seismischen Gefährdung an einem Standort beiträgt, eindeutig mit einem einzelnen Erdbeben aus dem synthetischen Katalog verknüpft ist. Daher auch der Name „event set“. Dies ist unabdingbar für eine komplette Risikoanalyse, in der potentielle Verluste in einem Portfolio von Werten berechnet werden müssen. Basierend auf der vorgestellten Methode habe ich eine PSGA der zentralen und östlichen Vereinigten Staaten und Chinas durchgeführt. Im Falle der USA korrelieren meine Resultate klar mit den Resultaten des USGS. Im Falle Chinas, insbesondere Ostchinas, fand ich klare Abweichungen mit dem GSHAP (Global seismic hazard assessment program). Ich empfehle, die vorausgesagte Gefährdung von GSHAP für China zu revidieren. Ein Nachteil in der Generierung von synthetischen Erdbebenkatalogen ist, dass die Resulta­ te nicht eindeutig sind. Diese aleatorischen Variabilitäten resultieren aus verschiedenen Pro­ zessen, in denen kontrolliert Zufallsvariablen verwendet werden. Daher der Name „sto chastisch“. Die Variabilitäten der endgültigen Gefährdungsniveaus an jedem Standort können durch eine Normalverteilung beschrieben werden. Die Parameter der Normalverteilung hän­ gen von der räumlichen Verteilung ab, die auf die Lokationen der Erdbeben im vorgegebenen Erdbebenkatalog angewandt wird. Ein letzter Teil dieser Diplomarbeit präsentiert eine Validierung der vorgestellten Methodik. Dank des Erdbebenkatalogs Ostchinas, der einige hundert Jahre zurückreicht, ist ein direkter Vergleich zwischen den vorausgesagten und tatsächlich beobachteten Bodenbewegungen möglich. In Bruchzonen scheint unsere Methodik Schwierigkeiten zu haben, das räumliche und zeitliche Verhalten des Erdbebenprozesses aufzulösen. Deshalb schlage ich zwei Metho­ den vor, die dem Erdbebenzyklus und der räumlichen Verteilung von Erdbeben entlang von Bruchzonen gerecht werden. Unsicherheiten in der Gefährdungsanalyse können mit der hier vorgestellten Methode leicht abgebildet werden. Ausserdem ist eine Disaggregierungsstudie (die Analyse, welche Magni­ tuden-Distanz-Paare am meisten zur Gefährdung an einem Standort beitragen) sehr leicht durchzuführen, und dies sollte auch einen ersten Schritt zukünftiger Forschung darstellen.

II



Contents Abstract

I

Zusammenfassung

II

Contents

III

Figures

VIII

Tables

XI

1 General Introduction

1

1.1

Why seismic hazard assessment? ...................................................................1

1.2

Probabilistic seismic hazard assessment .........................................................2

1.3

Stochastic event sets for probabilistic seismic hazard assessment .................3

1.4

Goal and contents of this study .......................................................................6

2 Method

9

2.1

Fundamentals of the method ...........................................................................9

2.2

Catalogue of observed earthquakes ..............................................................10

2.2.1 Catalogue completeness ..................................................................10

2.3

Seismic zones ................................................................................................11

2.4

Magnitude distribution ..................................................................................12

2.4.1 Gutenberg-Richter relationship .......................................................12

2.4.2 Non Gutenberg-Richter relationship: characteristic

earthquakes .....................................................................................14

2.5

Empirical ground motion relationship ..........................................................16

2.6

Probability analysis for hazard calculation ...................................................17

2.7

Implementation .............................................................................................18

2.7.1 Synthetic catalogue duration, number of generated events

and sampling with replacement ......................................................18

2.7.2 Magnitude sampling for Gutenberg Richter distribution .................19

Sampling from a distribution ......................................................19

Latin Hypercube sampling ..........................................................20

2.7.3 Magnitude sampling for characteristic zones ..................................22

III





2.7.4

2.7.5 2.7.6 2.7.7 2.7.8

Spatial distribution ...........................................................................24 Epicentre spread function ...........................................................24

Drawback ....................................................................................25

Depth Distributions .........................................................................27 Faults ...............................................................................................27 Hazard map ......................................................................................28 Computational requirements ............................................................29

3 Stochastic Event Set for Central and Eastern United States









31

3.1

Motivation for this study area .......................................................................31

3.2

Catalogue of observed seismicity and catalogue completeness ....................31

3.3

Seismic zones ................................................................................................32 3.3.1 Eastern and Western United States ..................................................33

3.3.2 Wabash Valley .................................................................................33 3.3.3 Charlevoix .......................................................................................33 3.3.4 Characteristic zones: Introduction ...................................................33 3.3.5 New Madrid seismic zone ...............................................................34

3.3.6 Charleston ........................................................................................36 3.3.7 Summary of seismic zones ..............................................................39









3.4

Epicentre spread function .............................................................................39

3.5

Faults .............................................................................................................40



3.6

Attenuation relationship ................................................................................40



3.7

Results ...........................................................................................................40

3.7.1 Stochastic event set ..........................................................................40

3.7.2 Gutenberg-Richter distribution ........................................................42

3.7.3 Hazard map ......................................................................................44

3.8

Discussion .....................................................................................................46

3.8.1 Locations of stochastic events .........................................................46

3.8.2 Magnitude-frequency distribution ...................................................46

3.8.3 Hazard map ......................................................................................49

4 Stochastic Event Set for Mainland China

51

4.1

Introduction ...................................................................................................51

4.2

Catalogue of observed seismicity .................................................................52

4.2.1 Datasets ............................................................................................52

4.2.2 Observed Problems and Discussion ................................................52

4.2.3 Catalogue completeness ..................................................................53

4.3

Seismic zones ................................................................................................55

4.3.1 Background seismic zones ..............................................................55

IV





4.3.2

Characteristic source zones .............................................................58

Kunlun fault zone ........................................................................59

Altyn Tagh fault zone .................................................................60

Xianshuihe fault zone .................................................................61

Anninghe-Zemuhe fault zone .....................................................61

Xiaojiang fault zone ....................................................................61

4.4

Epicentre spread function .............................................................................62

4.5

Faults .............................................................................................................62



4.6

Attenuation relationship ................................................................................62 4.6.1 Discussion ........................................................................................64



4.7

Results ...........................................................................................................67

4.7.1 Gutenberg-Richter distribution ........................................................67

4.7.2 Stochastic event set ..........................................................................71

4.7.3 Hazard Map .....................................................................................71

4.8

Discussion .....................................................................................................74

4.8.1 Magnitude-frequency distribution ...................................................74

Deviations in b-values ................................................................74

Deviations in a-values .................................................................74

4.8.2 Locations of stochastic events .........................................................75

4.8.3 Hazard Map .....................................................................................76

4.9

Stochastic event set for Eastern China ..........................................................77

4.9.1 Input parameters ..............................................................................78

4.9.2 Results .............................................................................................79

4.9.3 Comparison with GSHAP Results ...................................................81

4.9.4 Discussion and suggested changes to either the

GSHAP-model or the model presented in this thesis .....................82

Tan-Lu fault zone and the 1668A.D. M8.5 event .......................82

Ground motion underestimation due to limited model

resolution ....................................................................................84

GSHAP underestimation of ground motion in the

Shanxi Rift zone ..........................................................................85

Uncertainty of maximum magnitude in the Shanghai

area ..............................................................................................86

Revision of ground motions in the GSHAP model to

lower values south of the Shanghai area .....................................86

5 Variabilities and Uncertainties

89

5.1

Terminology and Scope ................................................................................89

5.2

Aleatory variability .......................................................................................89

5.2.1 P robability density distribution of the aleatory variability...............90

V



5.2.2 Changes of the probability density distribution of the

aleatory variability with different parameter choices .....................92

Epicentre spread function ...........................................................93

Year of catalogue completeness .................................................95

5.2.3 Summary and Discussion ................................................................95

5.3

Epistemic variabilities and uncertainties ......................................................96

5.3.1 Sensitivity Analysis of epicentre spread function ...........................97

5.3.2 S tatistical test .................................................................................100

5.3.3 Application of statistical test .........................................................101

Epicentre spread function .........................................................101

Year of catalogue completeness ...............................................104

5.3.4 Discussion ......................................................................................105

Epicentre spread function .........................................................106

Year of catalogue completeness ...............................................106

5.4

Conclusion and Outlook .............................................................................109 5.4.1 Aleatory variability ........................................................................109 5.4.2 Epistemic variability ......................................................................109

6 General Validation and Improvements of the Method





111

6.1 Scope and goal ............................................................................................111

6.2

General validation .......................................................................................111

6.2.1 Input earthquake catalogue construction .......................................112

6.2.2 Comparison of magnitude and locations of synthetic

events versus historical events ......................................................113

6.2.3 Comparison of observed historical ground motions

versus modelled ground motions ..................................................116

6.2.4 C omparisons of magnitude-frequency distributions .....................118

6.2.5 Discussion ......................................................................................119

The 1668A.D. M8.5 event ........................................................119

Higher and lower hazard levels than observed in fault

zones .........................................................................................119

6.3 Improvements of the random spatial distribution of synthetic events to

account for fault-like behaviour ..................................................................120

6.3.1 Azimuthal earthquake occurrence in fault zones ...........................121

6.3.2 Implementation of earthquake-cycle .............................................123

7 Conclusions

127

7.1 Conclusions related to the methodology .....................................................127

7.2 Conclusion of seismic hazard assessments .................................................128

8 Outlook

131

VI



A clarification on used Intensity scale

135

Modified Mercalli Intensity scale ........................................................................135

Abbrevations

138

Curriculum Vitae

139

Acknowledgements

140

Bibliography

142

VII



Figures Figure 1-1:

The elements of a probabilistic seismic hazard assessment. ..................2

Figure 2-1:

Truncated GR law for an arbitrary source zone and dataset ................13

Figure 2-2:

Hazard curve for an arbitrary site with an arbitrary hazard. ................17

Figure 2-3:

Flowchart showing the operation of bootEQ .......................................18

Figure 2-4:

Graphical representation of the LHC algorithms .................................21

Figure 2-5:

Truncated GR distribution for a characteristic zone ............................23

Figure 2-6:

Different characteristic models and their corresponding

parameters. ...........................................................................................23

Figure 2-7:

Visualisation of the epicentre spread function .....................................26

Figure 2-8:

Graphical representation of the algorithm by Smith (1995) ................28

Figure 3-1:

Seismic source zones for the CEUS region and the input

earthquake catalogue. ...........................................................................32

Figure 3-2:

Seismic Hazard Map in PGA for CEUS for a return period of 475

years .....................................................................................................34

Figure 3-3:

Magnitude frequency distribution in the NMSZ ..................................37

Figure 3-4:

Same as Figure 3-3 but for a υ-value of 0.1, a Mmin of M6.4 and a

Mmax of M7.5. .......................................................................................38

Figure 3-5:

Final event set for the CEUS region.....................................................41

Figure 3-6:

Magnitude frequency distributions for the different source

regions used in this study. ....................................................................43

Figure 3-7:

Magnitude frequency distribution for the Eastern Zone and

Wabash Valley .....................................................................................44

Figure 3-8:

Hazard map for the CEUS region for a return period of 475 years......45

Figure 3-9:

Magnitude frequency distribution for the Charlevoix seismic

zone ......................................................................................................47

Figure 4-1:

Estimation of the year of completeness of the SSB catalogue by

plotting the cumulative sum of events in different magnitude bins versus time............................................................................................54

Figure 4-2:

Seismic source zones of China and the GSHAP-catalogue. ................56

Figure 4-3:

Comparison of different attenuation relationships ...............................65

Figure 4-4:

Same as Figure 4-3 for 4 different considered earthquakes. ................66

VIII



Figure 4-5:

Magnitude-frequency distribution for the source zones 12, 15 to

18 and 22 ..............................................................................................68

Figure 4-6:

Same as Figure 4-5 for the source zones 1 to 11..................................69

Figure 4-7:

Same as Figure 4-6 but for the source zones 12 to 22..........................70

Figure 4-8:

Event set for the mainland of China.....................................................72

Figure 4-9:

Hazard map for the mainland of China for a return period of 475

years .....................................................................................................73

Figure 4-10: Hazard Map in PGA for China for a return period of 475 years

(after Zhang Peizhen et al., 1999) ........................................................76

Figure 4-11: Time distribution of the earthquakes in Eastern China ........................79

Figure 4-12: Same as Figure 4-5 for the source zones that encompass the area

of eastern China....................................................................................80

Figure 4-13: Hazard map for Eastern China for a return period of 475....................81

Figure 4-14: Hazard map in PGA for North-Eastern and South-Eastern China

for a return period of 475 .....................................................................83

Figure 5-1:

Above: histogram for 100 processed hazard maps with 475 years

return period. Below: Corresponding normal plots..............................91

Figure 5-2:

Standard deviation in units of MMI intensity for 100 processed

hazard maps of a return period of 475 years ........................................92

Figure 5-3:

Aleatory variability distributions for two parameter sets at four

different points where the hazard was calculated.................................93

Figure 5-4:

Standard deviations for a parameter with a smaller spread and

larger spread .........................................................................................94

Figure 5-5:

Upper panels: Histograms for two parameter sets. Lower panel:

Comparison of the empirical cumulative density distribution of

the observed diffused radii with a gamma distribution ........................98

Figure 5-6:

Mean and Variance of the gamma distribution for different

parameter sets. ....................................................................................100

Figure 5-7:

Squares and triangles show the locations where the Null

Hypothesis of the statistical test is accepted ......................................103

Figure 5-8:

Difference in calculated MMI levels for four different years of

catalogue completeness. .....................................................................105

Figure 5-9:

Comparison of magnitude frequency distribution for two

parameter sets of the spread function and event set for the Ordos

source zone .........................................................................................107

Figure 5-10: Earthquake rate for the SSB catalogue of Eastern China...................108

Figure 6-1:

Comparison of locations and magnitudes generated by bootEQ

and the historical SSB-catalogue since catalogue completeness .......114

Figure 6-2:

Same as Figure 6-1 for South-Eastern China. ....................................115

IX



Figure 6-3:

Difference in ground motions between the observed ground motions and the predicted values of bootEQ .....................................117

Figure 6-4:

Magnitude-frequency distribution for the source zones that

encompass the region of Eastern China. ............................................118

Figure 6-5:

Representation of the anisotropic spread function .............................122

Figure 6-6:

Visualization of the algorithm that accounts for an earthquake

cycle ...................................................................................................124

Figure 8 1:

Illustration for a Monte Carlo approach for uncertainty analysis ......132

Figure 8 2:

De-aggregation for the city of Tangshan and Hong Kong .................133

X



Tables Table 2-1:

LHC algorithms for a truncated GR distribution .................................22

Table 3-1:

Summary of the seismic zones for CEUS used in this study ...............39

Table 4-1:

Summary of the background seismicity zones for the mainland of

China. ...................................................................................................57

Table 4-2:

Summary of the parameters for the characteristic source zones

used for the mainland of China.. ..........................................................62

Table 5-1:

Summary of the four different parameters sets to test the

influence of the epicentre spread function. ........................................102

XI



Chapter 1: General Introduction

1 General Introduction 1.1

Why seismic hazard assessment?

In 2005, 149 natural phenomena associated with earthquake, flood or storm caused 97’000 fatalities and 230 billion U.S. Dollars financial losses (Sigma, 2006). With Earth’s growing population and its concentration to few but densely populated areas, the alertness of modern society to the impact of disastrous natural events is increasing. The hazard posed by any natural phenomenon depends on the size of the population and the value exposed to that event. The exposed value can be buildings or another quantity of measurable property or the loss of life which is a far more difficult value to rate. How a natural phenomenon influences society and economy is expressed by the term of risk. Risk in the context of hazard assessment is defined as the hazard times the vulnerability of an exposed value times the social or economic elements involved (Alexander, 1993). Hazard characterizes the probability of exceeding a physical property characterizing the natural phenomena or its consequences at a specific site during a time period of interest. Vulnerability is a measure of the fragility of the exposed social or economic elements and defines the degree of loss resulting from a given level of hazard. The degree of loss is expressed on a scale from 0 (no loss) to 1 (complete loss). Society has only two possibilities to reduce its exposure to a certain risk: either not placing any goods at locations where they could be exposed to a natural phenomenon or to reduce the vulnerability of the possible affected goods. The hazard due to a specific peril is given by nature itself and cannot be influenced by mankind. Nevertheless, it is essential to describe the hazard as exactly as possible. Only with a comprehensive understanding of the hazard, the vulnerability to this threat may be reduced by means that are socially and economically feasible. This study deals with the evaluation of the seismic hazard. Seismic hazard assessment is the quantitative estimation of the potential for dangerous earthquakerelated natural phenomena such as ground shaking, ground failure or surface faulting. The classical output of seismic hazard assessment is a seismic hazard map that expresses the seismic hazard as a probability of exceeding some ground-motion intensity measure during a time period of interest. Such maps are in civil engineers’ and governmental interest to guide the design of buildings and to define building codes. The proper design of constructions and adequately defined building codes reduce the vulnerability of buildings which are ultimately affected by ground shakings due to earthquakes. The reduction of the vulnerability eventually leads to the mitigation of the seismic risk.

1



Chapter 1: General Introduction

1.2

Probabilistic seismic hazard assessment

There are two primary approaches to estimate the seismic hazard at a site by quantifying the ground motions at this site. The quantification of ground motions can be in terms of simple scalar values (e.g. peak ground acceleration, spectral acceleration, macroseismic intensities) or it can be in terms of time series of acceleration, velocity and displacement. This work does only consider scalar values and therefore ground motion will from now on always refer a scalar ground motion intensity value. The first approach is the deterministic approach that quantifies the ground motions at the site of interest from one or more earthquakes of specified location and magnitude. The deterministic approach uses a scenario approach and estimates the upper limit of ground motion at a specific site as it mostly only considers the largest possible earthquake at the nearest possible distance to the site of interest. The advantage of this approach is that it can easily be understood, even by a layperson. However it does not consider all possible earthquakes that can affect the seismic hazard during a time period of interest. Therefore, probabilistic seismic hazard assessment (PSHA) is the most commonly used approach to assess earthquake hazard (Cornell, 1968). This approach takes all possible earthquakes from all possible magnitude-distance pairs into consideration and calculates the probability of various levels of ground shaking at the site. There are four basic steps in a PSHA and they are outlined schematically in Figure 1-1 (TERA, 1980). The first element of a PSHA (upper left panel of Figure 1-1) is to identify and specify where earthquakes can potentially occur. Either areal seismic sources or line sources (faults) are discriminated. The description of the earthquake occurrence in space leads to Figure 1-1: The elements of a probabilistic the probability density distribution fR(r|m), seismic hazard assessment. (TERA, 1980) describing the probability of the map distance to the site of interest being r, for an earthquake of magnitude m and origin in the potential seismic source i (notation for this and following equations taken from McGuire and Arabasz, 1990). The second element of a PSHA (upper right panel of Figure 1-1) describes the probability distribution of earthquake sizes and their corresponding rates. A magnitude-frequency distribution gives the probability distribution fM(m),υi, describing the probability that an event of magnitude m occurs under the condition that υi earthquakes occur above a magnitude m0 per unit time in seismic source i. These first two steps represent a complete seismic source model. Each PSHA’s third step (lower left panel of Figure 1-1) is to use mathematical equations to estimate the ground motion intensity measure at the site of interest as a

2



Chapter 1: General Introduction

function of the magnitude and location (and possibly other free parameters). This step leads again to the construction of a probability distribution GA|m,r(a*), describing the probability that for an earthquake of magnitude m at distance r, the ground motion will exceed some ground-motion intensity measure a*. The final step is to relate all three ingredients through the total probability theorem to compute the probability of exceeding the ground-motion intensity measure a* during a certain time period t: P[ A > a* in time t ] / t ≅

N source

∑ υi i

∞ mmax

∫ ∫

GA|m,r ( a*) f M ( m ) f R ( r | m )* dmdr

(1.1)

r =0 m=m0

This formula signifies that the annual probability that a ground-motion intensity measure A exceeds a* (left side of equation (1.1)) is equal to the probability of a* being exceeded for a given magnitude and distance (GA|m,r(a*)), multiplied by the probability of an earthquake of that magnitude and distance (fR(r|m) fM(m)), integrated over all possible magnitudes and distances (∫ ∫*dmdr), taking all possible seismic sources (∑i υi) into consideration. The inverse of the annual probability of exceedance is called the return period. Thus, the PSHA presents a composite picture of the probability that a certain level of ground motion will be exceeded during a time of interest, considering all possible seismic events in the region.

1.3

Stochastic event sets for probabilistic seismic hazard assessment

In today’s economy, the insurance industry provides among other stakeholders, the replacement of values affected by a destructive earthquake. For an earthquake hazard assessment of an insurance company the classical form of the PSHA as derived by Cornell (1968) is not applicable. The great advantage of this approach, i.e. the fact of taking all possible source locations, magnitudes and distances into account, is also its weak point. The predicted ground motion during the time of interest is not assuredly related to one single earthquake, since that ground motion corresponds to many potential earthquakes around the site. A composite picture of the seismic hazard gives no information about the quantity of sites affected by one earthquake and is therefore not applicable to calculate losses on a portfolio of properties. For instance, imagine two hazard maps of two sites of interest which appear identical. In one hazard map the main contributors to the hazard are many small earthquakes occurring frequently at close range of the site. The main contributor to the hazard of the second site is a single large earthquake that occurs on average every 500 years in the region of interest. The causative events have completely different effects on the values that are exposed in these two areas. Buildings at the first site may frequently suffer some minor damage due to the numerous earthquakes while buildings on the second site may experience damage one time only: structural injury that is leading to their total destruction. The losses in terms of fatalities as well as economic loss will therefore be completely different.

3



Chapter 1: General Introduction

Consequently it is elementary for an insurance company’s earthquake hazard and risk assessment, which has to consider the potential losses in terms of a portfolio of properties, to apply a form of PSHA that considers all sites simultaneously and also relates the predicted ground motions clearly to one single earthquakes. Each event that contributes to the hazard at a given site has to be characterized with its temporal and spatial occurrence and can therefore be directly regarded as one possible scenario of earthquake-occurrence (hence the name “event set”: a set of possible events). In recent years a method became available which not only corresponds to this need, but also compensates for the limitation of the Cornell (1968) approach: The generation of stochastic or synthetic earthquake catalogues. Every PSHA is interested in time spans of several hundred or event thousand of years, in which a certain value a* of a ground motion intensity measure is exceeded. Unfortunately our instrumental and eventually historical perception of the earthquake process only covers some decades. Hence, it is necessary to create stochastic earthquake catalogues that have the information content that might be representative for the earthquake occurrence of the next of several ten- to hundred thousands of years. This long duration gives an unbiased estimate of the hazard at all sites of interest. In essence, the generation of such amount of data in the stochastic earthquake catalogue only requires a model to describe the temporal and spatial distribution of the earthquake hazard: a seismic source model. This is identical to the first two steps of the classical Cornell (1968) approach (Figure 1-1). From this model a synthetic earthquake catalogue can be constructed by means of a Monte Carlo process (the controlled use of random numbers, hence the name “stochastic”). The stochastic earthquake catalogue can be constructed in a way that it obeys the seismic source model and is consistent with the past behaviour of seismicity. The method presented in this thesis is based on a sampling with replacement technique† of available earthquake catalogues. This technique was first presented by Ebel and Kafka (1999). In essence, an equivalent amount of earthquakes from a given input earthquake catalogue is re-sampled with replacement. For each of the drawn events by this procedure a magnitude is assigned from a magnitude list that follows predefined statistical properties and is consistent with the seismic source model. The assignment of the magnitudes is undertaken by means of a sampling from a distribution technique. The locations of the epicentres in the re-sampled catalogue are varied randomly, while maintaining the overall relative density of epicentres in space. In order to have an unbiased estimate of the seismic hazard, the process described above is repeated several thousand times. A simple example illustrates the feasibility of this approach. Assume an earthquake catalogue covering 200 years of data. In order to estimate the ground motion value a* (on any measurement-scale) that occurs on average every 500 years (after a Poisson process), several hundred-thousand years of data are needed to have a reliable estimate of this annual probability. For this purpose, the process as described above (drawing-with-replacement; magnitude assignment; random variation of the epicentre location) is repeated several thousand times Each of these simulations represents a †

In a random sample with replacement each observation in the input catalogue has the same chance of being selected and can be selected several times

4



Chapter 1: General Introduction

possible outcome of an earthquake catalogue of the next 200 years and is both consistent with the seismic source-model and the seismicity behaviour of the past. 1’500 of these simulations give the effect of 300’000 years of data. One only needs to relate the magnitudes and locations in this stochastic earthquake catalogue containing 300’000 years of data to the ground motion values (lower left panel of Figure 1-1) at the site of interest and sort the possible outcomes by size. The value of the 501st worst outcome in the sorted list gives the ground-motion value a* with an annual probability of 0.002, corresponding to a return period of 500 years. There are numerous advantages of such an approach†. First of all, the output may be used directly to calculate the expected losses on a portfolio of properties and in addition to that, the expected losses of lifeline structures such as power or water distribution systems, critical traffic systems and associated infrastructure. Each event in the stochastic catalogue is characterized by a frequency of occurrence, a magnitude and its location of occurrence. After the application of the third step of the PSHA (lower left panel of Figure 1-1) and if vulnerability functions are available (that relate the vulnerability on a scale from 0 to 1 to the ground shaking) simulating the loss of buildings is straightforward. This leads to the second large advantage of this method: the process of de-aggregation is extremely simple. The de-aggregation of hazard is required to understand which magnitude-distance combinations contribute most to a specific ground motion value at a site of interest and is not trivial with a Cornell-approach (Bazzurro and Cornell, 1999). But in an event set based approach it is straightforward. As each event in a stochastic event set is spatially and temporally well-defined, it is straightforward to evaluate which magnitude-distance combinations lead to a ground motion value of interest at a specific site. The third advantage is that the incorporation of uncertainties in the input parameters for the PSHA can be dealt with with quite easily. Each input parameter can be entered as distribution function with an observed mean and standard deviation. A different value of this distribution function can be sampled for every simulation of a stochastic earthquake catalogue. Conventional logic-trees can also be used but their disadvantage is that the choice of weights for each branch in the tree tends to be subjective. The last and possibly trivial appearing advantage is that the method is conceptually very demonstrative. The seismologist creates an earthquake catalogue that might be representative for the upcoming several hundred thousands of years. It is very easy to communicate to all parties involved in a risk assessment, such as disaster planners, civil engineers, politicians and other parties, how this catalogue is created and how it is used to predict the earthquake hazard. Despite the above mentioned benefits of a method that creates a stochastic earthquake catalogue, it is barely represented in literature, although it is not a new method. Rosenhauer (1983) applied this method to the seismic hazard assessment for nuclear power plants in the Lower Rhinegraben; Shapira (1983) to the seismic hazard assessment of Jerusalem; Ahorner and Rosenhauer (1993) for a hazard assessment of †

Part of the information of this and the following paragraphs is taken from Musson (2000).

5



Chapter 1: General Introduction

Germany and these authors were also the first to incorporate multiple source zones in their model. In more recent years this approach was also incorporated in the complete risk assessment of insurance companies (Schmid and Schaad, 1994) and for evaluating possible earthquake scenarios (Baranov et al., 2002). In most recent years the method found its application in the national seismic hazard assessment programs for the United Kingdom (Musson, 2000) and Switzerland (Giardini et al., 2004).

1.4

Goal and contents of this study

A first part of this work (Chapter 2) presents a detailed of how stochastic event sets are generated, as introduced by Dr. Mariagiovanna Guatteria, SwissRe New York. A thorough documentation is required of an otherwise underrepresented method in literature. All background information necessary to understand the input model for the stochastic event set generation is outlined. In a later section of this chapter the mathematical and statistical methods are provided on the algorithm and its implementation. The method used in this thesis was applied in a first step to a seismic hazard assessment of the Central and Eastern United States (Chapter 3). This is a well examined region in terms of seismic hazard assessment and a comparison of the method presented in this thesis with those of other authors is possible. As numerous studies are available for this region, a thorough literature research was conducted to obtain the most credible inputs for the underlying hazard model. I did not account for the uncertainties in the input parameters and therefore only determined the most credible input parameters. The goal is to examine whether the results obtained by the method presented in this study correlate with the results of other authors. The resulting hazard assessment of this region is a preparation work for a complete risk assessment modelling process of the Swiss Reinsurance Company for the same region. The second region for which I carried out a seismic hazard assessment is China, with focus on the region of Eastern China (Chapter 4). Only few studies examining the seismic hazard of China exist. I have performed an in-depth literature research to determine all possible inputs for a hazard model of this region. For Eastern China I have carried out a direct comparison with the Global Seismic Hazard Assessment Program (Giardini et al., 1999). The goal was to determine whether or not our results are comparable to those attained by the previously mentioned authors. Deviations between our results and the Global Seismic Hazard Assessment are discussed in detail and reasons are outlined why the obtained results deviate from the previous studies. Furthermore, specific suggestions are made on what points future geophysical and geological research in China should focus. The results of the hazard assessment of this region have been directly incorporated by the Swiss Reinsurance Company in a complete seismic risk modelling process of China. Despite the numerous advantages of generating stochastic event sets for probabilistic seismic hazard assessment, there are also some disadvantages. One drawback has been examined in the fifth chapter of this thesis, where that the fact is pointed out that as the creation of the synthetic earthquake catalogue is stochastic by

6



Chapter 1: General Introduction

nature, the results will deviate among each other and are non-unique. The goal of the fifth chapter is therefore to examine the distribution and quantify of the deviations in the final hazard calculation. The focus was set on a time span of interest of 475 years. Furthermore I examined the sensitivity of different input-parameters on the final hazard calculation, whereas the focus was set to the input parameters that determine the spatial occurrence of the stochastic events. A statistical test was developed for this purpose. The final Chapter 6 uses the findings of the previous chapters, in an effort to carry out a general validation of the presented method for stochastic event set generation. Thanks to the tradition of circumstantial Chinese historiography, the earthquake catalogue of Eastern China permitted the comparison of the actually observed hazard of the past five decades with the hazard predicted by our model. Due to the found deviations, methods are proposed that better mimic the spatial and temporal occurrence of earthquakes in the synthetic earthquake catalogue, compared with the long term behaviour of seismicity.

7



[Blank Page]

8



Chapter 2: Method

2 Method 2.1

Fundamentals of the method

In order to generate a stochastic event set, assumptions have to be made how seismicity will behave in the future in terms of its spatial and temporal distribution. The only available information to constrain such a simulated earthquake catalogue are observed earthquake data. One possible assumption is that earthquakes will behave in the future in the same way as they did in the past. In this chapter I introduce the concept and implementation of a novel method for generating stochastic earthquake catalogues. Dr. Mariagiovanna Guatteri of Swiss Reinsurance Company wrote a code written in Matlab language called bootEQ. The basic mode of operation is based on sampling with replacement technique of a given input earthquake catalogue and creating a synthetic catalogue (from now on also called event set or stochastic earthquake catalogue). The method takes all the data from the observed seismicity into account but additionally accounts for the randomness of earthquake occurrence. The basic idea of a sampling with replacement technique of a given input earthquake catalogue was first presented by (Ebel and Kafka, 1999). A brief summary of its working procedure: after drawing with replacement an equivalent amount of events from the input earthquake catalogue (that can contain both instrumental and historical data) it assigns a magnitude to each of the chosen events. The magnitudes are chosen from a magnitude list where the probability of selection for each magnitude value is determined from a Gutenberg-Richter recurrence relation. The locations of the synthetic events are determined from the locations of the events in the input catalogue, whereas further randomness may be applied. This implies that a set of stochastic generated events can be treated independently in space and time from the input earthquake catalogue, signifying that the magnitude of earthquakes in the synthetic catalogue are independent of the observed magnitudes of the earthquakes in the input catalogue. The locations of earthquakes in the synthetic catalogue solely depend on the applied spatial randomness and their rate of recurrence is determined from a Gutenberg-Richter recurrence relation. An empirical ground motion relationship relates the magnitudes in the synthetic catalogue with ground motions that can be expected at all sites of interest. In order to have an unbiased estimate of the annual probability of exceeding a certain ground motion at a site of interest one needs a very large number of events in the synthetic catalogue. Therefore the process of re-sampling an input earthquake catalogue; assigning a magnitude value to each event in this re-sampled catalogue; accounting for the spatial randomness of earthquake occurrence; and calculating the ground motions at all sites of interest is repeated n times and for each simulation run n a synthetic event set is generated that contains an equivalent amount of events as the input earthquake catalogue. This leads to a very large number of synthetic events and

9



Chapter 2: Method

gives therefore an unbiased estimate of the annual probability of exceeding a certain ground motion. As a simple example, 750 simulations of 200 years of seismicity in the input earthquake catalogue give the effect of 150’000 years of data. When the ground motion value from each of these 150’000 years is sorted by size, one can determine the ground shaking value with a 0.002 annual probability of being exceeded by just picking the 501st value in the sorted list (e.g. Musson, 2000). In this chapter I first introduce the background information necessary to understand the working procedure of bootEQ. In section 2.7 the specific implementation in bootEQ is outlined.

2.2

Catalogue of observed earthquakes

The first step in each probabilistic seismic hazard assessment (PSHA) is the assembly of an earthquake catalogue. As outlined in the introduction to this chapter bootEQ re-samples a given input earthquake catalogue and creates a synthetic catalogue on the basis of the observations. The input earthquake catalogue can contain historical and instrumental data. Each earthquake has to be characterized with a magnitude and location. It is necessary that the magnitude scale in the earthquake catalogue is homogenized. If the observed catalogue contains different magnitude scales, conversion schemes may be applied. This may sound as a contradiction to the statement made in the introduction of this chapter, where it was mentioned that only the spatial information of the input catalogue is needed but the magnitudes are assigned from a magnitude-list with predefined statistical properties. But in order to know how many synthetic events shall be generated one needs to rely on the seismicity rate (number of events per unit time above a certain threshold magnitude) in the input earthquake catalogue (besides the fact the statistical properties of the magnitude list will also be determined from this input catalogue; see section 2.3). To accurately determine this seismicity rate, the magnitude information of the input earthquake catalogue is needed. Why this information is necessary is outlined in the next sub-section. 2.2.1

Catalogue completeness

Each earthquake catalogue contains a threshold in time above which 100% of the events above a threshold magnitude Mc are completely represented in the catalogue. Below Mc, a fraction of events is missed by the seismic network (1) because seismic network does not have enough stations to accurately and reliably determine the magnitudes and locations of these events; (2) because network operators decide that events below a certain threshold are not of interest; or (3) in case of an aftershock sequence, because they are too small to be detected within the coda of larger events. E.g. the seismic network of Switzerland has a Mc of M1.8 (Giardini et al., 2004). When considering time intervals before instrumental recordings, this threshold magnitude Mc is larger and depends on population density. In uninhabited places, also the largest events will happen undetected. E.g. in the western United States there is no complete historical record of earthquakes greater than M5 before the 18th century. In

10



Chapter 2: Method

the western United States numerous earthquakes above M5 happen every year but the population density before the 18th century was extremely low so the events passed unnoted or a historical record of these events is missing. Hence it follows that Mc varies temporally and spatially. The completeness in time of a historical earthquake catalogue is determined by analyzing the rate of seismicity versus time within different magnitude intervals for all of the study area (Mulargia et al., 1987). This analysis results in a year since which the input earthquake catalogue is complete for earthquakes with magnitudes above the threshold magnitude Mc. I will refer to that point in time as the year of catalogue completeness. The knowledge of Mc, respective the year of catalogue completeness, determines the rate of seismicity (number of events per unit time in the whole area of interest) of the input earthquake catalogue and it is assumed that the stochastic event set has the exactly the same seismicity rate. Therefore it follows that bootEQ assumes that earthquakes can be treated temporally independent. More explicit, the seismicity rate determines the numbers of synthetic events that have to be generated for a given simulation time (section 2.7.1). For example, if the input earthquake catalogue spans 200 years since the year of catalogue completeness and 80 earthquakes greater than the threshold magnitude Mc were observed during this time the seismicity rate is found to be 0.4 earthquakes per year (80/200). If we want to generate an event set of 150’000 years we would need 60’000 events in our stochastic event set (same ratio).

2.3

Seismic zones

Each PSHA’s second step is the identification of possible seismic sources (see general introduction). A seismic zone defines an area with a similar style of deformation and a homogenous seismicity. Seismic zones are defined based on information about active faults, observed seismicity from historical and instrumental seismic catalogues, tectonic and geological data. In a seismic hazard analysis source zones are getting discriminated in terms of background seismicity zones and characteristic source zones. Background seismicity zones are areal sources and obey the same statistical properties of earthquake occurrence in time (see next subchapter) in the whole source zone. The parameters of the Gutenberg-Richter law (or any derivations of it, see next subchapter) are constant inside the whole source zone. That leads to the conclusion that anywhere in a source zone, the average period of time between the occurrences of earthquakes of a given size is similar. The same holds true for characteristic seismic source zones. Characteristic source zones produce in more or less regular intervals magnitudes near or at their maximum magnitude and are mostly associated with faults (see section 2.4.2). It is beyond the scope of this study to define these source zones. In this work the source zone definitions for the regions of interest were obtained from the available literature.

11



Chapter 2: Method

2.4

Magnitude distribution

In each PSHA the description of the earthquake occurrence in space (see previous section) is directly linked with the description of the probability distribution of earthquake sizes in time. Each seismic zone obeys the same probability distribution of earthquake sizes in time. Examining an earthquake catalogue it can be seen that the size of earthquakes and their frequency of occurrence are not uncorrelated. Since early times of experimental seismology it was observed that the number of earthquakes decreases with increasing magnitude. This was first recognized in Japan by Ishimoto and Iida (1939) and in California by Gutenberg and Richter (1944). This subchapter describes and characterizes the properties of the probability distribution of earthquake sizes in time. The procedure how bootEQ recreates a given magnitude-frequency relationship is outlined in section 2.7.2. 2.4.1

Gutenberg-Richter relationship

Gutenberg and Richter (1944) formulated the relationship between the frequency of occurrence of a given earthquake of size M as: log10 N( M ) = a − bM

(2.1)

where N is the cumulative number of earthquakes with magnitude greater or equal than M. Plotting the logarithm of the number of events log10N(M) versus the magnitude M leads to a straight line. The a- and b-parameters in equation (2.1) are generally obtained by regressing a database of recorded earthquakes in the source zone of interest. Therefore the magnitude reporting in the zone of interest needs be homogenous (section 2.2). 10a is the mean yearly number of earthquakes of magnitude greater than or equal to zero. In a plot of log10N(M) versus the magnitude M, the interception with the y-axis defines the a-parameter (arrow in Figure 2-1). The a-parameter can be considered as the seismic activity rate in a certain area. The higher the a-parameter, the more events are expected to occur in a certain area during one year. The b-parameter describes the relative likelihood of larger versus smaller magnitude earthquakes. In a graph log10N(M) versus the magnitude M the b-parameter is given by the slope of the straight line (Figure 2-1). Equation (2.1) can be rewritten as:

λM = 10a −bM = eα − β M

(2.2)

where α=ln(10)*a and β=ln(10)*b. λm can be considered as a mean annual rate of exceedance, i.e. λm can be considered a frequency that an event of magnitude M will take place during one year. This equation is called the standard Gutenberg Richter (GR) law. To define a reliable GR law one needs also to rely on historical data because large earthquakes can have a very low mean annual rate of occurrence. Theoretically this law covers an infinite range of magnitudes -∞ to +∞. However, as mentioned in section 2.2.1, seismic networks detect only magnitudes down to Mc. As

12



Chapter 2: Method

a fraction of the numbers of events below Mc is missed, the plot of log10(λm) versus M will not lead anymore to a straight line. At magnitudes below Mc, log10(λm) deviates from the straight line towards lower values of λm (see Figure 2-1 for a graphical representation). For an analysis of seismic hazard the effect of earthquakes lower than M4 or M5 is often negligible as these earthquakes seldom cause any damage. If earthquakes smaller than that threshold magnitude Mmin are ignored, the mean annual rate of exceedance can be written as (McGuire and Arabasz, 1990):

λM = υ * e− β *( M −M

min

)

(2.3)

where υ = e( α − β M min ) . In most seismic hazard approaches Mmin is set to values between M4.0 and M5.0.

Figure 2-1: Truncated GR law for Mmin=4, Mmax=6 Mc=3.2 and a b-value of 0.9 for an arbitrary source zone and dataset plotted on a semi-logarithmic scale. Note that part of the blue line for magnitudes below Mmin (lighter blue part of the curve) is actually not defined by equation (2.5) but drawn here to illustrate that the straight part of the blue line continues down to Mc. Below Mc the λm-values are shifted towards lower values. The intersection of the graph with the y-axis at a magnitude of zero gives 10a (arrow). The slope of the straight part of the blue line is the b-value. The red square marks the observation that certain events take place with a higher frequency than expected from the GR law near their characteristic magnitude Mchar (section 2.4.2).

13



Chapter 2: Method

Equation (2.2) shows that the GR law implies that earthquakes are exponentially distributed†. Its cumulative distribution function (CDF) is given by:

λM − λM = 1− e − β ( M −M ) (2.4) λM Where P( M < m | M > M min ) is the probability that the random variable M is smaller than a value m under the condition that M is larger than Mmin. Therefore this formula gives the probability that magnitude M will not exceed m. P( M < m I M > M min ) =

min

min

min

At the upper end of the magnitude scale, the standard GR law predicts nonzero mean rates of exceedance for magnitudes up to infinity. But an earthquake of magnitude greater than M9.5 has never been observed (Kanamori, 1977). It therefore makes sense to introduce some maximum magnitude Mmax. Again, a mean annual rate of exceedance can be constructed (McGuire and Arabasz, 1990):

λM = υ * κ * ⎡⎣e− β *( M −M

(

min

)

− e− β *( M max −M min ) ⎤ ⎦

)

(2.5)

−1

where κ = 1 − e − β ( M max −M min ) . This equation is defined on magnitude intervals from Mmin to Mmax. This law is often called truncated GR law. From this equation the corresponding CDF can be constructed: P( M < m | M min ≤ M ≤ M max ) = κ * ⎡⎣1 − e − β ( M −M min ) ⎤ ⎦

(2.6)

Again this is the probability that the random variable M is smaller than a value m but this time under the condition that M is larger than Mmin and smaller than Mmax. For a graphical representation of this CDF and how it is being used in order to construct the magnitude values of the synthetic catalogue, refer to section 2.7.2. 2.4.2

Non Gutenberg-Richter relationship: characteristic earthquakes

Since a long time it has been questioned that the GR law (or any of its deviations) is able to represent the behavior of a single source (like a fault) (Schwartz and Coppersmith, 1984). The same study showed that paleoseismic evidence indicates that individual points on faults and fault segments tend to move by approximately the same distance in each earthquake. This has been interpreted that individual faults repeatedly generate earthquakes of similar size, known as characteristic earthquakes, at or near their maximum magnitude Mchar. By dating the characteristic earthquakes their historical rate of recurrence can be estimated. Geologic observations also indicate that characteristic earthquakes occur more frequently than would be implied by extrapolation of the GR law (see red square in Figure 2-1). There are four possibilities to estimate the characteristic magnitude and/or recurrence interval of a single source. The first estimate comes from an interpretation † The probability density function of an exponential distribution has the form for x≥0 where λ is the shape parameter. λ is often called the rate parameter as in a Poisson process in which an object initially in state A can change to state B with a constant probability per unit time λ. The radioactive decay is an example of such a process.

14



Chapter 2: Method

of the magnitude-frequency distribution of earthquakes. If characteristic events follow a GR distribution, the b-value of the distribution can be used to infer the recurrence of large earthquakes. As mentioned above, this approach has been put into question. Nevertheless, monitoring today’s distribution of seismicity can be very useful to illuminate principal fault zones. The second approach is the interpretation of paleoliquefaction features preserved in sediments. Such a feature is for example sand blows. These patches of sand erupt onto the ground when waves from a large earthquake pass through wet, loose sand. Due to the passing of seismic waves, the water pressure increases and causes the sand to liquefy. This liquefied sand emerges at the surface (e.g. USGS Fact Sheet, 2002). If the age of material buried by erupted sand can be dated with radiogenic isotope dating techniques, then the maximum age of the earthquake can be determined. If such paleoliquefaction features are found numerous times in a seismic zone it is possible to use these features as a basis to estimate a recurrence interval. The magnitude estimates can e.g. be obtained by measuring riser offsets or other offsets of points connected previously to the earthquake in the field. The disadvantage of these studies is that they can either over- or underestimate the recurrence interval of characteristic events, depending whether: all pre-historic earthquakes are preserved in the sediments; whether each event is detected and interpreted as a single event; whether multiple observations of the same event are interpreted as a single event or mistakenly as multiple earthquakes and how accurately the events can be dated. The third kind of available data to estimate recurrence intervals and characteristic magnitudes are displacement rate measurements in a seismic zone. In such a study one measures the displacement-rates along and nearby a fault e.g. by GPS measurements. Knowing the fault geometry, the strain level that causes the fault to rupture can be estimated. From displacement rate measurement an estimation of the time needed to accumulate the strain level that causes the fault to rupture can be gathered. Such displacement measurements always give the minimum recurrence interval of earthquakes on singe source as the whole displacement is assumed to be released in seismic energy. It has been shown that a significant portion of the total deformation can be released in aseismic deformation (Burgmann et al., 2001). Furthermore the uncertainties in the current GPS data are still large. The last and most frequently applied approach to determine Mchar is the interpretation of first-hand reports of the shaking and/or damage caused by the earthquakes. In such kind of study, all the damage or felt effects of an earthquake are reported. It is then possible to apply a “log(isoseismal area A)” method of macroseismology (e.g. Hanks et al., 1975; Toppozada, 1975). This method relates the area of a given intensity to a magnitude†. There is often quite some uncertainty around the estimated recurrence interval and magnitude of characteristic events. In order to model the characteristic earthquakes, Youngs and Coppersmith (1985) developed a magnitude-frequency density function that combined an exponential magnitude distribution at lower magnitudes (the same † e.g. Johnston (1996) related seismic moment M0 to the area of an assigned Intensity AMMI via equations in the form of l og( M 0 ) = a + b* log( AMMI ) + c AMMI ). The coefficients of these equations were obtained by a regression on a database of observed intensity values.

15



Chapter 2: Method

as outlined in section 2.4.1) with a uniform distribution in the vicinity of the characteristic earthquake to account for the uncertainties in the magnitude of characteristic events. Other models like the time dependent model of (Wu et al., 1995) had been proposed. In this study another approach was applied. This approach is outlined in section 2.7.3.

2.5

Empirical ground motion relationship

To calculate the ground motion intensity value that contributes to the seismic hazard at a site an equation is needed that relates a physical ground motion parameter or a macroseismic parameter to a few independent parameters, such as source-to-site distance, magnitude of an earthquake or style of deformation. Such a relationship is called an attenuation relationship. The coefficients in the equation are found by a regression analysis that relates the strong motion recordings (or macroseismic observations) to the independent variables. With a given attenuation relationship, one is able to calculate dependent variables, such as physical ground motion parameters or the macroseismic parameter for one earthquake in an earthquake catalogue at a given location. One possible mathematical form of such an equation is: MMI = a1 + a2 * M − a3 * ln( Δ ) − a4 * Δ

(2.7)

where MMI are Intensities on the Modified Mercalli Intensity Scale, ∆ the source to site distance which can be hypo- or epicentral distance, M the magnitude of the event and a1...4 coefficients that are found by the mentioned regression analysis. However, there exists no standard mathematical expression that relates the same independent parameters to a macroseismic parameter. Workers are often forced to choose their techniques based on the available data, which vary greatly depending of the geographical region. Therefore it is vital to use an attenuation relationship that is calibrated for the region of interest. A detailed review of different attenuation relationships and the used techniques can be found in (Douglas, 2003). For the reasons mentioned in the Appendix, insurance companies often prefer intensity values for their hazard assessment. For insurance companies it is therefore often more favourable to choose attenuation relationships relating distance, magnitude and intensity. Furthermore, for historical data only the macroseismic parameters for ground motions are known (section 2.4.2). It is hard to relate first-hand reports of the shaking to other ground motion parameters and introduces a great deal of uncertainty. Note that nowadays the attenuation relationships are also described in a probabilistic sense. The standard deviation of the predicted ground motion parameter is also computed and this accounts for the uncertainty in the attenuation relationships. For most attenuation relationships relating macroseismic intensity to the independent parameters, the standard deviation is not computed. This does not mean that the computed attenuation relationship is free from errors but mostly this error-bound is not calculated. In a probabilistic sense I therefore only compute the median in the predicted ground motions.

16



Chapter 2: Method

2.6

Probability analysis for hazard calculation

With the steps described in the previous sections a complete model is defined that is needed for each probabilistic seismic hazard assessment (PSHA): Seismic sources, their corresponding parameters and an attenuation relationship. See also the general introduction for the necessary steps in a PSHA. The last piece of needed information would be the spatial description of earthquake occurrence. This last step is described in section 2.7.4 as the “bootEQ-approach” uses a quite different approach as the commonly used approaches in PSHA’s. This model has now to be incorporated in a probability analysis that calculates the probability per unit time that any amplitude of ground motion or a macroseismic parameter will be exceeded at one or all sites of interest. The output of this calculation is a seismic hazard curve (e.g. McGuire and Arabasz, 1990) for each site of interest. As mentioned in the introduction, this probability analysis is straightforward if we have a synthetic earthquake catalogue (event set) that covers several hundredthousand years of data. One needs simply to sort the expected ground motions intensity levels at the sites of interest decreasingly. This will result in the seismic hazard curve (Figure 2-2). If one is interested in an annual probability of exceedance of 0.002, then it is only necessary to pick the 501st value in the sorted list, respective the 501st worst outcome. Note that an annual probability of exceedance of 0.002 corresponds to a return period of 500 years. The creation of a hazard map is simply done by picking the specific ground motion value that corresponds to the return period of interest.

Figure 2-2: Hazard curve for an arbitrary site with an arbitrary hazard. The hazard curves relates the return period with a ground motion value or macroseismic parameter (like in this example). The return period corresponds to an annual probability of exceedance a certain ground motion (the inverse of the return period). The creation of a hazard map is simply done by picking the corresponding ground motion value to a certain return period of interest. This is shown with the green lines that shows the MMI value that would appear on a seismic hazard map showing the seismic hazard of 500 years return period. The yellow line segment denotes a feature that is discussed in section 2.7.7.

17



Chapter 2: Method

2.7

Implementation

In the following subsections I present the specific implementation of the working procedure of bootEQ. A flowchart of its working procedure can be found in Figure 2-3, displaying graphically the different steps for defining the quantities needed for the final hazard calculation. One specific example will guide the reader through this subchapter.

Figure 2-3: Flowchart showing the operation of bootEQ. Each step is outlined in the following paragraphs.

2.7.1

Synthetic catalogue duration, number of generated events and sampling with replacement

The simulation time of the synthetic catalogue determines the number of events to be generated in the synthetic catalogue. The assumption is that earthquakes can be treated to be temporally independent. No time dependency of magnitude frequency distribution are allowed (as proposed by e.g. Wiemer and Wyss, 2002). For example, if the input earthquake catalogue spans 200 years since the year of catalogue completeness (section 2.2.1) and 80 of totally 120 earthquakes greater than the threshold magnitude Mc were observed during this time we have an average of 80 earthquakes per 200 years (0.4 events per year). Note that we normally fix Mc in a given space volume to Mmin (the minimum magnitude of interest) and the year of catalogue completeness determines the time volume in which this particular Mc is obtained. If we wanted to generate an event set of 150’000 years we would need

18



Chapter 2: Method

60’000 events in our event set (same ratio). That means our final stochastic earthquake catalogue has to contain 750 (150’000/200) times the amount of data from the input earthquake catalogue, which is equivalent to create 500 times more earthquakes than the complete input earthquake catalogue (60’000/120). In order to create the full event set bootEQ needs to repetitively draw 500 times an equivalent amount of earthquakes of the input earthquake catalogue (120 earthquakes in this example). The “drawing” is done by a sample with replacement technique. In a random sample with replacement each observation in the input catalogue has the same chance of being selected and can be selected several times. For each of these 500 samples a magnitude list is created that follows a GR-distribution (see next section) and the locations are randomly varied from the locations of the input earthquake catalogue (see section 2.7.4). This process is illustrated by the arrow “n repetitions” in Figure 2-3. 2.7.2

Magnitude sampling for Gutenberg Richter distribution

As mentioned in section 2.3 and the introduction to this chapter, bootEQ selects for each synthetically generated event a magnitude from a magnitude list where the probability of selection is determined from a truncated GR law. In order to then generate magnitudes that have the same statistical properties as the observed catalogue a sampling from a distribution technique is applied. How this procedure works and is applied in bootEQ is outlined in the next two subchapters. Sampling from a distribution

Any cumulative density function (CDF) can be used to map a random variable (in our case magnitudes) to a uniformly distributed variable in the interval 0 to 1 (from now on referred as uniform[0,1] variable). Since the total area under any CDF is always 1 (also for the truncated GR distribution described by equation (2.6)) the CDF will always result in a mapping for any random variable to a number between 0 and 1. In more detail: Randomly drawing numbers from a random variable and calculating the corresponding value on the CDF (e.g. via equation (2.6)) will always result in a number between 0 and 1. Since each value of the random variable has the same chance of being selected also each value on the CDF will have an equal probability of being chosen. That means that the resulting numbers in the interval [0,1] will have a uniform probability density distribution. The process can therefore be considered as a mapping from a random variable to an uniform[0,1] variable. The problem is, as outlined in the introduction to this subchapter, to generate random variables that follow the same statistical properties as a given CDF distribution (given by equation (2.6)). This is exactly the inverse of the process described in the above paragraph. Instead of mapping a random variable to an uniform[0,1] variable, an uniform[0,1] random variable has to be mapped to a random variable that follows a truncated GR distribution. To obtain that, equation (2.6) can simply be solved for the uniform[0,1] variable. Doing this, leads to the following formula:

19



Chapter 2: Method

⎛ [ 0 ,1] ⎞ (2.8) * ln ⎜1− ⎟ β κ ⎠ ⎝ where again M is the random variable, and [0,1] is the uniform[0,1] variable. The other parameters are the same as for equation (2.4) through (2.6). I will call equation (2.8) the inverse truncated GR distribution function. M = M min −

1

Latin Hypercube sampling

A conventional approach to generate the magnitude samples via equation (2.8) is to apply a Monte Carlo sampling (as applied by Ebel and Kafka, 1999). In such an approach the random variable is also calculated via equation (2.8) (or any other distribution). Such an approach leads to a reasonable estimate for the magnitude distributions if the number of uniform[0,1] random variables is quite large. However, generating a large number of random variables and calculating their corresponding value via an inverse CDF leads also to large computational power requirements. Furthermore a Monte Carlo sampling risks that if the number of random numbers is not chosen large enough, only a part of the CDF is sampled. To avoid these two problems a different approach is applied in bootEQ. This alternative approach which can lead to better estimates of the distribution of random variables was presented by McKay et al. (1979): The Latin Hypercube Sampling (LHC). LHC selects n different values from each of k variables. The k variables are in the case of this study k magnitude values within a certain area. The procedure is implemented in the following manner: The range of each variable is divided into n non-overlapping intervals on the basis of equal probability. The CDF is sliced into n non-overlapping intervals, each with equal probability. In the case of n = 5 each of the intervals corresponds to a 20% probability (Figure 2-4). The next step is to randomly select an observation within each of the n intervals. This selection is not done uniformly within the intervals, but rather it is done relative to the probability density function (pdf) distribution being sampled. For example, in the [Mmin, A] interval (Figure 2-4) values closer to A will have a higher probability of selection than those in the tail that extends to Mmin. To make this relative selection according to the pdf distribution m random numbers from a standard uniform distribution have to be picked. Let us denote these as Um where m is an integer form 1 to n. Um will be a ( n × k ) matrix containing of k variables in n intervals. To accomplish the relative selection the random numbers U m have to be scaled to obtain a corresponding cumulative probability Pm, so that each Pm lies within the mth interval. This is done by the following scaling function:

⎛1⎞ ⎛ m −1 ⎞ Pm = ⎜ ⎟ U m + ⎜ (2.9) ⎟

⎝n⎠ ⎝ n ⎠ where m are random numbers from a standard uniform distribution and n is the number of non-overlapping intervals. This scaling function ensures that exactly one value of Pm will fall within each of the n intervals which have equal probability. The last step is to use the values of Pm with the inverse truncated GR distribution function (equation(2.8)) to produce the sampling values of each k variable as outlined in section 2.4.1. Or again, re-writing the formula out:

20



Chapter 2: Method

M = M min −

⎛ P ⎞ * ln ⎜1− m ⎟ β κ ⎠ ⎝ 1

(2.10)

Note that due to the LHC sampling technique, bootEQ does only resample the input earthquake catalogue 1/n times as much as would be needed due to the ratio of numbers of events in the input earthquake catalogue and the needed simulation time. Coming back to the example of section 2.7.1 and the example made here (the number n of the non-overlapping intervals is chosen to be 5) this means that the input earthquake catalogue has only to be resampled 500/5 (=100) times. As the magnitude samples created by the LHC algorithm follow a truncated GR distribution (which parameters are set by the user), the frequency of each synthetic created event is simply given by the inverse of the simulation time. In our example the frequency of each event will be 1/150’000 a-1.

Figure 2-4: Graphical representation of the LHC algorithms for a truncated GR CDF distribution (same parameter values as Figure 2-1). The blue line represents the CDF of the truncated GR distribution. The black lines represent the n non-overlapping intervals chosen by the LHC. Each of these n intervals has the same probability (in that case of 20%). The red lines represent the Pm values. Note that each Pm value lies within the nth interval. The specific values can be obtained from Table 2-1.

21



Chapter 2: Method

Table 2-1: LHC algorithms for a truncated GR distribution with the same parameters as in Figure 2-1. For the graphical representation of the algorithm and the numbers in this table refer to Figure 2-4 (Table after Wyss and Jorgensen, 1998).

Interval Number m

uniform [ 0 ,1] random number U m

Scaled Probability within the interval Pm

Corresponding observation within truncated GR distribution

1

0.0185

0.0889

4.0442

2

0.4447

0.3524

4.2055

3

0.4565

0.4037

4.2443

4

0.7621

0.6913

4.5503

5

0.8214

0.9643

5.4360

2.7.3

Magnitude sampling for characteristic zones

As mentioned in section 2.4.2 there exist several models that account for the uncertainties in the magnitude of characteristic events and their recurrence interval. All of these models are inconsistent with the idea that small magnitude events should have higher probability of occurrence than large magnitude events. Thus bootEQ models the characteristic seismic zones with a truncated GR distribution at lower magnitudes and combines it with an exponential distribution in the vicinity of the characteristic earthquake. This procedure is similar to the model by Youngs and Coppersmith (1985) but instead of combining a truncated GR distribution with a uniform distribution (as proposed by the latter authors), bootEQ combines a truncated GR distribution with a exponential distribution. The chosen exponential distribution is again a truncated GR distribution (equation (2.6)) with a very low b-parameter in order to give enough weight to the large events. Note that the term b-parameter in this context has nothing to do with observed seismicity. In order to avoid confusion we will call the b-parameter if mentioned in the context of the exponential distribution around the characteristic magnitude Mchar from now on υ-value. To account for the uncertainty in the recurrence interval of Mchar, bootEQ samples the recurrence intervals from a lognormal distribution with a user set standard deviation. The fact that the characteristic portion of the model is modeled by a truncated GR distribution implies that the user has first to define the parameters of the truncated GR distribution Mmin, Mmax, and the υ-value. Mmin is in this context the minimum estimate for the characteristic event and Mmax the maximum estimate for the characteristic event. Mmin and Mmax define the boundaries for the characteristic portion of the model. The user sets the recurrence interval for Mmin. By a trial and error procedure the υ­ value is found so that the recurrence interval at the desired Mchar is obtained. Finally the user can set the shape parameter for a lognormal distribution to account for the uncertainty in the recurrence interval of Mchar. For a graphical representation of this procedure refer to Figure 2-5.

22



Chapter 2: Method

Figure 2-5: Truncated GR distribution for a characteristic zone with Mmin=6.5, Mmax=7.5 and υ=0.1. At Mmin the recurrence interval is 350years. By trial and error the υ-value is found so that at a Mchar the recurrence interval is 500 years. In this example a Mchar of 6.9 will have a recurrence interval of 500 years.

Figure 2-6: Different characteristic models and their corresponding parameters. In each panel only one parameter is allowed to vary: In the upper left panel the υ-value is varied, in the upper right panel the recurrence interval of Mmin and in the lower left panel standard deviation of the lognormal distribution of the recurrence interval (not used in this study). The basic model is a characteristic model with Mmin=7, Mmax=8, υ-value =0.001, a recurrence interval for Mmin of 250 years and no distribution in the recurrence interval.

23



Chapter 2: Method

Figure 2-6 shows a set of different parameters for this characteristic model. In each panel only one parameter is allowed to vary and the others are kept constant. The basic model was a characteristic model with Mmin=7, Mmax=8, υ-value =0.001, a recurrence interval for Mmin of 250 years and no distribution in the recurrence interval. It is obvious that slight variations in these parameters cause the recurrence intervals of specific magnitudes to vary quite a lot. Note that in this study no distribution in the recurrence interval was allowed. The user has to set up the characteristic model quite carefully and verify it for several set of parameters. As said, this is a trial and error procedure and quite time consuming. Instead of applying a trial and error procedure it would be possible to invert for the υ-value. This was omitted in this study due to the high degree of non-linearity of the problem. It is beyond the scope of this study to develop a faster method for this specific characteristic model but would be certainly worth a thorough investigation. 2.7.4

Spatial distribution

bootEQ samples a given input earthquake catalogue for epicentre locations. As mentioned in section 2.2, the locations in the catalogue can either be obtained from historical data or from instrumental data. Each of the locations serves as a spatial epicentre around which we apply a spread function (see subchapter below). This ensures that each epicentre of the stochastic event set will take place at slightly different locations as in the input earthquake catalogue. Thanks to the epicentre spread function it is only necessary to choose a b-value and maximum magnitude Mmax to characterize the truncated GR distribution for each source zone. There is no need to compute the a-value which is the great advantage of this approach. Nevertheless, not having the need to compute an a-value does not mean that the knowledge of a-values in specific source zones cannot be implemented. One can easily increase or decrease the frequencies in a specific zone (and therefore increase, respective decrease the a-value) by implementing a characteristic source zone with the desired parameters. The a-value will simply be controlled by the Mmin parameter and its corresponding recurrence interval (see previous section). The reason for not needing to compute a-values is that the relative epicentres density in different areas will automatically be reflected in the stochastic event set due to the applied spread function. I want to stress that this is the basic assumption of the algorithm. The spatial occurrence of seismicity in the historic catalogue will also be reflected in the stochastic event set. The temporal occurrence will be determined by the seismic source model. Seismic zones that showed a high seismicity in the past will also show high seismicity in the synthetic catalogue. That leads to the assumption that the preferred location for large magnitude events near or at the maximum magnitude of each source zone is situated in locations where a lot of seismicity has been observed in the input earthquake catalogue. Epicentre spread function The epicentre spread function works in the following way. For each event bootEQ calculates the radius R of a circular area which is centred at the event itself and

24



Chapter 2: Method

contains “target” other events in the chosen earthquake catalogue. The “target” is an integer chosen by the user. The calculated distance R for each event is taken as a characteristic radius for each event. See Figure 2-7 for a graphical representation of this algorithm. To ensure that epicenters are not spread too far away from the epicenters in the input catalogue it is possible set a maximum radius for the characteristic radius which is usually set 200km. Furthermore there is the option to set a minimum characteristic radius, which is usually taken as 8km. These and other following mentioned values were obtained by Dr. Mariagiovanna Guatteri on a basis of experience when writing bootEQ. It remains to be demonstrated in a later part of this thesis whether there are physical or statistical limits for these values. The characteristic radius R for each event is then spread according to a Rayleigh distribution†. The reason to choose a Rayleigh distribution is the following: If future events take place at approximately the same location as previous events but a random distance away of these locations then it is a possible assumption that the randomness in latitudes and longitudes are normally distributed. E.g. Ebel and Kafka (1999) chose the same approach. If now two independent variables y1 and y2 are normally 1/ 2 distributed with equal variance, then the variable x = ( y12 + y 2 ) will follow a Rayleigh distribution and this variable x is identical to the randomness in distance of future events. The shape parameter λ of the Rayleigh distribution for each event is identical to the characteristic radius divided by “nsigma”, a number set by the user. 2

The azimuths of the event locations are randomly sampled from a uniform distribution. In polar coordinates the distance will therefore follow a Rayleigh distribution and the azimuths a uniform distribution. These polar coordinates for the generated events in the synthetic catalogue are then transformed into a Cartesian coordinate system, resulting in the longitudes and latitudes of the synthetic catalogue events. Note that this algorithm implies that also the minimum radius and maximum radius are spread according to the Rayleigh distribution. An example: If the maximum radius is set to 200km, “nsigma” to a value of 4, and some events do indeed show a characteristic radius of 200km or greater, then this characteristic radius of 200km is spread according to a Rayleigh distribution with the shape parameter λ of 50. Drawback The attentive reader noted that the epicentre spread function as described in the above paragraph is quite user dependent. The user has to set two numbers: “target” and “nsigma”. The resulting spread will change significantly how these two numbers are chosen. It is one goal of this study to quantify this step or at least to give the user a guideline how these numbers influence the results. This is done in the fifth chapter of this thesis and all the different parameter choices are discussed in detail in this chapter.

† The Rayleigh distribution has a probability density function for the random variable x given by f ( x ) = ( x / λ 2 )* exp( −x 2 / 2λ 2 ) and has only one parameter which is the shape parameter λ. The mean of the random variable x is given by λ π / 2 and the variance by ( 4 − π )* λ 2 / 2 .

25



Chapter 2: Method

Figure 2-7: Visualisation of the epicentre spread function for an arbitrary input earthquake catalogue and the generated event set. Upper panel: For each event a radius R of a circular area which is centred at the event itself and contains “target” other events in the input earthquake catalogue is calculated. “target” is in this example chosen to a value of 3. In zones where the event density is low (e.g. the zone highlighted with the left green circle) the radius R is much larger than in zones where the event density is large (right green circle). Each of these characteristic radii R is then spread according to a Rayleigh distribution with a shape parameter λ that equals R over ”nsigma”. In zones with a low event density the corresponding probability density distribution (pdf) has a much larger mean and standard variation than in zones with a high event density (indicated by the green pdf’s overlain in the two zones). The azimuths are sampled from a uniform distribution. The pdf’s for this uniform distribution are indicated by the green circles. Lower panel: Synthetic event set generated from this arbitrary input earthquake catalogue showing four times as much events. Note that due to the epicentre spread function the relative event density (number of events per area) in each of the two zones remains constant.

26



Chapter 2: Method

2.7.5

Depth Distributions

bootEQ has the option to account for the depth distribution of earthquakes in the synthetic catalogue. The sampling of depths can either be a simple sampling from a lognormal distribution with a user set mean and standard deviation around one fixed depth. Another possibility is to build a matrix for each seismic zone containing different depth limits and distribute the depth values around these limits with a user set standard deviation. This option is particularly important when building an event set in subduction zone environments. For the scope of this study I did not use this option of bootEQ. There are numerous reasons for that. First of all the studied regions have either insufficient depth information regarding the historical catalogue (as in the case of China) or the depth assignments are doubtful to be precisely resolved as the depths show only a minor scatter and happen around a small depth interval (as in the case of the Central and Eastern United States). For both cases a testing for the depth information would either not be possible (as no reliable data were available like in the former case) or require an extremely detailed study (as for the latter case). But even when considering the depths, the final hazard calculation will not change, as the used attenuation relationships do only take into consideration the epicentral distance and neglect the depths of the events. For all these reasons this option of bootEQ was not evaluated or applied in this study. 2.7.6

Faults

If an earthquake shows significant deviations from a point source the patterns of the ground motion observations will show clear deviations from a simple circular area centered around the event location itself. Such an effect can especially be seen if an earthquake ruptures a significant length of a fault. To account for this fault finiteness bootEQ allows the implementation of fault lines and changing the intensity patterns on a regional scale accordingly. For each earthquake in the zone in which a fault is implemented a strike is assigned whereby the value of the strike is sampled from a uniform distribution around a given mean and standard deviation. For each of these events a fault length according to the scaling relationships† by Mai and Beroza (2000) is calculated. Other scaling relations as presented by Wells and Coppersmith (1994) could also be implemented. In the algorithm it is ensured that these arbitrary faults do not exceed spatial limits that are determined by the user. The arbitrary fault is then used as a line source for the simple approach by Smith (1995) to modify the intensity patterns on a regional scale. In this approach, the earthquake source is modeled as a line source (Figure 2-8). In order to depict the spatial effect of the line source it is segmented into a number of elements and each element of the fault acts as a point source. The total effect of an earthquake at a certain point is then calculated as the weighted sum of the effects of all point sources on the fault scaled to the length of the fault. With the approach by Smith (1995), the same attenuation relationship as for the whole event set can be used. † Scaling relations relate e.g. the seismic moment of an earthquake with its surface rupture length.

27



Chapter 2: Method

Figure 2-8: Graphical representation of the algorithm by Smith (1995). The fault is modelled as a line source with length L and sliced into N segments. Each of the N segments attributes according to a weighted sum to the intensity I at Point P. (graphic taken from Steimen, 2004).

2.7.7

Hazard map

The basic approach of generation a hazard map is already outlined in section 2.6. In this section a more explicit mathematical explanation is outlined how the hazard maps are generated and one possible problem is mentioned. For each event k in the synthetic catalogue the distance to all n sites of interest is calculated. For each event k its intensity m at site of interest is derived by the attenuation relationship. This implies that intensity values are treated as continuous numbers which by definition they are not. For a clarification of this matter refer to the section “Clarification on used Intensity scale” in the Appendix. Intensities below a value of MMI4.5 are ignored as these cause seldom any damage (even when site amplification is considered). This results in n hazard curves for each of the n site of interest. Each hazard curves consists of m intensity values due to the event k in the synthetic catalogue. The final step is to choose one fixed return period value (green line in Figure 2-2) which is the same as to choose one fixed annual probability of exceedance in a certain exposure time after a Poisson process. As the hazard curves generated by bootEQ are discrete in their nature, a linear interpolation has to be applied between the fixed return period and its neighbouring values on the hazard curve. The user has to ensure that this interpolation does not introduce a significant underestimation of the intensity on the hazard map (see yellow line segment in Figure 2-2). Due to the discrete nature of the hazard curves it has to be ensured that the synthetic catalogue contains sufficient data in order to properly represent the intensity value at the return period of interest. A hazard map for a specific return period of interest is now simply a matter of plotting the obtained intensity value on a grid spanned by the n sites of interest. Between the grid points a linear interpolation is applied.

28



Chapter 2: Method

2.7.8

Computational requirements

In order to reduce the required computational time a down-sampling algorithm is introduced. The number of total simulated events can be reduced by splitting the magnitude sampling over two magnitude ranges. The magnitude threshold that controls the split is calculated as the quantile q of he truncated GR CDF. The frequencies of the events have of course to be changed accordingly. Usually I chose q to be the 95% quantile. The computational time on a standard two CPU PC with 2GB RAM takes for bootEQ approximately 60seconds when treating an input catalogue of 4’500 events and generating 80’000 synthetic events. In order to create the hazard map (section 2.7.7) approximately 40 minutes are needed for 2’000 sites of interests. If one fault line is implemented the time to create the hazard map triples.

29



[Blank Page]

30



Chapter 3: Stochastic Event Set for Central and Eastern United States

3 Stochastic Event Set for Central and Eastern United States 3.1

Motivation for this study area

This chapter describes the application of the procedure outlined in the chapter “Method” to the Central and Eastern United States (CEUS). In the upcoming sections I present the seismic source model and other related input parameters, the resulting event set and the obtained hazard map for the CEUS Region. The reasons for choosing the CEUS region as a first application for bootEQ are versatile. First, the CEUS region is a well examined region in terms of seismic hazard assessment. The source zones and their corresponding parameters are well defined and investigated. The earthquake catalogue is homogenous in terms of the reported magnitudes and locations. The generated hazard map can be compared with the hazard map created by the national seismic hazard assessment program (Frankel et al., 1996; Frankel et al., 2002). In addition to that, bootEQ was specifically written for a hazard assessment of the New Madrid seismic zone. It was therefore obvious to extend the hazard assessment to the whole CEUS region and evaluate whether bootEQ can be applied to a larger zone of interest.

3.2

Catalogue of observed seismicity and catalogue completeness

The used catalogue spans the years form 1701A.D. to 2004A.D. and includes both historic and instrumental data. The basis was the catalogue of Frankel et al. (2002). For the CEUS Region I included or updated events according to the catalogue presented by Bakun and Hopper (2004a)†. Bakun et al. (2003) developed a model specifically designed to estimate the location and magnitudes for historical earthquakes in eastern North America. This method was applied by Bakun and Hopper (2004a) to compile a list of all known historical Central United States events for magnitudes greater than M5. The catalogue magnitudes were set to moment magnitude using the U.S. Geological Survey (USGS) relationships between body wave magnitude and surface wave magnitude (Mueller et al., 1996). From now on the magnitude values referred as M will denote moment wave magnitude unless mentioned otherwise. The final catalogue contains 3970 events with magnitudes equal or greater than M2. See Figure 3-1 for a map projection of this catalogue.

† The final catalogue can be obtained from me upon request

31



Chapter 3: Stochastic Event Set for Central and Eastern United States

Figure 3-1: Seismic source zones used for the stochastic event set generation of the CEUS region and the used input earthquake catalogue. The dots represent the epicentres of the input catalogue for different magnitude ranges. The blue dots are for moment magnitudes of M2 (the minimum magnitude in the input catalogue) to M4, the green dots for magnitudes in the range of M4 to M6 and the red dots for magnitudes in the range of M6 to M7.8 (the maximum magnitude in the input catalogue). The black line the represents the Western United States source zone. The blue line represents the Eastern United states background seismicity zone as outlined in section 3.3.1. Other background seismicity zones are the Wabash Valley source zone (green line, see section 3.3.2) and the Charlevoix source zone (yellow line, see section 3.3.3). The red lines represent the characteristic source zones for New Madrid (section 3.3.5) and the purple line the characteristic source zone for Charleston (section 3.3.6).

Bakun and Hopper (2004a) claim that their catalogue for the CEUS region is complete for magnitudes greater than Mc equals M5 since 1850A.D. I used these two values as input parameter for the threshold in magnitude Mc and time for the catalogue completeness (see section 2.2.1).

3.3

Seismic zones

I used the same seismic zones in CEUS to generate the stochastic event set as the USGS used for their national seismic hazard maps in 1996 and 2002 (Frankel et al., 1996; Frankel et al., 2002). Figure 3-1 shows the different background regions and characteristic zones used as the seismic source model for this event set generation. The next sections outline the characteristics of each of the different background seismicity zones. The minimum magnitude of interest for the hazard calculation Mmin was set to a magnitude of M5 for the whole region of interest.

32



Chapter 3: Stochastic Event Set for Central and Eastern United States

3.3.1

Eastern and Western United States

The black line in Figure 3-1 represents the Western United States source zone. The geographical location of this zone is the same as used for the US Geological Survey hazard map 2002 (Frankel et al., 2002). I also used the same parameters for the truncated GR distribution as for the above mentioned hazard assessment program. Therefore I assigned a maximum magnitude of M7 and a b-value of 0.95 for this source zone. The same b-value was assigned to the Eastern United States source zone (blue line in Figure 3-1). The maximum magnitude was set to M7.5 as used by Frankel et al. (2002). 3.3.2

Wabash Valley

A small background seismicity zone was assigned to Wabash Valley (green line in Figure 3-1) north to the New Madrid seismic zone. Again this is in agreement with Frankel et al. (2002) who followed suggestions of Wheeler and Cramer (2002) to extend the source region of the Western United States to this small zone. 3.3.3

Charlevoix

The Charlevoix seismic zone is an about 40km by 70km large area, roughly elliptical in shape and situated in southeastern Canada (yellow line in Figure 3-1). This is the seismically most active region in southeastern Canada. Since the settlement by Europeans in the seventeenth century, this area has experienced several large earthquakes (Johnston et al., 1994). The most prominent event happened at the 28th February 1925A.D. with a magnitude of M6.2 and a maximal intensity of MMIVIII. Adams et al. (1996) suggested a b-value of 0.76 for this region of seismicity. This value was accepted by Frankel et al. (2002). Therefore I used the same value as input for bootEQ. The maximum magnitude of this zone is the same as for its surrounding Eastern United States source zone (M7.5). 3.3.4

Characteristic zones: Introduction

Looking at a seismic hazard map of CEUS (Figure 3-2) one clearly sees that one zone dominates the seismic hazard of the States Illinois, Indiana, Missouri, Arkansas, and Tennessee. This zone is called the New Madrid seismic zone (NMSZ). Considering that the recurrence interval for this source zone is much shorter than expected from a GR distribution one can consider such a zone as a characteristic source zone as outlined in section 2.4.2. There is another zone with such properties: The Charleston seismic zone that dominates the seismic hazard in South Carolina. In the following paragraphs I outline the characteristics of these source zones and how I set the input parameters for bootEQ according to the implementation of characteristic source zones as outlined in section 2.7.3.

33



Chapter 3: Stochastic Event Set for Central and Eastern United States

Figure 3-2: Seismic Hazard Map in PGA (of %g) for CEUS for 10% probability of exceedance in 50 years corresponding to a return period of 475 years. (data source: Frankel et al., 2002)

3.3.5

New Madrid seismic zone

The area of New Madrid experienced several large shocks since the settlement by Europeans. Three principal events have been reported during winter 1811-1812A.D.: Approximately 0215 local time (LT) on December 16, 1811; around 0800 LT on January 23, 1812 and approximately 0345 LT on February 7, 1812. I will call that sequence NM1, NM2 and NM3, consistent with Johnston (1996b). Additionally, a large aftershock to NM1 occurred near dawn on December 16, 1811. The ground motions of these three principal events were felt in places as far away as Canada, New England and the coast of South Carolina (Mitchill, 1815; Bradbury, 1819; Fuller, 1912; Nuttli, 1973; Penick, 1981; Johnston and Schweig, 1996). Considering all methods available to estimate the recurrence intervals and magnitudes for characteristic events (section 2.4.2) the investigator finds a vast range for the estimate of the characteristic magnitude and recurrence intervals for the New Madrid type events. In the following paragraphs I present the whole range of values discussed in literature and present which values were chosen for this study. Newman et al. (1999) estimated the recurrence interval of that type of events by extrapolating the magnitude-frequency distribution of the GR law. They derive recurrence interval of 1’400 ± 600 years for M7.0 earthquakes and 14’000 ± 7’000 years for M8.0 earthquakes. In spite of the difficulties with this approach to monitor present day distribution of seismicity for inferring recurrence intervals, it still can be used to illuminate principal fault zones in the (NMSZ (e.g. Gomberg, 1993; Johnston, 1996b). From displacement rate measurements in the region Newman et al. (1999) argue that large earthquakes occurring every 500-1’000 years are likely to be in the low magnitude M7 range. The last and most frequently applied approach to estimate the characteristic magnitude is the interpretation of first-hand reports of the shaking

34



Chapter 3: Stochastic Event Set for Central and Eastern United States

and/or damage caused by the events over the CEUS states (e.g. Nuttli, 1973; Wells and Coppersmith, 1994; Johnston, 1996b; Hough et al., 2000; Bakun and Hopper, 2004b). There is a wide range of magnitude assignments resulting from this method. E.g. for NM1 Johnston (1996b) assigned a moment magnitude of M8.1, Bakun and Hopper (2004b) M7.6 and Hough et al. (2000) M7.2-7.3. These uncertainties in magnitude estimates arise due to numerous reasons. Foremost, the 1811-1812A.D. events occurred in the early times of settlement of the North American mid-continent. As a result of the low population density there are not enough detailed, credible accounts of the effects of the earthquakes. It is therefore not straightforward to apply the established “log(isoseismal area A)” methods of macroseismology (e.g. Hanks et al., 1975; Toppozada, 1975). The second major aspect of the magnitude uncertainty arises from the low rate of seismic activity in the CEUS. The largest event in the past 100 years in CEUS was the M5.8 earthquake near Massena, New York in 1944. This means that there are no large earthquakes that have been recorded with modern seismographs and a profound calibration for the intensity attenuation relationship is lacking. The third aspect of the magnitude uncertainties is the potential bias in reported intensity values due to site response effects. Site effects can dramatically influence the observed ground motions at a given site. In areas with soft soil deposits one can overestimate the local intensity as large as 2 to 3 units. Therefore it is imperative to include intensity site corrections before applying the “log(isoseismal area A)” method. For the CEUS region this was done by Hough et al. (2000) and these values were used by Bakun and Hopper (2004b) to derive the attenuation relationship represented in equation (3.1). Numerous of the above mentioned studies and other authors performed studies of these uncertainties in the magnitude estimates (Johnston, 1996b; Cramer, 2001; Newman et al., 2001; Bakun and Hopper, 2004b). Including those uncertainties this could lead to magnitudes as large as M8.4 or small as M7.0 for the NM1 event. These are enormous uncertainties in the magnitude estimate. I limited the range of values due to the following considerations. All reinterpretations of first hand reports of damage and/or shaking are essentially based on only two datasets presented by Nuttli (1973) and Hough et al. (2000). As mentioned above, it is essential to include intensity site corrections before estimating magnitudes. As this was not done by Nuttli (1973) all following magnitude estimates are suspicious to overestimate the magnitudes. Therefore I chose to discard the magnitude estimates presented by Johnston (1996b) who had only the dataset presented by Nuttli (1973) available. A second point supports this decision. Studies about the fault geometry and estimates of slip rates introduce scaling relationships between fault geometry (e.g. the surface rupture length of the fault) and the size of earthquakes at these faults (e.g. Wells and Coppersmith, 1994; Mai and Beroza, 2000). Hough et al. (2000) applied such a technique to the NMSZ and favor magnitude estimates not larger as M7.3. The magnitude estimate of M7.2-7.3 by Hough et al. (2000) seems to be the most favorable estimate as it is based on their new set of intensity assignments. Nevertheless Hough et al. (2000) still uses the “log(isoseismal area A)” presented by Johnston (1996b) and its corresponding parameters. Furthermore scaling relationships show a significant scatter and do not reproduce a unique earthquake size for a given scaling parameter (Wells and Coppersmith, 1994). As I chose to use the attenuation relationship presented by Bakun et al. (2003) in my point of view it is most consistent

35



Chapter 3: Stochastic Event Set for Central and Eastern United States

to use the magnitude assignment for the NM1 event of M7.6 presented by Bakun and Hopper (2004b). For all the above mentioned reasons the preferred magnitude for the NM1 event is M7.6. By the interpretation of paleoliquefaction features preserved by the sediments within the Mississippi embayment several authors tried to infer a recurrence interval of the New Madrid type of event (e.g. Kelson et al., 1996; Tuttle and Schweig, 1996; Tuttle et al., 2002). Tuttle et al. (2002) estimate the recurrence interval to be 500 ± 300 years for M7-8 events. The value for the recurrence interval of 500 years was accepted by Frankel et al. (2002). From these values I constructed the characteristic portion of the model for the NMSZ by choosing Mmin to M7.0 with a recurrence interval of 200 years and Mmax was set to M8.1. By trial and error a υ-value of 0.001 was found (section 2.7.3). This ensures that at a recurrence interval of 500 years the estimated size of the characteristic event Mchar is a M7.6. Such a distribution allows for events with a greater magnitude than M7.6 but with a very low frequency. The upper panel in Figure 3-3 shows the resulting magnitude-frequency distribution for the characteristic portion of the model in the NMSZ. The lower panel in the same figure shows the magnitude-frequency distribution in the NMSZ combining the characteristic portion of the model with the magnitude frequency distribution around lower magnitudes. The hyperbolic shape of the magnitude-frequency distribution in the lower panel of Figure 3-3 is a result of the summation of the magnitude-frequency distribution of the background zone events with the characteristic portion of the model. 3.3.6

Charleston

The largest earthquake in the southeastern United States occurred near Charleston in 1886A.D. Again, like for the NMSZ, there is quite some uncertainty about the magnitude of this event. Furthermore the location of this event is not well constrained (Bakun and Hopper, 2004b). It is uncertain whether this event took place off-shore or on land. A thorough investigation for the input parameters for bootEQ is further complicated in a sense that much less scientific work is done in this area compared to the NMSZ. The magnitude estimates range from M7.3 ± 0.26 (Johnston, 1996b) to M6.4-7.2 with a preferred estimate of M6.9 (Bakun and Hopper, 2004b). Again I use the magnitude assignment of Bakun and Hopper (2004b) who accounted for the reported site effects of Hough et al. (2000). Recent paleoliquefaction studies proposed a recurrence interval of 500-600 years for this characteristic source zone (Talwani and Schaeffer, 2001). The same value was used by Frankel et al. (2002) for the U.S. Geological Survey hazard maps and I adopt this estimate for my study. Other estimates for the recurrence interval were not found at the date of the writing of this thesis. For these reasons the preferred magnitude for the Charleston type event is M6.9 and has a recurrence interval of 550 years. I modelled the Charleston seismic area with a truncated GR distribution with a υ-value of 0.1, a Mmin of M6.4 and a Mmax of M7.5. Such a distribution ensures that at a recurrence interval of 500 years the assigned Magnitude is M6.9 (Figure 3-4).

36



Chapter 3: Stochastic Event Set for Central and Eastern United States

Figure 3-3: Magnitude frequency distribution in the NMSZ. Blue line: Historic catalogue after catalogue completeness (1850A.D.). Red line: magnitude-frequency distribution of the synthetic catalogue. Above: truncated GR distribution to model the characteristic portion of the NMSZ with a υ value of 0.001, a Mmin of M7.0 and a Mmax of M8.1. At 500 the years recurrence interval, Mchar is M7.6. Below: combination of the magnitude-frequency distributions of the non-characteristic portion of the model (magnitudes below Mmin) with its characteristic portion (magnitudes above Mmin).

37

Chapter 3: Stochastic Event Set for Central and Eastern United States

Figure 3-4: Same as Figure 3-3 but for a υ-value of 0.1, a Mmin of M6.4 and a Mmax of M7.5. At the 500

years recurrence interval, Mchar is M6.9.

38



Chapter 3: Stochastic Event Set for Central and Eastern United States

3.3.7

Summary of seismic zones

Table 3-1 gives an overview of the seismic source model used for this study and the corresponding parameters of each source zone as outlined in the previous sections. Table 3-1: Summary of the seismic zones for CEUS used in this study and their corresponding parameters. For the background seismic zones the only input parameters are the b-value and the maximum magnitude. For the characteristic zones the input parameters are the υ-value, the recurrence interval of Mmin and Mmax (section 2.7.3). By trial and error the recurrence inteval of the characteristic event Mchar is found. Note that the minimum magnitude Mmin for the hazard calculation was set to M5 for all source zones.

Background seismicity zone

bvalue

Maximum Magnitude Mmax

Western United States

0.95

7

Eastern United States

0.95

7.5

Wabash Valley

0.95

7.5

Charlevoix

0.76

7.5 Recurrence interval of Mmin [years]

Recurrence interval of Mchar [years]

Characteristic zones

υ­ value

Mmin

New Madrid

0.001

7

250

7.6

500

8.1

Charleston

0.1

6.4

250

6.9

500

7.5

3.4

Mchar

Mmax



Epicentre spread function

The parameters for the epicentre spread function, as explained in section 2.7.4 for this event set are as follows. The “target” was set to the number of 2 and the “nsigma” to the number of 4. As a reminder: “target” is an integer that describes how many events will be contained around a circular area of radius R around each event. The “nsigma” parameter determines the shape parameter of the Rayleigh distribution around which each radius R is spread. The Rayleigh parameter is the radius R divided by the value “nsigma”. The minimum and maximum radius is set to 8 respective 200km. No testing was applied yet to these parameters. Nevertheless I used different values of these parameters and evaluated their impact on the stochastic event set. The impact was as expected. The smaller the “sigma” value the more spread is applied and the larger the “target” the result is the same as in the former case. Which spread can be considered as an optimum spread (or whether such a value exists) is discussed in the fifth chapter of this thesis.

39



Chapter 3: Stochastic Event Set for Central and Eastern United States

3.5

Faults

The New Madrid seismic zone shows for the NM1, NM2 and NM3 type of event three plausible faulting scenarios on three different faults (Johnston and Schweig, 1996). In bootEQ it is not possible to model several faults in the same characteristic source zone. Furthermore the computational time increases dramatically when including faults in the calculation (section 2.7.6). Therefore I chose the implement only one strike slip fault in the NMSZ, extending from 35.8°N to 36.9°N with a strike of 50 degrees and variance of this strike of 10 degrees. The latitudinal extend of this arbitrary fault zone is consistent with the data presented by Johnston and Schweig (1996). It should be stressed that such a fault represents an arbitrary fault zone but it mimics the regional fault appearance rather well.

3.6

Attenuation relationship

I used the attenuation relationship presented in Bakun and Hopper (2004b) who developed an attenuation relationship between magnitude and modified Mercalli Intensities (MMI) based on a method presented by Bakun et al. (2003). These authors include empirical intensity site corrections that account for systematic bias in the intensity assignments in a given area. These site corrections are consistent with the site response effects described by Hough et al. (2000). The attenuation relationship is given by: MMI = 2.89 +1.36 * M − 0.912 * ln Δ − 0.00277 * Δ

(3.1)

where ∆ is the epicentral distance in kilometres, MMI is the modified Mercalli intensity and M is the epicentral-magnitude.

3.7

Results

In this section I present the obtained event set for the CEUS region. The synthetic catalogue spans 100’000 years and samples magnitudes of M5 or larger. I highlight certain features in the results that are worth a detailed discussion (section 3.8). 3.7.1

Stochastic event set

The locations and magnitudes of the synthetic catalogue for the CEUS region are shown in Figure 3-5. The reader should be aware that in Figure 3-5 no information about the frequencies of the events is included. That means that a region with a high density of events (like the region in Western Tennessee) is not necessarily a region contributing most to the hazard. For the frequency information refer to Figure 3-6.

40



Figure 3-5: Final event set for the CEUS region. The source regions are the same as in Figure 3-1. Three different magnitude bins are plotted (see legend).

Chapter 3: Stochastic Event Set for Central and Eastern United States

41



Chapter 3: Stochastic Event Set for Central and Eastern United States

Wherever there is an epicentre in the historic catalogue (Figure 3-1) also synthetic events are created. This is exactly what is expected according to the description of the working procedure of bootEQ (Chapter 2). But that leads also to regions were no seismicity is observed (e.g. in nearly the whole state of Iowa and the western part of Wisconsin). A striking feature is also the discrimination in the maximum magnitude for each source zone. The Western source zone shows only earthquakes with a maximum magnitude of M7, whereas other source zones show earthquakes of larger size. Again, this is only a proof that bootEQ works exactly as expected as the maximum magnitude Mmax is set by the user. The effect of the epicentre spread function (section 2.7.4) is also worth some consideration. In regions with a low historical seismicity (e.g. northern Wisconsin), the spread function distributes the generated events in the synthetic catalogue far away of the locations of historical seismicity. In regions showing a high historical density of seismicity (e.g. around 47°N,78°E) the generated events in the event set remain much closer to the locations of historical seismicity. Again, this is exactly what is expected from the working procedure of bootEQ. The epicentre spread function leads also to a potential drawback. If regions with low historical seismicity are close to a boundary between different seismic source zones, it is very likely that the synthetic events are spread into another source zone. In Figure 3-5 such an observation can be made in the westernmost part of the Eastern source zone. If the two seismic zones have different source parameters (b-value and Mmax) this spread from one source zone to another can influence the magnitude-frequency distribution of the two zones. The maximum magnitude Mmax in one source zone is eventually exceeded when another source zone with a higher Mmax diffuses events into a zone with a lower Mmax. If this happens it can be recognized in the magnitude-frequency distribution of the source zone with the lower Mmax. 3.7.2

Gutenberg-Richter distribution

Although the actual testing of bootEQ should be carried out in a later part of this thesis I implemented already at this point a graphical measure for the quality of the stochastic event set. For each seismic zone the corresponding magnitude-frequency distributions can now be displayed in bootEQ (before this implementation only the whole zone in which the event set is created could be displayed). The magnitudefrequency relationships of the synthetic catalogue and the corresponding historical events for all the background seismicity zones (sections 3.3.1 to 3.3.3) are shown in Figure 3-6. For the historical catalogue only the data since catalogue completeness are shown. The magnitude-frequency distributions for the characteristic zones are already shown in the lower panels of Figure 3-3 and Figure 3-4. The plots are generated by dividing the magnitude range into magnitude bins with a spacing of 0.05 magnitude units. Then the frequency data in each bin is found. The coordinates in the magnitudefrequency plot are then given by building the sum of the frequencies in each bin and plotting them in the middle of the corresponding magnitude bin. Note that such an algorithm also leads to cumulative errors. If the frequencies at higher magnitudes are for some reasons elevated, then all the frequencies will be shifted to higher values.

42



Figure 3-6: Magnitude frequency distributions for the different source regions used in this study. Blue lines are the data from the input catalogue (section 3.2) since catalogue completeness (1850A.D.). The red lines are the magnitude-frequency distributions of the synthetic catalogue. For each source zone their input parameters are shown in the upper right corner of each panel. Panel (a) is the sum of the magnitude-frequency distributions of panel (b) to (e)

Chapter 3: Stochastic Event Set for Central and Eastern United States

43

Chapter 3: Stochastic Event Set for Central and Eastern United States

Looking at the magnitude-frequency distribution of the Western seismic zone (panel (e) in Figure 3-6) one sees that the maximum magnitude (which is M7.5) exceeds the input parameter Mmax (which is M7.0) of this source zone. This is exactly the feature generated by the epicentre spread function mentioned in the forth paragraph of section 3.7.1. In the Eastern seismic source zone and Wabash Valley source zone a similar effect can be observed but the reason is different. The characteristic seismic zones Charlevoix and New Madrid overlap these source zones and therefore exceed the maximum magnitude attributed to these source zones. This is also the reason why in these source zones the red lines seem to be discontinuous (e.g. looking at a M7.0 in panel (b)). For the sake of a simplified interpretation for the source zones that overlap characteristic areas the magnitude-frequency plots are plotted again in Figure 3-7 but this time excluding the characteristic zones.

Figure 3-7: Magnitude frequency distribution for the Eastern Zone and Wabash Valley excluding earthquakes generated in the characteristic source zones. Panel (a) is again the sum of all source zones in the area of interest.

For the whole CEUS zone in which the event set was calculated a remarkably good agreement between the magnitude-frequency distribution of the historical catalogue and the synthetic catalogue is obtained (panel (a) in Figure 3-6). The same holds true for the Eastern (panel (b)) and Western (panel (e)) seismic source zones. For the source zones of Wabash Valley (panel (c)) and Charlevoix (panel (d)) clear deviations are obvious. The synthetic catalogue shows a significantly reduced frequency (and therefore a-value) compared to the historical catalogue and in the case of the Charlevoix source zone also a deviation in the b-value seems obvious. 3.7.3

Hazard map

The resulting hazard map for the CEUS region is shown in Figure 3-8 for a return period of 475 years on neutral soil. The hazard was calculated at 2116 grid points, which is equivalent to a grid point every 0.6 degrees (see section 2.7.7 for the method how to produce the hazard map). Note that intensities below MMI4.5 were set to zero in order to reduce computational time and are therefore not shown. As already mentioned the intensity data are treated like continuous numbers (see “Intensity scale” in the Appendix for explanation). For this study region soil quality correction factors are available to the Swiss Reinsurance Company but were not included in the hazard

44



Figure 3-8: Hazard map for the CEUS region for a return period of 475 years and excluding soil conditions. Only hazard levels above an intensity of MMI4.5 are shown. The colour scale is in units of MMI intensities.

Chapter 3: Stochastic Event Set for Central and Eastern United States

45



Chapter 3: Stochastic Event Set for Central and Eastern United States

calculation of this thesis. Figure 3-8 shows nicely the elliptical shape of the intensity pattern in the NMSZ resulting from the implemented fault zone.

3.8

Discussion

The synthetic catalogue and the resulting hazard map show several features that are exactly like expected according to the description of the working procedure of bootEQ. Others are somehow more surprising or need further consideration. In the next few sections I discuss those features and provide the reader with possible solutions. 3.8.1

Locations of stochastic events

As mentioned in section 3.7.1 there are regions in the synthetic catalogue showing no seismicity during the whole simulation period (e.g. in nearly the whole state of Iowa and the western part of Wisconsin). bootEQ would allow the user to set up an artificial grid of which each grid point serves as a spatial centre around which synthetic events are created (from now on also called extra catalogue events) to avoid that there are regions with no earthquakes in the synthetic catalogue. But that would create epicentres in the synthetic catalogue at locations where no epicentre has ever been observed in the input earthquake catalogue. Certainly it is doubtful to assume no event will happen during the next 100’000 years (or any other time span) in regions were no historical seismicity has been observed (like the state of Iowa). But on the other hand it would be an extremely questionable step how to choose locations in seismically inactive regions where future earthquakes could nucleate. The extra catalogue events should at least be chosen according to the scientific information available. This information could contain geological data (fault databases) or geophysical information (information on crustal stress). Unfortunately studies examining regions with low seismicity are rare and for the CEUS region none could be determined. There is a second reason why I did not choose to include extra catalogue events. Including such events would shift the magnitude frequency distribution of the synthetic catalogue to higher frequency values. For this first application of bootEQ it is more fruitful to completely understand its working procedure and therefore not introducing this shift in the magnitude-frequency relationships. 3.8.2

Magnitude-frequency distribution

The problem that the maximum magnitude in a source zone can exceed the input parameter Mmax of this particular source zone due to the diffusion of events generated in neighbouring source zones into the source zone of consideration (as outlined in the forth paragraph of section 3.7.1) is not thought to significantly influence the magnitude-frequency distributions. An example of this effect can be seen in panel (e) of Figure 3-6 at magnitudes greater than M7.0. The user has to be aware of the fact that the whole magnitude-frequency curve will be shifted to higher frequency values

46



Chapter 3: Stochastic Event Set for Central and Eastern United States

if such an “in-diffusion” takes place. Nevertheless the amount of this shift is negligible, as at lower magnitudes than the input parameter Mmax some 10’000 events in each source zone are present and above that value only some tens of events are present. I checked this statement by reducing the maximum magnitude of neighbouring source zones (and assuming that the number of events diffused in the source zone of interest is the same as the number of events diffused out that source zone). The magnitude-frequency curve was not found to be shifted to significantly lower frequency values (by a visual inspection of the curve) Nevertheless, when considering very small source zones that include only some hundreds of synthetically generated events and having a significantly lower Mmax than its neighbouring source zones one should be well aware that this effect could significantly influence the results. But if the former case applies, it should be save to simply neglect the events in the magnitude-frequency curve exceeding the input value of Mmax for each source zone. The exact impact of this is extensively discussed in the fifth chapter of this thesis. The Wabash Valley and Charlevoix source zone show a significantly reduced frequency (and therefore a-value) in the synthetic catalogue compared to the historical catalogue. An obvious error source might be that the epicentre spread function diffuses events out of these zones resulting in a reduced number of events in the synthetic catalogue and therefore reducing the frequency. I extended the Charlevoix source zone for some kilometers and plotted the magnitude-frequency distribution again. The results were by a visual inspection of the magnitude-frequency curve exactly the same and the a-value was not found to be higher. The epicentre spread function can therefore not be the reason for significantly reducing the a-value. The reason for the reduced frequency is merely to seek in the seismicity in these source zones that occurred according to the input earthquake catalogue. The Charlevoix source zone experienced only 94 earthquakes since catalogue completeness (1850A.D.) according to the used catalogue. Such a small amount of events is certainly too small to determine a magnitude of completeness Mc because rather 200 events would be needed to accurately determine this threshold value (Woessner and Wiemer, 2005). Furthermore before 1920A.D. only some tens of events have been reported in this region, again a number certainly too low to accurately determine the thresholds in magnitude and time for the catalogue completeness. Figure 3-9: Magnitude frequency distribution for the Charlevoix seismic Therefore it is impossible to determine an zone after a year of catalogue accurate magnitude-frequency distribution for the completeness of 1950A.D. Blue line: historic events for this region, at least with the historic catalogue. Red line: synthetic given input catalogue. Nevertheless I reevaluated catalogue. In the lower left corner the the year of catalogue completeness by plotting number of events for the historic catalogue and the synthetic catalogue is the cumulative number of events in this region shown and in the upper right corner the versus time and obtained a completeness time of seismicity parameters of this source 1950A.D. The resulting magnitude-frequency zone.

47



Chapter 3: Stochastic Event Set for Central and Eastern United States

distribution of the historic events and the synthetic catalogue are shown in Figure 3-9. It is obvious that the agreement between the historic catalogue and the synthetic one is much better. The obvious discrepancy of the frequencies between the historic catalogue and the synthetic for the Charlevoix source zone (Figure 3-6) must be attributed to the fact that the historic magnitude-frequency distribution is not accurately determined. The same points as discussed for the Charlevoix source zone hold true for the Wabash Valley source zone. The conclusion of this discussion is that the interpretation of the magnitude-frequency distribution must be done very carefully and always keeping in mind that the historic magnitude of completeness in each source zone might be shifted to higher or lower a-values, depending whether the time since catalogue completeness is also representative for the respective source zone. The statement “remarkably good agreement” made in section 3.7.2 or “significant reduced frequency” in the paragraph above is solely based on visual inspection of the magnitude-frequency plots. It is extremely difficult to assess the fit between the historical magnitude-frequency distribution and distribution of the synthetic catalogue in a more quantitative way. One criteria of the goodness of the fit could be the comparison between the b-value of the historical catalogue and the synthetic one. However, such a criterion is rather inconclusive. bootEQ reproduces exactly the b-value used as an input, providing enough samples are generated. As the b-values for CEUS are determined from literature this criteria would rather lead to a reevaluation of the b-values provided in the literature and such a reevaluation is far beyond the scope of this study. Another criterion of the goodness of the fit could also be the comparison of the a-values of the historical catalogue and the synthetic one. In my point of view such criteria could be far more conclusive as the previous mentioned one. For the characteristic zone of the NMSZ (lower panel of Figure 3-3) an overestimate of the frequency in the synthetic catalogue compared to the historical catalogue for the non-characteristic portion of the model (magnitudes below M7.0) seems obvious. Nevertheless this is difficult to assess graphically as the historical catalogue since catalogue completeness contains only three events above M5.0 and the synthetic catalogue is just calculated from that magnitude on. The reader should also keep in mind that the cumulative frequency is plotted and therefore the frequency value at M5.0 is a result of the summation of the characteristic portion of the model with the non-characteristic portion. Here the effect of the summation is significant, as the characteristic portion of the model shows greatly reduced recurrence intervals for the characteristic events and a large number of characteristic events is generated. For the Charleston seismic zone a rather good agreement (Figure 3-4) can be seen. The straight part of the blue line in that figure is a result of the Charleston earthquake in 1886A.D. But again I want to stress that such a graphical assessment is difficult, especially in source zones where only few earthquakes happened in historic periods. In my point of view one should not concentrate on the magnitude-frequency distributions of the characteristic zones. The events in this location show very long recurrence intervals and are therefore likely not to be representative in their frequency in the historical data. By definition characteristic source zones do not show a GR law as expected from extrapolating the current rate of seismicity. For all these reasons one should hence concentrate to include all information about characteristic recurrence intervals available in his model instead of comparing the magnitude-frequency

48



Chapter 3: Stochastic Event Set for Central and Eastern United States

distribution of characteristic source zones of the historical data with the model. If a criterion of the goodness of the fit between the GR distribution of historical data and the model can be introduced it should not be applied to characteristic source zones. 3.8.3

Hazard map

Comparing Figure 3-8 with Figure 3-2 it can be seen that the pattern of the hazard compares nicely to the results of the national seismic hazard assessment program of the U.S. Geological Survey (Frankel et al., 2002). Using relationships between intensity values and peak ground motion it is possible to convert intensity values to peak ground motion values (e.g. Neumann, 1954; Gutenberg and Richter, 1956; Trifunac and Brady, 1975; Murphy and O'Brien, 1977). Doing this leads also to a very good correlation. However such conversations show a high scatter or to say it in Peter Bormann’s words: (Bormann, 2002) “However, they [attempts to equate intensity with peak ground acceleration] can not be relied on; work in the 1970s (Trifunac and Brady, 1975) demonstrated that intensity and peak ground acceleration correlate very poorly, and any attempt to relate the two suffers from such severe scattering as to be practically useless. “

One should therefore concentrate on comparing the patterns of the hazard and their relative magnitude from location to location in Figure 3-2 and Figure 3-8, rather than examining the absolute values. And these patterns show a remarkably similar appearance. Not only the two zones contributing most to the hazard (Charleston and New Madrid) are recognizable but also features with a lower hazard level such as in Western Tennessee, Charlevoix, Central Virginia, New Jersey, Northern New York (state of New York), Northeastern Main or Western New Hampshire. On the other hand there are also some obvious deviations. Looking at the (NMSZ in the USGS hazard map a z-shaped pattern is recognizable. The reason for this pattern is that the USGS implemented several fault lines as suggested by Johnston and Schweig (1996). As outlined in section 3.5, only one fault zone was implemented in our model and therefore it is not possible to reconstruct that z-shaped form in our model. On the 110°E meridian our model appears much more spotted than in the U.S. Geological Survey hazard map. Furthermore when considering the relative hazard values, our model seems to under predict the hazard in this zone. I.e. Frankel et al. (2002) suggest similar hazard values on this meridian like in the northernmost portion of the state of New York, whereas our model shows significantly lower hazard values on this meridian compared to the northernmost portion of the state of New York. These two observations can be explained by two reasons. First, the USGS implemented several fault zones in this region, whereas I did not include any faults in this region and therefore the hazard values are concentrated in areal zones and do not show the elongated form as on the USGS map. The underestimation of the hazard values along the 110°E meridian in our model can be attributed to the fact that we did not extend our source model westwards of this line. The source model of the USGS incorporates sources west of this line that are seismically very active and these sources therefore contribute mainly to the seismic hazard along the 110°E meridian.

49



[Blank Page]

50



Chapter 4: Stochastic Event Set for Mainland China

4 Stochastic Event Set for Mainland China 4.1

Introduction

During its history, the mainland of China experienced several large and destructive earthquakes. The deadliest earthquake ever reported happened in 1556A.D. in Eastern China and caused 830’000 fatalities (e.g. Shi et al., 2003). Today, the mainland of China is a densely populated area with an economic growth of enormous extent. A carefully conducted seismic hazard assessment of this region is essential to properly plan countermeasures in order to reduce the social and economic impact of a possible destructive earthquake which could include the introduction of appropriate building codes. In spite of the tremendous impact if a 1556A.D. type earthquake would have nowadays, only few studies were conducted to address the seismic hazard in this region. Just one detailed seismic hazard analysis was conducted by the Global Seismic Hazard Assessment Program (GSHAP) (Giardini et al., 1999). This chapter contains a detailed analysis of the seismic hazard of China’s mainland. A seismic hazard assessment of China poses the problem that only sparse literature is available to the non-Chinese speaking investigator. Hence, a thorough literature research had to be conducted to determine the few sources available to the non-indigenous researcher and this literature research is contained in this chapter. The study area of mainland China could eventually serve the purpose of testing the main assumption of bootEQ if the location of present day seismicity can also be assumed to be the location of future earthquakes. An earthquake catalogue of much longer duration is available for China, listing events dating as far back as 2300B.C. This allows a comparison of a synthetic catalogue, basing upon instrumental recordings and extended to the duration of the complete historical catalogue with the complete historical catalogue. This could give an indication if bootEQ is able to reconstruct the size and locations of large earthquakes. The third reason for choosing China as target region for applying bootEQ is that it shows a completely different tectonic setting compared to the Central and Eastern United States. The seismicity in the former can be described as active tectonics in a collision regime including complex faulting and deformation. In the latter case the seismicity occurs more in diffuse patterns and the region can be defined as a Stable Continental Region (SCR) (according to the definition of Johnston, 1994; or Johnston, 1996a)†. It is critical for future applications of seismic hazard assessment studies to †

The term SCR is a subset of the term intraplate in that it excludes oceanic crust and intraplate regions that are either currently tectonically active or have undergone activity in the geologically

51



Chapter 4: Stochastic Event Set for Mainland China

know if a tool that creates stochastic event sets can also be applied to more active tectonic regions.

4.2

Catalogue of observed seismicity

Two catalogues were available to this study. In this section I outline their underlying datasets and mention some inconsistencies as well as the catalogue completeness which is needed as input parameter for bootEQ. 4.2.1

Datasets

The first catalogue comes from the State Seismological Bureau of China (SSB). From now on I refer this catalogue as the SSB-catalogue. The data are published in Shi et al. (2003) and the electronic data are available to the Swiss Reinsurance Company (and cannot be reproduced in this work due to copyright reasons). The catalogue contains 6700 events covering whole China, with magnitudes equal or greater than M4.7 from 2300B.C. to 2004A.D. The year of 2002A.D. was not available at the point when writing this study and therefore the catalogue was cut at 2001A.D. The magnitude scale is reported in surface wave magnitude (SSB, personal communication, 16 November 2005, via Dr. Junhua Zhou‡). The second available catalogue is taken from Zhang Peizhen et al. (1999). who collected regional and local catalogues of Mongolia, Kazakhstan, Tadzhikhstan, Kirghizstan, Korea, India, Myanmnar (Burma), Vietnam, Nepal, Pakistan, Laos, Bangladesh and China. From now on I will refer to that catalogue as the GSHAP-catalogue. The catalogue contains events from 7670B.C to 1997A.D with magnitudes equal or greater than M4.7. The reported magnitude scale for the 7081 events is surface wave magnitude. 4.2.2

Observed Problems and Discussion

The authors of the GSHAP-catalogue claim that duplicate records from different sources were removed by using a computer program that removes events with differences in origin time of less than two minutes. A manual examination followed this automatic procedure. This should lead to the fact that in China all earthquakes in the SSB-catalogue should also be included in the GSHAP-catalogue. But only a quick examination shows that this is not the case. Some events of the SSB-catalogue (of course before 1997A.D.) are not included in the GSHAP-catalogue. In contrast the same holds true for locations well inside the mainland of China. These events are unlikely to be reported by neighbouring seismological observatories as these events are well beyond their detection range. These two facts are interpreted that the recent past. The seismic moment release differs by a factor of nearly 50 compared to more active regions. ‡

Swiss Reinsurance Company, Beijing

52

Chapter 4: Stochastic Event Set for Mainland China

GSHAP-catalogue shows several inconsistencies in merging the catalogues. Therefore I chose to treat the SSB and the GSHAP-catalogue as two separate catalogues that are independent of each other. A construction of a merged catalogue from the two is hardly possible as the source times of the events were not available to this study. In contrast to the CEUS region the reported magnitude scale is expressed in surface wave magnitude. Several conversion schemes exist to relate the reported surface wave magnitude in the two catalogues to moment magnitude (e.g. Yang, 1997; Rong, 2002). Such conversion schemes contain multiple steps and are complicated by the fact that the SSB reported the magnitudes before 1991 in Beijing surface-wave magnitude and after that point in time in “Ms 7” (see Zhang Peizhen et al., 1999 for further details on this magnitude scale). I chose not to apply such a conversion scheme. The arguments for not applying such a conversion scheme are nicely expressed by Zhang Peizhen et al. (1999): Two major problems prevented us form using the multi-step method […] to convert the various magnitudes into moment magnitude […]. First, the available data are inadequate to establish robust relationships between different magnitude scales […]. Second, some magnitude data provided by local catalogs already are converted values rather than directly measured values. […] Errors in magnitude estimation increase with each scale conversion, in part because the dispersion associated with each relationship is large. Any multi-step conversion scheme may result in final magnitude values that are non-uniform and have large associated errors. Furthermore for most earthquakes, Ms≈Mw (see Chapter1, Lay and Wallace, 1995)

Furthermore the conversion to moment magnitude is unnecessary as long as an attenuation relationship existed that relates surface wave magnitude to modified Mercalli intensities that are needed for the final hazard calculation. From now on the magnitude values referred as M will denote surface wave magnitude unless mentioned otherwise. 4.2.3

Catalogue completeness

There are no reported catalogue completeness parameters in literature, respective the values are not available to the non-Chinese speaking reader, or highly variable depending on the source. Huang et al. (1994) (as cited by Ma et al., 2005) report 1480A.D. as the year of completeness for M≥6.0 in North China and 1900A.D for other regions. Jenny (2004) reports 1915A.D. as the year of completeness for moment magnitudes greater or equal than M6.0 for North China. This is a discrepancy of more than 400 years between these authors. To make the list of available estimates complete, Rong (2002) estimated the catalogue completeness to be 1950A.D. for M5.4 and Zhang Peizhen et al. (1999) 1500A.D. for M5 for Eastern China and 1700A.D. for Western China. Further estimations of the completeness level of the earthquake catalogue were not found. Because the discrepancies depending on the source are so extreme I chose to estimate the year of completeness of the SSB-catalogue and GSHAP-catalogue for various magnitude bins. The applied method compares the cumulative sum of reported magnitude values in different magnitude bins to the time window when they are reported (e.g. Mulargia et al., 1987). In a plot of time versus the cumulative

53



Chapter 4: Stochastic Event Set for Mainland China

number of reported events the slope of the line will be constant once the completeness year is reached. I used a Microsoft Excel tool written by Martin Bertogg† for the specific implementation. The results of this approach are shown in Figure 4-1. Surprisingly the cumulative sum of events in the magnitude bin M4-5 is actually smaller than the magnitude bin M5-6. The reason for this feature is that the analyzed catalogue contains only magnitudes greater than M4.7 (with very few exceptions). For magnitudes greater than M4.0 and smaller than M5.0 much more events are reported starting from 1990A.D. I suspect that the cause for this is a changed seismic network that is able to detect magnitudes down to lower values. The plot shows that with the beginning of 1930A.D. the slope for magnitudes in the range M5-6 is constant. I chose this year of 1930A.D. as the year of catalogue completeness and the threshold magnitude Mc equals M5.0 as input values for bootEQ. Cumulative Number of EQ by Magnitude Range ­

China

3500 Cumulative number of events

4.0