A simple statistical estimation procedure for Monte ... - Springer Link

A Simple Statistical Estimation Procedure for Monte Carlo Inversion in Geophysics. II" Efficiency and Hempel's Paradox By R. S. ANDERSSEN and E. SI/NETA1)

Summary - The major drawback to the use of the statistical estimation procedure proposed recently by Anderssen and Seneta for Monte Carlo Inversion is the necessity to generate and test an excessive number of random models before the required M successive successful acceptable solutions are obtained. In this paper, it is shown that this difficulty can be alleviated in situations where the occurrences of a non-successful non-acceptable solution can be regarded to be equivalent in some sense to the occurrence of a successful acceptable solution as far as support for the validity of a given refinement is concerned. However, the problem of using this equivalence to gain greater efficiency is a difficult one, since plausible reasoning to this end results in a manifestation of Hempel's paradox. 1. Introduction In a recent paper, ANDERSSEN and SENETA [1] z) proposed a statistical estimation procedure which formalizes the implementation of, and interpretation o f results obtained from, M o n t e Carlo Inversion (MCI). In order to list the basic steps in their procedure, it is assumed that the following has been given {see [1 ; w : (i) The data in the f o r m o f tests {T}, which divide naturally into the following subsets o f tests (direct tests)

/{f}

(indirect

t e s t s ) S precise { T},

\

n o n precise { ~ } ,

where we associate with the non-precise {~} upper and lower bounds, called N P T bounds, which define their degree o f non-uniqueness. (ii) The unknown, say u, with respect to which the tests {T} are defined. W i t h o u t loss o f generality, we shall assume that it is a function of a single variable x, viz. u = u ( x ) , defined on the interval [0, 1]. (iii) U p p e r and lower bounds, called a priori u(x)-bounds, which define the degree o f non-uniqueness within the unknown. A suitable basis for specifying such bounds is: the bounds are such that they contain all possible realistic solutions. We say that a r a n d o m solution (model) ~ = f i ( x ) o f u=u(x), lying between the 1) Australian National University, P.O. Box 4, Canberra A.C.T. 2600, Australia. 2) Numbers in brackets refer to References, page 14.

6

R.S. Anderssen and E. Seneta

(Pageoph,

u(x)-bounds, is acceptable if it satisfies all the tests iT}. Let {A} denote the set of regions lying between the u(x)- and NPT-bounds. We are now in a position to define the basic steps in tbe implementation of the above mentioned statistical estimation procedure. They are: 1. Determine a set of N acceptable solutions {fi} using the following sequence of steps: a. Generate a uniformly distributed random solution 4, lying between the u(x)-bounds. b. Is fi acceptable? If yes, accept ~ into {12}; if no, reject fi, and return to a. c. Does {~} contain N acceptable solutions? If yes, stop; if no, return to a. NOTE 1.1. The choice of N depends heavily on how quickly a refinement of {A} can be deduced from {z~}.Consequently, it is problem dependent, and thus, in general, difficult to automate. One exception is unconstrained optimization - see BROOKS

[2]. 2. On the basis of the information contained in {z~}, define a refinement of {A} and denote it by {A1}: where possible, sharper upper and lower bounds {than the u(x)- and NPT-ones} which just contain {t2} and the corresponding values for the non-precise tests {~} are specified, with the original u(x)- and NPTbounds being retained as the sharper bounds in those cases where sharper bounds are not discernable. Consequently, {A1} denotes the set of subregions lying between this set of sharper bounds. NOTE 1.2. Because of the nature of MCI, it is unrealistic not to allow for a situation where some of the sharper bounds will coincide with the corresponding u(x)- and NPT-bounds. In fact, except to require that the u(x)- and NPT-bounds be consistent, nothing has been assumed which ensures that sharper bounds than the a priori u(x)and NPT-bounds will be found. 3. Define a successful acceptable solution as one which, with its non-precise tests {T}, falls in the refinement {Ai} ( = {A1}, initially). Commence a search for M successive successful acceptable solutions, where M is the smallest integer consistent with M ~ loges/log(1 - fi), where a and fl are the confidence levels mentioned in [1; w sequence of steps is used:

The following

a. m = 0 , i = 1 .

b. Generate a uniformly distributed random solution z~, lying between the original a priori u(x)-bounds.

Vol. 96, 1972/IV)

Monte Carlo Inversion in Geophysics. II

7

e. Is ~ acceptable? If yes, go to d.; if no, go to b. d. Is fi successful ? If yes, set m = m + 1 and go to f.; if no, set i = i + 1 and m = 0, add this non-successful acceptable solution into {~7},generate a new refinement {Ai} in the sense of 2, and go to e. e. Does {A i} = {A } ? If yes, STOP as a refinement of {A } is not possible; if no, return to b. f. Is m = M ? If yes, STOP as a refinement of {A} is possible upto the confidence level specified by a and fl; if no, return to b. Thus, the whole basis of the estimation procedure of Anderssen and Seneta is to ascertain whether a refinement of {A } in the sense of 2 up to confidence level specified by a and fl is possible. It is clear from the above that the estimation procedure is inefficient, if it is necessary to generate many random solutions a before an acceptable one is obtained. Such a situation arises in density modelling. In fact, because of the nature of MCI, one would expect this to be the rule rather than the exception. Consequently, it is important to examine ways to improve its efficiency. In w it will be argued that information which supports the validity of a given refinement {A i} is contained in the class of non-acceptable solutions, when the occurrence of a non-successful non-acceptable solution can be regarded as equivalent in some sense to the occurrence of a successful acceptable one as far as support for the validity of {Ai} is concerned, and therefore, should be used. Some basic facts about Hempel's paradox are developed in w in order to illustrate the nature of the problem associated with deciding whether a non-successful non-acceptable solution is equivalent in some sense to the occurience of a successful acceptable one. To make full use of this additional confirming evidence, changes must be made to the manner in which the estimation procedure listed above is implemented. These are presented in w For clarification, we note tbat, in view of the definition of a successful acceptable solution, a non-successful non-acceptable one is a ~ which not only falls outside the refinement set {A,}, but also fails to satisfy at least one of the tests {T}. It is reasonable to expect that this 'double negative', when it occurs, also lends some support to the current refinement set {Ai}.

2. Hempel's paradox From the Introduction, it is clear that the next step is to investigate the validity of using 'non-successful non-acceptable' solutions as confirming evidence for the hypothesis

hi:

The refinement {Ai} contains all the information contained in the u(x)-bounds and non-precise tests,

in conjunction with 'successful acceptable solutions' as hitherto.

a-priori

8


(Pageoph,

Within the broad realm of the logical foundations of probability and induction, justification for (and agreement with, the use of a procedure of this kind in a very general context is still unresolved {see SCHEFFLER[3 ; Part III] and CARNAP [4; w167 86-89]}. The inherent difficulties are embodied in what is known as Hempel's paradox, and may be explained as follows. Let ~C'

stand for

'is a crow'

and 'B'

stand for 'is black',

and consider the hypothesis Ho: (x) (Cx -+ Bx). We interpret Ho as: 'for everything x, if x is a crow, then x is black'. Now, under Nieod's (sufficient) criterion {see [3; Part III]}, a way to verify H o is to use the existence of an y such that the observational report about y asserts that y is a crow and is black, viz.

C y . B y, as confirming evidence for H0, where ' . ' - ' a n d ' . Another plausible principle which Hempel reasons should be used with Nicod's criterion is the equivalence condition which demands that anything confirming a hypothesis confirms every logically equivalent hypothesis. Consequently, since the hypothesis

HI:(x)(,.~ B x - + ~ C x ) is equivalent to Ho, where ' ~ ' = ' n o t ' , it follows that an observational report about the existence of an object y for which

~Cy.

NBy

represents confirming evidence for H o. This conclusion, reached with the aid of two seemingly innocent logical principles, is intuitively unacceptable, since on this basis the observation of a white shoe, red kangaroo, etc. would tend to confirm Ho. Let us go a little further to emphasize this intuitive unacceptability. Also equivalent to H o is the hypothesis H~:(x)((C x v - C x ) - ~ ( ~ C x v B x ) ) , where ' v ' -= 'or', thus, the observation of a y satisfying either

~Cy

or B y

also confirms H o. Now, repeating the reasoning from the beginning, b u t putting ',~B' in place of 'B' throughout, it follows in particular that ~ C y also confirms ~ Bx).

Vol. 96, 1972/IV)


9

This hypothesis contradicts H o, but also is confirmed by a white shoe, a red kangaroo; and, indeed by a Western Australian black swan. This is HEMVEL'S paradox: plausible reasoning leads to intuitively paradoxical results. The question arises: Is intuition correct and thus the above basis for confirming evidence wrong, or is intuition at fault? HEMPEL argues the case for the latter, and has been given strong support by some, such as CARNAV [4; w86--89] and SCHEFFLER [3; Part III]. A detailed account of the situation can be found in the last mentioned. The essence of HEMVEL'Sargument seems to be that the impression of a paradoxical situation is not objectively founded, but is, rather, a psychological illusion, resulting from a faulty view concerning the reference of universal conditionals, and from the intrusion of extra information (ScHEFFLER [3; Part III, w167 and HEMP~L [5]). Another line of defence of the reasoning leading to the paradox (i.e., that there is no real paradox), although not acknowledged as always adequate by HEMPEL and his supporters, seems particularly well suited to, and adequate ,for, the framework of our present needs. This arose out of an attempt by HOSlASSON-LINDENBAUM [6] to quantify the situation by speaking of degrees of confirmation. The paradox is explained by reference to 'class-size': there are many more non-black objects than there are crows, hence the degree of confirmation for H o provided by an observation of a nonblack non-crow is much less than that provided by the observation of a black crow. Similar arguments are advanced by PEARS [7], who notes in essence that the fact that the hypothesis H o ran the risk of falsification on the basis of an observation of a white shoe, and was not falsified, tends to confirm it (even if it tends to confirm the contradictory hypothesis also). What is particularly attractive about the HOSIASSON-LINDENBAUMapproach, in a physico-scientific (as distinct from a philosophical) context, is that: a) The paradox does not arise if the structure of the universe of observable objects of interest is sufficiently well defined so that estimates for the degree of Confirmation of different types of confirming evidence can be generated. b) It is possible to prove mathematical theorems, essentially probabilistic in content, which support the quantitative conclusion of HEMPEL [5], o n the basis of a set of (essentially probabilistic) axioms. A later mathematical approach of the same kind, arriving at essentially the same conclusion, can be found in the first part of a paper of GOOD [8] (although the second part contains an error, [9]). In addition, GOOD introduces the notion of 'stoogian observation', which is the kind that needs to be made to ensure the validity of such arguments, and is made in our experimental situation. Consequently it seems to us that, within the limited universe in which we formulate the statistical estimation procedure for the Monte Carlo Inversion of geophysical data, one is justified in using 'non-successful non-acceptable' solutions to support the hypothesis h , so long as one takes into account, in some satisfactory quantitative way, that the degree of confirmation provided by any one of them is much less than that provided by a 'successful acceptable' solution. In fact, in the next section, we shall

l0


(Pageoph,

indicate how changes in the implementation of the estimation procedure presented in w1 lend further support to this conclusion.

3. Efficiency It follows from the formulation for the statistical estimation procedure in w1 that there are four possibilities which can occur for any randomly generated solution lying between the a priori u(x)-bounds: Pl: An acceptable solution falls in {A~}, and thus, is successful. P2: An acceptable solution does not fall in {Ai}, and thus, is non-successfuL /~3: A non-acceptable solution is successful. That is, there exists a random solution which lies in {A~}, but does not satisfy all the tests {T}. fig: A non-acceptable solution is non-successful. Now, when the tests (T} are only non-precise tests, the four possibilities reduce to three :pl and P2, as above, with P3: A non-acceptable solution, which is automatically non-successful. From the point of view of the present investigation, this special situation has certain advantages. It eliminates the necessity to decide whether the occurrence of successful non-acceptable solutions represent confirming evidence for h~. In fact, this desirable situation can be achieved by simply changing the manner in which the statistical estimation procedure is implemented. The basis for this change is a conflict which, at first sight, appears to exist between the implementation of w1 and the need to ensure that we operate within a framework which rules out the possible manifestation of Hempel's paradox. It is clear from w that we can justify the occurrence of non-successful non-acceptable solutions as confirming evidence for the hypothesis h~ if the universe of solutions over which we operate is sufficiently well defined. This does not appear to be the case for the implementation defined in w1, because we define all our operations relative to uniformly distributed random solutions a lying between the apriori u(x)-bounds. In actuality, we work within the universe of uniformly distributed random solutions u which satisfy the direct tests {T}, since the notion of direct tests was introduced in [1] to ensure that only realistic solutions from the class of all possible uniformly distributed random ones are used to generate and test refinements {A~}. For example, the direct tests should ensure that use is not made of random solutions which oscillate too wildly or have a basic form which contradicts known (physical) reality. Therefore, we can reformulate our estimation procedure with respect to (the universe ~R of) uniformly distributed random realistic solutions, r = r (x), say. In doing this, we make the tacit assumption that we generate such solutions by simply generating uniformly distributed solutions a and accepting them as r if they satisfy the direct tests {~r}.

Vol. 96, 1972/IV)


11

Thus, it is now necessary to redefine acceptability and success, relative to 91 and the indirect precise and non-precise tests {~}, in an analogous way to that of w However, if, in doing this, we drop the distinction between precise and non-precise, and interpret all indirect test {~} as non-precise, then we obtain a reformulation for the implementation of the estimation procedure which allows, relative to 9t and the new definitions for acceptability and success, only the three possibilities pa, P2 and P3 cited above. Consequently, a precise test becomes a non-precise test for which a refinement of its domain of non-uniqueness, lying between its upper and lower a priori uniqueness bounds, is virtually impossible. This lack of refinement can arise because either (i) The test is known to an accuracy equivalent to the rounding error of the calculations it defines, or (ii) the non-uniqueness in the test coincides with the non-uniqueness in the values calculated when applying it to acceptable solutions. NOTE 3.1. It is important to stress that these above changes represent only changes in the implementation and interpretation of the estimation procedure and not a change in the procedure itself. We defer defining the basic steps of the reformulated implementation, along with the new definitions of acceptability and success, until the next section, and now turn to an examination of efficiency for this reformulation. In [1], it was assumed that only the occurrence of pl supports the hypothesis h i. Consequently, the method is inefficient in situations, such as density modelling, where random solutions satisfying P3 occur much more often than those satisfying Pl. Thus, inefficiency, when it occurs, is a direct consequence of neglecting information contained in P3. In fact, this is information which is logically equivalent to that contained inpx. Thus, efficiency is improved if some weight is given to the information contained in P3 relative to that contained inp~ so that occurrences of non-successful non-acceptable solutions can be counted, with an appropriate weighting, along with the successful acceptable ones. As we have shown in w justification for this can be based on the work of HOSlASSoN-LI~DZNBAUM [6], PZARS [7] and GOOD [8] as long as some satisfactory estimate of the degree of confirmation of p3 relative to pa is determined. Thoughp3 also confirms P2, this poses no difficulty here because of the nature of the refinement testing phase of the estimation procedure, which is the stage at which we wish to make use of the occurrence of realistic solutions which satisfy P3 as well as pl. In fact, though it is clear that the occurrence of a random realistic solution which satisfies P3 supports the existence of random realistic solutions which satisfy p~ and Pz, we do not have a situation where the 'immediate inference', P3, supports contradictory possibilities, which is the case in more general contexts as shown in w In our case, we have a situation where the validity of the hypothesis, hl, rules out the existence o f p z , while the first occurrence of a P2 is regarded as contradicting the validity of h i, and thus, is used

t2


(Pageoph,

as a basis for reformulating hv Consequently, the more reliable the hypothesis hi the greater the likelihood that P3 only supports h i, and thus, the higher the degree of confirmation the occurrence of a P3 should possibly be given relative to p~. Thus, as well as the use of the concept of degree of confirmation, we have strong independent justification that Hempel's paradox can be ignored in the present context, since we have implemented our procedure in such a way as to block the more onerous manifestation of the paradox. In fact, we have eliminated from consideration, the selective confirmation concept of GOODMAN {see [3 ; Part III,w 6] }, which he showed to be a basic reason for the manifestation of the paradox. Now, we turn to the determination of an estimate of the degree of confirmation, c, of p3 relative to Pl ; viz., c(p3 ; pa). Our basis for this will be that the universe o f random realistic solutions ~l is sufficiently well-defined for an estimate of the ratio of non-acceptable to acceptable solutions over sufficiently many random realistic ones to be applicable. In the above estimation procedure, such an estimate is available at the end of the initial search for the N acceptable realistic solutions required for the generation of the first refinement {A1}: viz., (3.1) where Ntot equals the total number of random realistic solutions generated before the N acceptable ones are found, and ' ~ ' denotes 'is an approximate estimate for'. Further, by updating N and Ntot during the refinement phase of the estimation procedure, an improved estimate for c(p3;Pl) based on (3.1) is available when required. NOTE 3.2. The above estimate is conservative. There is nothing to stop an experimenter deciding, on the basis of suitable evidence, experience or knowledge, that c(P3;Pl) should have a larger value that that given by (3.1), and specifying it. As pointed out above, the more reliable the hypothesis hi, the higher the degree of confirmation the occurrence of a P3 should possibly be given relative to Pl. We pause to note that estimating c(p3; pl) in this way ensures that criticism relating to the loose use of Bayes' Postulate regarding prior probabilities is largely countered. It follows from the above discussion that the scientist is only interested in testing certain basic possibilities (e.g., pa, P2 and P3 above) within a well defined framework (e.g., ~R above), on the basis of which he wishes to check some hypothesis (e.g., hl above) about the given problem, and is not interested in the full range of possibilities within a framework-free context. It is this fact, which (i) forces the conclusion that Hempel's paradox does not operate in the present context (ii) allows an estimate for c(P3;pl) to be derived, and (iii) yields a basis for improving the efficiency of the estimation procedure. Before we turn to a discussion of (iii) in the next section, we only pause to note that justification for the use of information contained in random realistic solutions

Vol. 96, 1972/IV)


13

which satisfy P3 has been obtained in just this way by formulating the problem in terms of only three possibilities Pt, P2 and P3 over N.

4. Reformulation of the implementation of the statistical estimation procedure On the basis of the preceding discussion, it is only necessary to ensure that we now operate within the universe of random realistic solutions 9~ in the sense of w and count the occurrence of non-successful non-acceptable solutions with a weight of c(P3;Pl), in order to ensure greater efficiency for the estimation procedure. A random solution ~ lying between the a priori u(x)-bounds, is a random realistic solution, r, if it satisfies all the direct tests {St }. A realistic solution is acceptable if it satisfies all the indirect tests {T}, which are now interpreted as non-precise tests. We denote by {A} the set of regions lying between the u(x)- and NPT-bounds. The basic steps of the reformulation are: Determine a set of N acceptable realistic solutions {?} and an initial estimate ct for r using the following sequence of steps: a. Set I = 0 and Ntot =0. b. Generate a uniformly distributed random solution ft. e. Is fi realistic? If yes, set P=fi and Ntot=Xtot-]-l, and go to d; if no, reject ~, and return to b. d. Is P acceptable? If yes, accept ? into {?}, set I = I + 1 , and go to e; if no, reject P, and return to b. e. Does I=N? I f yes, set q = N / ( N t o t - N ), and STOP; if no, return to b. II. On the basis of the information contained in {t~}, define a refinement of {A} in the sense of 2, w1, and denote it by {A t } III. Define a successful acceptable (realistic) solution as one which, with its nonprecise tests { T }, falls in the refinement {A i} ( = {A t }, initially). Commence a search for confirming evidence which is equivalent to the occurrence of M successive successful solutions, where the occurrence of a successful solution is counted with weight 1 and that of a non-successful non-acceptable one with weight % and where M is the smallest integer consistent with M ~ log~t/log(1 - fl), with c~ and fi the confidence levels cited in [1; w The following sequence of steps are used: a. m = 0 , i = 1 , I = N and Nto~ retains its value obtained in I. b. Generate a uniformly distributed random solution ~. e. Is ~ realistic? If yes, set P =fi and Ntot =Ntot + 1, and go to d.; if no, reject fi, and return to b. d. Is P acceptable? I f yes, set I = I + 1 and go to e., ; if no, m = m + ci, and go to g. e. Is ~ successful? If yes, set m = m + l and go to g.; if no, set i = i + 1 , m = 0

14

R.S. Anderssen and E. Seneta and c i = I / ( N t o t - I ) , add this non-successful acceptable solution into {~}, generate a new refinement {Ai) in the sense of II, and go to f. f. Does {A/} = {A}? I f yes, STOP as a refinement of {A} is not possible; if no, return to b. g. Is m > M ? I f yes, STOP as a refinement o f {A} is possible upto the confidence level specified by a and fi; if no, return to b. Acknowledgement

Both authors wish to acknowledge valuable discussions with RICHARD ROUTLEY o f the Philosophy Department, Australian National University. REFERENCES [1] R. S. ANDERSSENand E. SENETA, A simple statistical estimation procedure for Monte Carlo Inversion in geophysics, Pure and Applied Geophysics 91 (1971/VHI), 5014. [2] S. H. BROOKS,Discussion of random methods for locating surface maxima, Operations Research 6 (1958), 244-251. [3] ISRAELSCHEFFLER,The anatomy of inquiry (Alfred A. Knopf, New York, 1963). [4] RUDOLF CARNAP, Logical foundations of probability, Second Edition (The University of Chicago Press, Chicago, 1962). [5] CARLG. HEMPEL, Studies in the logic of confirmation. Mind, N.S., LIV(1945), (I) 1-26; (II) 97-121. [6] JANINA HOSIASSON-LINDENBAUM,On confirmation. J. Symbolic Logic V (1940), 133-148. [7] DAVIDPEARS,Hypotheticals, Analysis X (1950), 49-63. [8] I. J. GOOD, The paradox of confirmation, I, British J. for the Philosophy of Science 11 (1960), 145-148. [9] I. J. GOOD, The paradox of confirmation, II, British J. for the Philosophy of Science 12 (1962), 63-64.

(Received 16th August 1971)