A linear programming approach to fitting an upper quadratic boundary

0 downloads 0 Views 615KB Size Report
Nov 22, 2012 - methods by using simulated data and shown to perform better. ... Journal of the National Science Foundation of Sri Lanka 41 (1) variables ... graphical methods like scatter plots and box plots can ..... gbonte/ecares/boot1.pdf, Accessed 08 November 2012. 2. ... Tang Z., Chambers J.L. & Barnett J.P. (1999).
J.Natn.Sci.Foundation Sri Lanka 2013 41 (1): 13-20

RESEARCH ARTICLE

$ OLQHDU SURJUDPPLQJ DSSURDFK WR ¿WWLQJ DQ XSSHU TXDGUDWLF boundary line to natural rubber data B.M.S.G. Banneheka1*, M.P. Dhanushika2, Wasana Wijesuriya3 and Keminda Herath4 1

Department of Statistics and Computer Science, Faculty of Applied Sciences, University of Sri Jayewardenepura, Gangodawila, Nugegoda. Department of Computational Mathematics, Faculty of Information Technology, University of Moratuwa, Katubedda, Moratuwa. 3 %LRPHWU\6HFWLRQ5XEEHU5HVHDUFK,QVWLWXWHRI6UL/DQND'DUWRQ¿HOG$JDODZDWWD 4 Department of Agribusiness Management, Faculty of Agriculture and Plantation Management, Wayamba University of Sri Lanka, Makandura.

2

Revised: 22 November 2012; Accepted: 20 December 2012

Abstract: ,Q WKH FRQWH[W RI ELRORJ\ WKH SUREOHP RI ¿QGLQJ the maximum or minimum response in a cause and effect UHODWLRQVKLSLVUHFRJQL]HGDVERXQGDU\OLQH¿WWLQJ)UHTXHQWO\ XVHG PHWKRGV LQYROYH ¿WWLQJ FXUYHV WKURXJK WKH ERXQGDU\ SRLQWVXVLQJWKHOHDVWVTXDUHSULQFLSOH+RZHYHULGHQWL¿FDWLRQ of boundary points especially when there are not any multiple responses at values of the explanatory variable is ad hoc. Existing methods involve dividing the range of explanatory variable into bins and considering points with the maximum response or the response above some value in each bin. However, the results depend heavily on the way of dividing into bins and the number of bins. There is no agreement on the best way of dividing or the number of bins. Furthermore, the least square line is not consistent with the theory of limiting response because it goes through the points rather than going above all the points or below all the points, representing the boundary. This paper presents a new method that avoids all the above drawbacks. It involves the theory of linear programming. The proposed method has been compared with commonly used methods by using simulated data and shown to perform better. The method is illustrated by applying to experimental data on the response of latex yield of natural rubber to leaf nutrient concentrations. Keywords: %LRORJLFDO GDWD ERRWVWUDS FRQ¿GHQFH LQWHUYDOV limiting response, linear programming.

INTRODUCTION Biological data are often generated as a consequence of a biological response of a dependent variable against one or more covariates, which explain the variability of the response in a biological system or a process. This can be further interpreted as a set of data generated as a result of a cause and effect relationship in a biological process. *

Corresponding author ([email protected])

In this study, the leaf nutrient concentrations of the major elements, N, P, K and Mg used in the determination of fertilizer recommendations for rubber was considered. The critical levels for these elements are decided by their response to the latex yield of rubber. These kind of relationships need to be built in a reliable way for the routing exercise of fertilizer recommendations to the rubber sector. The cause and effect relationships in a biological process are of simple to very complex in nature and of greater interest to biologists and researchers in understanding behavioural and functional features of a SURFHVV$YDULHW\RIVWDWLVWLFDOPRGHOVDUHXVHGWR¿JXUH out the cause and effect relationships found in various biological processes. Most of the underlying statistical models; viz. linear regression, non linear regression and smoothing techniques, basically explain the average response of a variable against a covariate or a set of covariates. Even if it is statistically sound for prediction purposes, it is not biologically meaningful unless the dependent variable can be treated as an additive combination of all explanatory variables in a biological context (Milne et al., 2006). The maximum, minimum or limiting response under given conditions are generally important in the context of biology. Webb (1972) suggested that the upper boundary or the lower boundary of a plot that indicates the biological response may be of greater biological LQWHUHVWWKDQWKHOLQHRIWKHEHVW¿WWKURXJKWKHVHWRIGDWD Further, he proposed the boundary line as a conceptual model for the interpretation of a plot of two biological

14

variables where the dependent variable is some biological response to some set of independent variables, one of which is represented on the abscissa of the plot (Milne et al., 2006). The upper or lower edges of a set of data in a plot are considered as the upper boundary line or the lower boundary line, respectively. The upper boundary comprised the maximum responses of the dependent variable at particular levels of a covariate, while the lower boundary includes the set of minimum observations of the dependent variable at respective levels of the covariate. Biologists have found that boundary line model is persuasive and of practical interest in various contexts. Casanova et al. (1999); Schmidt et al. (2000); Shatar and McBratney (2004); Gunarathna et al. (2004); Lewandowski and Schmidt (2006) and Feiziasl et al. (2010) have applied boundary line method to assess the cause-effect relationships in biological data in crop growing environments. The boundary line may also represent the limiting response to available water (independent variable) in a multiplicative limiting response model, which has been used to trace gas emissions (dependent variable) from soil (Rolston et al., 1984; Milne et al., 2005; Pringle & Lark, 2006). The boundary line approach has also been used for modelling ZLWKLQ¿HOG\LHOGYDULDWLRQDVDEDVLVIRUUHFRPPHQGDWLRQV RQ VLWHVSHFL¿F PDQDJHPHQW 6KDWDU  0F%UDWQH\ 2004) and in specifying nutrient standards of durian (Poovarodom & Chatuepot, 2002). However, in most of the applications, methods of estimating the boundary line are ambiguous and lacking an underpinning statistical model. Further, Milne et al. (2006) have highlighted that the algorithms proposed by Webb (1972); Schnug et al. (1996); Kitchen et al. (1999); Tang et al. (1999); Schmidt et al. (2000) and Shatar and McBratney (2004) have not been widely adopted due to their impromptu nature.  ,QJHQHUDOWKHH[LVWLQJPHWKRGVWR¿WWKHERXQGDU\ lines are based on the least squares regression. Detection RI RXWOLHUV LGHQWL¿FDWLRQ RI SODXVLEOH SRLQWV RQ WKH ERXQGDU\DQGFXUYH¿WWLQJDUHWKUHHPDMRUVWHSVWKDWDUH LQYROYHGLQWKHOHDVWVTXDUHVPHWKRGRI¿WWLQJERXQGDU\ line. The measurement errors are expected to be removed during the outlier detection. Various methods to identify RXWOLHUV EHIRUH ¿WWLQJ D ERXQGDU\ OLQH WR D GDWDVHW FDQ be found in literature. Calculating Mahalanobis distance is a popular method in the literature to remove outliers. Shatar and McBratney (2004) have removed data points with the largest 5 % of Mahalanobis distances before ¿WWLQJ RI WKH ERXQGDU\ OLQH 6RPH UHVHDUFKHUV FRXQW the near neighbours of each observation and remove the observations, which have a fewer neighbours than

March 2013

B.M.S.G. Banneheka et al.

some threshold. A drawback of these methods is the possible removal of a data point that may not actually be an outlier. Therefore, a useful data point also can be wrongly deleted as an outlier (Milne et al., 2006). Simple graphical methods like scatter plots and box plots can also be used to identify outliers. However, information obtained from these graphs can be subjective and therefore, understanding of the dataset is essential to discard the outliers of the practical dataset. In the process of identifying points that fall on the boundary, data are grouped into subdivisions (bins) with respect to the independent variable (x-axis). However, the number of subdivisions, sizes of subdivisions and underlying method of grouping have not been clearly GH¿QHG \HW $V 6KDWDU DQG 0F%UDWQH\   KDYH pointed out, the method chosen to partition the data into bins affects the results. If the bin width is too narrow, WKHVHWRIELQPD[LPDZLOOEHQRLV\DQGWKH¿WWHGPRGHO will tend towards simple regression. With too wide bin sizes, the form of the boundary response will be poorly represented. Location of bins also affects the results.  7KUHHFRPPRQO\XVHGPHWKRGVWR¿WWKHERXQGDU\ OLQH DUH EULHÀ\ PHQWLRQHG EHORZ )RU IXOO GHWDLOV WKH reader may refer to Kitchen et al. (1999); Schmidt et al. (2000); and Shatar and McBratney (2004). All three methods involve dividing the range of x values into bins and least square estimation. The recommended number of bins in the literature was 8 to 10. However, WKHUREXVWQHVVRIWKHVHPHWKRGVRQWKH¿WWHGOLQHKDVQRW yet been studied. The methods are described assuming that the x values have been divided into 10 bins. 1.

2.

3.

Find the maximum y within each bin. Let them be denoted by v1, v2, . . . , v10. Let u1, u2, . . , u10 be the x values corresponding to those v1, v2, . . . , v10, respectively. Fit the least square boundary line using the points (u1, v1), (u2, v2), . . . , (u10, v10). We call this method ‘LS1’. Let u1, u2, . . . , u10 be the middle x values of the 10 bins. Find the 99th percentile of the y values within each bin. Let them be denoted by v1, v2, . . . , v10. Fit the least square boundary line using the points (u1, v1), (u2, v2), . . . , (u10, v10). We call this method ‘LS2’. Find the 95th percentile of the y values within each bin. For each bin, consider the points for which y values are equal to or above the 95th percentile. Fit the least squares boundary line using all such points in the 10 bins. We call this method ‘LS3’.

We compare our method with the above three methods.

Journal of the National Science Foundation of Sri Lanka 41 (1)

$PHWKRGWR¿WXSSHUTXDGUDWLFERXQGDU\OLQHPRGHOIRUWKHPD[LPXPUHVSRQVH

METHODS AND MATERIALS











additional parameters in fX,Y . A much simpler solution is presented in this paper.

Even though a considerable amount of literature is Suppose there exists a boundary line of the DYDLODEOHRQWKHERXQGDU\OLQH¿WWLQJQRQHRIWKHPKDYH IRUP J [ ȕ   ȕ0  ȕ1[  ȕ2x2 and we have a set of n D FRQFUHWH PDWKHPDWLFDO GH¿QLWLRQ IRU WKH ERXQGDU\ independent pairs of observations (x1, y1), (x2, y2), . . . , OLQH7KHUHIRUH ZH VWDUW E\ RIIHULQJ D GH¿QLWLRQ IRU DQ (xn, yn )XUWKHUVXSSRVHWKHUHDUHP P”Q GLIIHUHQW[ upper boundary line. Suppose X is some explanatory values among x1, x2, . . . , xn and let them be denoted by (independent) variable, Y is a dependent variable and X Ⱦሻȁ šሻ of ൌ yͳ‫š׊‬ ‫š א‬ u1, u൑ , .‰ሺšǢ . . umȾሻȁ .”ሺ Letൌ v൑ be‰ሺšǢ theͳ‫š׊‬ maximum values when X ”ሺ šሻ ൌ ‫א‬ൌš 2 i and Y are jointly distributed with some density function X . We use the subset (u , v ), (u , v ), . . . , (u , vm) for ”ሺ ൑ ‰ሺšǢ Ⱦሻȁ ൌ šሻ ൌ ‫א‬2 š i 1 ͳ‫š׊‬ 1 2 m fX,Y(x, y) on the domain DX,Y . Further, suppose that there our method. not ൌ a waste of information, because ”ሺ ൑This ‰ሺšǢisȾሻȁ šሻ ൌ ͳ‫š׊‬ ‫š א‬ H[LVWVVRPHIXQFWLRQJ [ȕ LQWKHGRPDLQ'X such that ൑ൌ ‰ሺšǢ Ⱦሻ െ ɂȁpoints ൌ š šሻdo ൏not ͳ‫š׊‬ ‫š א‬any these points, allšሻ the contain ”ሺgiven ൑ ‰ሺšǢ Ⱦሻ”ሺ െ ɂȁ ൏other ͳ‫š׊‬ ‫א‬ further information about the upper boundary. This is ”ሺ ൑൑ ‰ሺšǢ ȾሻȾሻȁ െ ɂȁൌൌšሻšሻൌ൏ͳ‫š׊‬ ͳ‫š׊‬ ”ሺ ‰ሺšǢ ‫אא‬ šš ”ሺ ൑ ‰ሺšǢ Ⱦሻȁ ൌ šሻ ൌ ͳ‫š א š׊‬ ...(01) ”ሺ ൑ ‰ሺšǢ Ⱦሻȁ መൌ šሻ ൌ ͳ‫א š׊‬መ š because know that the upper boundary must go above ߚ ”ሺ we ൑ ‰ሺšǢ Ⱦሻ െ ɂȁ ൌ šሻ ൏ ͳ‫š׊‬ ‫א‬ š ߚ଴ ଴ ”ሺ ൑ ‰ሺšǢ Ⱦሻȁ ൌ šሻ ൌ ͳ‫š א š׊‬ መ these”ሺ points is enough for us that the boundary line ൑ and ‰ሺšǢitȾሻȁ ൌ šሻ ൌ ͳ‫š׊‬ ‫א‬ š ߚ଴ DQGIRUDQ\İ! መ መ goes as close as possible to these points. መ ߚଵ ‫š א‬ ”ሺ ൑ ‰ሺšǢ Ⱦሻ െߚଵɂȁ ൌߚ଴šሻ ൏ ͳ‫š׊‬ ”ሺ ൑ ‰ሺšǢ Ⱦሻ െ ɂȁ ൌ šሻ ൏ ͳ‫š א š׊‬ ”ሺ ൑ ‰ሺšǢ Ⱦሻ െ ɂȁ ൌ መଵšሻ ൏ ͳ‫š א š׊‬ ߚ ”ሺ ൑ ‰ሺšǢ Ⱦሻ െ ɂȁ ൌ šሻ ൏ ͳ‫š א š׊‬ ...(02) ”ሺ ൑ ‰ሺšǢ Ⱦሻȁ ൌ šሻ ൌ ͳ‫š א š׊‬ ”ሺ ൑ ‰ሺšǢ Ⱦሻ ൌ šሻ ൏ ‫א‬ൌš ”ሺ ൑ Ⱦሻȁ ൌ šሻminimize ͳ‫š א š׊‬ ߚመଶ ͳ‫š׊‬ We suggest to use , ߚመ‰ሺšǢ that ߚመଶെߚመ଴ɂȁ ଵ and ߚመ଴ መ ߚ଴ ߚመൌ šሻ ൌ ͳ‫š א š׊‬ ”ሺ ൑ ‰ሺšǢ Ⱦሻȁ ߚመ଴FRQVLGHUHG DV DQ XSSHU ERXQGDU\ 7KHQ J [ ȕ  FDQ EH ௠ଶ ߚመመ ௠ ଴መଶ ଶൌ šሻ ”ሺ ൑መ෍ ‰ሺšǢߚመଵȾሻȁ ͳ‫š׊‬ š ˜ ሻ መ መߚ ሺߚ —ൌ ߚመଶെ —‫א‬ଶ୧ɂȁ െ line. The conditionߚመ(1) simply states the requirement that ”ሺ ൑ߚመଵ˜‰ሺšǢ ൌ୧ šሻ ...(03) ൏ ͳ‫š א š׊‬ ෍ ௠ ሺߚ଴ ൅”ሺ ߚଵߚመ—ଵ୧ ൑ ൅ ‰ሺšǢ ߚ ଴ ୧൅െ ୧ ൅Ⱦሻ ଵ ଶ— ୧ሻ Ⱦሻ െ ɂȁ ൌ šሻ ൏ ͳ‫š׊‬ ‫š א‬ ௜ୀଵ ଶ ௜ୀଵ መ መ መ መ all the values of the response variable for a given value of ߚଵ ൅መ šሻ ߚ— െ  ˜୧ ሻ‫š א‬ ௠ ”ሺ෍ ൑ ‰ሺšǢሺߚ Ⱦሻ െ ߚɂȁ ൌ ൏ ͳ‫š׊‬ ଴൅ ଵ— ”ሺ௜ୀଵ ൑ ‰ሺšǢ Ⱦሻȁ šሻଶ൅ ൌ୧ߚͳ‫š׊‬ ‫š א‬ ߚመଶ୧ൌ መመ଴ ൅ መߚଵଵ— መଶ —ଶ୧ െ X must be below the ෍ ሺߚ ߚ ˜୧ ሻ ߚመ଴ ߚመଶ upper boundary line. The condition ୧ መ ߚ subject to ൑ ଶ ߚ ”ሺ ‰ሺšǢ Ⱦሻ െ ɂȁ ൌ šሻ ൏ ଴ ͳ‫š א š׊‬ ௜ୀଵ መଶ (2) eliminates all theߚother curves that are above the real ଶ ߚመመ଴ ǡߚመ‹൅ൌߚመͳǡʹǡ መ ଶ ߚ መଵ —௠୧ ൅ ߚመߚ ˜ ǡ ‹ ൌ ͳǡʹǡ ǤǤǡ ߚመ଴ ൅ ߚ௠ Ǥǡ ௠ ୧ ଶ መ ଶ— ଶ ୧ Ǥ൒ ୧൅ መ଴ଶ —଴൅୧൅ߚ൒ መଶଵ —ଵ˜— ERXQGDU\OLQHVWLOOVDWLVI\LQJ  +HUHȕLVDYHFWRURI ෍ ሺߚ ߚ — െ  ˜መ ୧ ሻ୧ ߚመଵ ଶ ୧ ଶ ୧ መ መ መ መ ଶ መ መ መ ෍ ሺߚ ൅ ߚ — ൅ ߚ — െ  ˜ ሻ ߚ ߚ ”ሺ ൑ ‰ሺšǢ Ⱦሻ െ ɂȁ ൌ šሻ ൏ ͳ‫š׊‬ ‫א‬ š መ መ መ ...(04) ௠ ଴ ߚ଴ ൅ ߚሺߚ ൅ ߚߚଵଶ——୧୧ ൅൒ߚ˜ଶ ୧—ǡ ୧‹଴ൌ ଵ ୧ ଶ ୧ ୧ ෍ െ ͳǡʹǡ ˜୧ ሻ ǤଵǤ ǡ  ௜ୀଵ ଵ —଴௠ ୧൅ XQNQRZQFRHI¿FLHQWV)RUH[DPSOHIRUWKHVWUDLJKWOLQH ௜ୀଵ ሺߚመ ൅ ߚመ — ൅ ߚመ —ଶ െ  ˜ ሻ ߚመመଵ—ଶ ൒ ˜ ǡ ‹ ൌ ௜ୀଵ ෍ ߚመ଴෍ ൅ ߚመଵ —ሺߚ ߚ ͳǡʹǡ Ǥ Ǥ ǡ  ଴ȕ  ଵȕ ୧  ȕ [ ଶ ZLWK ୧  ȕ  ȕ )T, and for ୧ ୧መ൅൅ ଶߚመ ୧— ൅ ߚ ୧መ —ଶ െ ERXQGDU\ J [ ȕ  ˜୧ ሻ ߚመଶ ଴ ଵ ୧ ଶ ௠୧ 0 1 0 1 ௠ ௜ୀଵ መଶ , respectively. ߚመ଴ , ߚመଵ෍ ௜ୀଵ as the estimates of˜ and ߚ˜ These WKHTXDGUDWLFERXQGDU\IXQFWLRQJ [ȕ  ȕ0ȕ1[ȕ2x2, ෍ ୧ መ ௠ ୧ ߚመ଴ ൅ ߚመଵ —୧ ൅ ߚመଶଶ௜ୀଵ —ଶ୧ߚଶ൒ ˜୧ ǡ ‹ ൌ ͳǡʹǡ ௜ୀଵ ௠ ǤǤǡ ଶ)T7KHERXQGDU\OLQHJ [ȕ FDQDOVR WZRHTXDWLRQVIRUFHWKH¿WWHGOLQHWRJRDERYHDOOWKHGDWD ZLWKȕ  ȕ ȕ ȕ መ መ መ ߚ଴ ൅ ߚଵ —୧ ൅ 0 ߚ1ଶ —୧2 ൒ ˜୧ ǡ ‹ ൌ ͳǡʹǡ Ǥ Ǥ ǡ  ୧ መ ͳǡʹǡ Ǥ Ǥ ǡ መ ߚመ ൅ ߚመଵ —୧ ൅ ߚመଶ —୧෍ ൒ ˜ߚመ୧௠ ǡ˜ ‹௠ൌ ሺߚ ൅ ߚመଵthe — ൅ ߚመଶ —ଶ୧ െ  ˜୧ ሻ ߚ˜ ௜ୀଵ while being෍ asଵଶclose as possible be ߚany of order ଴points መଶ଴෍ መ଴ ൅other ௠ ෍ ሺߚ ߚመଵ —୧ ଴൅ ߚመto —ଶ୧ ୧െmaximum  ˜୧ ሻ ߚመଵ —୧ suitable ൅ ߚመଶ —ଶ୧ function ൒ ˜୧ ǡ ‹ ൌlike ͳǡʹǡa Ǥpolynomial Ǥǡ ୧ଶ൅௜ୀଵ ଶ መ଴ ൅ ߚመଵመ—୧ ൅ መߚመଶ —୧ ௜ୀଵ ߚ ൒ ˜ ǡ ‹ ൌ ͳǡʹǡ Ǥ Ǥ ǡ  ෍ ሺߚ଴୫൅ ߚଵ௠—୧௜ୀଵ ൅ ߚመଶ —୧୫୧ െ  ˜୫୧ ሻ 3 or more or a non linear function. points. ୫ ௠ ௜ୀଵ መଶൌ መ෍ —መ ǡ ƒ†ƒ ଶ ௠ ߚ ଶ —ଶ ଶ ൌ ෍ ௠ ǡ ƒ — መ ƒ଴ ൌ ǡ ƒଵ ൌ ƒ෍ — ǡ ƒ†ƒ ൌ ෍ ˜ ଴ ൌ෍ ଵ ୧ ෍ ሺߚ ൅ ߚ — ൅ ߚ — െ  ˜ ሻ ୫୧ ଴ ൌ୧ šሻ ൌ ͳ‫š א š׊‬ ୧ଵ ଶ୧ ଶ”ሺ ୧ ୫୧൑ ୧‰ሺšǢ Ⱦሻȁ ෍ ௠˜୧ ୧ୀଵ ୧ୀଵ ଶ ୧ୀଵ ୧ୀଵ ෍ ˜  $ORZHUERXQGDU\OLQHFDQEHGH¿QHGDQDORJRXVO\ ௜ୀଵ ଶ ௜ୀଵ inߚመ (3) can beመ ignored without ƒ଴Note ൌ ǡ that ƒଵ௠ൌ ෍መ —୫ ௠ƒ†ƒ —෍ —୫ ୧ ୧ǡ ଶመ ߚመൌ ௜ୀଵ ˜ ୧୧ ൒ ˜୧ ǡ ‹ ൌ ͳǡʹǡ Ǥ Ǥ ǡ  ଴൅൅ߚ ଵଶ ୧ ൅˜ߚǡଶ— ߚ௜ୀଵ ൅ ߚመଵ——୧˜ ͳǡʹǡ ଶ ǤǤǡ ୧ ୧ୀଵ ୧ୀଵ The discussion in෍ this paper is restricted to estimating ଴ ଶ —୧ ൒ ୧ ‹ൌ— ෍ ƒ ൌ ǡ ƒ ൌ ෍ ǡ ƒ†ƒ ൌ ෍ መ መ መ ଶ ୧ ଴ ଵ ୧ ଶ ୧šሻ ൌ ሺߚsolution, —୧˜൅ Ǥa˜Ǥ ୧ǡconstant ሻ ௜ୀଵ changing since is‰ሺšǢ a given Ⱦሻȁ ൌ for ͳ‫š א š׊‬ ߚመ଴ ൅෍ ߚመଵ —the ߚመଶ൅—ߚ ǡ ‹ߚൌ ͳǡʹǡ ଶ —it ୧൑െ ୧ ൅଴ ୧”ሺ ୧ ଵ൒ ௜ୀଵ ୧ୀଵ ୧ୀଵ a quadratic upper boundary line. However, the methods ௜ୀଵ ”ሺ ൑ ‰ሺšǢ Ⱦሻ െ ɂȁ ൌ šሻ ൏ ͳ‫š א š׊‬ ୫ ୫ መ መ መ dataset. Letting ଶ መ መ መ  ƒ଴ƒߚଶ ଴ߚଶ൅ ƒଵ ߚଵߚመ൅ ୫ൌ for ƒ଴ ߚan ƒ ൅ ൅ƒଶ෍ ߚመߚଵଶ—୧ ൅— ߚመǡଶƒ†ƒ —୧ ൒ ˜୧ൌ ǡ‹ ൌ Ǥ ଶǤ ௠ ǡ ଴ ൅upper ଵ ߚൌ ଵ or proposed here ୫can be easily adapted ୫ ୫ ͳǡʹǡ— ƒ଴ ൌመ ǡ ƒଵ଴ ൌ ෍ ଶ ௠ ୧ ଶ ୧ ”ሺ ൑ ‰ሺšǢ Ⱦሻȁ ൌ šሻ ൌ ͳ‫š׊‬ ‫א‬ ଶ መ መ ƒ଴ lower ൌ ǡ ƒpolynomial — ൌ of ෍order ୫—ൌ ƒ଴more. ߚ଴ ൅ ƒଵ ߚƒଵ଴൅ൌƒǡ ෍ ˜୧ መ š ଵ ൌ ෍ ୫ ୧ ǡ ƒ†ƒ ଶ ୧ 3 or boundary line —୧ୀଵ ƒ†ƒଶ ൌ෍ ෍ ˜ —୧ୀଵ ଶ ߚଶƒଵ ൌ ෍ ୧ ǡ௠ ୧୧ ୫ ୫ ଶ ୧ୀଵ ୧ୀଵ ߚ଴ ͳ‫š א š׊‬ ”ሺ ൑ ‰ሺšǢ Ⱦሻ െ ɂȁ ௜ୀଵൌ šሻ ൏ መ መ መ ଶ ୧ୀଵ ƒ଴ ൌ ǡ ƒଵ ൌ ෍ —୧ ǡ ƒ†ƒଶ ൌ ෍ —୧ ൌ  ƒ଴ ߚ଴ ൅ ƒଵ ߚଵ ൅ߚመƒଶ൅ መଶ —୧ ൒ ଶߚመଵƒ—୧ ୧ୀଵ ൅෍ ߚ ‹ଶൌ ͳǡʹǡ ௜ୀଵǤ Ǥ ǡ  —ଶ ˜ ƒ଴ ଴ൌߚǡ ෍ —୧˜୧ǡ௠ ୧ ǡƒ†ƒ ଵ ଶൌ ଶ ൌ෍ ୧ ୧ୀଵ ୧ୀଵ ...(05) መ መ መ መ መ መ ˜୧ ǡ ‹ ൌ୧ୀଵ ͳǡʹǡ Ǥ Ǥ ǡ  ௜ୀଵ ߚ଴ ൅ —୧ ߚଵ ൅ —ߚ ǡ ‹൅ൌ—ͳǡʹǡ Ǥǡ ୧ୀଵ ଴ ଶ൅൒— ୧˜ߚ୧ଵ Fitting the boundary line ୧ ߚଶ Ǥ ൒ ୧ߚ ෍ ˜Ⱦሻ መ଴ ൏ ͳ‫š׊‬ ߚመଵ ‫ š א‬୫ ൑ǡ ‰ሺšǢ šሻ ଶ ”ሺ ୧ െ ɂȁ ൌ ߚ መ መ መ መ መ መ  ൌ  ƒ ߚ ൅ ƒ ߚ ൅ ƒ ߚ ߚ ൅ — ߚ ൅ — ߚ ൒ ˜ ‹ ൌ ͳǡʹǡ Ǥ Ǥ ǡ  ୫ ଴ ଴ ଵ ଵ ଶ ଴ଶ ୧ ଵ ୧ ௜ୀଵ ୧ ଶ ƒ଴ ߚመ଴ ൅ ƒଵ ߚመଵ ൅ ƒଶ ߚመଶ መ መ መ ୫ ୫ ଶ  ൌ  ƒ ߚ ൅ ƒ ߚ ൅ ƒ ߚ ௠ it is easy to see that the above is a simple linear መ መ መ ଵ ଵ ଶ ଶ ߚ଴ ൅ —୧ ߚଵ ୫ ൅ —୧ ߚ ˜୧ ǡ ‹ƒଵൌൌͳǡʹǡ ƒ ൒ ൌ ǡ ෍Ǥ Ǥ ǡ —୧ ǡ ƒ†ƒ ൌଶ ෍ —ଶ୧ of least square and the method଴ of଴ maximum ଶ— ൌ  ƒ଴ ߚመ଴ ൅ ƒଵ ߚመଵ ൅The ƒଶ ߚመmethod ƒ଴ problem ൌ෍ ǡ ƒଵ଴ଶ˜ ൌ୧ determine ෍ —୧ ǡ୫ ƒ†ƒ ൌ ෍ መ መ ଶ ଶ መ ୧ୀଵ ୧ୀଵ መ መ መ programming to , and so ߚ ߚ ൌ  ƒ଴ ߚ଴ ൅for ƒ ߚ ൅ൌƒǡ ଶ ୧ ଶ ߚଶƒଵ ൌ ෍ likelihood are the most commonly used methods ଵ଴଴ —௜ୀଵ ୧ୀଵ ƒ†ƒଶ ୧ୀଵ ൌ ෍ ଴ —ଶ୧ ߚଵ ୧ ǡ୫ ͳ ଶ ͳ ଵ ƒଵ଴ଵ଴଴ ଶ ୫ ଶ ଶ that ଶ ሻ൧ ෠୧ ǡଵ‹™ መଵ଴଴ ෠መଵଵ෍ ෠መ෠ଶଶ଴™ ෠ඨ ሺȾ

ߚ ൌ Ⱦ෠ȾଶǤଵ™ െ Ⱦଵ ™୧ ൅ Ⱦଶ ™୧ଶ ሻ൧ ൣȾ ൅ —୧ ߚ ൅ଶ୧ —൅୧ୀଵ ߚ ൒൅୧ଶ˜Ⱦ ൌ୧଴൅ ͳǡʹǡ Ǥ™ ǡ୧୧ୀଵ  ඨ ¿WWLQJ FXUYHV PHWKRG RI OHDVW VTXDUH LVൌVXLWDEOH ሺȾ ൅ Ⱦ ™ Ⱦ െ ൅ ൅ Ⱦ ෍ ൣȾ ଶ7KH ଶ ଴™൅ ଴ ୧ ଴ ୧ ଶ መ መ መ ௠ ୧ ߚ଴ ൅ —୧ ߚଵ ൅ —୧ ߚଶ ൒ ˜୧ ǡ ‹ ൌ ͳǡʹǡ Ǥ Ǥ ǡ  ͳ መ୧ୀଵ ƒ଴ߚ ǡ ƒ†ƒ ෍መ —୧ ଶ ଶ መ෠ଵൌ൅ǡ—෠୧ƒߚ ͳͲͲ ͳͲͲ ଵመଶൌ൒෍ ୧ୀଵ ˜෠୧ ǡ ‹™ൌଶ—୧െ ͳǡʹǡ Ǥ Ǥ ǡ ଶȾൌ ߚመ™ ଴ ൅ —୧ൣȾ ଵ ߚ መଵ —୧ ൅ ߚመଶ —ଶ୧ െ  ˜୧ ሻ when the line of interest goes through the data so that ߚ෍ ଶመȾଶ൅ ൌඨ ൅ Ⱦ൅ଵ ™ ൅ ሺߚ ଵ଴଴ ୧ୀଵ ଴ߚ ୧ߚ ଶ୧ୀଵ ଵ ୧൅ መ൑ መଵଶȾ ଴ ™୧ߚሻ൧ ୧ߚመ ሺȾ଴ ൅ ෍ ͳመ ൌ”ሺ ߚመ଴ ൅ —୧ ߚመଵ ൅ —ଶ୧ ߚመଶ ൒ ˜୧ ǡ ‹ ൌ ͳǡʹǡ Ǥ Ǥ ǡ   ƒ ƒ ൅ ƒ ...(06) ‰ሺšǢ Ⱦሻȁ ൌ šሻ ൌ ͳ‫š׊‬ ‫א‬ š ୫ ୫ ଴ ଴ ଵ ଶ ଶ ͳͲͲ መ መ መ ଶ ଶ ଶ ෠ ௜ୀଵ —෠൅ ߚ ൒ ˜ ǡ ‹ ൌ ͳǡʹǡ Ǥ Ǥ ǡ   ƒ଴ ߚ଴୧ୀଵ ൅ߚƒ଴ଵ൅ ߚመଵൣȾ ƒ൅ଶ൅ߚ roughly equal number of observations is above and ሺȾ ൌൌඨ Ⱦ෠መଶଵ—™ ൅ Ⱦ ™ െ ൅ Ⱦ ™ ൅ Ⱦ ෍ ୧଴ߚଵ ଶ ୧ ୧ ୧ ଶ ୧ ௠ ଴ ଵଶ ୧ ଶ ™୧ ሻ൧ መ መ መ ƒ ൌ ǡ ƒ ൌ ෍ — ǡ ƒ†ƒ ൌ ෍ — ͳͲͲ መ  ൌ  ƒ ߚ ൅ ƒ ߚ ൅ ߚ ଴ ଵ ୧ ଶ ߚ୧ୀଵ ଴ ଴ ଵ ଵ ଶ ଶ ୧ୀଵ below the line. A dataset consistent with requirements ଶ መ ୧ ଵ଴଴ ୧ୀଵ ෍‫š א‬ ሺߚመ଴ ൅ ߚଵ —୧ ൅ ߚመଶ —ଶ୧ଶ െ  ˜୧ ሻ ͳ ”ሺ is minimized subject to constraints ଵ଴଴ ൑ ‰ሺšǢ Ⱦሻȁ ൌ šሻ ൌ ͳ‫š׊‬ ଶ መ መ መ ͳ ෠ ෠ ෠ ଵ଴଴ (1) and (2) should contain all the points below the actual ଶ ௜ୀଵ ൌ ଶ ƒൌ଴ ߚ ƒଶ଴ߚଶ൅ Ⱦଵ ™୧ ൅ Ⱦଶ ™୧መ௠െ ሺȾ଴መ൅ Ⱦଵ ™ ඨ ൅ ƒ෍ Ⱦଶଶ™୧ଶଶ ሻ൧ ଵ ߚଵ ൅ ൣȾ ୧መ൅൒ ෠ ଶ ሺȾ଴ ൅ Ⱦଵ ™୧ ൅ Ⱦଶ ™ ଶ ߚൌ ሻ൧ ͳ଴ͳͲͲ ൅ —൅ —ߚ ߚ ˜መଶ୧ଶǡ—‹ ଶ୧ൌ൒ͳǡʹǡ ǡ ͳǡʹǡ Ǥ Ǥ ǡ  መଵ଴൅ መଶ୧‫א‬

ൌ ඨ ͳ෍ Ⱦ෠ଵ ™Therefore, ൣȾ෠଴ ൅line. ෠൅ ෠଴ ൅ ”ሺ ൑Ⱦ෠ଵ‰ሺšǢ Ⱦሻ െ™—ɂȁ šሻ ൏ ͳ‫š׊‬ ଶ ଵ଴଴ ଴ሺȾ ୧መߚ ଶš ୧ ൅ Ⱦଶ ™୧ െ ߚ ൅ — ൅ ߚ ˜ ǡ ‹Ǥ Ǥൌ ୧ඨ

ൌ ™ ൅ Ⱦ െ Ⱦ ™ ൅ Ⱦ ™ ෍ ൣȾ ୧ୀଵ መ መ ଵ ୧ boundary the least square method is not ଶ ୧ ଶ ଴ ଵ ୧ ଶ ߚ ߚ ൅ — ߚ ൒ ˜ ǡ ‹ ൌ ͳǡʹǡ ǤെǤ ǡ ୧෍ ୧ ሻ൧ ୧ ଵ ୧ୀଵ ൣȾ ሺߚ୧መ଴ ଶ൅ ߚመଵ —୧ ୧ ൅ ߚመଶ — ˜ଶ୧ ሻ ଶ୧ ͳ୧ୀଵߚመ ൅ଵ଴଴ ෠ ଴ ൅ Ⱦ෠ଵ ™୧ ൅ Ⱦ෠ଶ ™୧ଶ െ ሺȾ଴ ൅ Ⱦଵ ™୧ ൅ Ⱦଶ ™୧ଶͳͲͲ ଶ଴መ ඨ ୧ ሻ൧

ൌ ͳͲͲ ෍ መ ଶ —୧ൣȾ ߚ෠ଵ଴ൌ൅൅šሻ —Ⱦ෠୧ ଵൌ ߚ™ ൒ ˜෠ǡ‫™‹א‬ൌ ͳǡʹǡ Ǥ Ǥ ǡ  ൑ ƒ‰ሺšǢ š ...(07) ଴ߚመ Ⱦሻȁ ଶͳ‫š׊‬ መଵ ൅

”ሺ ෍  ൌ  ƒ଴ ߚመ଴ ൅ ƒൌଵ ߚඨ ͳͲͲ VXLWDEOHIRUHVWLPDWLQJDERXQGDU\OLQHGH¿QHGDVDERYH ୧ ൅ Ⱦ୧ ଶ௜ୀଵ ୧ െ ሺȾ଴ ൅ Ⱦଵ ™୧ ൅ Ⱦଶ ™୧ ሻ൧ ୧ୀଵ ଶ ଶ ͳͲͲ൑ଶ ‰ሺšǢ ୧ୀଵȾሻ መ መ መ ”ሺ െ ɂȁ ൌ šሻ ͳ‫š׊‬ ‫א‬ š ଶߚመ൏ ଴ ߚ ൅ ߚ — ൅ ߚ — ൒ ˜ ǡ ‹ ͳǡʹǡ Ǥ Ǥ ǡ  መ መ መ The method of maximum likelihood can be used only ୧ ͳǡʹǡ ଶ Ǥ Ǥ୧ ǡ  ୧ ൌ ߚ଴ ൅ —୧ ߚଵ ൅ —୧ ߚଶ଴൒ ˜୧ ǡଵ‹ ൌ ௠ ଵ଴଴ This linear programming problem can be solved very if a parametric model is available for the joint density ෍ ˜୧ ଵ଴଴ͳ ଶ ෠൅଴ ߚ ͳ the መଵany ሺȾ ൌߚመඨ ൅ Ⱦ෠ଶଵ ™ Ⱦ෠‹ ଶൌ ™୧ͳǡʹǡ െ ൅ Ⱦ ™ ൅ଶ Ⱦ ™ ଶ ሻ ෍ ൣȾ መ෠଴of መଶ෠— ߚ ”ሺ ൑ ‰ሺšǢ െ ɂȁ šሻ ൏ ͳ‫š׊‬ š ଶ୧ ൅ with help that linear ߚ ൅ ߚመǡȾ෠ଵ‹‫א‬ —software ൒ ˜support ǡሺȾ Ǥ™Ǥ ǡ୧଴ ௜ୀଵ fX,Y . Finding a suitable model for fX,YLVXVXDOO\GLI¿FXOW ଴ଶͳͲͲ መ መ ୧ ୧ ඨ ୧

ଵ଴଴ ൌ ൅ ™ ൅ Ⱦ ™ െ ൅ Ⱦ ൅ Ⱦଶଵ™୧ଶ୧ሻ൧ ଶ ୧ ෍ ൣȾ ߚመȾሻ ൅ — ߚ ൅ — ߚ ൒ ˜ ൌ ͳǡʹǡ Ǥ Ǥ ǡ  ͳ easily ଶ ଴ ଵ ୧ ଶ ଴ ଵ ଴ ୧ ଵ ୧୧ୀଵ ୧ ௠ ୧ ଶ ଶ ଶ programming. package 2.11.1 ൅ Ⱦ෠ଵ ™We ൅ used Ⱦ෠ଶ ™୧ the െ ሺȾ ™ ሻ൧ to ൌඨ ෍ ൣȾ෠ ͳͲͲ and estimation of the boundary line using the method ୧୧ୀଵ ଴ ൅ Ⱦଵ ™ ୧ ൅ Ⱦଶ (2010) ෍ ˜୧ ୧ ͳͲͲ ͳ୧ୀଵ ଴ଵ଴଴ ଶ መ መ଴ ෠, ߚመଵ and . We call this method ߚ ߚ ଶ RI PD[LPXP OLNHOLKRRG FDQ EH PRUH GLI¿FXOW GXH WRൌ ඨcalculate Ⱦଵ ™୧‘LP’. ൅ Ⱦଶ ™୧ଶ ሻ൧ ෍ ൣȾ଴ ൅ Ⱦ෠ଵ ™୧ ଶ൅ Ⱦ෠ଶ ™୧ െ ሺȾ௠଴ ൅ ௜ୀଵ ୫ ୫ ͳͲͲ ୧ୀଵ ƒ଴ ൌ෍ ǡ ƒଵ˜ൌ୧ ෍ —୧ ǡ ƒ†ƒଶ ൌ ෍ —ଶ୧ ଵ଴଴ ௠ ͳ መଵ ෠ ߚመመଶ ୧ୀଵଶ ଶ ୧ୀଵ ଶ መ ሺȾଶ ෠ ଴ ߚ൅ Ⱦሺߚ ൅ߚመȾ෠ଵଶ—™ Ⱦ୧ଵሻ୫™୧ ൅ ȾMarch

ൌඨ ෍ ൣȾ ෍ ߚଶ —୧ ଴െ൅ ˜௜ୀଵ ଵ™ ଶ ™୧ ሻ൧2013 ୫ ଴ ୧൅ ୧൅ Journal of the National Science Foundation of Sri Lanka 41 (1) ୧ െ ͳͲͲ ୧ୀଵ ௜ୀଵ ƒ଴ ൌ ǡ ƒଵ ൌ ෍ —୧ ǡ ƒ†ƒଶ ൌ ෍ —ଶ୧ ௠ መଶ መ ߚ ଶ ୧ୀଵ ୧ୀଵ መ ෍ ሺߚ଴൅ൌߚଵƒ—୧ ߚ൅ ୫ መ ߚመଶ —୧ െ መ  ˜୧ ሻ ୫ መ ଴ ଴ ൅ ƒଵ ߚଵ ൅ ƒ ଶ ߚଶ ଶ ௜ୀଵ ƒ ൌ ǡ ƒ ൌ ෍ —୧ ǡ ƒ†ƒଶ ൌ ෍ —୧ ௠ ߚመ଴ ൅መ ߚመଵ —୧ ൅መ ଴ߚመଶଶ—ଶ୧ ൒ ଵ˜୧ ǡ ‹ ൌ ͳǡʹǡ ୧ୀଵ Ǥ Ǥ ǡ  ୧ୀଵ ෍ ሺߚመ ൅ ߚ — ൅ ߚ — െ˜ ሻ





ϭϱ

 16

B.M.S.G. Banneheka et al.

Illustrations of methods This study employs both simulated and experimental data to describe the appropriateness of the proposed methods. The experimental data were collected from the survey   ï SURJUDPPH RI VLWH VSHFL¿F IHUWLOL]HU UHFRPPHQGDWLRQ based on soil and foliar nutrients conducted by the Soils   ï and Plant Nutrition Department of the Rubber Research Institute of Sri Lanka (RRISL). Leaf nitrogen and potassium concentrations (%) together with latex yields NJKD\HDU  RI GLIIHUHQW ¿HOGV VXUYH\HG LQ  ZHUH used in the study.  ,Q WKLV VWXG\ DV DQ LQLWLDO VWHS ZH LGHQWL¿HG DQG removed outliers by simply using graphical methods. Graphical representations of real data obtained from the Soils and Plant Nutrition Department of RRISL are shown in Figure 1. Box plot of yield data, which is shown in Figure 1 (a) was simply used to identify the outliers. When detecting outliers it is not enough to rely only on the information given by the box plot. Therefore a scatter plot, which shows the relationship between latex yields (kg/ha) and leaf nitrogen concentration (%) was drawn as shown in Figure 1 (b). Two unusual observations, which have been circled could be seen in the scatter plot but only one point was marked as an outlier in the box SORW ,Q WKLV VLWXDWLRQ NQRZOHGJH RI WKH ¿HOG VLWXDWLRQ was necessary to take a decision. Therefore under the JXLGDQFHRIH[SHUWVLQWKHUXEEHU¿HOGWKHVLWXDWLRQZDV carefully considered and the decision that both points were outliers was taken.

Since the actual boundary line is unknown for the real dataset, it is not possible to illustrate the overall closeness RIWKH¿WWHGERXQGDU\OLQHVWRWKHDFWXDOERXQGDU\OLQH Therefore, we simulated data with a known upper boundary line. Figure 3 shows the true (known) upper ERXQGDU\ OLQH DQG WKH ¿WWHG XSSHU ERXQGDU\ OLQHV IRU a simulated dataset from a known bivariate normal distribution with a known boundary line. The data have been generated to be similar to the actual dataset used in )LJXUH)LJXUHFOHDUO\VKRZVWKDWWKHOLQH¿WWHGXVLQJ the ‘LP’ method is the closest to the true boundary line. 7KLV¿QGLQJIXUWKHUFRQ¿UPVWKHDSSOLFDELOLW\RIWKHµ/3¶ PHWKRGIRU¿WWLQJWKHERXQGDU\OLQH

(b) Yield of latex (kg/ha)

Yield of latex (kg/ha)

(a)

There are 117 observations in this dataset after removing outliers. Figure 2 illustrates the above four methods. The variables X and Y of the dataset are the leaf nitrogen concentration (%) and the yield of latex (kg/ha), respectively. According to the scatter plot of the data, it appears reasonable to assume that there exists a quadratic upper boundary line. The bins used in ‘LS1’, ‘LS2’, and ‘LS3’ are shown by the dotted vertical lines. In ‘LS1’, µ/6¶DQGµ/3¶WKHSRLQWVXVHGWR¿WWKHERXQGDU\OLQH are marked with circles. These are subsets of the original VDPSOH ,Q µ/6¶ WKH SRLQWV XVHG WR ¿W WKH ERXQGDU\ are marked with ‘x’. This is a new set of points since x values are the mid-points of the bins. It is clear that the ‘LP’ method uses more observations than the other PHWKRGV WR ¿W WKH ERXQGDU\ OLQH +HQFH WKLV PHWKRG assures preservation of more information in the original sample compared to the other methods.

;ĂͿ

;ďͿ ;ĂͿ

 ;ďͿ



Rubber leaf N concentratin (%)





Figure 1: (a) Box plot of latex yield (kg/ha) and (b) scatter plot of latex yield (kg/ha)

Figure 1: (a) Box plot of latex yield (kg/ha) and (b) scatter plot of latex yield (kg/ha) vs N concentration (%). The SRLQWVFLUFOHGLQ E DUHWKHLGHQWL¿HGRXWOLHUV

March 2013





Journal of the National Science Foundation of Sri Lanka 41 (1)











Latex yield (kg/ha)

Latex yield (kg/ha)

$PHWKRGWR¿WXSSHUTXDGUDWLFERXQGDU\OLQHPRGHOIRUWKHPD[LPXPUHVSRQVH

Latex yield (kg/ha)

Rubber leaf N concentratin (%)

Latex yield (kg/ha)

Rubber leaf N concentratin (%)

Rubber leaf N concentratin (%)

Rubber leaf N concentratin (%)



Figure 2: Fitted boundary lines for data on N concentration (%) and latex yield (kg/ha) using four methods. Vertical lines are the bines. The points circled in LS1, LS3 and LP and mid points indicated by X in LS2 DUHXVHGIRU¿WWLQJWKHERXQGDU\

Evaluation of method



We compared our method ‘LP’ with the 3 methods ‘LS1’, ‘LS2’ and ‘LS3’ described above, using data generated from two known truncated bivariate normal GLVWULEXWLRQV,QWKH¿UVWVLPXODWLRQVWXG\WKHELYDULDWH normal distribution and boundary line were selected so that the simulated data are similar to our real data on rubber leaf nitrogen concentration (X) and yield of latex (Y). In the second simulation study, they were selected so that the simulated data are similar to a real dataset on rubber leaf potassium concentration (X) and yield of latex (Y). Sample sizes were 117 and 106, which were the same as our actual datasets. Parameters of the bivariate normal distribution and the boundary line used to generate data are given in Table 1.

Figure 3: Fitted boundary lines and the true boundary line for a simulated data set

Journal of the National Science Foundation of Sri Lanka 41 (1)

March 2013

18

B.M.S.G. Banneheka et al. Parameters of the bivariate normal distributions and respective boundary lines used in simulation studies

Table 1:

Parameters

Simulation study 1

”ሺ ൑ ‰ሺšǢ Ⱦሻȁ ൌ šሻ ൌ ͳ‫š א š׊‬

Simulation study 2

Bivariate normal distribution

”ሺ ൑ ‰ሺšǢ Ⱦሻ െ ɂȁ ൌ šሻ ȝ൏x ͳ‫š א š׊‬    

ȝy ı 2x ߚመ଴ ı2y ߚመଵ ȡxy n ߚመଶ Boundary line  ȕ0 ௠ ෍ ሺߚመ଴ ൅ ߚመଵ —୧ ൅ ߚመଶ —ȕ1ଶ୧ െ  ˜୧ ሻ ௜ୀଵ  ȕ2

3.22 767.68 0.12 128838.6 - 0.17 117

0.90 821.40 0.02 146851 0.24 106

- 15163 10264 - 1575

- 4979 138879 7262

ߚመ଴ ൅ ߚመଵ —୧ ൅ ߚመଶ —ଶ୧ ൒ ˜୧ ǡ ‹ ൌ ͳǡʹǡ Ǥ Ǥ ǡ 

)RU D ¿[HG NQRZQ ELYDULDWH ௠ QRUPDO GLVWULEXWLRQ ZLWK D ෍procedure ˜୧ known boundary line, the was as follows: ௜ୀଵ

1.

create a sequence of 100 equally spaced x values ୫ ୫ spanning theƒ domain D—Xǡand name them—w ଶ ,w , . . ., 2 ƒ଴ ൌ ǡ ଵ ൌ ෍ ୧ ƒ†ƒ ଶ ൌ ෍ ୧ 1 ୧ୀଵ ୧ୀଵ w100. 2. generate a dataset from the known truncated መ መnormal distribution with a known quadratic  ൌ  ƒ଴ ߚመ଴ ൅ ƒbivariate ଵ ߚଵ ൅ ƒ ଶ ߚଶ boundary line. 3. estimate the boundary line using each of LS1, LS2, ଶ መ መ ൅ —methods. LS3 andߚመ଴LP ୧ ߚଵ ൅ —୧ ߚଶ ൒ ˜୧ ǡ ‹ ൌ ͳǡʹǡ Ǥ Ǥ ǡ  4. calculate

ൌඨ

ͳ ଶ ෍ ൣȾ෠଴ ൅ Ⱦ෠ଵ ™୧ ൅ Ⱦ෠ଶ ™୧ଶ െ ሺȾ଴ ൅ Ⱦଵ ™୧ ൅ Ⱦଶ ™୧ଶ ሻ൧ ͳͲͲ ୧ୀଵ ଵ଴଴

...(08)



DVDPHDVXUHRIJRRGQHVVRI¿WXQGHUHDFKPHWKRG * LV D PHDVXUH RI RYHUDOO FORVHQHVV RI WKH ¿WWHG boundary line to the actual boundary line. Note that even though the data are different from sample

Table 2:

Values of

5. 6.

to sample, w1,w2, ..., w100 DUH ¿[HG WKURXJKRXW WKH procedure. repeat steps 2 to 4, 1000 times. calculate the mean ( ) of the 1000 values of G, for each method.

RESULTS obtained from the simulation studies The values of described in Table 1 are given in Table 2. 7KH ¿WWHG ERXQGDU\ OLQH XVLQJ WKH µ/3¶ PHWKRG IRU WKH real data on leaf nitrogen concentration (%) and the yield of latex (kg/ha) is: șѼ[  [±[ 7KH¿WWHGYDOXHVIRUWKHERXQGDU\OLQHDW[  3.25 and 3.75 are given in Table 3. These results are shown in Figure 4. Based on the results in Figure 4, the maximum yields are obtained at 3.25 leaf nitrogen concentration (%), when other factors are not limiting.

Table 3:

from two simulation studies

Method

Simulation study 1

Simulation study 2

LS1 LS2 LS3 LP

192.96 199.08 198.76 96.96

237.27 252.29 238.83 121.58

Fitted boundary values by LP method at different x values

x (leaf nitrogen concentration)(%)

Fitted boundary value (x), latex yield (kg/ha)

2.75 3.00 3.25 3.50

1328.1 1523.4 1572.3 1474.7

Note: Each simulation study is based on 1000 simulations

March 2013

Journal of the National Science Foundation of Sri Lanka 41 (1)

$PHWKRGWR¿WXSSHUTXDGUDWLFERXQGDU\OLQHPRGHOIRUWKHPD[LPXPUHVSRQVH











Yield of latex (kg/ha)

investigated the possibility of using bootstrap methods for this purpose. Even though we were able to adjust the ‘basic bootstrap method’ to give coverage probability FORVH WKH QRPLQDO FRQ¿GHQFH OHYHOV WKH OHQJWKV RI WKH intervals were too large for practical use. According to Bontempi (n.d.), the ‘smoothness condition’ required for the bootstrap methods to work is not true for extreme order statistics like the minimum and the maximum values. This may be the reason for failure of bootstrap PHWKRGVLQWKLVSUREOHP$VVXFKWKHSUREOHPRI¿QGLQJ FRQ¿GHQFHLQWHUYDOVLVRSHQIRUUHVHDUFK Acknowledgement

Nitrogen concentration (%)



Figure 4: )LWWHGERXQGDU\OLQHE\/3PHWKRGIRUGDWDRQ; QLWURJHQ FRQFHQWUDWLRQ  DQG