Nonparametric Regression Using Clusters

Nonparametric Regression Using Clusters Fred Viole Fordham University Dept. of Economics [email protected]

December 5, 2017

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

1 / 28

Objective & Achievements

This paper evaluates a newer and fundamentally distinct alternative to nonparametric curve fitting with direct comparison to kernel regressions. Main Achievements: * Derivative estimation * Interpolation * Out-of-sample forecasting Future Analysis: * Multivariate case



December 5, 2017

2 / 28

Motivation - Behavioral Finance

In behavioral finance, the concept of loss aversion is modeled by studying lower partial moments of partitioned densities since Bawa (1975) and Vinod and Reagle (2005). Viole and Nawrocki (2012a) prove that aggregating all partial moment matrices equals the covariance matrix, providing much more disaggregated and nuanced information than possible with traditional summary statistics.



December 5, 2017

3 / 28

Partitioning - Partial Moment Quadrants

We use a hierarchical and partition clustering method using partial moment quadrants. Definition of Partial Moment Quadrants: X ≤ target, Y X ≤ target, Y X > target, Y X > target, Y


≤ target → CLPM > target → DUPM ≤ target → DLPM > target → CUPM


December 5, 2017

4 / 28

Partitioning - Partial Moment Quadrants

2 −3 −2 −1

0

1

2

3

−3 −2 −1

0

1

2

CUPM Quadrant

DLPM Quadrant

3

2 0 −4

−4

−2

0

Y

2

4

X

4

X

−2

Y

0

Y

−4

−2

0 −4

−2

Y

2

4

DUPM Quadrant

4

CLPM Quadrant

−3 −2 −1

0

1

2

3

−3 −2 −1

X



0

1

2

3

X

December 5, 2017

5 / 28

Partitioning - Iterative Orders Below is the same partitioning based on partial moment quadrants, now iterated on each subquadrant according to the order parameter. The red dots are the means of each partial moment subquadrant.

Y

−1.0

0.0

1.0

NNS Order = 1

0

2

4

6

8

10

12

10

12

10

12

X

Y

−1.0

0.0

1.0

NNS Order = 2

0

2

4

6

8

X

Y

−1.0

0.0

1.0

NNS Order = 3

0

2

4

6

8

X



December 5, 2017

6 / 28

Clusters k-means clustering objective: to minimize the squared distance of each vector from its centroid summed over all vectors (k is predetermined). X |~x − µ ~ (ωk )|2 RSSk = ~x ∈ωk

min

K nX

RSSk

o

k=1

NNS clustering objective: minimize the within-cluster sum of squares for a given cluster (partial moment quadrant), not the overall sum-of-squares. X RSSCLPM = |~x − µ ~ (ωCLPM )|2 ~x ∈ωCLPM



December 5, 2017

7 / 28

k-means and NNS Visualization k-means (non-deterministic):

y 0

5

10

15

20

25

0

5

10

15

20

25

k−means (k=14)

k−means (k=14)

y

5 0

0 0

5

10

15 x


10

10

20

x

20

x

5

y

10 5 0

0

5

y

10

20

k−means (k=14)

20

k−means (k=14)

20

25

0

5

10

15

20

25

x


December 5, 2017

8 / 28

k-means and NNS Visualization NNS (deterministic):

0

5

10

15

20

25

0

10

15

20

25

NNS Order = 3

10 0

0

5

10

Y

20

NNS Order = 3 20

X

5

Y

5

X

0

5

10

15

X Fred Viole (Fordham University)

10

Y

0

5

10 0

5

Y

20

NNS Order = 3

20

NNS Order = 3

20

25

0

5

10

15

20

25

X


December 5, 2017

9 / 28

Connecting the Dots... Connecting the subquadrant means generates sequence of line segments comprising an approximation to the nonlinear curve.

4

6

8

10

0

12

2

4

6

8

10

12

X (Segments = 2)

NNS Order = 2

NNS Order = 2

−1.0

Y

2

R = 0.9855

0.0

0.0

1.0

X

−1.0

2

4

6

8

10

0

12

2

4

6

8

10

12

X (Segments = 5)

NNS Order = 3

NNS Order = 3 2

R = 0.9991

−1.0

−1.0

Y

0.0

1.0

X

1.0

0

0.0

Y

0.0

Y 2

1.0

0

Y

2

R = 0.07607

−1.0

−1.0

Y

0.0

1.0

NNS Order = 1

1.0

NNS Order = 1

0

2

4

6

X


8

10

12

0

2

4

6

8

10

12

X (Segments = 17)


December 5, 2017

10 / 28

Partial Derivatives

Since we are using linear segments, the partial derivatives are easily recovered. Two methods: 1. Local linear coefficient 2. Finite step difference



December 5, 2017

11 / 28

Partial Derivatives - Local Linear Coefficient

We can see the series of the line coefficients and their respective values of x in the following truncated regression output using our sine wave example:

1 2 3 4

Coefficient 0.9575 0.9590 0.9607 0.9622

X.Lower.Range 5.9879 5.9926 5.9989 6.0052

X.Upper.Range 5.9926 5.9989 6.0052 6.0099

Our coefficient (0.9607) when x = 6 is fairly accurate to the known derivative cos(6) = 0.9602.



December 5, 2017

12 / 28

Partial Derivatives - Finite Step

f (x + h) − f (x − h) 2h This method depends on our accuracy of f (x − h) and f (x + h)

sin(5.99) sin(6.01)

NNS Estimate -0.2890 -0.2698

Known Value -0.2890 -0.2698

Our estimates are fairly close to the known values of sin(5.99) and sin(6.01) when estimating the derivative of sin(6).



December 5, 2017

13 / 28

Experiments

We performed three sets of 6 experiments with varying regressor types and nonlinearities comparing: (a) the goodness-of-fit or R 2 values (b) estimated regression coefficients as partial derivatives of conditional expectation function wrt the (noiseless) regressor (c) estimated regression coefficients as partial derivatives with increasing orders of noise in the regressor.



December 5, 2017

14 / 28

Results - R 2

It is imperative to note while NNS can achieve a R 2 = 1 for any f (x), it properly compensates for noise by lowering the order of partitions and reducing its fit.



December 5, 2017

15 / 28

Results - Partial Derivatives

Note the “NNS MAPE” columns versus the “np MAPE” and the actual



December 5, 2017

dy dx .

16 / 28

Out-of-Sample Predictions An important distinguishing feature of NNS over ‘np‘ is the ability to obtain out-of-sample predictions well beyond the observed range, if needed. NNS Estimate 0.4336

Sin(13)

Actual Value 0.4202

1.0

NNS Order = 10 R =1

Y

−1.0

−0.5

0.0

0.5

2

0

2

4

6

8

10

12

X (Segments = 3198) Fred Viole (Fordham University)


December 5, 2017

17 / 28

Alternative Methods of Curve Fitting



December 5, 2017

18 / 28

Multivariate Case NNS works for multivariate regressions as well. We have a working paper describing the technique and look to extend the simulations & experiments versus other nonparametric multivariate regressions.



December 5, 2017

19 / 28

References Vinod, H.D. and Viole, F. “Nonparametric Regressions Using Clusters“ Computational Economics, 2017. https://doi.org/10.1007/s10614-017-9713-5 Viole, F. and Nawrocki, D. “Cumulative Distribution Functions and UPM/LPM Analysis“ SSRN Working Paper, 2012. https://ssrn.com/abstract=2148482

Thank you for your attention!

PS - This entire presentation was written in R, if you’d like to learn how, please attend the R seminar next spring!



December 5, 2017

20 / 28

Appendix: NNS Dependence

An obvious question is “How does NNS determine dependence to reduce the partition order?” Answer: Using partial moment quadrants η(x, y ) = |ρCLPM | + |ρCUPM | + |ρDUPM | + |ρDLPM | where ρCLPM =

CLPMCLPM + CUPMCLPM − DUPMCLPM − DLPMCLPM CLPMCLPM + CUPMCLPM + DUPMCLPM + DLPMCLPM



December 5, 2017

21 / 28

Appendix: NNS Dependence - Examples Correlation & Dependence

−10

0

Y

10

20

NNS Order = 3

−2

−1

0

1

2

3



December 5, 2017

22 / 28

Appendix: NNS Dependence - Examples Correlation & Dependence R-code > x=rnorm(1000);y=x^3 > cor(x,y) [1] 0.7844 > NNS.dep(x,y) $Correlation [1] 0.9958 $Dependence [1] 0.9958



December 5, 2017

23 / 28

Appendix: NNS Dependence - Examples NO Correlation & Dependence

Y

−1.0

−0.5

0.0

0.5

1.0

NNS Order = 3

0

2

4

6

8

10

12



December 5, 2017

24 / 28

Appendix: NNS Dependence - Examples NO Correlation & Dependence R-code > x=seq(0,4*pi,pi/1000);y=sin(x) > cor(x,y) [1] -0.3897 > NNS.dep(x,y) $Correlation [1] 0.0002499 $Dependence [1] 0.999



December 5, 2017

25 / 28

Appendix: NNS Dependence - Examples NO Correlation & Dependence

Y

−1.0

−0.5

0.0

0.5

1.0

NNS Order = 3

−1.0

−0.5

0.0

0.5

1.0



December 5, 2017

26 / 28

Appendix: NNS Dependence - Examples NO Correlation & Dependence R-code > > > >

set.seed(123) df