Nonparametric Regression Using Clusters Fred Viole Fordham University Dept. of Economics [email protected]

December 5, 2017

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

1 / 28

Objective & Achievements

This paper evaluates a newer and fundamentally distinct alternative to nonparametric curve fitting with direct comparison to kernel regressions. Main Achievements: * Derivative estimation * Interpolation * Out-of-sample forecasting Future Analysis: * Multivariate case

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

2 / 28

Motivation - Behavioral Finance

In behavioral finance, the concept of loss aversion is modeled by studying lower partial moments of partitioned densities since Bawa (1975) and Vinod and Reagle (2005). Viole and Nawrocki (2012a) prove that aggregating all partial moment matrices equals the covariance matrix, providing much more disaggregated and nuanced information than possible with traditional summary statistics.

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

3 / 28

Partitioning - Partial Moment Quadrants

We use a hierarchical and partition clustering method using partial moment quadrants. Definition of Partial Moment Quadrants: X ≤ target, Y X ≤ target, Y X > target, Y X > target, Y

Fred Viole (Fordham University)

≤ target → CLPM > target → DUPM ≤ target → DLPM > target → CUPM

Nonparametric Regression Using Clusters

December 5, 2017

4 / 28

Partitioning - Partial Moment Quadrants

2 −3 −2 −1

0

1

2

3

−3 −2 −1

0

1

2

CUPM Quadrant

DLPM Quadrant

3

2 0 −4

−4

−2

0

Y

2

4

X

4

X

−2

Y

0

Y

−4

−2

0 −4

−2

Y

2

4

DUPM Quadrant

4

CLPM Quadrant

−3 −2 −1

0

1

2

3

−3 −2 −1

X

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

0

1

2

3

X

December 5, 2017

5 / 28

Partitioning - Iterative Orders Below is the same partitioning based on partial moment quadrants, now iterated on each subquadrant according to the order parameter. The red dots are the means of each partial moment subquadrant.

Y

−1.0

0.0

1.0

NNS Order = 1

0

2

4

6

8

10

12

10

12

10

12

X

Y

−1.0

0.0

1.0

NNS Order = 2

0

2

4

6

8

X

Y

−1.0

0.0

1.0

NNS Order = 3

0

2

4

6

8

X

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

6 / 28

Clusters k-means clustering objective: to minimize the squared distance of each vector from its centroid summed over all vectors (k is predetermined). X |~x − µ ~ (ωk )|2 RSSk = ~x ∈ωk

min

K nX

RSSk

o

k=1

NNS clustering objective: minimize the within-cluster sum of squares for a given cluster (partial moment quadrant), not the overall sum-of-squares. X RSSCLPM = |~x − µ ~ (ωCLPM )|2 ~x ∈ωCLPM

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

7 / 28

k-means and NNS Visualization k-means (non-deterministic):

y 0

5

10

15

20

25

0

5

10

15

20

25

k−means (k=14)

k−means (k=14)

y

5 0

0 0

5

10

15 x

Fred Viole (Fordham University)

10

10

20

x

20

x

5

y

10 5 0

0

5

y

10

20

k−means (k=14)

20

k−means (k=14)

20

25

0

5

10

15

20

25

x

Nonparametric Regression Using Clusters

December 5, 2017

8 / 28

k-means and NNS Visualization NNS (deterministic):

0

5

10

15

20

25

0

10

15

20

25

NNS Order = 3

10 0

0

5

10

Y

20

NNS Order = 3 20

X

5

Y

5

X

0

5

10

15

X Fred Viole (Fordham University)

10

Y

0

5

10 0

5

Y

20

NNS Order = 3

20

NNS Order = 3

20

25

0

5

10

15

20

25

X

Nonparametric Regression Using Clusters

December 5, 2017

9 / 28

Connecting the Dots... Connecting the subquadrant means generates sequence of line segments comprising an approximation to the nonlinear curve.

4

6

8

10

0

12

2

4

6

8

10

12

X (Segments = 2)

NNS Order = 2

NNS Order = 2

−1.0

Y

2

R = 0.9855

0.0

0.0

1.0

X

−1.0

2

4

6

8

10

0

12

2

4

6

8

10

12

X (Segments = 5)

NNS Order = 3

NNS Order = 3 2

R = 0.9991

−1.0

−1.0

Y

0.0

1.0

X

1.0

0

0.0

Y

0.0

Y 2

1.0

0

Y

2

R = 0.07607

−1.0

−1.0

Y

0.0

1.0

NNS Order = 1

1.0

NNS Order = 1

0

2

4

6

X

Fred Viole (Fordham University)

8

10

12

0

2

4

6

8

10

12

X (Segments = 17)

Nonparametric Regression Using Clusters

December 5, 2017

10 / 28

Partial Derivatives

Since we are using linear segments, the partial derivatives are easily recovered. Two methods: 1. Local linear coefficient 2. Finite step difference

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

11 / 28

Partial Derivatives - Local Linear Coefficient

We can see the series of the line coefficients and their respective values of x in the following truncated regression output using our sine wave example:

1 2 3 4

Coefficient 0.9575 0.9590 0.9607 0.9622

X.Lower.Range 5.9879 5.9926 5.9989 6.0052

X.Upper.Range 5.9926 5.9989 6.0052 6.0099

Our coefficient (0.9607) when x = 6 is fairly accurate to the known derivative cos(6) = 0.9602.

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

12 / 28

Partial Derivatives - Finite Step

f (x + h) − f (x − h) 2h This method depends on our accuracy of f (x − h) and f (x + h)

sin(5.99) sin(6.01)

NNS Estimate -0.2890 -0.2698

Known Value -0.2890 -0.2698

Our estimates are fairly close to the known values of sin(5.99) and sin(6.01) when estimating the derivative of sin(6).

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

13 / 28

Experiments

We performed three sets of 6 experiments with varying regressor types and nonlinearities comparing: (a) the goodness-of-fit or R 2 values (b) estimated regression coefficients as partial derivatives of conditional expectation function wrt the (noiseless) regressor (c) estimated regression coefficients as partial derivatives with increasing orders of noise in the regressor.

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

14 / 28

Results - R 2

It is imperative to note while NNS can achieve a R 2 = 1 for any f (x), it properly compensates for noise by lowering the order of partitions and reducing its fit.

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

15 / 28

Results - Partial Derivatives

Note the “NNS MAPE” columns versus the “np MAPE” and the actual

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

dy dx .

16 / 28

Out-of-Sample Predictions An important distinguishing feature of NNS over ‘np‘ is the ability to obtain out-of-sample predictions well beyond the observed range, if needed. NNS Estimate 0.4336

Sin(13)

Actual Value 0.4202

1.0

NNS Order = 10 R =1

Y

−1.0

−0.5

0.0

0.5

2

0

2

4

6

8

10

12

X (Segments = 3198) Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

17 / 28

Alternative Methods of Curve Fitting

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

18 / 28

Multivariate Case NNS works for multivariate regressions as well. We have a working paper describing the technique and look to extend the simulations & experiments versus other nonparametric multivariate regressions.

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

19 / 28

References Vinod, H.D. and Viole, F. “Nonparametric Regressions Using Clusters“ Computational Economics, 2017. https://doi.org/10.1007/s10614-017-9713-5 Viole, F. and Nawrocki, D. “Cumulative Distribution Functions and UPM/LPM Analysis“ SSRN Working Paper, 2012. https://ssrn.com/abstract=2148482

Thank you for your attention!

PS - This entire presentation was written in R, if you’d like to learn how, please attend the R seminar next spring!

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

20 / 28

Appendix: NNS Dependence

An obvious question is “How does NNS determine dependence to reduce the partition order?” Answer: Using partial moment quadrants η(x, y ) = |ρCLPM | + |ρCUPM | + |ρDUPM | + |ρDLPM | where ρCLPM =

CLPMCLPM + CUPMCLPM − DUPMCLPM − DLPMCLPM CLPMCLPM + CUPMCLPM + DUPMCLPM + DLPMCLPM

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

21 / 28

Appendix: NNS Dependence - Examples Correlation & Dependence

−10

0

Y

10

20

NNS Order = 3

−2

−1

0

1

2

3

X Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

22 / 28

Appendix: NNS Dependence - Examples Correlation & Dependence R-code > x=rnorm(1000);y=x^3 > cor(x,y) [1] 0.7844 > NNS.dep(x,y) $Correlation [1] 0.9958 $Dependence [1] 0.9958

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

23 / 28

Appendix: NNS Dependence - Examples NO Correlation & Dependence

Y

−1.0

−0.5

0.0

0.5

1.0

NNS Order = 3

0

2

4

6

8

10

12

X Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

24 / 28

Appendix: NNS Dependence - Examples NO Correlation & Dependence R-code > x=seq(0,4*pi,pi/1000);y=sin(x) > cor(x,y) [1] -0.3897 > NNS.dep(x,y) $Correlation [1] 0.0002499 $Dependence [1] 0.999

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

25 / 28

Appendix: NNS Dependence - Examples NO Correlation & Dependence

Y

−1.0

−0.5

0.0

0.5

1.0

NNS Order = 3

−1.0

−0.5

0.0

0.5

1.0

X Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

26 / 28

Appendix: NNS Dependence - Examples NO Correlation & Dependence R-code > > > >

set.seed(123) df

December 5, 2017

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

1 / 28

Objective & Achievements

This paper evaluates a newer and fundamentally distinct alternative to nonparametric curve fitting with direct comparison to kernel regressions. Main Achievements: * Derivative estimation * Interpolation * Out-of-sample forecasting Future Analysis: * Multivariate case

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

2 / 28

Motivation - Behavioral Finance

In behavioral finance, the concept of loss aversion is modeled by studying lower partial moments of partitioned densities since Bawa (1975) and Vinod and Reagle (2005). Viole and Nawrocki (2012a) prove that aggregating all partial moment matrices equals the covariance matrix, providing much more disaggregated and nuanced information than possible with traditional summary statistics.

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

3 / 28

Partitioning - Partial Moment Quadrants

We use a hierarchical and partition clustering method using partial moment quadrants. Definition of Partial Moment Quadrants: X ≤ target, Y X ≤ target, Y X > target, Y X > target, Y

Fred Viole (Fordham University)

≤ target → CLPM > target → DUPM ≤ target → DLPM > target → CUPM

Nonparametric Regression Using Clusters

December 5, 2017

4 / 28

Partitioning - Partial Moment Quadrants

2 −3 −2 −1

0

1

2

3

−3 −2 −1

0

1

2

CUPM Quadrant

DLPM Quadrant

3

2 0 −4

−4

−2

0

Y

2

4

X

4

X

−2

Y

0

Y

−4

−2

0 −4

−2

Y

2

4

DUPM Quadrant

4

CLPM Quadrant

−3 −2 −1

0

1

2

3

−3 −2 −1

X

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

0

1

2

3

X

December 5, 2017

5 / 28

Partitioning - Iterative Orders Below is the same partitioning based on partial moment quadrants, now iterated on each subquadrant according to the order parameter. The red dots are the means of each partial moment subquadrant.

Y

−1.0

0.0

1.0

NNS Order = 1

0

2

4

6

8

10

12

10

12

10

12

X

Y

−1.0

0.0

1.0

NNS Order = 2

0

2

4

6

8

X

Y

−1.0

0.0

1.0

NNS Order = 3

0

2

4

6

8

X

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

6 / 28

Clusters k-means clustering objective: to minimize the squared distance of each vector from its centroid summed over all vectors (k is predetermined). X |~x − µ ~ (ωk )|2 RSSk = ~x ∈ωk

min

K nX

RSSk

o

k=1

NNS clustering objective: minimize the within-cluster sum of squares for a given cluster (partial moment quadrant), not the overall sum-of-squares. X RSSCLPM = |~x − µ ~ (ωCLPM )|2 ~x ∈ωCLPM

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

7 / 28

k-means and NNS Visualization k-means (non-deterministic):

y 0

5

10

15

20

25

0

5

10

15

20

25

k−means (k=14)

k−means (k=14)

y

5 0

0 0

5

10

15 x

Fred Viole (Fordham University)

10

10

20

x

20

x

5

y

10 5 0

0

5

y

10

20

k−means (k=14)

20

k−means (k=14)

20

25

0

5

10

15

20

25

x

Nonparametric Regression Using Clusters

December 5, 2017

8 / 28

k-means and NNS Visualization NNS (deterministic):

0

5

10

15

20

25

0

10

15

20

25

NNS Order = 3

10 0

0

5

10

Y

20

NNS Order = 3 20

X

5

Y

5

X

0

5

10

15

X Fred Viole (Fordham University)

10

Y

0

5

10 0

5

Y

20

NNS Order = 3

20

NNS Order = 3

20

25

0

5

10

15

20

25

X

Nonparametric Regression Using Clusters

December 5, 2017

9 / 28

Connecting the Dots... Connecting the subquadrant means generates sequence of line segments comprising an approximation to the nonlinear curve.

4

6

8

10

0

12

2

4

6

8

10

12

X (Segments = 2)

NNS Order = 2

NNS Order = 2

−1.0

Y

2

R = 0.9855

0.0

0.0

1.0

X

−1.0

2

4

6

8

10

0

12

2

4

6

8

10

12

X (Segments = 5)

NNS Order = 3

NNS Order = 3 2

R = 0.9991

−1.0

−1.0

Y

0.0

1.0

X

1.0

0

0.0

Y

0.0

Y 2

1.0

0

Y

2

R = 0.07607

−1.0

−1.0

Y

0.0

1.0

NNS Order = 1

1.0

NNS Order = 1

0

2

4

6

X

Fred Viole (Fordham University)

8

10

12

0

2

4

6

8

10

12

X (Segments = 17)

Nonparametric Regression Using Clusters

December 5, 2017

10 / 28

Partial Derivatives

Since we are using linear segments, the partial derivatives are easily recovered. Two methods: 1. Local linear coefficient 2. Finite step difference

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

11 / 28

Partial Derivatives - Local Linear Coefficient

We can see the series of the line coefficients and their respective values of x in the following truncated regression output using our sine wave example:

1 2 3 4

Coefficient 0.9575 0.9590 0.9607 0.9622

X.Lower.Range 5.9879 5.9926 5.9989 6.0052

X.Upper.Range 5.9926 5.9989 6.0052 6.0099

Our coefficient (0.9607) when x = 6 is fairly accurate to the known derivative cos(6) = 0.9602.

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

12 / 28

Partial Derivatives - Finite Step

f (x + h) − f (x − h) 2h This method depends on our accuracy of f (x − h) and f (x + h)

sin(5.99) sin(6.01)

NNS Estimate -0.2890 -0.2698

Known Value -0.2890 -0.2698

Our estimates are fairly close to the known values of sin(5.99) and sin(6.01) when estimating the derivative of sin(6).

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

13 / 28

Experiments

We performed three sets of 6 experiments with varying regressor types and nonlinearities comparing: (a) the goodness-of-fit or R 2 values (b) estimated regression coefficients as partial derivatives of conditional expectation function wrt the (noiseless) regressor (c) estimated regression coefficients as partial derivatives with increasing orders of noise in the regressor.

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

14 / 28

Results - R 2

It is imperative to note while NNS can achieve a R 2 = 1 for any f (x), it properly compensates for noise by lowering the order of partitions and reducing its fit.

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

15 / 28

Results - Partial Derivatives

Note the “NNS MAPE” columns versus the “np MAPE” and the actual

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

dy dx .

16 / 28

Out-of-Sample Predictions An important distinguishing feature of NNS over ‘np‘ is the ability to obtain out-of-sample predictions well beyond the observed range, if needed. NNS Estimate 0.4336

Sin(13)

Actual Value 0.4202

1.0

NNS Order = 10 R =1

Y

−1.0

−0.5

0.0

0.5

2

0

2

4

6

8

10

12

X (Segments = 3198) Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

17 / 28

Alternative Methods of Curve Fitting

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

18 / 28

Multivariate Case NNS works for multivariate regressions as well. We have a working paper describing the technique and look to extend the simulations & experiments versus other nonparametric multivariate regressions.

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

19 / 28

References Vinod, H.D. and Viole, F. “Nonparametric Regressions Using Clusters“ Computational Economics, 2017. https://doi.org/10.1007/s10614-017-9713-5 Viole, F. and Nawrocki, D. “Cumulative Distribution Functions and UPM/LPM Analysis“ SSRN Working Paper, 2012. https://ssrn.com/abstract=2148482

Thank you for your attention!

PS - This entire presentation was written in R, if you’d like to learn how, please attend the R seminar next spring!

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

20 / 28

Appendix: NNS Dependence

An obvious question is “How does NNS determine dependence to reduce the partition order?” Answer: Using partial moment quadrants η(x, y ) = |ρCLPM | + |ρCUPM | + |ρDUPM | + |ρDLPM | where ρCLPM =

CLPMCLPM + CUPMCLPM − DUPMCLPM − DLPMCLPM CLPMCLPM + CUPMCLPM + DUPMCLPM + DLPMCLPM

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

21 / 28

Appendix: NNS Dependence - Examples Correlation & Dependence

−10

0

Y

10

20

NNS Order = 3

−2

−1

0

1

2

3

X Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

22 / 28

Appendix: NNS Dependence - Examples Correlation & Dependence R-code > x=rnorm(1000);y=x^3 > cor(x,y) [1] 0.7844 > NNS.dep(x,y) $Correlation [1] 0.9958 $Dependence [1] 0.9958

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

23 / 28

Appendix: NNS Dependence - Examples NO Correlation & Dependence

Y

−1.0

−0.5

0.0

0.5

1.0

NNS Order = 3

0

2

4

6

8

10

12

X Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

24 / 28

Appendix: NNS Dependence - Examples NO Correlation & Dependence R-code > x=seq(0,4*pi,pi/1000);y=sin(x) > cor(x,y) [1] -0.3897 > NNS.dep(x,y) $Correlation [1] 0.0002499 $Dependence [1] 0.999

Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

25 / 28

Appendix: NNS Dependence - Examples NO Correlation & Dependence

Y

−1.0

−0.5

0.0

0.5

1.0

NNS Order = 3

−1.0

−0.5

0.0

0.5

1.0

X Fred Viole (Fordham University)

Nonparametric Regression Using Clusters

December 5, 2017

26 / 28

Appendix: NNS Dependence - Examples NO Correlation & Dependence R-code > > > >

set.seed(123) df