Unit 4 sum(Unit 1 and Unit 4) â¦. custom(Unit 1 and Unit 4). : : : : : â¦. : year 1 Unit 1_Unit N-â1. Unit 1. Unit N-â1 sum(Unit 1 and Unit N-â1) â¦. custom(Unit 1 and ...
Package ‘PairwiseD’ Title: Pairing up units and vectors in panel data setting Description PairwiseD is a package for R that allows for paring observations according to a chosen formula and facilitates bilateral analysis of the panel data. Paring is possible for observations, as well as for vectors of observations ordered with respect to time. Imports tcltk2, xlsx, openxlsx, rJava, xlsxjars Authors Krzysztof Beck Marcin Stryjek
1
Main idea: User provides csv, xlsx, xls file of the following form: Table 1. Input file for PairwiseD package
Time Unit 1 Unit 2 Unit 3 year 1 x11 x21 x31 year 2 x12 x22 x32 year 3 x13 x23 x33 year 4 x14 x24 x34 year 5 x15 x25 x35 year 6 x16 x26 x36 year 7 x17 x27 x37 year 8 x18 x28 x38 year 9 x19 x29 x39 year 10 x110 x210 x310 year 11 x111 x211 x311 year 12 x112 x212 x312 year 13 x113 x213 x313 : : : : year T x1T x2T x3T
… Unit N-‐1 Unit N … x(N-‐1)1 xN1 … x(N-‐1)2 xN2 … x(N-‐1)3 xN3 … x(N-‐1)4 xN4 … x(N-‐1)5 xN5 … x(N-‐1)6 xN6 … x(N-‐1)7 xN7 … x(N-‐1)8 xN8 … x(N-‐1)9 xN9 … x(N-‐1)10 xN10 … x(N-‐1)11 xN11 … x(N-‐1)12 xN12 … x(N-‐1)13 xN13 : : : … x(N-‐1)T xNT
where column time contains consecutive units of time (annual, quarterly, monthly, weekly, daily, hourly, or even of higher frequency), and every column contains information about analyzed cross-‐sections (ex. countries, companies, customers, etc.). Package PairwiseD provides two functions. The first one, “pairwise”, pairs up all the observations xit and returns the table in the following form:
2
Table 1. Form of the output file of function “pairwise” Time
Identification
ID 1
ID 2
x+y
….
custom(z,g)
year 1
Unit 1_Unit 2
Unit 1
Unit 2
sum(Unit 1 and Unit 2)
….
custom(Unit 1 and Unit 2)
year 1
Unit 1_Unit 3
Unit 1
Unit 3
sum(Unit 1 and Unit 3)
….
custom(Unit 1 and Unit 3)
year 1
Unit 1_Unit 4
Unit 1
Unit 4
sum(Unit 1 and Unit 4)
….
custom(Unit 1 and Unit 4)
:
:
:
:
:
….
:
year 1
Unit 1_Unit N-‐1
Unit 1
Unit N-‐1
sum(Unit 1 and Unit N-‐1)
….
custom(Unit 1 and Unit N-‐1)
year 1
Unit 1_Unit N
Unit 1
Unit N
sum(Unit 1 and Unit N)
….
custom(Unit 1 and Unit N)
year 1
Unit 2_Unit 3
Unit 2
Unit 3
sum(Unit 2 and Unit 3)
….
custom(Unit 2 and Unit 3)
year 1
Unit 2_Unit 4
Unit 2
Unit 4
sum(Unit 2 and Unit 4)
….
custom(Unit 2 and Unit 4)
:
:
:
:
:
….
:
year 1
Unit 2_Unit N
Unit 2
Unit N
sum(Unit 2 and Unit N)
….
custom(Unit 2 and Unit N)
year 1
Unit 3_Unit 4
Unit 3
Unit 4
sum(Unit 3 and Unit 4)
….
custom(Unit 3 and Unit 4)
:
:
:
:
:
….
:
Unit N-‐1
sum(Unit N-‐2 and Unit N-‐1)
….
custom(Unit N-‐2 and Unit N-‐1)
year 1
Unit N-‐2_Unit N-‐1 Unit N-‐2
year 1
Unit N-‐1_Unit N
Unit N-‐1
Unit N
sum(Unit N-‐1 and Unit N)
….
custom(Unit N-‐1 and Unit N)
year 2
Unit 1_Unit 2
Unit 1
Unit 2
sum(Unit 1 and Unit 2)
….
custom(Unit 1 and Unit 2)
:
:
:
:
:
….
:
year 2
Unit N-‐1_Unit N
Unit N-‐1
Unit N
sum(Unit N-‐1 and Unit N)
….
custom(Unit N-‐1 and Unit N)
:
:
:
:
:
….
:
year T
Unit 1_Unit 2
Unit 1
Unit 2
sum(Unit 1 and Unit 2)
….
custom(Unit 1 and Unit 2)
year T
Unit 1_Unit 3
Unit 1
Unit 3
sum(Unit 1 and Unit 3)
….
custom(Unit 1 and Unit 3)
:
:
:
:
:
….
:
Unit N-‐1
sum(Unit N-‐2 and Unit N-‐1)
….
custom(Unit N-‐2 and Unit N-‐1)
Unit N
sum(Unit N-‐1 and Unit N)
….
custom(Unit N-‐1 and Unit N)
year T year T
Unit N-‐2_Unit N-‐1 Unit N-‐2 Unit N-‐1_Unit N
Unit N-‐1
The table can contain as many measures as the users specifies – some of them are already build in, but user can also specify any measure using the custom option. The second function, “pairwise2”, operates of on time ordered vectors of observations xit and returns the table in the following form (in this example six years’ window – vector of six consecutive years – length was chosen):
3
Table 2. Form of the output file of function “pairwise” (six years window length) Time
Identification ID 1
ID 2
cor(z,g)
….
custom(z,g)
year 1_year 6
Unit 1_Unit 2
Unit 1
Unit 2
cor(Unit 1 and Unit 2)
….
custom(Unit 1 and Unit 2)
year 1_year 6
Unit 1_Unit 3
Unit 1
Unit 3
cor(Unit 1 and Unit 3)
….
custom(Unit 1 and Unit 3)
:
:
:
:
:
….
:
year 1_year 6
Unit 2_Unit 3
Unit 2
Unit 3
cor(Unit 2 and Unit 3)
….
custom(Unit 2 and Unit 3)
year 1_year 6
Unit 2_Unit 4
Unit 2
Unit 4
cor(Unit 2 and Unit 4)
….
custom(Unit 2 and Unit 4)
:
:
:
:
:
….
:
year 1_year 6
Unit N-‐2_Unit N-‐1
Unit 2
year 1_year 6
Unit N-‐1_Unit N
Unit 2
Unit N
cor(Unit N-‐1 and Unit N)
….
custom(Unit N-‐1 and Unit N)
year 2_year 7
Unit 1_Unit 2
Unit 1
Unit 2
cor(Unit 1 and Unit 2)
….
custom(Unit 1 and Unit 2)
year 2_year 7
Unit 1_Unit 3
Unit 1
Unit 3
cor(Unit 1 and Unit 3)
….
custom(Unit 1 and Unit 3)
:
:
:
:
:
….
:
year 2_year 7
Unit 2_Unit 3
Unit 2
Unit 3
cor(Unit 2 and Unit 3)
….
custom(Unit 2 and Unit 3)
year 2_year 7
Unit 2_Unit 4
Unit 2
Unit 4
cor(Unit 2 and Unit 4)
….
custom(Unit 2 and Unit 4)
:
:
:
:
:
….
:
year 2_year 7
Unit N-‐1 cor(Unit N-‐2 and Unit N-‐1) …. custom(Unit N-‐2 and Unit N-‐1)
Unit N-‐2_Unit N-‐1 Unit N-‐2 Unit N-‐1 cor(Unit N-‐2 and Unit N-‐1) …. custom(Unit N-‐2 and Unit N-‐1)
year 2_year 7
Unit N-‐1_Unit N
Unit N-‐1
Unit N
cor(Unit N-‐1 and Unit N)
….
custom(Unit N-‐1 and Unit N)
:
:
:
:
:
….
:
year T-‐5_year T
Unit 1_Unit 2
Unit 1
Unit 2
cor(Unit 1 and Unit 2)
….
custom(Unit 1 and Unit 2)
year T-‐5_year T
Unit 1_Unit 3
Unit 1
Unit 3
cor(Unit 1 and Unit 3)
….
custom(Unit 1 and Unit 3)
:
:
:
:
:
….
:
year T-‐5_year T
Unit 2_Unit 3
Unit 2
Unit 3
cor(Unit 2 and Unit 3)
….
custom(Unit 2 and Unit 3)
year T-‐5_year T
Unit 2_Unit 4
Unit 2
Unit 4
cor(Unit 2 and Unit 4)
….
custom(Unit 2 and Unit 4)
:
:
:
:
:
:
:
year T-‐5_year T Unit N-‐2_Unit N-‐1 Unit N-‐2 Unit N-‐1 cor(Unit N-‐2 and Unit N-‐1) …. custom(Unit N-‐2 and Unit N-‐1) year T-‐5_year T
Unit N-‐1_Unit N
Unit N-‐1
Unit N
cor(Unit N-‐1 and Unit N)
….
custom(Unit N-‐1 and Unit N)
Helpful tips: 1) For bigger files csv format is recommended, as the OS might not handle xls and xlsx output files properly. 2) Dropping “custom” option, greatly improves the speed of computation. 3) Tested for Windows, Linux and Mac OS.
4
pairwise pairing up observations of variables Description Function for pairing up observations in panel data setting according to built in or user specified formula. Usage pairwise( data, exp.multiplier = 0, multiplier = 1, no.massage = FALSE, measure = c(“sum”, ”sumabs”, “absdiff”, “diffabs”, “prod”, “prodabs”, “absprod", “abslogdiff”, “abslogsum”, “logdiffabs”, “logsumabs”, “if1”, “if2”, “custom”, dec=”,”,sep=”,”), save.as = “path_to_the_destination/name_of_the_output_file.format”) Arguments data a data.frame, list or path to the input file in csv, xlsx or xls format. exp.multiplier default value set to 0. Changes the decimal mark in all the observations in the input file. Useful in cases when, for example, the natural logarithm must be taken from a number that is very close to 0. multiplier default value set to 1. Multiplies all the observations in the input file. measure operation that should be applied to every pair of observations. Built in options contains (x denotes first observation and y denotes the second observation): “sum”: 𝑥 + 𝑦, “sumabs”: 𝑥 + 𝑦 , “absdiff”: 𝑥 − 𝑦 , “diffabs”: 𝑥 + 𝑦 , “prod”: 𝑥 ∗ 𝑦, “prodabs”: 𝑥 ∗ 𝑦 , “absprod”: 𝑥 ∗ 𝑦 , “abslogdiff”: 𝑙𝑛(𝑥 ÷ 𝑦) , “abslogsum”: 𝑙𝑛(𝑥 ∗ 𝑦) , “logdiffabs”: 𝑙𝑛( 𝑥 ÷ 𝑦 ), “logsumabs”: 𝑙𝑛( 𝑥 ∗ 𝑦 ), “if1”: (𝑥 + 𝑦 = 1) → 1 ∧ (𝑥 + 𝑦 ≠ 1) → 0, “if2”: (𝑥 + 𝑦 = 2) → 1 ∧ (𝑥 + 𝑦 ≠ 2) → 0. Moreover, option “custom” allows the user to specify any function of x and y. save.as in this argument, user can specify: path for the destination folder, name of the file and desired format. Argument is
5
given in the following form: “path/name.format”. For example: “C: Myfiles/Rfiles/pairwised_data.csv”. If the argument is not specified default is set as “current R directory/name_of_the_input_file_paiwise_result.format_of_ input_file”. User can specify csv, xls and xlsx as the output format. If user does not specify the file format, the format will be the same as of the input format or xlsx in the case of data.frame/list input. If argument is equal to “none”, output file is not generated. no.messages dec sep
default value is FALSE. If FALSE, the package will show messages: about finishing the calculations, destination file, potential error and wrong format of the input file.
symbol to be used for decimal separator in the output file. Default value is “.” symbol to be used for values separator in the output csv file (works only if output file is set to be a csv file). Default value is “.”
Examples pairwise( data = "~/Desktop/PairwiseD/GDPgrowth.xlsx" , measure = c("absdiff", "sumabs", "logsumabs", "if1" ," x^2+y^2"), multiplier = 1, exp.multiplier = 0, no.massage = FALSE) GDPgrowth