FuzzyPovertyR
is a package for estimating fuzzy poverty
indexes. Broadly speaking, a fuzzy poverty index is an index that ranges
in the set \(Q=[0,1]\) (Alkire et al. (2015); Silber (2023)). A fuzzy indexes assigns to each
statistical unit a value in this interval according to a given
membership function (mf) \(\mu(x_i)\in
Q\) where \(x\) is a poverty
predicate. The higher the value of \(\mu\), the more the individual is regarded
as “poor” with respect to the poverty predicate \(x\). In socio-economic surveys \(x\) may be the equivalised disposable
income, or expenditure. However, in principle \(x\) may be a generic poverty predicate that
the researcher needs to analyse, for example this could be a variable
that relates to access to transports, services and other facilities.
Below we distinguish between monetary and supplementary poverty
indexes (or measures). A fuzzy monetary poverty measure is calculated
over a numeric
vector of length \(n\) (the available sample size). A
supplementary poverty index is calculated on a data.frame
of items of a questionnaire that relates to other dimensions of poverty
other than monetary.
The dataset coming from the package is loaded in the environment with
The package lets the user choose among different membership functions
trough the argument fm
of the fm_construct
function. The membership function available are:
fm="verma"
(see Cheli and Lemmi
(1995))\[ \mu_i=\left(1-F_X(x_i)\right)^{\alpha-1} = \left(\frac{\sum_{j=i+1} w_j|x_j> x_i}{\sum_{j\ge 2} w_j|x_j> x_1}\right)^{\alpha-1} \]
where \(F_X\) is the empirical cumulative distribution function of \(X\) calculated for the \(i-th\) individual and \(w_i\) is the sampling weight of statistical unit \(i\). The parameter \(\alpha\ge 2\) is chosen so that the average over the function equals the head count ratio (i.e. the proportion of units whose equivalised disposable income falls below the poverty line) of the whole population.
fm="verma1999"
(see Betti and
Verma (1999))\[ \mu_i=(1-L_X(x_i))^{\alpha-1} = \left(\frac{\sum_{j=i+1} w_jx_j|x_j> x_i}{\sum_{j\ge 2} w_jx_j|x_j> x_1}\right)^{\alpha-1} \]
where \(L_X\) is the Lorenz Curve of \(X\) calculated for the \(i-th\) individual and \(w_i\) is the sampling weight of statistical unit \(i\). Again, the parameter \(\alpha\ge 2\) is chosen once again so that the average over the function equals the head count ratio of the whole population or its estimate.
fm="verma"
(see Betti and Verma
(2008))\[ \begin{split} \mu_i&=\left(1-F_X(x_i)\right)^{\alpha-1}\left(1-L_X(x_i)\right)=\\ &=\left(\frac{\sum_{j=i+1} w_j|x_j> x_i}{\sum_{j\ge 2} w_j|x_j> x_1}\right)^{\alpha-1} \left(\frac{\sum_{j=i+1} w_jx_j|x_j> x_i}{\sum_{j\ge 2} w_jx_j|x_j> x_1}\right) \end{split} \]
Again, the parameter \(\alpha\ge 2\) is chosen once again so that the average over the function equals the head count ratio of the whole population or its estimate.
fm="chakravarty"
(see Chakravarty (2006))\[ \mu_i = \begin{cases} 1 & x_i = 0\\ \frac{z - x_i}{z} & 0 \le x_i < z\\ 0 & x_i \ge z \end{cases} \] where \(z\) is a threshold to be chosen by the researcher.
fm="cerioli"
(see Cerioli and
Zani (1990)) $$ _i=\[ \mu_i = \begin{cases} 1, \quad 0<x_i<z_1\\ \frac{z_2-x_i}{z_2-z_1}, \quad z_1\le x_i<z_2\\ 0 \quad x_i\ge z_2 \end{cases} \]
$$ the values \(z_1\) and \(z_2\) have to be chosen by the researcher.
fm="ZBM"
(see Zedini and Belhadj
(2015) and Belhadj and Matoussi
(2010))\[ \mu_i = \begin{cases} 1 & a \le x_i < b\\ \frac{-x_i}{c-b} + \frac{c}{c-b} & b \le x_i < c\\ 0 & x_i < a \cup x_i \ge c\\ \end{cases} \] where \(a,c,b\) are percentiles estimated with via the bootstrap technique.
fm="belhadj2015"
(see Besma
(2015))\[ \mu_i = \begin{cases} 1 & x_i < z_1\\ \mu^1 = 1-\frac{1}{2}\left(\frac{x_i-z_1}{z_1}\right)^b & z_1 \le x_i < z\\ \mu^2 = 1-\frac{1}{2}\left(\frac{z_2 - x_i}{z_2}\right)^b & z \le x_i < z_2\\ 0 & x_i \ge z_2 \end{cases} \] where \(z^*\) is the flex point of \(\mu_i\), \(z_1\) and \(z_2\) have to be chosen by the researcher, and \(b\ge 1\) is a shape parameter ruling the degree of convexity of the function. In particular, when \(b=1\) the trend is linear.
fm="belhadj2011"
(see Belhadj
(2011))\[ \mu_i = \begin{cases} 1 & 0 < x_i < z_{min} \\ \frac{-x_i}{z_{\max} - z_{\min}} + \frac{-z_{\max}}{z_{\max} - z_{\min}} & z_{min} \le x_i < z_{max}\\ 0 & x_i \ge z_{max} \end{cases} \] where \(z_{min}\) and \(z_{MAX}\) have to be chosen by the researcher
For each of the functions below the breakdown
argument
can be specified in case the user’s want to obtain estimates for given
sub-domains.
fm=verma
, fm=verma1999
and
fm=ZBM
The computation of a fuzzy poverty index that uses the
fm="verma"
argument goes trough the following steps:
FuzzyPovertyR
provides the function HCR
to
estimate the Head Count Ratio from data. It outputs a list of three
elements: a classification of units into being poor or not poor, the
poverty line, and the value itself.hcr = HCR(predicate = eusilc$eq_income, weight = eusilc$DB090, p = 0.5, q = 0.6)$HCR # add poverty threshold
if needed, the package has a built-in function
eq_predicate
to calculate the equivalised disposable income
using some equivalence scales.
#> [1] "verma"
#> [1] "FuzzyMonetary"
#> Fuzzy monetary results:
#>
#> Summary of verma membership function:
#>
#> Quantiles:
#>
#> 0% 20% 40% 60% 80% 100%
#> 0 0.004 0.036 0.194 0.528 1
#>
#> Estimate(s):
#>
#> [1] 0.246
#>
#> Parameter(s):
#>
#> alpha
#> 3.787
When alpha = NULL
(the default) this function solves a
non-linear equation finding the value \(\alpha\) in interval
that
equates the expected value of the poverty measure to the Head Count
Ratio calculated above (see #eq-betti2006). This can be avoided by
specifying a numeric value of \(\alpha\).
verma = fm_construct(predicate = eusilc$eq_income, fm = "verma", weight = eusilc$DB090, ID = NULL, interval = c(1,10), alpha = 2)
The result of fm_construct
using fm="verma"
is a list containing
data.frame
of individuals’ membership functions
sorted in descending order (i.e. from most poor to least poor)head(verma$results)
#> ID predicate weight mu
#> 1 44 3.225806 465.3585 1.0000000
#> 2 450 6.666667 1010.5060 0.9946732
#> 3 372 12.903226 304.7067 0.9930663
#> 4 490 17.857143 1177.4120 0.9868551
#> 5 245 20.000000 853.0255 0.9823546
#> 6 130 42.424242 1141.3410 0.9763241
breakdown=NULL
). However, one can obtain estimates for
sub-domains using the breakdown
argument as followsverma.break = fm_construct(predicate = eusilc$eq_income, weight = eusilc$DB090, ID = NULL, HCR = hcr, interval = c(1,10), alpha = NULL, breakdown = eusilc$db040,
fm="verma")
summary(verma.break)
#> Fuzzy monetary results:
#>
#> Summary of verma membership function:
#>
#> Quantiles:
#>
#> 0% 20% 40% 60% 80% 100%
#> 0 0.004 0.036 0.194 0.528 1
#>
#> Estimate(s):
#>
#> a b c d e f g h i j k l m
#> 0.189 0.165 0.166 0.268 0.217 0.178 0.363 0.294 0.186 0.155 0.157 0.272 0.327
#> n o p q r s t u v w x y z
#> 0.247 0.237 0.255 0.356 0.163 0.375 0.076 0.509 0.242 0.206 0.271 0.313 0.213
#>
#> Parameter(s):
#>
#> alpha
#> 3.787
verma.break$estimate
#> a b c d e f g
#> 0.18866339 0.16538025 0.16562162 0.26821706 0.21695323 0.17829490 0.36323070
#> h i j k l m n
#> 0.29418642 0.18600937 0.15484057 0.15722022 0.27162839 0.32717227 0.24665062
#> o p q r s t u
#> 0.23685247 0.25459850 0.35638295 0.16349880 0.37543084 0.07613034 0.50897058
#> v w x y z
#> 0.24174548 0.20595260 0.27131691 0.31315140 0.21313753
alpha
parameter.With almost identical procedures is also possible to compute the
index for fm="verma1999"
and fm="TFR"
.
#> [1] "verma1999"
#> [1] "FuzzyMonetary"
#> Fuzzy monetary results:
#>
#> Summary of verma1999 membership function:
#>
#> Quantiles:
#>
#> 0% 20% 40% 60% 80% 100%
#> 0 0 0.006 0.131 0.591 1
#>
#> Estimate(s):
#>
#> [1] 0.246
#>
#> Parameter(s):
#>
#> alpha
#> 12.067
#> [1] "TFR"
#> [1] "FuzzyMonetary"
#> Fuzzy monetary results:
#>
#> Summary of TFR membership function:
#>
#> Quantiles:
#>
#> 0% 20% 40% 60% 80% 100%
#> 0 0.005 0.042 0.199 0.52 1
#>
#> Estimate(s):
#>
#> [1] 0.246
#>
#> Parameter(s):
#>
#> alpha
#> 3.062
fm=belhadj2015
and
fm=cerioli
The construction of a fuzzy index using the membership function as
Besma (2015) or Cerioli and Zani (1990) is obtained by
specifying fm="belhadj2015"
or fm="cerioli"
.
Let us begin by fm="belhadj2015"
. For this mf the arguments
z1
, z2
and b
need user’s
specification. The value z
that correspond to the flex
points of the mf or to the point where the two mf touch together is
calculated by the function.
The parameter b
\((>=1)\) rules the shape of the
membership functions (set b=1
for linearity)
z1 = 20000; z2 = 70000; b = 2
belhadj = fm_construct(predicate = eusilc$eq_income, weight = eusilc$DB090, fm = "belhadj2015", z1 = z1, z2 = z2, b = b)
#> Fuzzy monetary results:
#>
#> Summary of belhadj2015 membership function:
#>
#> Quantiles:
#>
#> 0% 20% 40% 60% 80% 100%
#> 0 0.991 1 1 1 1
#>
#> Estimate(s):
#>
#> [1] 0.939
#>
#> Parameter(s):
#>
#> z1 z2 z b
#> 20000.00 70000.00 47547.17 2.00
Using fm="cerioli"
. Again we have to set the values of
z1
, z2
as follows:
z1 = 10000; z2 = 70000
cerioli = fm_construct(predicate = eusilc$eq_income, weight = eusilc$DB090, fm = "cerioli", z1 = z1, z2 = z2)
#> Fuzzy monetary results:
#>
#> Summary of cerioli membership function:
#>
#> Quantiles:
#>
#> 0% 20% 40% 60% 80% 100%
#> 0 0.788 0.935 1 1 1
#>
#> Estimate(s):
#>
#> [1] 0.895
#>
#> Parameter(s):
#>
#> z1 z2
#> 10000 70000
fm=chakravarty
and
fm=belhadj2011
Chakravarty (2006) fuzzy index is
obtained setting fm = "chakravarty"
. The argument
z
needs user’s specification as follows:
z = 60000
chakravarty = fm_construct(predicate = eusilc$eq_income, weight = eusilc$DB090, fm = "chakravarty", z = z)
#> Fuzzy monetary results:
#>
#> Summary of chakravarty membership function:
#>
#> Quantiles:
#>
#> 0% 20% 40% 60% 80% 100%
#> 0 0.622 0.768 0.84 0.903 1
#>
#> Estimate(s):
#>
#> [1] 0.761
#>
#> Parameter(s):
#>
#> z
#> 60000
again is is possible to specify the breakdown
argument
to obtain estimates at sub-domains.
chakravarty.break = fm_construct(predicate = eusilc$eq_income, eusilc$DB090, fm = "chakravarty", z = z, breakdown = eusilc$db040)
knitr::kable(data.frame(verma.break$estimate, chakravarty.break$estimate), col.names = c("Verma", "Chakravarty"), digits = 4)
Verma | Chakravarty | |
---|---|---|
a | 0.1887 | 0.7041 |
b | 0.1654 | 0.7340 |
c | 0.1656 | 0.6114 |
d | 0.2682 | 0.7740 |
e | 0.2170 | 0.7886 |
f | 0.1783 | 0.6871 |
g | 0.3632 | 0.8167 |
h | 0.2942 | 0.8128 |
i | 0.1860 | 0.7439 |
j | 0.1548 | 0.7600 |
k | 0.1572 | 0.7827 |
l | 0.2716 | 0.8143 |
m | 0.3272 | 0.7918 |
n | 0.2467 | 0.7797 |
o | 0.2369 | 0.7523 |
p | 0.2546 | 0.7596 |
q | 0.3564 | 0.8127 |
r | 0.1635 | 0.7804 |
s | 0.3754 | 0.8489 |
t | 0.0761 | 0.7434 |
u | 0.5090 | 0.8221 |
v | 0.2417 | 0.7170 |
w | 0.2060 | 0.6673 |
x | 0.2713 | 0.8170 |
y | 0.3132 | 0.7286 |
z | 0.2131 | 0.7966 |
Instead, for fm = "belhadj2011"
the values of \(z_{min}\) and \(z_{MAX}\) need user’s specification as
follows:
zmin = 5000; zmax = 60000
belhadj2011 = fm_construct(predicate = eusilc$eq_income, weight = eusilc$DB090, fm = "belhadj2011", z_min = zmin, z_max = zmax)
#> Fuzzy monetary results:
#>
#> Summary of belhadj2011 membership function:
#>
#> Quantiles:
#>
#> 0% 20% 40% 60% 80% 100%
#> 0 0.678 0.838 0.916 0.985 1
#>
#> Estimate(s):
#>
#> [1] 0.823
#>
#> Parameter(s):
#>
#> z_min z_max
#> 5000 60000
fm=ZBM
As previously showed to compute the index related to this mf is necessary to compute three parameters \(a, b, c\) which are computed with bootstrap techniques (Zedini and Belhadj (2015)). Moreover the definition of the index require the knowledge of the household size. The index is obtained as follows:
ZBM = fm_construct(predicate = eusilc$eq_income, weight = eusilc$DB090, fm = "ZBM", hh.size = eusilc$ncomp)
The package include also a multidimensional fuzzy poverty index known as Fuzzy Supplementary (FS) index (see Betti and Verma (2008) and Betti, Gagliardi, and Verma (2018)). This index is defined with multiple steps. The package has an ad-hoc function for each step (excluding the third one). The steps are:
Step 1 - Identification
Step 2 - Transformation
Step 3 - Factor analysis to identify dimensions of poverty
Step 4 - Calculation of weights
Step 5 - Calculation of scores in dimensions
Step 6 - Calculation of the \(\alpha\) parameter
Step 7 - Construction of the FS measure for each dimension
This step is pretty simple. The user has to select the columns of the data that correspond to the items that he/she has decided to keep in the analysis.
Step 1 is done with the following selection
If the data in the dataset are not “ordered” in the right way, i.e. the highest values represents the highest deprivation it is possible to invert them using the following function:
#Create a dataframe in which the variable X is not ordered in the right way:
data=data.frame("X"=rep(c(1,2,3,4),20), "Y"=rep(c(7,8,9,1),20))
#Crete vec_order a vector of length n with TRUE or FALSE. True if the order of the variable is to be inverted, False otherwise
vec_order=c(TRUE,FALSE)
head(fs_order(data=data, vec_order))
#> X Y
#> 1 4 7
#> 2 3 8
#> 3 2 9
#> 4 1 1
#> 5 4 7
#> 6 3 8
In this step the items are mapped from their original space to the
\([0,1]\) interval using the function
fs_transform
(see Betti, Gagliardi,
and Verma (2018) and Betti and Verma
(2008)). For each item, a positive score \(s_{ij}\) is determined as follows
\[ s_{ij}= 1- d_{ij} = 1-\frac{1-F(c_{ij})}{1-F(1)},\quad i=1,\dots,n \quad \text{and}\quad j=1,\dots,k \]
where \(c_{ij}\) is the value of the
category of the \(j\)-th item for the
\(i\)-th household and \(F(c_{ij})\) is the value of the \(j\)-th item cumulative function for the
\(i\)-th household. This step is done
as follows using the function fs_transform
:
step2 = fs_transform(step1, weight = eusilc$DB090, ID = eusilc$ID); class(step2)
#> [1] "FuzzySupplementary"
summary(step2$step2)
#> ID HS040 HS050 HS060
#> Min. : 1.0 Min. :0.000 Min. :0.000 Min. :0.000
#> 1st Qu.:125.8 1st Qu.:0.000 1st Qu.:1.000 1st Qu.:0.000
#> Median :250.5 Median :1.000 Median :1.000 Median :1.000
#> Mean :250.5 Mean :0.608 Mean :0.976 Mean :0.538
#> 3rd Qu.:375.2 3rd Qu.:1.000 3rd Qu.:1.000 3rd Qu.:1.000
#> Max. :500.0 Max. :1.000 Max. :1.000 Max. :1.000
#> HS070 HS080 HS090 HS100
#> Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000
#> 1st Qu.:1.0000 1st Qu.:1.0000 1st Qu.:0.0000 1st Qu.:1.0000
#> Median :1.0000 Median :1.0000 Median :1.0000 Median :1.0000
#> Mean :0.9929 Mean :0.9874 Mean :0.7279 Mean :0.9758
#> 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000
#> Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000
#> HS110 HS120 HS160 HS170
#> Min. :0.0000 Min. :0.0000 Min. :0.00 Min. :0.000
#> 1st Qu.:1.0000 1st Qu.:0.1201 1st Qu.:0.00 1st Qu.:0.000
#> Median :1.0000 Median :0.3791 Median :0.00 Median :0.000
#> Mean :0.8274 Mean :0.4420 Mean :0.05 Mean :0.122
#> 3rd Qu.:1.0000 3rd Qu.:0.7960 3rd Qu.:0.00 3rd Qu.:0.000
#> Max. :1.0000 Max. :1.0000 Max. :1.00 Max. :1.000
#> HS180 HS190 HH010 HH020
#> Min. :0.000 Min. :0.000 Min. :0.00000 Min. :0.0000
#> 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0.08545 1st Qu.:0.6244
#> Median :0.000 Median :0.000 Median :0.08545 Median :1.0000
#> Mean :0.058 Mean :0.086 Mean :0.45947 Mean :0.8080
#> 3rd Qu.:0.000 3rd Qu.:0.000 3rd Qu.:1.00000 3rd Qu.:1.0000
#> Max. :1.000 Max. :1.000 Max. :1.00000 Max. :1.0000
#> HH040 HH050 HH081 HH091
#> Min. :0.000 Min. :0.000 Min. :0.0000 Min. :0.00
#> 1st Qu.:0.000 1st Qu.:1.000 1st Qu.:1.0000 1st Qu.:1.00
#> Median :0.000 Median :1.000 Median :1.0000 Median :1.00
#> Mean :0.128 Mean :0.954 Mean :0.9865 Mean :0.99
#> 3rd Qu.:0.000 3rd Qu.:1.000 3rd Qu.:1.0000 3rd Qu.:1.00
#> Max. :1.000 Max. :1.000 Max. :1.0000 Max. :1.00
#> HX040
#> Min. :0.0000
#> 1st Qu.:0.1537
#> Median :0.5762
#> Mean :0.5511
#> 3rd Qu.:1.0000
#> Max. :1.0000
# step2.1 = fs_transform(step1, weight = eusilc$DB090, ID = eusilc$ID, depr.score = "d")
which outputs
ID | HS040 | HS050 | HS060 | HS070 | HS080 | HS090 | HS100 | HS110 | HS120 | HS160 | HS170 | HS180 | HS190 | HH010 | HH020 | HH040 | HH050 | HH081 | HH091 | HX040 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0 | 1 | 0 | 1 | 1 | 1.000 | 1 | 1 | 1.000 | 0 | 0 | 0 | 0 | 1.000 | 1.000 | 0 | 1 | 1 | 1 | 0.576 |
2 | 0 | 0 | 0 | 1 | 1 | 0.776 | 1 | 1 | 0.796 | 0 | 0 | 0 | 0 | 0.085 | 0.080 | 0 | 0 | 1 | 1 | 0.154 |
3 | 0 | 1 | 1 | 1 | 1 | 1.000 | 1 | 1 | 0.796 | 0 | 0 | 0 | 0 | 0.085 | 0.080 | 1 | 1 | 1 | 1 | 0.154 |
4 | 1 | 1 | 1 | 1 | 1 | 0.000 | 0 | 0 | 0.120 | 0 | 1 | 0 | 1 | 0.005 | 0.624 | 0 | 1 | 1 | 1 | 1.000 |
5 | 1 | 1 | 0 | 1 | 1 | 1.000 | 1 | 1 | 0.379 | 0 | 0 | 0 | 0 | 0.005 | 0.624 | 0 | 1 | 1 | 1 | 1.000 |
6 | 0 | 1 | 0 | 1 | 1 | 1.000 | 1 | 1 | 0.379 | 0 | 0 | 0 | 0 | 0.085 | 0.080 | 0 | 1 | 1 | 1 | 0.040 |
This fuzzy supplementary measure use factor analysis to undercover
latent dimension in the data. There are multiple approaches to get
factor analysis in R which we do not cover in this vignette, however the
user can check for example the lavaan
package. Anyways,
factor analysis is not mandatory and the user may wish to undertake a
different approach to undercover a latent structure in the data. Indeed,
it is possible to skip factor analysis or to use a personal assignment
of columns in dimensions.
Regardless of the chosen method, to go trough Step 3 the user has to specify a numeric vector of the same length of the number of items selected in Step 1 that assigns each column to a given dimension.
In this step, the weights to be assigned to each item, belonging to a given dimension \(h\), are determined separately within each dimension. Such weighting procedure takes into account of two different aspects: the dispersion of the deprivation indicator and its correlation with other deprivation indicators in the given dimension. The weight of item \(j\) belonging to dimension \(h\) is taken as
\[ \omega_{hj}=\omega^a_{hj}\times \omega^b_{hj}, \quad h=1,\dots,H \quad \text{and}\quad j=1,\dots,k_h \]
\(\omega_{hj}^a\) is taken as proportional to the coefficient of variation of the complement to one of the positive score for the variable concerned
\[ \omega^a_{hj}\propto \frac{\sigma_{s_{hj}}}{1-\bar{s}_{hj}} \]
where \(\sigma_{s_{hj}}\) is the standard deviation of the deprivation score \(s\) for item \(j\) in dimension \(h\) and \(\bar s_{hj}\) its sample mean. The second factor is
\[ \omega^b_{hj}\propto\Biggl(\frac{1}{1+\sum_{j=1}^{k_h}\rho_{e_{hjhj^{\ast}}}|\rho_{e_{hjhj^{\ast}}}<r^\ast_{e_{hj}}}\Biggr)\times \Biggl(\frac{1}{1+\sum_{j=1}^{k_h}\rho_{e_{hjhj^{\ast}}}|\rho_{e_{hjhj^{\ast}}}\ge r^\ast_{e_{hj}}}\Biggr) \]
where \(\rho_{e_{hjhj^{\ast}}}\) is the kendall’s correlation coefficient between deprivation indicators corresponding to items j and \(j^{\ast}\) in the \(h\)-dimension and \(r^\ast_{e_{hj}}\) is a critical value. Aggregation over a group of items in a particular dimension is given by a weighted mean taken over the items in that dimension \(s_{hi}=\sum w_{hj} s_{hj,i}/w_{hj}\), where \(w_{hj}\) is the sampling weight of the \(j\)-th deprivation item in the \(h\)-th dimension. An overall score for the \(i\)-th individual is calculated as the un-weighted mean:
\[
s_i=\frac{\sum_h s_{hi}}{H}
\] Those two steps are implemented in the package with the
function fs_weight
. It is necessary to define a value \(\rho\) which is a critical value to be used
for calculation of weights in the Kendall correlation matrix. If NULL,
i.e. not defined, rho is set equal to the point of largest gap between
the ordered set of correlation values encountered (see Betti and Verma,
2008).
#> [1] "FuzzySupplementary"
#> # A tibble: 20 × 5
#> Dimension Item w_a w_b w
#> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 1 HS040 0.804 0.483 0.388
#> 2 1 HS050 0.157 0.688 0.108
#> 3 1 HS060 0.928 0.508 0.471
#> 4 1 HS070 0.0807 0.504 0.0407
#> 5 2 HS080 0.111 0.611 0.0677
#> 6 2 HS090 0.597 0.586 0.350
#> 7 2 HS100 0.157 0.478 0.0751
#> 8 2 HS110 0.419 0.578 0.242
#> 9 2 HS120 0.759 0.883 0.670
#> 10 3 HS160 4.36 0.807 3.52
#> 11 3 HS170 2.69 0.630 1.69
#> 12 3 HS180 4.03 0.686 2.77
#> 13 3 HS190 3.26 1.32 4.29
#> 14 4 HH010 0.986 0.929 0.916
#> 15 4 HH020 0.427 0.900 0.384
#> 16 4 HH040 2.61 0.933 2.44
#> 17 4 HH050 0.220 0.980 0.215
#> 18 5 HH081 0.105 0.553 0.0583
#> 19 5 HH091 0.101 0.590 0.0593
#> 20 5 HX040 0.624 0.903 0.563
The output is a longitudinal data frame that contains the weights \(w_a, w_b, w = w_a\times w_b\) , the deprivation score \(s_{hi}\) for unit \(i\) and dimension \(j\), and the overall score \(s_i\) (the average over dimensions).
ID | Item | s | Dimension | w_a | w_b | w | s_hi | s_i |
---|---|---|---|---|---|---|---|---|
1 | HS040 | 0 | 1 | 0.8038 | 0.4828 | 0.3881 | 0.1476 | 0.4536 |
2 | HS040 | 0 | 1 | 0.8038 | 0.4828 | 0.3881 | 0.0404 | 0.2703 |
3 | HS040 | 0 | 1 | 0.8038 | 0.4828 | 0.3881 | 0.6149 | 0.5334 |
4 | HS040 | 1 | 1 | 0.8038 | 0.4828 | 0.3881 | 1.0000 | 0.4972 |
5 | HS040 | 1 | 1 | 0.8038 | 0.4828 | 0.3881 | 0.5327 | 0.4558 |
6 | HS040 | 0 | 1 | 0.8038 | 0.4828 | 0.3881 | 0.1476 | 0.2527 |
This step is equivalent to that discussed in the Fuzzy Monetary
section when fm="verma"
, in-fact the mf is defined as:
\[
\mu_i=(1-F_S(s_i)^{\alpha-1}(1-L_S(s_i)).
\] The function fs_equate
is used to find the value
of \(\alpha\) such that the expected
value of the mf equals the HCR.
alpha = fs_equate(steps4_5 = steps4_5, weight = eusilc$DB090, HCR = hcr, interval = c(1,10))
#> trying with alpha: 1 Expected Value: 0.5559
#> trying with alpha: 10 Expected Value: 0.0926
#> trying with alpha: 7.0204 Expected Value: 0.1285
#> trying with alpha: 4.0102 Expected Value: 0.2098
#> trying with alpha: 2.951 Expected Value: 0.2691
#> trying with alpha: 3.364 Expected Value: 0.2424
#> trying with alpha: 3.3089 Expected Value: 0.2457
#> trying with alpha: 3.3038 Expected Value: 0.246
#> trying with alpha: 3.3039 Expected Value: 0.246
#> trying with alpha: 3.3038 Expected Value: 0.246
#> Done.
(alternatively a user’s defined specification of the
alpha
argument can be used as well.)
The parameter \(\alpha\) estimated is used to calculate the fuzzy supplementary mf for each dimension of deprivation separately as follows. The FS indicator for the \(h-th\) deprivation dimension for the \(i-th\) individual is defined as combination of the \((1-F_{(S),hi})\) indicator and the \((1-L_{(S),hi})\) indicator.
\[
\begin{split}
\mu_{hi}&=\biggl(1-F_{S_{h}}(s_{hi})\Biggr)^{\alpha-1}\biggl(1-L_{S_h}(s_{hi})\Biggr)=\\
&=\Biggl[\frac{\sum_{\gamma=i+1}w_{h\gamma}|s_{h\gamma}>s_{hi}}{\sum_{\gamma\ge
2}w_{h\gamma}|s_{h\gamma}>s_{h1}}\Biggr]^{\alpha-1}\Biggl[\frac{\sum_{\gamma=i+1}w_{h\gamma}s_{h\gamma}|s_{h\gamma}>s_{hi}}{\sum_{\gamma\ge
2}w_{h\gamma}s_{h\gamma}|s_{h\gamma}>s_{h1}}\Biggr]
\end{split}
\] The function fs_construct
is used to compute the
FS for each dimension and overall as follows:
#> Fuzzy supplementary results:
#>
#> Summary of the membership functions:
#>
#> Quantiles:
#>
#> 0% 25% 50% 75% 100%
#> FS1 0 0.0000 0.1028 0.3146 1
#> FS2 0 0.0234 0.1761 0.4886 1
#> FS3 0 1.0000 1.0000 1.0000 1
#> FS4 0 0.0049 0.1082 0.1082 1
#> FS5 0 0.0000 0.0113 0.3381 1
#> Overall 0 0.0074 0.0975 0.3643 1
#>
#> Estimate(s):
#>
#> FS1 FS2 FS3 FS4 FS5 Overall
#> 0.161 0.224 0.829 0.200 0.191 0.246
#>
#> Parameter(s):
#>
#> Alpha:
#>
#> [1] 3.304
#> FS1 FS2 FS3 FS4 FS5 Overall
#> a 0.16017778 0.15721420 0.7504151 0.11843430 0.13690382 0.17803697
#> b 0.17203766 0.13353783 0.8603701 0.18226841 0.35306548 0.16150876
#> c 0.17790861 0.20949808 0.8747891 0.23864519 0.22568769 0.30225670
#> d 0.18430484 0.38429991 0.8634134 0.17047030 0.06218540 0.24525107
#> e 0.13157099 0.21602939 0.9432241 0.35550492 0.30772389 0.45379840
#> f 0.11635064 0.25468798 0.7003120 0.18457563 0.15552302 0.18651992
#> g 0.19789906 0.19517219 0.7868058 0.16063463 0.08637581 0.27243247
#> h 0.10378979 0.19398696 0.8526735 0.21615485 0.22137989 0.18955852
#> i 0.19944239 0.10733043 0.9561498 0.16923227 0.29106818 0.21228732
#> j 0.10950019 0.22043135 0.8774224 0.05680170 0.18878432 0.19766477
#> k 0.20243807 0.07183648 0.9791507 0.09036923 0.27431162 0.32986760
#> l 0.18773841 0.23105579 0.9185245 0.15254579 0.12587426 0.16756316
#> m 0.19459905 0.08661289 0.8812583 0.19715665 0.19712955 0.16909045
#> n 0.08403138 0.23751028 0.8101947 0.24143510 0.05576150 0.08021327
#> o 0.10717430 0.15153461 0.6907420 0.07810733 0.12316342 0.09665732
#> p 0.12462294 0.26074356 0.7678514 0.28953421 0.29440486 0.21144141
#> q 0.10731824 0.28134009 0.8083268 0.16137948 0.26594294 0.22523560
#> r 0.13292525 0.13075056 0.9532498 0.36621517 0.41196933 0.38421113
#> s 0.26416295 0.28100851 0.9387913 0.26507953 0.21010476 0.41753517
#> t 0.16740211 0.41950301 0.8725724 0.11691019 0.04229702 0.35450250
#> u 0.22746666 0.22160734 0.6666813 0.11469800 0.21659332 0.22168511
#> v 0.17997018 0.20148115 0.7284268 0.18935129 0.22035156 0.24724376
#> w 0.19114979 0.24834051 0.7252039 0.37794904 0.13056837 0.24049684
#> x 0.15863892 0.23342891 0.8680189 0.15527000 0.14885089 0.24134930
#> y 0.11294887 0.29472989 0.8436033 0.14557032 0.09870329 0.25374784
#> z 0.21039040 0.30499191 0.7309410 0.25553623 0.20458509 0.44289583
The output of the fs_construct
function is a list
containing:
membership
a list
containing the FS
measures for each statistical unit in the sample. Results for each
dimension can be obtained by
estimate
the average of the membership function for
each dimension
FS$estimate
#> FS1 FS2 FS3 FS4 FS5 Overall
#> a 0.16017778 0.15721420 0.7504151 0.11843430 0.13690382 0.17803697
#> b 0.17203766 0.13353783 0.8603701 0.18226841 0.35306548 0.16150876
#> c 0.17790861 0.20949808 0.8747891 0.23864519 0.22568769 0.30225670
#> d 0.18430484 0.38429991 0.8634134 0.17047030 0.06218540 0.24525107
#> e 0.13157099 0.21602939 0.9432241 0.35550492 0.30772389 0.45379840
#> f 0.11635064 0.25468798 0.7003120 0.18457563 0.15552302 0.18651992
#> g 0.19789906 0.19517219 0.7868058 0.16063463 0.08637581 0.27243247
#> h 0.10378979 0.19398696 0.8526735 0.21615485 0.22137989 0.18955852
#> i 0.19944239 0.10733043 0.9561498 0.16923227 0.29106818 0.21228732
#> j 0.10950019 0.22043135 0.8774224 0.05680170 0.18878432 0.19766477
#> k 0.20243807 0.07183648 0.9791507 0.09036923 0.27431162 0.32986760
#> l 0.18773841 0.23105579 0.9185245 0.15254579 0.12587426 0.16756316
#> m 0.19459905 0.08661289 0.8812583 0.19715665 0.19712955 0.16909045
#> n 0.08403138 0.23751028 0.8101947 0.24143510 0.05576150 0.08021327
#> o 0.10717430 0.15153461 0.6907420 0.07810733 0.12316342 0.09665732
#> p 0.12462294 0.26074356 0.7678514 0.28953421 0.29440486 0.21144141
#> q 0.10731824 0.28134009 0.8083268 0.16137948 0.26594294 0.22523560
#> r 0.13292525 0.13075056 0.9532498 0.36621517 0.41196933 0.38421113
#> s 0.26416295 0.28100851 0.9387913 0.26507953 0.21010476 0.41753517
#> t 0.16740211 0.41950301 0.8725724 0.11691019 0.04229702 0.35450250
#> u 0.22746666 0.22160734 0.6666813 0.11469800 0.21659332 0.22168511
#> v 0.17997018 0.20148115 0.7284268 0.18935129 0.22035156 0.24724376
#> w 0.19114979 0.24834051 0.7252039 0.37794904 0.13056837 0.24049684
#> x 0.15863892 0.23342891 0.8680189 0.15527000 0.14885089 0.24134930
#> y 0.11294887 0.29472989 0.8436033 0.14557032 0.09870329 0.25374784
#> z 0.21039040 0.30499191 0.7309410 0.25553623 0.20458509 0.44289583
alpha
the parameter \(\alpha\) estimated from data.Again, it is possible to obtain results for sub-domains by specifying
the breakdown
argument
FS1 | FS2 | FS3 | FS4 | FS5 | Overall | |
---|---|---|---|---|---|---|
a | 0.1602 | 0.1572 | 0.7504 | 0.1184 | 0.1369 | 0.1780 |
b | 0.1720 | 0.1335 | 0.8604 | 0.1823 | 0.3531 | 0.1615 |
c | 0.1779 | 0.2095 | 0.8748 | 0.2386 | 0.2257 | 0.3023 |
d | 0.1843 | 0.3843 | 0.8634 | 0.1705 | 0.0622 | 0.2453 |
e | 0.1316 | 0.2160 | 0.9432 | 0.3555 | 0.3077 | 0.4538 |
f | 0.1164 | 0.2547 | 0.7003 | 0.1846 | 0.1555 | 0.1865 |
g | 0.1979 | 0.1952 | 0.7868 | 0.1606 | 0.0864 | 0.2724 |
h | 0.1038 | 0.1940 | 0.8527 | 0.2162 | 0.2214 | 0.1896 |
i | 0.1994 | 0.1073 | 0.9561 | 0.1692 | 0.2911 | 0.2123 |
j | 0.1095 | 0.2204 | 0.8774 | 0.0568 | 0.1888 | 0.1977 |
k | 0.2024 | 0.0718 | 0.9792 | 0.0904 | 0.2743 | 0.3299 |
l | 0.1877 | 0.2311 | 0.9185 | 0.1525 | 0.1259 | 0.1676 |
m | 0.1946 | 0.0866 | 0.8813 | 0.1972 | 0.1971 | 0.1691 |
n | 0.0840 | 0.2375 | 0.8102 | 0.2414 | 0.0558 | 0.0802 |
o | 0.1072 | 0.1515 | 0.6907 | 0.0781 | 0.1232 | 0.0967 |
p | 0.1246 | 0.2607 | 0.7679 | 0.2895 | 0.2944 | 0.2114 |
q | 0.1073 | 0.2813 | 0.8083 | 0.1614 | 0.2659 | 0.2252 |
r | 0.1329 | 0.1308 | 0.9532 | 0.3662 | 0.4120 | 0.3842 |
s | 0.2642 | 0.2810 | 0.9388 | 0.2651 | 0.2101 | 0.4175 |
t | 0.1674 | 0.4195 | 0.8726 | 0.1169 | 0.0423 | 0.3545 |
u | 0.2275 | 0.2216 | 0.6667 | 0.1147 | 0.2166 | 0.2217 |
v | 0.1800 | 0.2015 | 0.7284 | 0.1894 | 0.2204 | 0.2472 |
w | 0.1911 | 0.2483 | 0.7252 | 0.3779 | 0.1306 | 0.2405 |
x | 0.1586 | 0.2334 | 0.8680 | 0.1553 | 0.1489 | 0.2413 |
y | 0.1129 | 0.2947 | 0.8436 | 0.1456 | 0.0987 | 0.2537 |
z | 0.2104 | 0.3050 | 0.7309 | 0.2555 | 0.2046 | 0.4429 |
The package contains also a function named
fs_construct_all
which constructs the fuzzy supplementary
poverty measure based without step-by-step functions.
The variance of each Fuzzy Monetary measure can be estimated either
via Bootstrap (naive or calibrated) or Jackknife Repeated Replications.
We recommend the former each time the user has no knowledge of the
sampling design, while we recommend the Jackknife when there is full
information on the design and of the PSUs (see Betti, Gagliardi, and Verma (2018)). In the
following, for the unidimensional indices, we report only the examples
linked with fm=verma
. For the other specification of the mf
the function is identical the only changes are linked with the
parameters required by the mf (see fm_construct
).
fm=verma
In case of fm="verma"
, we recommend the user to use the
value of alpha
from obtained from the function
fm_construct
. It is possible to specify different values of
the parameter (i.e. alpha=2
). We do not recommend to leave
the argument alpha=NULL
for the computation of
variance.
alpha = fm_construct(predicate = eusilc$eq_income, weight = eusilc$DB090, ID = NULL, HCR = 0.12, interval = c(1,10), alpha = NULL)$alpha
boot.var = fm_var(predicate = eusilc$eq_income, weight = eusilc$DB090, fm = "verma", type = "bootstrap_naive", HCR = .12, alpha = alpha, verbose = F, R = 10)
# plot(boot.var)
fm_var(predicate = eusilc$eq_income, weight = eusilc$DB090, fm = "verma", type = "jackknife", HCR = .12, alpha = 9, stratum = eusilc$stratum, psu = eusilc$psu, verbose = F)
#> $variance
#> [1] 5.324184e-06
#>
#> $type
#> [1] "jackknife"
#>
#> attr(,"class")
#> [1] "FuzzyMonetary"
which gives the bootstrap estimate or the jackknife estimate.
If there are multiple sub-domains or sub-populations that need
variance estimation, the user can specify the breakdown to the
breakdown
argument of the function fm_var
. For
example:
#> Variance of Fuzzy monetary results:
#>
#> Type of estimator:
#>
#> bootstrap_naive
#>
#> Estimate(s):
#>
#> a b c d e f g h
#> 0.004706 0.003829 0.007216 0.009812 0.004559 0.003746 0.009595 0.007760
#> i j k l m n o p
#> 0.003195 0.004499 0.007071 0.004369 0.010124 0.004262 0.008497 0.004778
#> q r s t u v w x
#> 0.014909 0.005027 0.004747 0.000420 0.005187 0.007698 0.007066 0.004625
#> y z
#> 0.006531 0.005011
#> Variance of Fuzzy monetary results:
#>
#> Type of estimator:
#>
#> jackknife
#>
#> Estimate(s):
#>
#> a b c d e f g h
#> 0.006067 0.009761 0.009576 0.012358 0.013549 0.003032 0.014162 0.024544
#> i j k l m n o p
#> 0.005101 0.006252 0.014782 0.003836 0.016164 0.002989 0.008660 0.007709
#> q r s t u v w x
#> 0.022870 0.008810 0.007936 0.000672 0.008556 0.010271 0.007848 0.007644
#> y z
#> 0.009668 0.005607
variance = fs_var(data = eusilc[,4:23], weight = eusilc$DB090, ID = NULL, dimensions = dimensions, breakdown = NULL, HCR = 0.12, alpha = 2, rho = NULL, type = 'bootstrap', M = NULL, R = 2, verbose = F)
summary(variance)
The following uses the Jackknife
fs_var(data = eusilc[,4:23], weight = eusilc$DB090, ID = NULL, dimensions = dimensions,
stratum = eusilc$stratum, psu = eusilc$psu, verbose = F, f = .01,
breakdown = eusilc$db040, alpha = 3, rho = NULL, type = "jackknife")%>%summary()