ON ESTIMATING THE COST EFFICIENCY OF THE BRAZILIAN
ELECTRICITY DISTRIBUTION UTILITIES USING
DEA AND BAYESIAN SFA MODELS
Marcus Vinicius Pereira de Souza
PUC-RJ – Pontifícia Universidade Católica do Rio de Janeiro
Departamento de Engenharia Industrial
Rua Marquês de São Vicente 225 – Gávea 22451-041 – Rio de Janeiro – RJ
[email protected]
Reinaldo Castro Souza
PUC-RJ – Pontifícia Universidade Católica do Rio de Janeiro
Departamento de Engenharia Elétrica
Rua Marquês de São Vicente 225 – Gávea 22451-041 – Rio de Janeiro – RJ
[email protected]
Tara Keshar Nanda Baidya
PUC-RJ – Pontifícia Universidade Católica do Rio de Janeiro
Departamento de Engenharia Industrial
Rua Marquês de São Vicente 225 – Gávea 22451-041 – Rio de Janeiro – RJ
[email protected]
ABSTRACT
The purpose of this study is to evaluate the efficiency indices for 60 Brazilian electricity distribution
utilities. These scores are obtained by DEA (Data Envelopment Analysis) and Bayesian
Stochastic Frontier Analysis models, two techniques that can reduce the information
asymmetry and improve the regulator’s skill to compare the performance of the utilities, a
fundamental aspect in incentive regulation squemes. In addition, this paper also addresses
the problem of identifying outliers and influential observations in deterministic
nonparametric DEA models.
Keywords: Data envelopment analysis; Bayesian stochastic frontier analysis; Economic regulation;
Outlier identifiers; Influential observations.
Introduction
In the Brazilian Electrical Sector (SEB, for short), the supply of energy tariffs is periodically
revised within a period of 4 to 5 years, depending on the distributing utility contract. On the very year
of the periodical revision, the tariffs are brought back to levels compatibles to its operational costs and
to guarantee the adequate payback of the investments made by the utility, therefore, maintaining its
Financial and Economical Equilibrium (EEF, for short). Over the period spanned between two
revisions, the tariffs are annually readjusted by an index named IRT given by:
IRT =
VPA 1 VPB 0 (IGPM − X)
+
RA 0
RA 0
(1)
where, VPA 1 stands for the quantity related to the utility non-manageable costs (acquisition of
energy and electrical sector taxes) at the date of the readjustment; RA 0 stands for the utility annual
revenue estimated with the existing tariff (free of the ICMS tax) at the previous reference date IGPM
(market prices index) and VPB 0 stands for the quantity related to the utility manageable costs (labor,
third part contracts, depreciations, adequate payback of invested assets and working capital) on the
previous reference date ( VPB 0 = RA 0 − VPA 0 ).
XLI SBPO 2009 - Pesquisa Operacional na Gestão do Conhecimento
Pág. 172
1
As shown in (1), the non-manageable costs (VPA) are entirely passed through to the final
tariffs, while the amount related to the manageable costs (VPB) is updated using the IGPM index
discounted by the X factor. This factor applies only to the manageable costs and constitutes the way
whereby the productivity gains of the utilities are shared with the final consumers due to the tariff
reduction they introduce. The National Electrical Energy Agency (ANEEL) resolution 55/2004 defines
the X factor as the combination of the 3 components ( X E , X A and X C ), according to the
expression below:
X = ( XE + XC ) x ( IGPM − X A ) + X A
(2)
The component X A accounts for the effects of the application of the IPCA index (prices to
consumer index) on the labor component of the VPB. The X C component is related to the consumer
perceived quality of the utility service and the X E component accounts for the productivity expected
gains of the utility due to the natural growth of its market. The latter is the most important and its
definition is based on the discounted cash flow method of the forward looking type, in such a way to
equal the present cash flow value of the utility during the period of the revision, added of its residual
value, to the utility assets at the beginning of the revision period. In summary:
(
)
t−1
N  RO .(1 − X )
− Tt − OM t − d t .(1 − g ) + d t − I t 
AN
E
t
A0 = ∑ 
(3)
+
t
N
t= 1
(1 + rWACC )
 (1 + rWACC )

where, N is the period, in years, between the two revisions; A 0 is the value of the utility assets on the
date of the revision, A N is the utility assets value at the end of the revision period; g stands for both;
the income tax percentage and the compulsory social contribution of the utility applied to the utility
liquid profit; rWACC is the average capital cost; RO t is the utility operational revenue; Tt represents
the various taxes (PIS/PASEP, COFINS and P&D); OM t are the operational and maintenance utility
costs; I t is the amount corresponding to the investments realized and d t is the depreciation, all of
them related to year t.
The quantities that form the cash flow in (3) are projected according to the criteria proposed
by ANEEL, resolution 55/2004. As an example, the projected operational revenue is obtained as the
product between the predicted marked and the average updated tariff; while the operational costs
(operational plus maintenance, administration and management costs) are projected based on the costs
of the “Reference Utility”, all related to the date of the tariff revision.
To avoid the complexity of the “Reference Utility” approach and in order to produce an
objective way to obtain efficient operational costs, ANEEL envisages the possibility of using
benchmarking techniques, among them, the efficient frontier method, as adopted by the same ANEEL
to quantify the efficient operational costs of the Brazilian transmission lines utilities (ANEEL, 2007).
The frontier is the geometric locus of the optimal production. The straightforward comparison of the
frontier with the position of the utilities allows the quantification of the amount of improvement each
utility should work on in order to improve its performance with respect to the others.
The international review conducted by Jasmab and Pollit (2000) shows that the most important
benchmarking approaches used in regulation of the electricity services provided by utilities are based
upon Data Envelopment Analysis (DEA, Cooper et al., 2000) and Stochastic Frontier Analysis (SFA,
Kumbhakar and Lovell, 2000). As cited in Souza (2008), the first method is founded on linear
programming, while the second is characterized by econometric models.
Studying cases of the SEB, authors such as Pessanha et al. (2004) and Sollero and Lins (2004)
have used different DEA models to evaluate the efficiency of the Brazilian distributing utilities. On the
other hand, Arcoverde et al. (2005) have also obtained efficient indices for the Brazilian distributing
utilities using SFA models. Recently, Souza (2008) has proposed to gauge the cost efficiency using
Bayesian Markov Chain Monte Carlo (MCMC) algorithm.
XLI SBPO 2009 - Pesquisa Operacional na Gestão do Conhecimento
Pág. 173
2
DEA and SFA approaches have distinct assumptions on their inner concept and present pros
and cons, depending on the specific application. Therefore, there is no such statement as “the best”
overall frontier analysis method.
In order to measure the efficiency (rather than inefficiency), and to make some interesting
interpretations of efficiency across comparable firms, it is recommended to investigate efficiency
indices obtained by several methods on the same data set, as carried out in the present work, where
DEA and Bayesian SFA (BSFA hereafter) models are used to evaluate the operational costs efficiency
of 60 Brazilian distributing utilities.
The paper is organized as follows. In the next section, it is discussed the basic concepts of the
DEA and BSFA formulations. In addition, it is presented the Returns to Scale (RTS) question, the
problem of detecting outliers, influential observations and Gibbs Sampler (MCMC) method. Section 3
comments on the results. Conclusions are given in Section 4. The appendix provides the main results
obtained by DEA and BSFA methodologies summarized in Tables 2 and 3.
2. Methodology and Mathematical Models
2.1 The Deterministic DEA Approach
Data Envelopment Analysis is a mathematical programming based approach for assessing the
comparative efficiency of the set of organisational units that perform similar tasks and for which
inputs and outputs are available. It is meaningful to point out that in the DEA terminology, those
entities are so-called Decision Making Units (DMUs).
The survey by Allen et al. (1997) reports that DEA was proposed originally by Farrell (1957)
and developed, operationalised and popularised by Charnes et al. (1978). Ever since, this technique
has been applied in a wide range of empirical work, such as education, banking, health care, public
services, military units, electrical energy utilities, and others instituitions. Zhu (2003) describes that
one of the reasons for this argumentation could be that DEA has ability to measure the relative
“technical efficiency” in a multiple inputs and multiple outputs situation, without the usual
information on market prices.
In the framework here (DEA methodology), consider the case where there are n DMUs to be
evaluated. Each DMU j ( j = 1,..., n ) has consumed varying amounts of m different inputs
[
]
[
T
x j = x1 j  xmj ∈ R m+ to produce s different outputs y j = y1 j 
ysj
]
T
∈ R +s . A set of
feasible combinations of input vectors and outputs vector composes the Production Possibility Set T
(PPS, for short), defined by:
T = ( x,y ) ∈ R +m+ s x can produce y
(4)
{
}
It is informative, here, to stress the study developed by Banker et al. (1984). In short, they
postulated the following properties for the PPS, which are worthwhile:
 Postulate 1. Convexitiy;
 Postulate 2. Inneficiency Postulate;
 Postulate 3. Ray Unboundedness;
 Postulate 4. Minimum Extrapolation.
Subsequent to some algebraic manipulations under the above-mentioned four postulates, it is
possible to show that the PPS T is given by:
T = ( x,y ) x ≥ Xλ , y ≤ Yλ , λ ≥ 0
(5)
{
}
where X is the ( mxn ) input matrix, Y is the ( sxn ) output matrix and λ is a semipositive vector in
Rn .
If postulate 3 is removed from the properties of the PPS, it can be verified that:
(6)
T = ( x,y ) x ≥ Xλ , y ≤ Yλ , 1λ = 1, λ ≥ 0
{
}
where 1 is the (1 x n ) unit vector. A complete presentation of this demonstration, worth reading, can
be found in Forni (2002).
XLI SBPO 2009 - Pesquisa Operacional na Gestão do Conhecimento
Pág. 174
3
Such results lead directly to two seminal DEA models. The first invokes the assumption of the
Constant Returns-to-Scale (CRS) and convex technology, Charnes et al. (1978). On the other hand, the
second assumes the hypothesis of Variable Returns to Scale (VRS), Banker et al. (1984).
In the following section, it is presented methods for measuring Return to Scale (RTS) of the
technology.
2.2 Returns to Scale
As pointed out in Simar and Wilson (2002), it is very important to examine whether the
underlying technology exhibits non-increasing, constant or non-decreasing RTS. Of course, large
amount of literature has been developed on the problem of testing hypotheses regarding RTS. For
example, Färe and Grosskopf (1985) suggested an approach for determining local RTS in the
estimated frontier which involves comparing different DEA efficiency estimates obtained under the
alternative assumptions of constant, variable, or non-increasing RTS, but did not provide a formal
statistics test of returns to scale. On the other hand, Simar and Wilson (2002), again, discussed various
statistics and presented bootstrap estimation procedures.
In some situations, it could be interesting to solve the RTS question by estimating total
elasticity ( e ) . Following Coelli et al. (1998), this estimate, certainly attractive from the point of view
of simplicity, can be computed by using the partial elasticity estimates ( E i ) . However, it is easy to
verify that this approach will fail in the very general setup of a multi output and multi-input scenario.
In terms of the partial elasticity estimates again, ( E i ) is given by:
∂ y xi
.
∂ xi y
Ei =
(7)
From its definition, the total elasticity ( e ) is expressed as:
e = E1 + E 2 + ... + E i
(8)
Once the value of the total elasticity ( e ) is measured, immediately it is possible to identify the
returns to scale type. Following Coelli et al. (1998), three possible cases are associated with (8):
 e = 1 ⇒ Constant Returns-to-Scale (CRS);
 e > 1 ⇒ Non-Decreasing Returns-to-Scale (NDRS);
 e < 1 ⇒ Non-Increasing Returns-to-Scale (NIRS).
In conformity with what is mentioned up to here, the next section focuses on how to find the
feasible DEA model based on the resulting total elasticity.
2.3 DEA Models regarding Returns to Scale
As above, it is possible to determine the DEA best-practice frontier type through ( e ) . In this
context, let the CRS and VRS DEA models defined in (9) and (10) respectively:
Min{θ y0 ≤ Yλ , θ x0 ≥ Xλ , λ ≥ 0}
{
}
Min θ y0 ≤ Yλ , θ x0 ≥ Xλ , 1λ = 1, λ ≥ 0
(9)
(10)
where λ is a ( n x 1) row vector of weights to be computed, x0 is a ( m x 1) vector of inputs for
DMU 0 and y0 is a ( s x 1) vector of outputs for DMU 0 .
By inspection of (9) and (10), it is remarkable to notice that the VRS model (BCC model)
differs from the CRS model (CCR model) only in the adjunction of the condition 1λ = 1 . Cooper et
al. (2000) point out that this condition, together with the condition λ j ≥ 0 ,∀ j , imposes a convexity
condition on allowable ways in which the n DMUs may be combined.
XLI SBPO 2009 - Pesquisa Operacional na Gestão do Conhecimento
Pág. 175
4
Based on the appointed comments, it may be found in Zhu (2003) that if it is replaced 1λ = 1
with 1λ ≥ 1 , then it is obtained Non-Decreasing Returns-to-Scale (NDRS) model, alternatively, if it is
replaced 1λ = 1 with 1λ ≤ 1 , then it is obtained Non-Increasing Returns-to-Scale (NIRS) model.
With regard to the interpretation of these models, it is straightforward: DEA minimize the
relative efficiency index (θ ) of each DMU 0 , comparing simultaneously all DMUs, subject to the
constraints (remember that these constraints are equivalent to (5) and (6)).
Given the data, it is necessary to carry out an optimization for each of the n DMUs.
Accordingly, a DMU is said to be fully efficient when θ ∗ = 1 and, in this case, it is located on the
efficiency frontier.
At this point another question arises: DEA models, by construction, are very sensitive to
extreme values and to outliers. Even thought Davies and Gather (1993) reasoned that the word outlier
has never been given a precise definition, Simar (2003) defined an outlier as an atypical observation or
a data point outlying the cloud of data points. This way, it is noteworthy that the outlier identification
problem is of primary importance and it has been investigated extensively in the literature.
Besides this, it is important to stress that outliers can be considered influential observations.
As stated by Dusansky and Wilson (1995), influential observations are those that result in a dramatic
change in parameter estimates when they are removed from the data. For some interesting discussions
about outliers and influential observations, see also Wilson (1993, 1995), Pastor et al. (1999), Forni
(2002).
Herein, it is used to help detecting potential outlier the Wilson (1993) method. This technique
generalizes the outlier measure proposed by Andrews and Pregibon (1978) to the case of multiple
outputs and incorporates a convexity assumption. Nevertheless, as is seen from Wilson (1995), it
becomes computationally infeasible as the number of observations and the dimension of the inputoutput space increases.
This discussion ends by assuming that these very rich results obtained will be extended in the
BSFA context.
2.4 The Statistical Model
The stochastic frontier models (also known in literature as composed error models) were
independently introduced by Meeusen and van den Broeck (1977), Aigner et al. (1977), Battese and
Corra (1977) and have been used in numerous empirical applications. Some of the advantages of this
approach are: a) identifying outliers in the sample; b) considering non manageable factors on the
efficiency measurement.
Unfortunately, this method may be very restrictive because it imposes a functional form for
technology.
This article uses a stochastic frontier model in Bayesian point of view. This technique allows
to realize inference from data using probabilistic models for both quantities observed as for those not
observed. Another feature of the BSFA framework is to enable the expert to include his previous
knowledge in the model studied. For these reasons, Bayesian models are considered more flexible and
thus, in most cases, they are not treatable analytically. To circumvent this problem, it is necessary to
use simulation methods. The most used are the Markov Chain Monte Carlo methods (MCMC).
2.4.1 Bayesian Stochastic Cost Frontier
The econometric model with composed error for the estimation of the stochastic cost frontier
can be mathematically expressed as:
y j = h x j ; β exp v j + u j
(11)
(
(
) (
)
)
Assuming that h x j ; β is linear on the logarithm, the following model is obtained after the
application of a log transformation in (11):
ln y j = β 0 +
m
∑
i= 1
β i ln x ji +
m m
∑ ∑
β ik ln x ji ln x jk + v j + u j
(12)
i≤ k = 1
XLI SBPO 2009 - Pesquisa Operacional na Gestão do Conhecimento
Pág. 176
5

The equation (12) is called in literature as Translog function. When the crossed products are
null, there is a particular case called Cobb-Douglas function. With this information, the deterministic
part of the frontier can be defined:
ln y j - natural logarithm of the output of the j -th DMU ( j = 1,..., n );

ln x ji - natural logarithm of the i -th input of the j -th DMU (including the
intercept);
β = [β 0

β1  β
m
]T
- a vector of unknown parameters to be estimated.
In equation (12), the deviation between the observed production level and the determinist part
of the frontier is given by the combination of two components: u j an error that can only take nonnegative values and captures the effect of the technical inefficiency, and v j a symmetric error that
captures any non manageable random shock. The hypothesis of symmetry of the distribution of v j is
supported by the fact that environmental favorable and unfavorable conditions are equally probable.
It is worthwhile to consider that v j is independent and identically distributed (i.i.d, in short)
with symmetric distribution, usually a Gaussian distribution, and that it is independent of u j . Taking
into account the component u j ( u j ≥ 0 ), this is not evident and thus can be specified by several
ways. For example, Meeusen and van den Broeck (1977) used the exponential distribution, Aigner et
al. (1977) recommended the Half-Normal distribution, Stevenson (1980) proposed the Truncated
Normal distribution and finally, Greene (1990) suggested the Gamma distribution. More recently,
Medrano and Migon (2004) used the lognormal distribution. The uncertainty related to the distribution
of the random term u as well as the frontier function suggests the use of Bayesian inference
techniques, as presented in pioneer works of van den Broeck et al. (1994) and Koop et al. (1995).
To this end, the sampling distribution is initially formulated. For example, considering the
(
) , i.e., the Normal distribution with mean 0 and variance σ and
u ~ Γ (1, λ ) ,i.e., u ~ exp( λ ) , the joint distribution of y and u , given x j and the vector of
parameters ψ (ψ = [ β
σ
λ ] ) is given by:
p ( y ,u x ,ψ ) = N ( y h( x ; β ) + u ,σ ) ⋅ Γ (u 1, λ )
(13)
iid
2
random term v j ~ N 0 ,σ
iid
j
2
iid
−1 1
−1
j
T
j
j
j
j
−1 T
2
j
j
j
2
j
j
−1
Integrating (13) with respect to u j , one arrives at the sampling distribution:
(
)
p y j x j ,ψ = λ
(
−1
)

. exp - λ

2
where m j = y j − h x j ; β − σ λ
−1
− 1
1 2
 mj+ σ λ
2

 mj 

  Φ 
  σ 
−1 
(14)
and Φ ( .) is the cumulative distribution function for a standard
normal random variable.
To use the Bayesian approach, prior distributions are added to the parameters and, following
the hierarchical modeling, posterior distributions are given. In principle, prior distribution of ψ may
be any. However, it is usually non advisable to incorporate much subjective information on them and,
in this case, appropriate prior specifications for the parameters need to be included. Here, consider the
following prior distributions:
(
β ~ N + 0 ,σ
1
2
2
β
);
2
(15)
Γ ( .) : Gamma function.
N + ( .,.) : Truncated Normal distribution.
XLI SBPO 2009 - Pesquisa Operacional na Gestão do Conhecimento
Pág. 177
6
σ
−2
n c 
~Γ 0 , 0 ;
 2 2
(16)
According to Fernandez et al. (1997), it is essential that prior distribution σ − 2 is informative (
n0 > 0 and c0 > 0 ) in order to ensure the existence of posterior distribution in stochastic frontier
model with cross-section sample.
Following, in some cases, it is reasonable to identify similar characteristics among the
companies evaluated and then, for including these informations in the model. This procedure can be
performed specifying for each of DMUs, a vector s j consisting of s jl ( l = 1,..., k ) exogenous
variables. For these cases, Osiewalski and Steel (1998) proposed the following parameterization for
the average efficiency:
k
− s jl
λ j = ∏ φl
(17)
l= 1
where φ l > 0 are the unknown parameters and, by construction, s j1 ≡ 1 . If s jl are dummy variables
and k > 1 , the distributions of u j may differ for different j . Thus, Koop et al. (1997) called this
specification as Varying Efficiency Distribution model (VED, in short). If, k = 1 , then λ
j
= φ 1− 1 and
all terms related to inefficiencies are independent samples of the same distribution. Again, according
to Osiewalski and Steel (1998), this is a special case called Common Efficiency Distribution model
(CED, in short).
Regarding to priori distribution of k parameters of the efficiency distribution, Koop et al.
(1997) suggested using φ l ~ Γ ( al ,gl ) with al = gl = 1 for l = 2,...,k , a1 = 1 , and g1 = − ln r* ,
( )
where r* ∈ ( 0,1) is the hyperparameter to be determined. According to van den Broeck et al. (1994),
in the CED model, r* can be interpreted as prior median efficiency. Proceeding this way, it could be
ensured that the VED model is consistent with the CED model.
In agreement with the above, it is important to present posterior full conditional distributions
of parameters involved in the model:
(
pσ
−2
) (
y j , x j , s j , u j , β ,φ = p σ
(
−2
p β y j , x j , s j , u j ,σ
) (
−2
(
(
)

 n + n c0 + ∑j y j − h x j ; β − u j
0
y j , x j ,u j , β = Γ 
,
2
2


,φ = p β y j , x j , u j ,σ
)
−2
) 2 
) ∝ N (β 0,σ ) x
+



(18)
−2
β


exp  − 1 σ − 2 ∑ y j − h x j ; β − u j 2  (19)
j
 2

The posterior full conditional distribution of φ l ( l = 1,...,k ) presents the following general
(
form:
(
p φ l y j , x j , s j ,u j , β ,σ
−2
) (
)
(
)


 
,φ ( − l ) = p φ l s j ,φ ( − l ) ∝ exp  − φ 1 ∑ u j D j1  xΓ  φ j 1 + ∑ s jl ,gl 
j
j

 

)
(20)
where:
k
D jl = ∏ φ
j≠ l
s jl
j
(21)
For l = 1,...,k ( D j1 = 1 for k = 1 ) and φ ( − l ) denotes φ without its l -th element.
With regard to inefficiencies, it can be shown that they are distributed as a Truncated Normal
distribution:
XLI SBPO 2009 - Pesquisa Operacional na Gestão do Conhecimento
Pág. 178
7
(
)

,φ =  Φ

(
)
−1
( (
)

  x N u j h x j ; β − y j − λ jσ 2 ,σ 2 (22)
p u j y j , x j , s j , β ,σ


As the posterior full conditional distribution for u is known, Gibbs sampler could be used to
−2
 h x j ; β − y j − λ jσ


σ

2
)
generate observations of the joint posterior density. These observations could be used to make
inferences about the unknown quantities of interest. It is worth remembering that the technical
efficiency of each DMU is determined making r j = exp − u j .
(
)
2.4.2 The Gibbs Sampler (MCMC) Algorithm
According to Gamerman (1997), the Gibbs sampler was originally designed within the context
of reconstruction of images and belongs to a large class of stochastic simulation schemes that use
Markov chains. Although it is a special case of Metropolis-Hastings algorithm, it has two features,
namely:
 All the points generated are accepted;
 There is a need to know the full conditional distribution.
The full conditional distribution is the distribution of the i -th component of the vector of
parameters ψ , conditional on all other components.
Again referring to Gamerman (1997), the Gibbs sampler is essentially a sampling iterative
scheme of a Markov chain, whose transition kernel is formed by the full conditional distributions.
To describe this algorithm, suppose that the distribution of interest is p(ψ ) , where
ψ = (ψ 1 , ... ,ψ
) . Each of the components ψ i
can be a scalar, a vector or a matrix. It should be
emphasized that the distribution p does not, necessarily, need to be an a posteriori distribution. The
implementation of the algorithm is done according to the following steps (Gamerman (1997)):
i. initialize the iteration counter of the chain t = 1 and set initial values
ψ ( 0 ) = ψ 1( 0 ) , ... , ψ d( 0 ) ;
ii. obtain a new value ψ ( t ) = ψ 1( t ) , ... ,ψ d( t ) from ψ ( t − 1) through sucessive generation of
values:
ψ ( t ) ~ p ψ ψ ( t − 1) , ... , ψ ( t − 1)
d
(
)
(
ψ ( ) ~ p (ψ
⋮
1
1
t
2
2ψ 1
(
2
(t)
(
)
d
, ψ 3( t − 1) ,... , ψ d( t − 1)
ψ d( t ) ~ p ψ d ψ 1( t ) ,... , ψ d( t−) 1
)
)
)
iii. change counter t to t + 1 and return to step (ii) until convergence is reached.
Thus, each iteration is completed after d movements along the coordinated axes of
components of ψ . After convergence, the resulting values form a sample of p(ψ ) . Ehlers (2005)
emphasizes that even in problems involving large dimensions, univariate or block simulations are used
which, in general, is a computational advantage. This has contributed significantly to the
implementation of this methodology, especially in applied econometrics area with Bayesian emphasis.
3. Experimental Results and Interpretation
To evaluate the efficiency, the utilities have been characterized by the 4 indicators marked in
Table 1. The products are the cost drivers of the operational costs. The amount of energy distributed
(MWh) is a proxy of the total production, the number of consumer units (NC) is a proxy for the
quantity of services provided and the grid extension attribute (KM) reflects the spread out of
consumers within the concession area, an important element of the operational costs.
XLI SBPO 2009 - Pesquisa Operacional na Gestão do Conhecimento
Pág. 179
8
By now, it is useful to start for identifying the outliers among the utilities. To do so, the
Wilson (1993) method was used and the information on 60 Brazilian utilities was processed by the
FEAR 1.11 (a software library that can be linked to the general-purpose statistical package R) 3. Then
the following utilities have been considered outliers: CEEE, CELPA, PIRATININGA,
BANDEIRANTES, CEB, CELESC, CELG, CEMAT, CEMIG, COPEL, CPFL, ELETROPAULO,
ENERSUL, LIGHT. It can be observed that this technique has classified the utilities with the largest
markets, with geographical concentration and a strong industrial share of participation. For instance:
BANDEIRANTES, CEMIG, COPEL, CPFL, ELETROPAULO, ENERSUL, LIGHT.
Results concerning to the measurement of efficiency were obtained by NIRS DEA models
because the total elasticity ( e ) is less than 1 (report to section 2.2 and 2.3).
Type
Input (DEA) or
dependent (BSFA)
Output (DEA) or
independent (BSFA)
Table 1 – Input and Outputs variables.
Variable
Description
OPEX
Operational Expenditure (R$ 1.000).
MWh
NC
KM
Energy distributed.
Units consumers.
Network distribution length.
The scores for each of the 60 DMUs are exhibited in Table 2 in the appendix. They were
calculated using the DEA Excel Solver developed by Zhu (2003).
By analyzing the scores obtained by R1 in Table 2, it can be observed that nine companies are
on the best-practice frontier. Note also that seven (PIRATININGA, BANDEIRANTES, CEMIG,
COPEL, CPFL, ELETROPAULO and ENERSUL) were labeled as outliers. Given these results, it is
meaningful to emphasize that these seven DMUs will be considered influential observations. A good
strategy to verify this fact is to carry out another NIRS model on both the set of outlier DMUs (14
companies) and the remaining DMUs.
Focusing attention on the results listed in Table 2, it can be ascertained that the performance of
inefficient DMUs (see R1), which has as benchmarks the outlier companies, has changed so much
(refer to R2). That is, the average range of efficiency variation to this set of DMUs is approximately
10%. This case can be mathematically expressed as:
1 31 R 2 j o − R1 j o
∆ Eff o =
∑
(23)
31 j o = 1
R1 j o
j o = 2,5,6,7,8,10,13,14,17,18,20,21,24,25,28,31,33,34,35,40,41,42,43,46,52,54,55,56,57,59,60.
Continuing further, for others inefficient DMUs, it is approximately 1%. Mathematically:
1 20 R 2 j − R1 j
∆ Eff =
∑
(24)
20 j = 1 R1 j
j = 3,4 ,9 ,11,16 ,23,32,36,37 ,38,39,44 ,45,47 ,48,49,50,51,53,58.
With respect to econometric methodology, it is important to attribute a specification for the
costs frontier. To this end, a Cobb-Douglas functional form was adopted, which is defined by:
lnOPEX j = β 0 + β 1lnMWh j + β 2lnNC j + β 3lnKM j + v j + u j
(25)
As mentioned in section 2.4.1, it is useful that the expert incorporates information on
companies to the model. Accordingly, by inspection of R2 in Table 2, is possible to obtain the
following: (i) Efficient NIRS DEA utilities; (ii) Prior median efficiency.
The first consideration suggests the use of a dummy variable while the second provides
*
r = 0,620 (approximate value of the prior median efficiency evaluated in Table 2, in the column
headed “Adjusted input oriented NIRS Efficiencies (R2)”).
3
The FEAR package is available at: http://www.economics.clemson.edu/faculty/wilson/Software/FEAR
XLI SBPO 2009 - Pesquisa Operacional na Gestão do Conhecimento
Pág. 180
9
The Bayesian model is carried out using the free software WinBUGS (Bayesian inference
Using
Gibbs
Sampling
for
Windows)
that
can
be
downloaded
at
www.mrc-bsu.cam.ac.uk/bugs/Welcome.htm.
In this context, the chain was run with a burn-in of 20.000 iterations with 50.000 retained
draws and a thinning to every 7-th draw. WinBUGS has a number of tools for reporting the posterior
distribution. A simple summary (see Table 3 in the appendix) can be generated showing posterior
mean, median and standard deviation with a 95% posterior credible interval. Referring to estimated
coefficients, it is worth reporting that they are significant. To examine the question of analysis of
convergence of parameters, this was verified using serial autocorrelation graphs.
Finally, the Pearson correlation coefficients as well as the Spearman rank-order correlation
coefficients computed between the model estimates (R1, R2 and R3) are statistically significant at the
5% level and ranges from 83 to 96 percent.
4. Conclusions
The measurement of efficiency obtained by the DEA and Bayesian SFA model should express
the reduction in operational costs. In accordance with that has been already exposed, the potential
reduction of the operational costs for the j-th utility, i.e., the operational cost recognized by the
regulator is equal to OPEX j x 1 − θ j .
For the next tariff revision cycles, the Brazilian regulator ANEEL has signalized with the
possibility of using DEA and SFA models in the estimation of the efficient operational costs, an
important element in the determination of the X-factor of the utilities. The two approaches use
different assumptions; the DEA is deterministic and deviations with respect to the efficient frontier are
assumed to be solely due to the utilities inefficiency, whereas the SFA has a stochastic nature and
provides estimates of the efficiency, free of the uncontrollable impacts of random factors that affect
the DMUs.
In this work, it can be ascertained that the conjoint analysis of DEA and Stochastic Frontier in
the Bayesian approach is fundamental. Indeed, this is demonstrated through easy incorporation of
prior ideas and formal treatment of parameter and model uncertainty.
(
)
Appendix
Table 2 – Efficiency scores ( θ j ).
XLI SBPO 2009 - Pesquisa Operacional na Gestão do Conhecimento
10
Pág. 181
Adjuste d
Input
Input
orie nte d
orie nte d
NIRS
Be nchma rks
Be nchm a rks
NIRS
Efficie ncie s
Efficie ncie s
(R1)
(R2)
DMU
Num be r
DMU Na m e
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
AES-SUL
CEAL
CEEE
CELPA
CELTINS
CEPISA
CERON
COSERN
ENERGIPE
ESCELSA
MANAUS
PIRATININGA
RGE
SAELPA
BANDEIRANTES
CEB
CELESC
CELG
CELPE
CEMAR
CEMAT
CEMIG
CERJ
COELBA
COELCE
COPEL
CPFL
ELEKTRO
ELETROPAULO
ENERSUL
LIGHT
BOA VISTA
BRAGANTINA
CAUIÁ
CAT-LEO
CEA
CELB
CENF
CFLO
CHESP
COCEL
CPEE
CSPE
DEMEI
ELETROACRE
ELETROCAR
JAGUARI
JOÃO CESA
MOCOCA
MUXFELDT
NACIONAL
NOVA PALMA
PANAMBI
POÇOS DE CALDAS
SANTA CRUZ
SANTA MARIA
SULGIPE
URUSSANGA
V. PARANAPANEMA
XANXERÊ
1,000
0,603
0,273
0,362
0,377
0,657
0,431
0,832
0,698
0,680
0,381
1,000
0,997
0,881
1,000
0,287
0,576
0,532
1,000
0,675
0,458
1,000
0,744
0,757
0,795
1,000
1,000
0,968
1,000
1,000
0,856
0,190
0,433
0,449
0,611
0,315
0,706
0,505
0,521
0,807
0,508
0,516
0,621
0,621
0,570
0,479
0,594
0,493
0,501
0,760
0,588
0,721
0,375
0,662
0,483
0,573
0,812
0,268
0,398
0,315
19, 26
1, 19
1, 19
19, 26
19, 26
1, 19, 26
1, 19, 26
1, 19
1, 19,26
1, 19
1, 19, 26
19, 26
1, 19
1, 15, 22, 27
1, 26, 30
19, 26
1, 26, 30
1, 19
1, 19, 26
19, 26
1, 22, 26, 27
26, 27, 29
1, 19
1, 19, 26
1, 19, 26
1, 19, 30
1, 19
1, 19
1, 19
1, 19
26, 30
1, 19, 26
1, 19, 26
1, 19, 26
1, 19
19
1, 26, 30
1
1
1, 19
1, 19
1, 19
1, 26, 30
1, 19
1, 26, 30
1, 19, 26
1, 26, 30
19, 26
1
1, 19, 26
1, 26, 30
1,000
0,605
0,295
0,379
0,457
0,678
0,503
0,835
0,698
0,682
0,381
1,000
1,000
0,889
1,000
0,314
0,604
0,533
1,000
0,688
0,485
1,000
0,744
1,000
0,852
1,000
1,000
1,000
1,000
1,000
0,856
0,190
0,433
0,449
0,841
0,315
0,706
0,505
0,521
1,000
0,509
0,536
0,645
0,621
0,570
0,526
0,594
0,493
0,501
0,760
0,588
0,830
0,375
1,000
0,511
0,719
0,915
0,268
0,407
0,325
13, 19
12, 26
12, 26
13, 40, 54
13, 19
13, 40, 54
13, 19
1, 19
1, 13, 40
1, 19
13, 19
12, 26
12, 15, 26, 30
15, 22, 26, 30
13, 19
12, 26, 30
1, 19
13, 19, 28
26, 27, 29
1, 19
1, 13, 19
1, 13, 19
13, 54
1, 19
1, 19
1, 19
1, 19
1, 13, 19
13, 40
13, 40
1, 19
19
13, 40
1
1
1, 19
1, 19
1, 19
13, 40
1, 19
13, 40
13, 40, 54
13, 40
1
13, 19
1, 13, 40
Table 3 – Results of efficiencies obtained by BSFA.
DMU Na me
Ba yesia n
Efficie ncie s
(R3)
S.D.
2,50%
Me dian
97,50%
AES-SUL
CEAL
CEEE
CELPA
CELTINS
CEPISA
CERON
COSERN
ENERGIPE
ESCELSA
MANAUS
PIRATININGA
RGE
SAELPA
BANDEIRANTES
CEB
CELESC
CELG
CELPE
CEMAR
CEMAT
CEMIG
CERJ
COELBA
COELCE
COPEL
CPFL
ELEKTRO
ELETROPAULO
ENERSUL
LIGHT
BOA VISTA
BRAGANTINA
CAUIÁ
CAT-LEO
CEA
CELB
CENF
CFLO
CHESP
COCEL
CPEE
CSPE
DEMEI
ELETROACRE
ELETROCAR
JAGUARI
JOÃO CESA
MOCOCA
MUXFELDT
NACIONAL
NOVA PALMA
PANAMBI
POÇOS DE CALDAS
SANTA CRUZ
SANTA MARIA
SULGIPE
URUSSANGA
V. PARANAPANEMA
XANXERÊ
0,977
0,784
0,516
0,572
0,606
0,754
0,712
0,890
0,873
0,896
0,720
0,973
0,976
0,851
0,965
0,558
0,795
0,707
0,969
0,751
0,705
0,964
0,841
0,812
0,963
0,970
0,970
0,972
0,955
0,965
0,828
0,430
0,828
0,762
0,830
0,612
0,876
0,780
0,846
0,964
0,874
0,870
0,896
0,843
0,788
0,844
0,862
0,866
0,846
0,905
0,842
0,908
0,766
0,962
0,821
0,841
0,862
0,623
0,733
0,737
0,030
0,135
0,157
0,160
0,161
0,144
0,151
0,089
0,099
0,086
0,153
0,036
0,033
0,109
0,051
0,160
0,132
0,153
0,043
0,144
0,154
0,052
0,115
0,126
0,054
0,042
0,042
0,038
0,069
0,050
0,121
0,147
0,119
0,141
0,119
0,160
0,098
0,137
0,112
0,050
0,095
0,100
0,086
0,115
0,134
0,113
0,105
0,106
0,113
0,082
0,114
0,079
0,142
0,054
0,122
0,114
0,105
0,166
0,148
0,148
0,888
0,506
0,286
0,322
0,347
0,469
0,431
0,670
0,637
0,685
0,433
0,870
0,882
0,598
0,814
0,314
0,516
0,424
0,841
0,467
0,422
0,811
0,580
0,538
0,802
0,845
0,850
0,863
0,744
0,818
0,556
0,233
0,563
0,483
0,564
0,352
0,642
0,498
0,591
0,818
0,637
0,634
0,684
0,577
0,507
0,586
0,614
0,609
0,587
0,699
0,581
0,707
0,478
0,799
0,551
0,581
0,612
0,346
0,451
0,450
0,988
0,796
0,486
0,546
0,585
0,761
0,708
0,911
0,885
0,918
0,720
0,987
0,988
0,872
0,984
0,531
0,810
0,704
0,986
0,757
0,701
0,985
0,862
0,829
0,984
0,986
0,986
0,986
0,982
0,984
0,848
0,398
0,845
0,769
0,848
0,592
0,898
0,791
0,867
0,984
0,896
0,893
0,917
0,865
0,801
0,864
0,885
0,890
0,866
0,927
0,862
0,929
0,776
0,983
0,839
0,862
0,884
0,603
0,734
0,740
1,000
0,989
0,911
0,942
0,955
0,986
0,980
0,997
0,996
0,997
0,982
1,000
1,000
0,994
1,000
0,936
0,990
0,980
1,000
0,985
0,980
1,000
0,994
0,992
1,000
1,000
1,000
1,000
1,000
1,000
0,993
0,836
0,993
0,987
0,993
0,956
0,996
0,988
0,994
1,000
0,996
0,996
0,997
0,994
0,989
0,994
0,995
0,995
0,994
0,997
0,994
0,997
0,988
1,000
0,992
0,994
0,995
0,964
0,983
0,984
References
XLI SBPO 2009 - Pesquisa Operacional na Gestão do Conhecimento
11
Pág. 182
AIGNER, D.; LOVELL, K.; SCHMIDT, P. Formulation and estimation of stochastic frontier production
function models. Journal of Econometrics, n. 6, p. 21-37, 1977.
ANDREWS, D. F.; PREGIBON, D. Finding the Outliers that Matter. Journal of the Royal Statistical
Society, Series B 40, p. 85-93, 1978.
ALLEN, R.; et al. Weight restrictions and value judgements in data envelopment analysis: evolution,
development and future directions. Annals of Operations Research, v. 73, p. 13-34, 1997.
ARCOVERDE, F. D.; TANNURI-PIANTO, M. E.; SOUSA, M. C. S. Mensuração das eficiências das
distribuidoras do setor energético brasileiro usando fronteiras estocásticas. In: XXXIII ENCONTRO
NACIONAL DE ECONOMIA, Natal, RN, 2005, Anais.
ANEEL Resolução Normativa nº 55/2004, 5 de abril de 2004
ANEEL Nota Técnica nº 166/2006 – SRE/ANEEL , 19 de maio de 2006a
ANEEL Nota Técnica nº 262/2006 – SER/SFF/SRD/SFE/SRC/ANEEL , 19 de outubro de 2006b
ANEEL Nota Técnica no 125/2007 - SRE/ANEEL, 11 de maio de 2007.
BANKER, R. D.; CHARNES, A.; COOPER, W. W. Some models for estimating technical and scale
inefficiencies in data envelopment analysis. Management Science, v. 30, n. 9, p. 1078-1092, 1984.
BATTESE, G. E.; CORRA, G. S. Estimation of a production frontier model with application to the pastoral
zone of eastern Australia. Australian Journal of Agricultural Economics, n. 21, p. 169-179, 1977.
CHARNES, A.; COOPER, W. W.; RHODES, E. Measuring the efficiency of decision-making units.
European Journal of Operational Research, n. 2, p. 429-444, 1978.
COELLI, T., RAO, D.S.P., BATTESE, G.E. An Introduction to Efficiency and Productivity Analysis,
Boston, MA: Kluwer Academic, 1998.
COOPER, W.W., SEIFORD, L.M., TONE, K., Data Envelopment Analysis, A Comprehensive Text with
Models Applications, Reference and DEA-Solver Software, Kluwer Academic Publishers, 2000.
DAVIES, L.; GATHER, U. The identification of multiple outliers. Journal of the American Statistical
Association, v. 88, n. 423, p. 782-792, 1993.
DUSANSKY, R.; WILSON, P. W. On the relative efficiency of alternative modes of producing a public
sector outputs: the case of the developmentally disabled. European Journal of Operational Research, n. 80,
p. 608-618, 1995.
EHLERS,
R.
S.
Métodos
computacionais
intensivos
no
R.
Disponível
em
http://leg.ufpr.br/~ehlers/ce718/praticas/praticas.html>. Acessado em Janeiro de 2005.
FÄRE, R.; GROSSKOPF, S. A nonparametric cost approach to scale efficiency. Scandinavian Journal of
Economics , n. 87, p. 594-604, 1985.
FARRELL, M. J. The measurement of productive efficiency. Journal of the Royal Statistic Society, n. 120,
p. 253-281, 1957.
FERNÁNDEZ, C.; OSIEWALSKI, J.; STEEL, M. F. J. On the use of panel data in stochastic frontier
models with improper priors. Journal of Econometrics. n. 79, p. 169-193, 1997.
FORNI, A. L. C. On the detection of outliers in data envelopment analysis methodology, 2002. Dissertação
de Mestrado em Engenharia Mecânica-Aeronáutica, Instituto Tecnológico de Aeronáutica, São José dos
Campos.
GAMERMAN, D. Markov Chain Monte Carlo - Stochastic simulation for Bayesian inference. London:
Chapman and Hall, 1997.
GREENE, W. H. A gamma-distributed stochastic frontier model. Journal of Econometrics, n. 46, p. 141164, 1990.
JASMAB, T.; POLLIT, M. Benchmarking and regulation: international electricity experience. Utilities
Policy, v. 9, n. 3, p. 107-130, 2000.
KOOP, G.; OSIEWALSKI, J.; STEEL, M. F. J. Posterior analysis of stochastic frontier models using Gibbs
sampling. Computational Statistics. n. 10, p. 353-373, 1995.
KOOP, G.; OSIEWALSKI, J.; STEEL, M. F. J. Bayesian efficiency analysis through individual effects:
hospital cost frontiers. Journal of Econometrics, n. 76, p. 77-105, 1997.
KUMBHAKAR, S.C., LOVELL, C.A.K. Stochastic Frontier Analysis, Cambridge University Press, 2000.
MEDRANO, L. A. T.; MIGON, H. S. Critérios baseados na “Deviance” para a comparação de modelos
Bayesiano de fronteira de produção estocástica. Rio de Janeiro: UFRJ, 2004. (Technical Report 176).
MEEUSEN, W.; VAN DEN BROECK, J. Efficiency estimation from Cobb-Douglas production functions
with composed error. International Economic Review, n. 18, p. 435-444, 1977.
OSIEWALSKI, J.; STEEL, M. F. J. Numerical tools for the Bayesian analysis of frontier models. Journal
of Productivity Analysis, n. 10, p. 103-117, 1998.
XLI SBPO 2009 - Pesquisa Operacional na Gestão do Conhecimento
12
Pág. 183
PASTOR, J. T. ; RUIZ, J. L.; SIRVANT, I. A statistical test for detecting influential observations in DEA,
European Journal of Operational Research, v. 115, n. 3, p. 542-554, 1999.
PESSANHA, J. F. M.; SOUZA, R. C.; LAURENCEL, L. C. Usando DEA na avaliação da eficiência
operacional das distribuidoras do setor elétrico brasileiro. In: XII CONGRESSO LATINOIBEROAMERICANO DE INVESTIGACION DE OPERACIONES Y SISTEMAS, Cuidad de La Havana,
CUBA, 2004, Anais.
SIMAR, L.; WILSON, P. W. Non-parametric tests of returns to scale. European Journal of Operational
Research, n. 139, p. 115-132, 2002.
SIMAR, L. Detecting outliers in frontier models: a simple approach. Journal of Productivity Analysis, n.
20, p. 391-424, 2003.
STEVENSON, R. E. Likelihood functions for generalized stochastic frontier estimation. Journal of
Econometrics, n. 13, p. 57-66, 1980.
SOLLERO, M. V. K.; LINS, M. P. E. Avaliação da eficiência de distribuidoras de energia elétrica através
da análise envoltória de dados com restrições aos pesos. In: XXXVI SIMPÓSIO BRASILEIRO DE
PESQUISA OPERACIONAL, São João Del Rei, MG: SOBRAPO, 2004, Anais.
SOUZA, M.V.P. Uma abordagem Bayesiana para o cálculo dos custos operacionais eficientes das
distribuidoras de energia elétrica. 2008. Tese de Doutorado em Engenharia Elétrica, Pontifícia
Universidade Católica do Rio de Janeiro, Rio de Janeiro.
VAN DEN BROECK, J.; et al. Stochastic frontier models: a Bayesian perspective. Journal of
Econometrics, n. 61, p. 273-303, 1994.
WILSON, P. W. Detecting outliers in deterministic non-parametric frontier models with multiple outputs.
Journal of Business and Economic Statistics Analysis, n. 11, p. 319-323, 1993.
WILSON, P. W. Detecting influential observations in deta envelopment analysis. The Journal of
Productivity Analysis, n. 6, p. 27-45, 1995.
ZHU, J. Quantitative models for performance evaluation and benchmarking. Massachusetts: Springer,
2003.
XLI SBPO 2009 - Pesquisa Operacional na Gestão do Conhecimento
13
Pág. 184