Download
# Estimators and Estimators Guy Lebanon Estimators and estimators also called estimating equations are natural extensions of the MLE PDF document - DocSlides

pasty-toler | 2014-12-15 | General

### Presentations text content in Estimators and Estimators Guy Lebanon Estimators and estimators also called estimating equations are natural extensions of the MLE

Show

Page 1

-Estimators and -Estimators Guy Lebanon -Estimators and -estimators (also called estimating equations) are natural extensions of the MLE. They enjoy similar consistency and are asymptotically normal, although with sometimes higher asymptotic variance. There are several reasons for studying these estimators: (a) they may be more comptuationally eﬃcient than the MLE, (b) they may be more robust (resistent to deviations from the assumptions) than MLE, and (c) they can be analyzed using techniques that do not assume the true model is within the assumed parametric family. We follow along lines similar to [2] where more details may be found. We assume that we have samples ,...,X iid ∈Q and we consider a parametric family }⊂Q for the purpose of approximating . Note that is not necessarily a member of . The -estimator associated with a given a function ) is arg max ) where ) = =1 The -estimator associated with a given vector valued function = ( θ, ,..., θ,l ) : is the value of in Θ satisfying the following -equations ) = 0 where ) = =1 The two estimators are equivalent if is concave and smooth in and θ,i ) = ∂m / . The case ) = log ) or θ,i ) = log / reduces or -estimators to the MLE. In some cases it is convenient to work with -estimators and in other cases with -estimators. Consistency Consistency, in this case, corresponds to the convergence of the -estimator to def = arg max ) where def ). Note that this does not mean convergence to the truth i.e., since may lie outside . Rather we have convergence to - the “projection” of on . Note that in the case of the MLE, the projection is in the KL-divergence sense: = arg min KL || ). The proposition below is in terms of -estimates but a similar one holds for -estimates. Proposition 1 ([2]) Assume sup (law or large numbers convergence is uniform over ), and for all > sup θ, . Then for any sequence of estimators with (1) we have The ﬁrst condition is satisﬁed by the uniform strong law of large numbers and is satisﬁed for example if Θ is compact, is continuous in for all and if < K ), x, for some function with [1]. There are other less restrictive conditions. The second condition correspond to being isolated from the rest of the function and may be easily veriﬁed by examining the shape of the function . For example, it holds for concave and continuous over a compact set Θ. The assertion (1) is trivially satisﬁed if is an -estimator (maximizes ). Proof. The uniform convergence of to implies ) and since (1) we have (1) and )+ (1) sup (1) By the second assumption, > 0 there exists η > 0 with < M for every for

Page 2

which θ, . Thus, the event , is contained in whose probability converges to 0. Asymptotic Normality We prove below that -estimators (the zero of the vector valued ) = =1 )) are asymp- totically a Gaussian whose mean is the zero of ) which we denote by . We denote the matrix of partial derivates of by . The result below reduces to the standard MLE asymptotic normality if log and ∈{ , but is more general since it applies to general -estimates and does not assume ∈{ . A similar result may be stated for -estimators. Proposition 2. We assume that is open and convex, ) = 0 exists and is non-singular, and ij / for all i,j and in a neighborhood of for some integrable . Then every consistent estimator sequence for which ) = 0 satisﬁes =1 ) + and (1) (0 )( (2) Proof. By Taylor’s theorem there exists a random vector on the line segment between and for which 0 = ) = ) + )( ) + )( which we re-arrange as )( ) + )( ) = ) = ) + (1) (3) where the second equality is due to the fact that and that continuous functions preserves limits. Since ) converges by the law of large numbers to ) and ) converges to a matrix of bounded values in the neighborhood of (for large ) Equation (3) becomes )+ (1)+ (1))( ) = )+ (1))( ) = )+ (1) since (1) and (1) (1) = (1) (the notation (1) denotes stochastically bounded and it applies to ) as described above). The matrix ) + (1) converges to a non-singular matrix and multiplying by its inverse proves (1). Equation (2) follows from (1) by noticing that =1 ) is an average of iid RVs with expecta- tion 0. Applying Slutsky’s theorem followed by the central limit theorem to the right hand side establishes normality while a simple calculation establishes the variance in (2). If we neglect the remainder in (1), the (asymptotic) inﬂuence function is def ,...,X ,z ,...,X 1) =1 ) + (4) References [1] T. S. Ferguson. A Course in Large Sample Theory . Chapman & Hall, 1996. [2] A. W. van der Vaart. Asymptotic Statistics . Cambridge University Press, 1998.

They enjoy similar consistency and are asymptotically normal although with sometimes higher asymptotic variance There are several reasons for studying these estimators a they may be more comptuationally e64259cient than the MLE b they may be more ro ID: 24185

- Views :
**134**

**Direct Link:**- Link:https://www.docslides.com/pasty-toler/estimators-and-estimators-guy
**Embed code:**

Download this pdf

DownloadNote - The PPT/PDF document "Estimators and Estimators Guy Lebanon Es..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Page 1

-Estimators and -Estimators Guy Lebanon -Estimators and -estimators (also called estimating equations) are natural extensions of the MLE. They enjoy similar consistency and are asymptotically normal, although with sometimes higher asymptotic variance. There are several reasons for studying these estimators: (a) they may be more comptuationally eﬃcient than the MLE, (b) they may be more robust (resistent to deviations from the assumptions) than MLE, and (c) they can be analyzed using techniques that do not assume the true model is within the assumed parametric family. We follow along lines similar to [2] where more details may be found. We assume that we have samples ,...,X iid ∈Q and we consider a parametric family }⊂Q for the purpose of approximating . Note that is not necessarily a member of . The -estimator associated with a given a function ) is arg max ) where ) = =1 The -estimator associated with a given vector valued function = ( θ, ,..., θ,l ) : is the value of in Θ satisfying the following -equations ) = 0 where ) = =1 The two estimators are equivalent if is concave and smooth in and θ,i ) = ∂m / . The case ) = log ) or θ,i ) = log / reduces or -estimators to the MLE. In some cases it is convenient to work with -estimators and in other cases with -estimators. Consistency Consistency, in this case, corresponds to the convergence of the -estimator to def = arg max ) where def ). Note that this does not mean convergence to the truth i.e., since may lie outside . Rather we have convergence to - the “projection” of on . Note that in the case of the MLE, the projection is in the KL-divergence sense: = arg min KL || ). The proposition below is in terms of -estimates but a similar one holds for -estimates. Proposition 1 ([2]) Assume sup (law or large numbers convergence is uniform over ), and for all > sup θ, . Then for any sequence of estimators with (1) we have The ﬁrst condition is satisﬁed by the uniform strong law of large numbers and is satisﬁed for example if Θ is compact, is continuous in for all and if < K ), x, for some function with [1]. There are other less restrictive conditions. The second condition correspond to being isolated from the rest of the function and may be easily veriﬁed by examining the shape of the function . For example, it holds for concave and continuous over a compact set Θ. The assertion (1) is trivially satisﬁed if is an -estimator (maximizes ). Proof. The uniform convergence of to implies ) and since (1) we have (1) and )+ (1) sup (1) By the second assumption, > 0 there exists η > 0 with < M for every for

Page 2

which θ, . Thus, the event , is contained in whose probability converges to 0. Asymptotic Normality We prove below that -estimators (the zero of the vector valued ) = =1 )) are asymp- totically a Gaussian whose mean is the zero of ) which we denote by . We denote the matrix of partial derivates of by . The result below reduces to the standard MLE asymptotic normality if log and ∈{ , but is more general since it applies to general -estimates and does not assume ∈{ . A similar result may be stated for -estimators. Proposition 2. We assume that is open and convex, ) = 0 exists and is non-singular, and ij / for all i,j and in a neighborhood of for some integrable . Then every consistent estimator sequence for which ) = 0 satisﬁes =1 ) + and (1) (0 )( (2) Proof. By Taylor’s theorem there exists a random vector on the line segment between and for which 0 = ) = ) + )( ) + )( which we re-arrange as )( ) + )( ) = ) = ) + (1) (3) where the second equality is due to the fact that and that continuous functions preserves limits. Since ) converges by the law of large numbers to ) and ) converges to a matrix of bounded values in the neighborhood of (for large ) Equation (3) becomes )+ (1)+ (1))( ) = )+ (1))( ) = )+ (1) since (1) and (1) (1) = (1) (the notation (1) denotes stochastically bounded and it applies to ) as described above). The matrix ) + (1) converges to a non-singular matrix and multiplying by its inverse proves (1). Equation (2) follows from (1) by noticing that =1 ) is an average of iid RVs with expecta- tion 0. Applying Slutsky’s theorem followed by the central limit theorem to the right hand side establishes normality while a simple calculation establishes the variance in (2). If we neglect the remainder in (1), the (asymptotic) inﬂuence function is def ,...,X ,z ,...,X 1) =1 ) + (4) References [1] T. S. Ferguson. A Course in Large Sample Theory . Chapman & Hall, 1996. [2] A. W. van der Vaart. Asymptotic Statistics . Cambridge University Press, 1998.

Today's Top Docs

Related Slides