lecture_11

# G h not accessible iid sample of noisy observations h

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: i - Foundations of Machine Learning page 34 Mean Estimation Theorem: Let X be a random variable taking values in [0, 1] and let x0 , . . . , xm be i.i.d. values of X. Deﬁne µm the sequence(µm )m∈N by µm+1 = (1 − αm)￿+ αm xm ￿ with µ0 = x0 , αm ∈ [0, 1] , αm = +∞ and α2 &lt; +∞. m m≥ 0 m≥ 0 Then, a.s µm −→ E[X ]. − Proof: By the independence assumption, for m ≥ 0 , • Var[µm+1 ] = (1 − αm )2 Var[µm ] + α2 Var[xm ] m ≤ (1 − αm )Var[µm ] + α2 . m ￿ We have αm → 0 since m≥0 α2 &lt; +∞. m Mehryar Mohri - Foundations of Machine Learning page 35 Mean Estimation 0 and suppose • Let ￿ &gt;m ≥ N Var[µ ] ≥there exists N ∈ N such that ￿ for all which implies , m . Then, for m ≥ N , Var[µm+1 ] ≤ Var[µm ] − αm ￿ + α2 , m ￿m+N ￿ m +N 2 Var[µm+N ] ≤ Var[µN ] − ￿ n=N αn + n=N αn , ￿ ￿￿ ￿ →−∞ when m→∞ contradicting Var[µm+N ] ≥ 0 . • Thus, for all N ∈ N there exists m ≥ N such that 0 Var[µm0 ] &lt; ￿. Choose N large enough so that ∀m ≥ N, αm ≤ ￿. Then, Var[µm0 +1 ] ≤ (1 − αm0)￿ + ￿αm0= ￿. • Therefore, µ m ≤￿ Mehryar Mohri - Foundations of Machine Learning for all m ≥ m0 (L2 convergence). page 36 Notes 1 special case: αm = m . Strong law of large numbers. • Connection with stochastic approximation. Mehryar Mohri - Foundations of Machine Learning page 37 Stochastic Approximation Problem: ﬁnd solution of x = H (x) with x ∈ RN while H (x) cannot be computed, e.g., H not accessible; i.i.d sample of noisy observations H (xi )+ wi , available, i ∈ [1, m] , with E[w] = 0. • • Idea: algorithm based on iterative technique: xt+1 = (1 − αt )xt + αt [H (xt ) + wt ] = xt + αt [H (xt ) + wt − xt ]. • more generally x t+1 Mehryar Mohri - Foundations of Machine Learning = xt + αt D(xt , wt ) . page 3...
View Full Document

Ask a homework question - tutors are online