Studying machine learning and statistical pattern recognition these days, I’ve learned a nice fact about estimation. The proof is straightforward but I’d like to remeber this fact, so here it is.

**Note.** Maximum likelihood estimation is equivalent to least squares estimation in the presence of Gaussian noise.

Let and let follow a normal gaussian distribution .

In a least squares estimate one minimize the following

(*)

In a maximum likelihood one defines a likelihood

and then minimize

which is equivalent to (*). **QED**