Posts

Showing posts from August 15, 2018

Understanding the difference between Ridge and LASSO

Image
Clash Royale CLAN TAG #URR8PPP up vote 5 down vote favorite I've asked this question a few days ago in the statistics site of this network, but although it's received a fair amount of views, I got no answer. If this kind of double posting is inappropriate let me know and I'll delete one of them. I'll start by presenting my current understanding of regularized loss minimization. So generally we'd be trying to minimize directly an empirical loss, something of the form $$undersethinmathcalHargminsum_i=1^mLleft(y_i,hleft(x_iright)right)$$ where there are $m$ samples $x_1,ldots,x_m$ labeled by $y_1,ldots,y_m$ and $L$ is some loss function. In regularized loss minimization we are trying to minimize $$undersethinmathcalHargminsum_i=1^mLleft(y_i,hleft(x_iright)right)+lambdamathcalRleft(hright)$$ where $mathcalR:mathcalHtomathbbR$ is some regularizer, which penalizes a hypothesis $h$ for it's complexity, and $lambda$ is a parameter controlling the severity o...