regression-leastsquares

import%20marimo%0A%0A__generated_with%20%3D%20%220.15.2%22%0Aapp%20%3D%20marimo.App()%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22!%5BMOSEK%20ApS%5D(https%3A%2F%2Fwww.mosek.com%2Fstatic%2Fimages%2Fbranding%2Fwebgraphmoseklogocolor.png%20)%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%23%20Least-squares%20regression%20with%20MOSEK%0A%0A%20%20%20%20Regression%20belongs%20to%20the%20basic%20toolbox%20of%20statistics%2C%20finance%2C%20machine%20learning%2C%20biology%20and%20many%20other%20disciplines%20that%20involves%20constructing%20approximate%20models%20of%20data.%20In%20this%20notebook%20we%20show%20how%20to%20quickly%20construct%20some%20and%20solve%20some%20standard%20regression%20models%20using%20the%20*Fusion*%20interface%20for%20Python.%0A%0A%20%20%20%20Table%20of%20contents%3A%0A%0A%20%20%20%20*%20least-squares%20(LSE%2C%20mean)%20regression%0A%20%20%20%20*%20regularization%20(ridge%20and%20lasso)%0A%20%20%20%20*%20robust%20regression%20with%20Huber%20loss%20function%0A%0A%20%20%20%20Other%20related%20material%3A%0A%0A%20%20%20%20*%20%5BRegression%20techniques%20for%20portfolio%20optimization%20using%20MOSEK%5D(https%3A%2F%2Farxiv.org%2Fabs%2F1310.3397)%0A%20%20%20%20*%20Another%20MOSEK%20notebook%20on%20regression%20under%20LP-norms%0A%20%20%20%20*%20Future%20MOSEK%20notebooks%20will%20discuss%20LAD%20(least%20absolute%20deviation%2C%20%24L_1%24%20or%20median)%20regression%2C%20quantile%20regression%20and%20mixed-integer%20piecewise-linear%20regression%20models.%0A%0A%20%20%20%20A%20general%20linear%20regression%20model%20approximates%20the%20relation%20between%20a%20vector%20of%20%24d%24%20independent%20variables%20(features)%20%20%24x%3D(x_1%2C%5Cldots%2Cx_d)%24%20and%20a%20dependent%20variable%20%24y%24.%20(To%20allow%20for%20intercept%20we%20are%20often%20going%20to%20assume%20%24x_1%3D1%24).%20Suppose%20we%20have%20%24n%24%20input%20vectors%20%24x%24%20arranged%20in%20the%20rows%20of%20a%20matrix%20%24%5Cmathbf%7BX%7D%5Cin%5Cmathbb%7BR%7D%5E%7Bn%5Ctimes%20d%7D%24%20and%20%24n%24%20corresponding%20observations%20%24y%24%20arranged%20in%20a%20vector%20%24%5Cmathbf%7By%7D%5Cin%5Cmathbb%7BR%7D%5En%24.%20The%20goal%20is%20to%20find%20a%20vector%20of%20weights%20%24%5Cmathbf%7Bw%7D%5Cin%5Cmathbb%7BR%7D%5E%7Bd%7D%24%20so%20that%20in%20some%20approximate%20sense%20we%20can%20write%0A%0A%20%20%20%20%24%24%5Cmathbf%7By%7D%5Capprox%20%5Cmathbf%7BX%7D%5Cmathbf%7Bw%7D.%24%24%0A%0A%20%20%20%20We%20call%20%24%5Cmathbf%7Br%7D%3D%5Cmathbf%7By%7D-%5Cmathbf%7BX%7D%5Cmathbf%7Bw%7D%24%20the%20*residual*%20and%20we%20aim%20to%20minimize%20some%20penalty%20function%20%24%5Cphi(%5Cmathbf%7Br%7D)%24.%0A%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_()%3A%0A%20%20%20%20from%20mosek.fusion%20import%20Model%2C%20Domain%2C%20Expr%2C%20ObjectiveSense%2C%20Matrix%2C%20Var%2C%20SolutionStatus%0A%20%20%20%20import%20mosek.fusion.pythonic%0A%20%20%20%20import%20numpy%2C%20sys%0A%20%20%20%20import%20matplotlib.pyplot%20as%20plt%0A%20%20%20%20import%20matplotlib.cm%20as%20cm%0A%20%20%20%20return%20Domain%2C%20Expr%2C%20Model%2C%20ObjectiveSense%2C%20numpy%2C%20plt%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%23%20Least-squares%20regression%0A%0A%20%20%20%20In%20the%20simplest%20*unconstrained%20least%20squares%20problem*%20(LSE%2C%20mean%20regression)%20the%20penalty%20function%20is%0A%0A%20%20%20%20%24%24%5Cphi(%5Cmathbf%7Br%7D)%3D%5C%7C%5Cmathbf%7Br%7D%5C%7C_2%3D%5Csqrt%7B%5Csum_%7Bi%3D1%7D%5En%20r_i%5E2%7D%24%24%0A%0A%20%20%20%20and%20solving%20the%20least-squares%20regression%20problem%20is%20equivalent%20to%20the%20minimization%20problem%0A%0A%20%20%20%20%24%24%5Cmathrm%7Bmin%7D_%7B%5Cmathbf%7Bw%7D%5Cin%5Cmathbb%7BR%7D%5Ed%7D%20%5C%7C%5Cmathbf%7By%7D-%5Cmathbf%7BX%7D%5Cmathbf%7Bw%7D%5C%7C_2.%24%24%0A%0A%20%20%20%20A%20conic%20formulation%20of%20this%20minimization%20problem%20is%3A%0A%0A%20%20%20%20%24%24%0A%20%20%20%20%5Cbegin%7Barray%7D%7Brl%7D%0A%20%20%20%20%5Cmathrm%7Bminimize%7D%20%26%20%20%20%20%20%20%20%20%20%20%20%20t%20%5C%5C%0A%20%20%20%20%5Cmathrm%7Bs.t.%7D%20%20%20%20%20%26%20%20%20%20%20%20%20%20%20%20%20%20(t%2C%20%5Cmathbf%7By%7D-%5Cmathbf%7BX%7D%5Cmathbf%7Bw%7D)%20%5Cin%20%5Cmathcal%7BQ%7D%20%5C%5C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%26%20%20%20%20%20%20%20%20%20%20%20%20%5Cmathbf%7Bw%7D%5Cin%5Cmathbb%7BR%7D%5E%7Bd%7D%20%5C%5C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%26%20%20%20%20%20%20%20%20%20%20%20%20t%5Cin%5Cmathbb%7BR%7D%20%0A%20%20%20%20%5Cend%7Barray%7D%0A%20%20%20%20%24%24%0A%0A%20%20%20%20This%20formulation%20can%20be%20translated%20directly%20into%20a%20*Fusion*%20model%3A%0A%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(Domain%2C%20Expr%2C%20Model%2C%20ObjectiveSense)%3A%0A%20%20%20%20def%20lse(X%2C%20y)%3A%0A%20%20%20%20%20%20%20%20n%2C%20d%20%3D%20(len(X)%2C%20len(X%5B0%5D))%0A%20%20%20%20%20%20%20%20M%20%3D%20Model('LSE')%0A%20%20%20%20%20%20%20%20w%20%3D%20M.variable('w'%2C%20d)%0A%20%20%20%20%20%20%20%20t%20%3D%20M.variable('t')%0A%20%20%20%20%20%20%20%20r%20%3D%20y%20-%20X%20%40%20w%0A%20%20%20%20%20%20%20%20M.constraint(Expr.vstack(t%2C%20r)%2C%20Domain.inQCone())%0A%20%20%20%20%20%20%20%20M.objective(ObjectiveSense.Minimize%2C%20t)%0A%20%20%20%20%20%20%20%20return%20M%0A%20%20%20%20return%20(lse%2C)%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%23%23%23%20Least%20squares%20example%0A%20%20%20%20As%20an%20example%20we%20use%20the%20basic%20model%20to%20estimate%20a%20simple%20polynomial%20regression%20problem%20with%20synthetic%20data%20points%20sampled%20from%20a%20degree%20%243%24%20planar%20curve%20with%20Gaussian%20error.%0A%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(lse%2C%20numpy%2C%20plt)%3A%0A%20%20%20%20def%20_()%3A%0A%20%20%20%20%20%20%20%20N%20%3D%20100%0A%20%20%20%20%20%20%20%20sigma%20%3D%200.03%0A%20%20%20%20%20%20%20%20x%20%3D%20numpy.sort(numpy.random.uniform(0.0%2C%201.0%2C%20N))%0A%0A%20%20%20%20%20%20%20%20def%20f(t)%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20return%201.5%20*%20t%20**%203%20-%202%20*%20t%20**%202%20%2B%200.5%20*%20t%20-%201%0A%0A%20%20%20%20%20%20%20%20y%20%3D%20f(x)%20%2B%20numpy.random.normal(0.0%2C%20sigma%2C%20N)%0A%20%20%20%20%20%20%20%20X%20%3D%20%5B%5B1%2C%20t%2C%20t%20**%202%2C%20t%20**%203%5D%20for%20t%20in%20x%5D%0A%20%20%20%20%20%20%20%20M%20%3D%20lse(X%2C%20y)%0A%20%20%20%20%20%20%20%20M.solve()%0A%20%20%20%20%20%20%20%20w%20%3D%20M.getVariable('w').level()%0A%20%20%20%20%20%20%20%20plt.scatter(x%2C%20y)%0A%20%20%20%20%20%20%20%20_ticks%20%3D%20numpy.linspace(0%2C%201%2C%20100)%0A%20%20%20%20%20%20%20%20plt.plot(_ticks%2C%20f(_ticks)%2C%20color%3D'black')%0A%20%20%20%20%20%20%20%20plt.plot(_ticks%2C%20sum(%5Bw%5Bi%5D%20*%20_ticks%20**%20i%20for%20i%20in%20range(4)%5D)%2C%20color%3D'red')%0A%20%20%20%20%20%20%20%20return%20plt.show()%0A%20%20%20%20_()%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22Note%20that%20the%20unconstrained%20model%20leaves%20open%20numerous%20possibilities.%20It%20wold%20be%20just%20as%20easy%20to%20phrase%20the%20problem%20without%20intercepts%2C%20add%20additional%20linear%20or%20conic%20constraints%20on%20%24%5Cmathbf%7Bw%7D%24%2C%20compute%20various%20statistics%20of%20the%20solution%20and%20so%20on.%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%23%20Regularisation%0A%0A%20%20%20%20Regularisation%20is%20a%20technique%20which%20helps%20stabilize%20computational%20results%2C%20which%20tend%20to%20be%20sensitive%20to%20small%20perturbations%20when%20the%20matrix%20%24%5Cmathbf%7BX%7D%24%20has%20nearly%20dependent%20columns.%20In%20machine%20learning%20regularisation%20is%20a%20major%20technique%20that%20%20helps%20avoid%20overfitting.%20A%20general%20regularized%20least%20squares%20regression%20problem%20is%20usually%20written%20in%20the%20form%0A%0A%20%20%20%20%24%24%5Cmathrm%7Bmin%7D_%7B%5Cmathbf%7Bw%7D%5Cin%5Cmathbb%7BR%7D%5Ed%7D%20%5C%7C%5Cmathbf%7By%7D-%5Cmathbf%7BX%7D%5Cmathbf%7Bw%7D%5C%7C_2%5E2%20%2B%20%5Clambda%5Cphi'(T%5Cmathbf%7Bw%7D)%24%24%0A%0A%20%20%20%20where%20%24T%24%20is%20a%20linear%20operator%2C%20for%20example%20%24T%5Cmathbf%7Bw%7D%3D%5Cmathbf%7Bw%7D%24%20or%20%24T%5Cmathbf%7Bw%7D%3D%5Cmathbf%7Bw%7D-%5Cmathbf%7Bw%7D_0%24%2C%20%24%5Cphi'%24%20is%20an%20additional%20penalty%20function%20and%20%24%5Clambda%24%20controls%20the%20tradeoff%20between%20regularisation%20and%20fit.%20The%20two%20main%20regularisation%20terms%20occurring%20in%20practice%20are%3A%0A%0A%20%20%20%20*%20quadratic%2C%20also%20known%20as%20*ridge%20regression*%2C%20leading%20to%20the%20problem%20%24%24%5Cmathrm%7Bmin%7D_%7B%5Cmathbf%7Bw%7D%5Cin%5Cmathbb%7BR%7D%5Ed%7D%20%5C%7C%5Cmathbf%7By%7D-%5Cmathbf%7BX%7D%5Cmathbf%7Bw%7D%5C%7C_2%5E2%20%2B%20%5Clambda%5C%7CT%5Cmathbf%7Bw%7D%5C%7C_2%5E2%24%24%0A%20%20%20%20*%20linear%2C%20also%20known%20as%20*lasso%20(least%20absolute%20shrinkage%20and%20selection%20operator)*%2C%20leading%20to%20the%20problem%20%24%24%5Cmathrm%7Bmin%7D_%7B%5Cmathbf%7Bw%7D%5Cin%5Cmathbb%7BR%7D%5Ed%7D%20%5C%7C%5Cmathbf%7By%7D-%5Cmathbf%7BX%7D%5Cmathbf%7Bw%7D%5C%7C_2%5E2%20%2B%20%5Clambda%5C%7CT%5Cmathbf%7Bw%7D%5C%7C_1.%24%24%20When%20%24T%3D%5Cmathrm%7Bid%7D%24%20lasso%20regression%20tends%20to%20give%20preference%20to%20sparser%20solutions.%0A%20%20%20%20*%20*Elastic%20net*%2C%20a%20combination%20of%20quadratic%20and%20linear%20regularisation%20terms%3A%0A%20%20%20%20%24%24%5Cmathrm%7Bmin%7D_%7B%5Cmathbf%7Bw%7D%5Cin%5Cmathbb%7BR%7D%5Ed%7D%20%5C%7C%5Cmathbf%7By%7D-%5Cmathbf%7BX%7D%5Cmathbf%7Bw%7D%5C%7C_2%5E2%20%2B%20%5Clambda_1%5C%7CT_1%5Cmathbf%7Bw%7D%5C%7C_1%20%2B%20%5Clambda_2%5C%7CT_2%5Cmathbf%7Bw%7D%5C%7C_2%5E2.%24%24%0A%0A%20%20%20%20Below%20is%20the%20most%20general%20version%20formulated%20as%20a%20conic%20model%2C%20for%20simplicity%20with%20both%20%24T_1%24%20and%20%24T_2%24%20being%20the%20identity.%0A%0A%20%20%20%20%24%24%0A%20%20%20%20%5Cbegin%7Barray%7D%7Brl%7D%0A%20%20%20%20%5Ctext%7Bminimize%7D%20%20%20%20%20%20%26%20%20%20t%20%2B%20%5Clambda_1%20p_%7B%5Cmathrm%7Blasso%7D%7D%20%2B%20%5Clambda_2%20p_%7B%5Cmathrm%7Bridge%7D%7D%20%5C%5C%0A%20%20%20%20%5Ctext%7Bsubject%20to%3A%20residual%7D%20%26%20%20(0.5%2C%20t%2C%20%5Cmathbf%7By%7D-%5Cmathbf%7BX%7D%5Cmathbf%7Bw%7D)%20%5Cin%20%5Cmathcal%7BQ%7D_%7B%5Cmathrm%7Brot%7D%7D%20%5C%5C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%26%20%20%5Cmathbf%7Bw%7D%5Cin%5Cmathbb%7BR%7D%5E%7Bd%7D%20%5C%5C%0A%20%20%20%20%5Ctext%7Blasso%7D%20%20%20%20%26%20%20p_%7B%5Cmathrm%7Blasso%7D%7D%20%5Cgeq%20%5Cmathbf%7Bp%7D_1%2B%5Ccdots%2B%5Cmathbf%7Bp%7D_d%20%5C%5C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%26%20%20-%5Cmathbf%7Bp%7D%20%5Cleq%20%5Cmathbf%7Bw%7D%20%5Cleq%20%5Cmathbf%7Bp%7D%20%5C%5C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%26%20%20%5Cmathbf%7Bp%7D%20%5Cin%20%5Cmathbb%7BR%7D%5Ed%20%5C%5C%0A%20%20%20%20%5Ctext%7Bridge%7D%20%20%20%20%26%20%20(0.5%2C%20p_%7B%5Cmathrm%7Bridge%7D%7D%2C%20%5Cmathbf%7Bw%7D)%20%5Cin%20%5Cmathcal%7BQ%7D_%7B%5Cmathrm%7Brot%7D%7D%0A%20%20%20%20%5Cend%7Barray%7D%0A%20%20%20%20%24%24%0A%0A%20%20%20%20Again%2C%20we%20have%20a%20straightforward%20translation%20to%20a%20*Fusion*%20model%3A%0A%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(Domain%2C%20Expr%2C%20Model%2C%20ObjectiveSense)%3A%0A%20%20%20%20def%20lassoVar(M%2C%20w%2C%20d)%3A%0A%20%20%20%20%20%20%20%20p%20%3D%20M.variable('p'%2C%20d)%0A%20%20%20%20%20%20%20%20plasso%20%3D%20M.variable('plasso')%0A%20%20%20%20%20%20%20%20M.constraint(plasso%20%3E%3D%20Expr.sum(p))%0A%20%20%20%20%20%20%20%20M.constraint(p%20%3E%3D%20w)%0A%20%20%20%20%20%20%20%20M.constraint(p%20%3E%3D%20-w)%0A%20%20%20%20%20%20%20%20return%20plasso%0A%0A%20%20%20%20def%20ridgeVar(M%2C%20w)%3A%0A%20%20%20%20%20%20%20%20pridge%20%3D%20M.variable('pridge')%0A%20%20%20%20%20%20%20%20M.constraint(Expr.vstack(0.5%2C%20pridge%2C%20w)%2C%20Domain.inRotatedQCone())%0A%20%20%20%20%20%20%20%20return%20pridge%0A%0A%20%20%20%20def%20lseReg(X%2C%20y%2C%20lambda1%3D0%2C%20lambda2%3D0)%3A%0A%20%20%20%20%20%20%20%20n%2C%20d%20%3D%20(len(X)%2C%20len(X%5B0%5D))%0A%20%20%20%20%20%20%20%20M%20%3D%20Model('LSE-REG')%0A%20%20%20%20%20%20%20%20w%20%3D%20M.variable('w'%2C%20d)%0A%20%20%20%20%20%20%20%20t%20%3D%20M.variable('t')%0A%20%20%20%20%20%20%20%20r%20%3D%20y%20-%20X%20%40%20w%0A%20%20%20%20%20%20%20%20M.constraint(Expr.vstack(0.5%2C%20t%2C%20r)%2C%20Domain.inRotatedQCone())%0A%20%20%20%20%20%20%20%20objExpr%20%3D%20t.asExpr()%0A%20%20%20%20%20%20%20%20if%20lambda1%20!%3D%200%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20objExpr%20%3D%20objExpr%20%2B%20lambda1%20*%20lassoVar(M%2C%20w%2C%20d)%0A%20%20%20%20%20%20%20%20if%20lambda2%20!%3D%200%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20objExpr%20%3D%20objExpr%20%2B%20lambda2%20*%20ridgeVar(M%2C%20w)%0A%20%20%20%20%20%20%20%20M.objective(ObjectiveSense.Minimize%2C%20objExpr)%0A%20%20%20%20%20%20%20%20return%20M%0A%20%20%20%20return%20(lseReg%2C)%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%23%23%23%20Regularized%20least%20squares%20example%0A%20%20%20%20We%20consider%20the%20same%20dataset%20we%20had%20previously%2C%20but%20this%20time%20we%20try%20to%20fit%20a%20polynomial%20of%20very%20high%20degree.%20This%20will%20lead%20to%20overfitting%2C%20which%20can%20then%20be%20reduced%20by%20varying%20the%20regularization%20parameters%20%24%5Clambda%24.%0A%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(lseReg%2C%20numpy%2C%20plt)%3A%0A%20%20%20%20def%20_()%3A%0A%20%20%20%20%20%20%20%20N%20%3D%2025%0A%20%20%20%20%20%20%20%20degree%20%3D%2015%0A%20%20%20%20%20%20%20%20sigma%20%3D%200.03%0A%20%20%20%20%20%20%20%20x%20%3D%20numpy.sort(numpy.random.uniform(0.0%2C%201.0%2C%20N))%0A%0A%20%20%20%20%20%20%20%20def%20f(t)%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20return%201.5%20*%20t%20**%203%20-%202%20*%20t%20**%202%20%2B%200.5%20*%20t%20-%201%0A%20%20%20%20%20%20%20%20y%20%3D%20f(x)%20%2B%20numpy.random.normal(0.0%2C%20sigma%2C%20N)%0A%20%20%20%20%20%20%20%20X%20%3D%20%5B%5Bt%20**%20i%20for%20i%20in%20range(degree)%5D%20for%20t%20in%20x%5D%0A%20%20%20%20%20%20%20%20print('Sparsity%20report%3A%5Cnlambda1%20%20nonzeros%20in%20w%5Cn')%0A%20%20%20%20%20%20%20%20plt.axis(%5B0.0%2C%201.0%2C%20-1.25%2C%20-0.8%5D)%0A%20%20%20%20%20%20%20%20plt.scatter(x%2C%20y)%0A%20%20%20%20%20%20%20%20_ticks%20%3D%20numpy.linspace(0.0%2C%201.0%2C%2010000)%0A%20%20%20%20%20%20%20%20lambdas%20%3D%20%5B10%20**%20(-i)%20for%20i%20in%20range(1%2C%206)%5D%0A%20%20%20%20%20%20%20%20for%20lambda1%20in%20lambdas%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20M%20%3D%20lseReg(X%2C%20y%2C%20lambda1%2C%200)%0A%20%20%20%20%20%20%20%20%20%20%20%20M.solve()%0A%20%20%20%20%20%20%20%20%20%20%20%20w%20%3D%20M.getVariable('w').level()%0A%20%20%20%20%20%20%20%20%20%20%20%20print('%7B0%3A%20%3C8%7D%20%7B1%3A%20%3C2%7D'.format(lambda1%2C%20sum(%5B0%20if%20abs(val)%20%3C%201e-05%20else%201%20for%20val%20in%20w%5D)))%0A%20%20%20%20%20%20%20%20%20%20%20%20plt.plot(_ticks%2C%20sum(%5Bw%5Bi%5D%20*%20_ticks%20**%20i%20for%20i%20in%20range(degree)%5D))%0A%20%20%20%20%20%20%20%20plt.legend(lambdas%2C%20loc%3D'upper%20right')%0A%20%20%20%20%20%20%20%20return%20plt.show()%0A%20%20%20%20_()%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%23%20Robust%20regression%20with%20Huber%20loss%20function%0A%0A%20%20%20%20The%20penalty%20function%20%24%5Cphi(x)%3Dx%5E2%24%20grows%20quite%20rapidly%20for%20large%20values%20of%20%24x%24%2C%20making%20the%20least-squares%20approximation%20overly%20sensitive%20to%20outliers.%20One%20solution%20to%20this%20issue%20in%20the%20realm%20of%20convex%20optimization%20is%20to%20replace%20it%20with%20the%20penalty%20defined%20as%3A%0A%0A%20%20%20%20%24%24%0A%20%20%20%20%5Cphi_%7B%5Cmathrm%7BHuber%7D%7D(x)%20%3D%20%0A%20%20%20%20%5Cbegin%7Bcases%7D%0A%20%20%20%20x%5E2%20%20%20%20%20%20%20%26%20%20%7Cx%7C%5Cleq%20M%2C%20%5C%5C%0A%20%20%20%20M(2%7Cx%7C-M)%20%26%20%20%7Cx%7C%5Cgeq%20M%2C%0A%20%20%20%20%5Cend%7Bcases%7D%0A%20%20%20%20%24%24%0A%0A%20%20%20%20for%20some%20value%20of%20%24M%24.%20This%20*Huber%20loss%20function*%20agrees%20with%20the%20quadratic%20loss%20for%20small%20values%20of%20%24x%24%20(small%20residuals)%20and%20otherwise%20it%20is%20the%20slowest-growing%20function%20preserving%20that%20property%20while%20remaining%20convex.%20Thus%20it%20assignes%20much%20lower%20loss%20to%20distant%20outliers%20than%20pure%20quadratic%20regression%2C%20maing%20it%20more%20robust.%20The%20*robust%20regression%20with%20Huber%20loss%20function*%20is%20now%20the%20problem%0A%0A%20%20%20%20%24%24%5Cmathrm%7Bmin%7D_%7B%5Cmathbf%7Bw%7D%5Cin%5Cmathbb%7BR%7D%5Ed%7D%20%5Cphi_%7B%5Cmathrm%7BHuber%7D%7D(%5Cmathbf%7By%7D-%5Cmathbf%7BX%7D%5Cmathbf%7Bw%7D).%24%24%0A%0A%20%20%20%20It%20is%20left%20as%20an%20exercise%20to%20prove%20that%20an%20equivalent%20formulation%20of%20this%20optimization%20problem%20is%0A%0A%20%20%20%20%24%24%0A%20%20%20%20%5Cbegin%7Barray%7D%7Brl%7D%0A%20%20%20%20%5Cmathrm%7Bminimize%7D%20%26%20%20%20%20%20%20%20%20%20%20%20%20%5C%7C%5Cmathbf%7Bu%7D%5C%7C_2%5E2%2B2M%5C%7C%5Cmathbf%7Bv%7D%5C%7C_1%20%5C%5C%0A%20%20%20%20%5Cmathrm%7Bs.t.%7D%20%20%20%20%20%26%20%20%20%20%20%20%20%20%20%20%20%20-(%5Cmathbf%7Bu%7D%2B%5Cmathbf%7Bv%7D)%5Cleq%20%5Cmathbf%7By%7D-%5Cmathbf%7BX%7D%5Cmathbf%7Bw%7D%20%5Cleq%20%5Cmathbf%7Bu%7D%2B%5Cmathbf%7Bv%7D%5C%5C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%26%20%20%20%20%20%20%20%20%20%20%20%200%5Cleq%20%5Cmathbf%7Bu%7D%5Cleq%20M%5C%5C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%26%20%20%20%20%20%20%20%20%20%20%20%200%5Cleq%20%5Cmathbf%7Bv%7D%20%5C%5C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%26%20%20%20%20%20%20%20%20%20%20%20%20%5Cmathbf%7Bu%7D%2C%20%5Cmathbf%7Bv%7D%5Cin%20%5Cmathbb%7BR%7D%5En%2C%20%5Cmathbf%7Bw%7D%5Cin%5Cmathbb%7BR%7D%5Ed%0A%20%20%20%20%5Cend%7Barray%7D%0A%20%20%20%20%24%24%0A%0A%20%20%20%20We%20proceed%20directly%20to%20a%20*Fusion*%20implementation%3A%0A%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(Domain%2C%20Expr%2C%20Model%2C%20ObjectiveSense)%3A%0A%20%20%20%20def%20huber(X%2C%20y%2C%20MConst)%3A%0A%20%20%20%20%20%20%20%20n%2C%20d%20%3D%20(len(X)%2C%20len(X%5B0%5D))%0A%20%20%20%20%20%20%20%20M%20%3D%20Model('LSE-HUBER')%0A%20%20%20%20%20%20%20%20w%20%3D%20M.variable('w'%2C%20d)%0A%20%20%20%20%20%20%20%20u%20%3D%20M.variable(n%2C%20Domain.inRange(0.0%2C%20MConst))%0A%20%20%20%20%20%20%20%20v%20%3D%20M.variable(n%2C%20Domain.greaterThan(0.0))%0A%20%20%20%20%20%20%20%20r%20%3D%20y%20-%20X%20%40%20w%0A%20%20%20%20%20%20%20%20ab%20%3D%20u%20%2B%20v%0A%20%20%20%20%20%20%20%20M.constraint(ab%20%3E%3D%20r)%0A%20%20%20%20%20%20%20%20M.constraint(ab%20%3E%3D%20-r)%0A%20%20%20%20%20%20%20%20t%20%3D%20M.variable()%0A%20%20%20%20%20%20%20%20M.constraint(Expr.vstack(0.5%2C%20t%2C%20u)%2C%20Domain.inRotatedQCone())%0A%20%20%20%20%20%20%20%20M.objective(ObjectiveSense.Minimize%2C%20t%20%2B%202%20*%20MConst%20*%20Expr.sum(v))%0A%20%20%20%20%20%20%20%20return%20M%0A%20%20%20%20return%20(huber%2C)%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%23%23%23%20Huber%20regression%20example%0A%0A%20%20%20%20We%20demonstrate%20the%20Huber%20regression%20compared%20to%20standard%20least-squares%20regression%20on%20an%20example%20with%20distant%20outliers.%0A%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(huber%2C%20lse%2C%20numpy%2C%20plt)%3A%0A%20%20%20%20N%20%3D%2030%0A%20%20%20%20sigma%20%3D%200.005%0A%20%20%20%20x%20%3D%20numpy.sort(numpy.random.uniform(0.0%2C%201.0%2C%20N))%0A%0A%20%20%20%20def%20f(t)%3A%0A%20%20%20%20%20%20%20%20return%201.5%20*%20t%20**%203%20-%202%20*%20t%20**%202%20%2B%200.5%20*%20t%20-%201%0A%20%20%20%20y%20%3D%20f(x)%20%2B%20numpy.random.normal(0.0%2C%20sigma%2C%20N)%0A%20%20%20%20x%20%3D%20numpy.append(x%2C%200.73)%0A%20%20%20%20y%20%3D%20numpy.append(y%2C%20-0.99)%0A%20%20%20%20X%20%3D%20%5B%5Bt%20**%20i%20for%20i%20in%20range(4)%5D%20for%20t%20in%20x%5D%0A%20%20%20%20LSE%2C%20HUB%20%3D%20(lse(X%2C%20y)%2C%20huber(X%2C%20y%2C%205%20*%20sigma))%0A%20%20%20%20LSE.solve()%0A%20%20%20%20HUB.solve()%0A%20%20%20%20wLSE%20%3D%20LSE.getVariable('w').level()%0A%20%20%20%20wHUB%20%3D%20HUB.getVariable('w').level()%0A%20%20%20%20plt.scatter(x%2C%20y)%0A%20%20%20%20_ticks%20%3D%20numpy.linspace(0%2C%201%2C%20100)%0A%20%20%20%20plt.plot(_ticks%2C%20sum(%5BwLSE%5Bi%5D%20*%20_ticks%20**%20i%20for%20i%20in%20range(4)%5D)%2C%20color%3D'blue')%0A%20%20%20%20plt.plot(_ticks%2C%20sum(%5BwHUB%5Bi%5D%20*%20_ticks%20**%20i%20for%20i%20in%20range(4)%5D)%2C%20color%3D'red')%0A%20%20%20%20plt.show()%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22The%20red%20curve%20is%20clearly%20less%20affected%20by%20the%20outlier.%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%3Ca%20rel%3D%22license%22%20href%3D%22http%3A%2F%2Fcreativecommons.org%2Flicenses%2Fby%2F4.0%2F%22%3E%3Cimg%20alt%3D%22Creative%20Commons%20License%22%20style%3D%22border-width%3A0%22%20src%3D%22https%3A%2F%2Fi.creativecommons.org%2Fl%2Fby%2F4.0%2F80x15.png%22%20%2F%3E%3C%2Fa%3E%3Cbr%20%2F%3EThis%20work%20is%20licensed%20under%20a%20%3Ca%20rel%3D%22license%22%20href%3D%22http%3A%2F%2Fcreativecommons.org%2Flicenses%2Fby%2F4.0%2F%22%3ECreative%20Commons%20Attribution%204.0%20International%20License%3C%2Fa%3E.%20The%20**MOSEK**%20logo%20and%20name%20are%20trademarks%20of%20%3Ca%20href%3D%22http%3A%2F%2Fmosek.com%22%3EMosek%20ApS%3C%2Fa%3E.%20The%20code%20is%20provided%20as-is.%20Compatibility%20with%20future%20release%20of%20**MOSEK**%20or%20the%20%60Fusion%20API%60%20are%20not%20guaranteed.%20For%20more%20information%20contact%20our%20%5Bsupport%5D(mailto%3Asupport%40mosek.com).%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_()%3A%0A%20%20%20%20import%20marimo%20as%20mo%0A%20%20%20%20return%20(mo%2C)%0A%0A%0Aif%20__name__%20%3D%3D%20%22__main__%22%3A%0A%20%20%20%20app.run()%0A