pwl-convex-approximation

import%20marimo%0A%0A__generated_with%20%3D%20%220.15.2%22%0Aapp%20%3D%20marimo.App()%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22!%5BMOSEK%20ApS%5D(https%3A%2F%2Fwww.mosek.com%2Fstatic%2Fimages%2Fbranding%2Fwebgraphmoseklogocolor.png%20)%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%23%20Piece-wise%20linear%20approximation%20of%20a%20convex%20function%0A%0A%20%20%20%20It%20is%20clear%20that%20each%20convex%20function%20can%20be%20approximated%20to%20some%20extent%20by%20a%20convex%20piece-wise%20linear%20function.%20In%20this%20notebook%20we%20show%20how%20to%20model%20this%20as%20a%20least%20squares%20problem.%20This%20model%20was%20introduced%20and%20studied%20by%20%5BKuosmanen%5D(https%3A%2F%2Fonlinelibrary.wiley.com%2Fdoi%2Fabs%2F10.1111%2Fj.1368-423X.2008.00239.x)%20and%20%5BSeijo%2C%20Sen%5D(https%3A%2F%2Fprojecteuclid.org%2Fjournals%2Fannals-of-statistics%2Fvolume-39%2Fissue-3%2FNonparametric-least-squares-estimation-of-a-multivariate-convex-regression-function%2F10.1214%2F10-AOS852.full)%2C%20among%20others.%0A%0A%20%20%20%20We%20are%20given%20a%20set%20of%20points%20%24X_1%2C%5Cldots%2CX_n%5Cin%20%5Cmathbb%7BR%7D%5Ed%24%20and%20%24Y_1%2C%5Cldots%2CY_n%5Cin%5Cmathbb%7BR%7D%24%2C%20which%20we%20presume%20were%20generated%20as%20%24Y_i%20%3D%20f(X_i)%20%2B%20%5Cvarepsilon_i%24%20where%20%24f%24%20is%20a%20convex%20function%20and%20%24%5Cvarepsilon_i%24%20is%20noise.%20Our%20least%20squares%20problem%20looks%20as%20follows%20%0A%0A%20%20%20%20%24%24%5Cbegin%7Barray%7D%7Brll%7D%0A%20%20%20%20%5Ctext%7Bminimize%7D%20%26%20%5Csum_%7Bi%3D1%7D%5En%20(t_i%20-%20Y_i)%5E2%20%26%20%5C%5C%0A%20%20%20%20%5Ctext%7Bsubject%20to%7D%20%26%20t_i%20%5Cgeq%20t_j%20%2B%20s_j%5ET%20(X_i%20-%20X_j)%20%26%5Ctext%7Bfor%20all%7D%5C%201%5Cleq%20i%2Cj%5Cleq%20n%2C%20%5C%5C%0A%20%20%20%20%26%20t_1%2C%20...%2C%20t_n%20%5Cin%20%5Cmathbb%7BR%7D%2C%20%26%20%5C%5C%0A%20%20%20%20%26%20s_1%2C%20...%2C%20s_n%20%5Cin%20%5Cmathbb%7BR%7D%5Ed.%20%26%0A%20%20%20%20%5Cend%7Barray%7D%24%24%0A%0A%20%20%20%20Then%20the%20piecewise%20linear%20function%20approximating%20the%20data%20will%20be%20defined%20as%20the%20maximum%20%24%5Chat%7B%5CPhi%7D(x)%3D%5Ctext%7Bmax%7D%5C%7B%5CPhi_i(x)%2C%5C%20i%3D1%2C%5Cldots%2Cn%5C%7D%24%20of%20%24n%24%20linear%20functions%20%24%5CPhi_i%3A%5Cmathbb%7BR%7D%5Ed%5Cto%20%5Cmathbb%7BR%7D%24%20given%20by%0A%0A%20%20%20%20%24%24%5CPhi_i(x)%20%3D%20t_i%20%2B%20s_i%5ET(x-X_i)%24%24%0A%0A%20%20%20%20for%20%24i%3D1%2C%5Cldots%2Cn%24.%0A%0A%20%20%20%20Note%20that%20%24%5CPhi_i(X_i)%3Dt_i%24%20and%20%24%5Cnabla%20%5CPhi_i(X_i)%3Ds_i%24.%20Therefore%20the%20main%20constraint%20appearing%20in%20the%20model%20corresponds%20precisely%20the%20convexity%20requirement%3A%0A%0A%20%20%20%20%24%24%5Chat%7B%5CPhi%7D(X_i)%20-%20%5Chat%7B%5CPhi%7D(X_j)%20%5Cgeq%20%5Clangle%5Cnabla%20%5Chat%7B%5CPhi%7D(X_j)%2CX_i-X_j%5Crangle%24%24%0A%0A%20%20%20%20for%20the%20function%20%24%5CTheta%24.%0A%0A%20%20%20%20%23%23%23%20Preparing%20synthetic%20data%0A%0A%20%20%20%20In%20the%20example%20we%20will%20show%20an%20approximation%20of%20a%20simple%20quadratic%20function%20%24f(x)%20%3D%20x%5ETx.%24%0A%0A%20%20%20%20Let%20us%20first%20define%20our%20data.%20We%20generate%20%24n%24%20points%20of%20dimension%20%24d%24%20uniformly%20from%20%24%5B-1%2C%201%5D%5Ed%24.%20The%20corresponding%20respose%20is%20given%20as%20%24Y_i%20%3D%20f(X_i)%20%2B%20%5Cvarepsilon_i%24.%20Error%20%24%5Cvarepsilon%24%20follows%20normal%20distribution%20%24N(0%2C%20%5Csigma%5E2%20I_n)%24%2C%20where%20%24%5Csigma%5E2%20%3D%20Var(f(%5Cmathbf%7BX%7D))%20%2F%20SNR%24%20and%20%24SNR%20%3D%203%24.%20Lastly%2C%20we%20mean-center%20and%20standardize%20to%20unit%20%24l_2%24%20norm%20the%20responses%20%24Y%24%20and%20the%20data%20%24X%24.%20We%20do%20this%20to%20get%20a%20more%20predictive%20model.%0A%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_()%3A%0A%20%20%20%20import%20numpy%20as%20np%0A%20%20%20%20from%20mosek.fusion%20import%20Model%2C%20Domain%2C%20Expr%2C%20ObjectiveSense%2C%20Matrix%2C%20Var%2C%20SolutionStatus%0A%20%20%20%20import%20mosek.fusion.pythonic%0A%20%20%20%20import%20matplotlib.pyplot%20as%20plt%0A%20%20%20%20plt.rcParams%5B'figure.figsize'%5D%20%3D%20%5B10%2C%2010%5D%0A%20%20%20%20from%20matplotlib%20import%20cm%0A%20%20%20%20import%20sys%0A%20%20%20%20return%20Domain%2C%20Expr%2C%20Model%2C%20ObjectiveSense%2C%20Var%2C%20cm%2C%20np%2C%20plt%2C%20sys%0A%0A%0A%40app.cell%0Adef%20_(np)%3A%0A%20%20%20%20np.random.seed(0)%0A%0A%20%20%20%20%23%20specify%20the%20dimension%0A%20%20%20%20n%20%3D%20200%20%0A%20%20%20%20d%20%3D%202%0A%0A%20%20%20%20%23%20specify%20the%20function%0A%20%20%20%20fun%20%3D%20lambda%20x%3A%20np.sum(np.multiply(x%2C%20x)%2C%201)%20%20%20%23%20x%5ET%20x%0A%0A%20%20%20%20%23%20generate%20data%20and%20get%20corresponding%20responses%0A%20%20%20%20x%20%3D%20np.random.uniform(-1%2C%201%2C%20n*d).reshape(n%2C%20d)%0A%20%20%20%20response%20%3D%20fun(x)%0A%0A%20%20%20%20%23%20compute%20response%20vector%20with%20error%0A%20%20%20%20varResponse%20%3D%20np.var(response%2C%20ddof%3D1)%20%20%23%20unbiased%20sample%20variance%0A%20%20%20%20SNR%20%3D%203%0A%20%20%20%20stdError%20%3D%20np.sqrt(varResponse%20%2F%20SNR)%0A%20%20%20%20error%20%3D%20np.random.normal(0%2C%20stdError%2C%20n)%0A%20%20%20%20reponseWithError%20%3D%20response%20%2B%20error%0A%0A%20%20%20%20%23%20standardize%20and%20mean-center%20the%20response%0A%20%20%20%20meanY%20%3D%20np.sum(reponseWithError)%20%2F%20n%0A%20%20%20%20standY%20%3D%20reponseWithError%20-%20meanY%0A%20%20%20%20norm%20%3D%20np.sqrt(%20np.sum(%20np.square(%20standY%20)))%0A%20%20%20%20Y%20%3D%20standY%20%2F%20norm%0A%0A%20%20%20%20%23%20standardize%20the%20x%20(by%20the%20dimension)%20and%20mean-center%0A%20%20%20%20meanX%20%3D%20np.sum(x%2C%20axis%3D0)%20%2F%20n%0A%20%20%20%20standX%20%3D%20x%20-%20meanX%0A%20%20%20%20norm%20%3D%20np.sqrt(np.sum%20(np.square(%20standX%20)%2C%20axis%3D0))%0A%20%20%20%20X%20%3D%20np.divide(standX%2C%20norm)%0A%20%20%20%20return%20X%2C%20Y%2C%20d%2C%20n%0A%0A%0A%40app.cell%0Adef%20_(X%2C%20Y%2C%20plt)%3A%0A%20%20%20%20fig%20%3D%20plt.figure()%0A%20%20%20%20ax%20%3D%20fig.add_subplot(projection%3D'3d')%0A%20%20%20%20ax.set_title('Generated%20Data')%0A%20%20%20%20scatter%20%3D%20ax.scatter(X%5B%3A%2C%200%5D%2C%20X%5B%3A%2C%201%5D%2C%20Y)%0A%20%20%20%20plt.show()%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%23%23%23%20Fusion%20Model%0A%20%20%20%20In%20the%20following%20block%2C%20usage%20of%20Mosek%20Fusion%20for%20Python%20for%20this%20case%20is%20presented.%0A%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(Domain%2C%20Expr%2C%20Model%2C%20ObjectiveSense%2C%20Var%2C%20X%2C%20Y%2C%20d%2C%20n%2C%20np%2C%20sys)%3A%0A%20%20%20%20%23%20solve%20QP%0A%20%20%20%20with%20Model('ConvexApproxLeastSquares')%20as%20M%3A%0A%0A%20%20%20%20%20%20%20%20%23%23%23%23%23%23%23%23%23%23%23%23%23%23%20create%20variables%20%23%23%23%23%23%23%23%23%23%23%23%23%23%23%0A%0A%20%20%20%20%20%20%20%20m%20%3D%20%20M.variable(%201%2C%20Domain.unbounded()%20)%20%20%20%20%20%20%20%23%20maximum%20variable%0A%20%20%20%20%20%20%20%20t%20%3D%20%20M.variable(%20n%2C%20Domain.unbounded()%20)%20%20%20%20%20%20%0A%20%20%20%20%20%20%20%20ss%20%3D%20M.variable(%20%5Bn%2C%20d%5D%2C%20Domain.unbounded()%20)%20%20%23%20variables%20s%20stored%20in%20n%20X%20d%20matrix%0A%0A%20%20%20%20%20%20%20%20%23%23%23%23%23%23%23%23%23%23%23%23%23%23%20create%20constraints%20%23%23%23%23%23%23%23%23%23%23%23%23%23%23%0A%0A%20%20%20%20%20%20%20%20%23%20create%20Qr%20constraints%0A%0A%20%20%20%20%20%20%20%20%23%20auxDiff%20%3D%20t%20-%20ys%0A%20%20%20%20%20%20%20%20auxDiff%20%3D%20M.variable(%20n%2C%20Domain.unbounded()%20)%0A%20%20%20%20%20%20%20%20M.constraint(%20auxDiff%20%3D%3D%20t%20-%20Y%20)%0A%0A%20%20%20%20%20%20%20%20%23%20(0.5%2C%20m%2C%20t%20-%20ys)%20in%20Q_r%0A%20%20%20%20%20%20%20%20auxHalf%20%3D%20M.variable(%201%2C%20Domain.equalsTo(0.5)%20)%0A%20%20%20%20%20%20%20%20z1%20%3D%20Var.vstack(%20auxHalf%2C%20m%2C%20auxDiff%20)%0A%20%20%20%20%20%20%20%20M.constraint(%20z1%2C%20Domain.inRotatedQCone()%20)%0A%0A%20%20%20%20%20%20%20%20%23%20create%20constraints%20on%20the%20maximum%20of%20the%20linear-piecewise%20functions%0A%20%20%20%20%20%20%20%20%23%20i.e.%20%20%20%20%0A%20%20%20%20%20%20%20%20%23%20%20%20%20%20%20%200%20%3E%3D%20t_j%20-%20t_i%20%2B%20%3Cs_j%2Cx_i%20-%20x_j%3E%0A%0A%20%20%20%20%20%20%20%20%23%20compute%20t_j%20-%20t_i%0A%20%20%20%20%20%20%20%20t_is%20%3D%20Expr.flatten(%20Expr.repeat(t%2C%20n%2C%201)%20)%0A%20%20%20%20%20%20%20%20t_js%20%3D%20Expr.repeat(%20t%2C%20n%2C%200%20)%0A%0A%20%20%20%20%20%20%20%20Diff%20%3D%20t_js%20-%20t_is%0A%0A%20%20%20%20%20%20%20%20%23%20make%20a%20dot%20product%20%3Cs_j%2Cx_i%20-%20x_j%3E%0A%20%20%20%20%20%20%20%20x_is%20%3D%20np.repeat(%20X%2C%20repeats%3Dn%2C%20axis%3D0%20)%0A%20%20%20%20%20%20%20%20x_js%20%3D%20np.tile(%20X%2C%20(n%2C%201)%20)%0A%20%20%20%20%20%20%20%20ss_stack%20%3D%20Expr.repeat(%20ss%2C%20n%2C%200%20)%0A%0A%20%20%20%20%20%20%20%20dotProd%20%3D%20Expr.sum(%20Expr.mulElm(ss_stack%20%2Cx_is%20-%20x_js)%20%2C%201%20)%20%20%20%23%20dot%20product%20%20%0A%0A%20%20%20%20%20%20%20%20%23%20add%20constraint%0A%20%20%20%20%20%20%20%20M.constraint(%20Diff%20%2B%20dotProd%20%3C%3D%200%20)%0A%0A%20%20%20%20%20%20%20%20%23%23%23%23%23%23%23%23%23%23%23%23%23%23%20solve%20problem%20%23%23%23%23%23%23%23%23%23%23%23%23%23%23%0A%0A%20%20%20%20%20%20%20%20%23%20minimize%20the%20least%20squares%0A%20%20%20%20%20%20%20%20M.objective(%20%22obj%22%2C%20ObjectiveSense.Minimize%2C%20m%20)%20%20%20%20%0A%0A%20%20%20%20%20%20%20%20%23%20set%20the%20log%20handler%20to%20stdout%0A%20%20%20%20%20%20%20%20M.setLogHandler(sys.stdout)%0A%0A%20%20%20%20%20%20%20%20M.solve()%0A%0A%20%20%20%20%20%20%20%20%23%20get%20the%20optimal%20values%0A%20%20%20%20%20%20%20%20solt%20%3D%20t.level()%0A%20%20%20%20%20%20%20%20sols%20%3D%20ss.level().reshape((n%2C%20d))%0A%20%20%20%20return%20sols%2C%20solt%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%23%23%23%20Model%20evalutation%0A%0A%20%20%20%20Let%20us%20evaluate%20our%20fit%2C%20this%20can%20be%20done%20as%20follows%0A%0A%20%20%20%20%24%5Chat%20%5CPhi%20(%5Cmathbf%7Bx%7D)%20%3D%20%5Cmax%20%5Climits%20_%7B1%20%5Cleq%20j%20%5Cleq%20n%7D%20%5C%7B%20%5Chat%7Bt%7D_j%20%2B%20%5Chat%7Bs%7D_j%5ET%20(%5Cmathbf%7Bx%7D%20-%20X_j)%20%5C%7D%24%0A%0A%20%20%20%20We%20can%20now%20see%20obvious%20drawback%20of%20such%20method%3B%20we%20need%20to%20keep%20all%20points.%20Please%20note%20that%20code%20below%20is%20not%20vectorized%20for%20the%20educational%20purposes%20but%20should%20be%20if%20aiming%20for%20efficiency.%0A%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(X%2C%20np%2C%20sols%2C%20solt)%3A%0A%20%20%20%20n_1%20%3D%20100%0A%20%20%20%20xs%20%3D%20np.linspace(-0.1%2C%200.1%2C%20n_1)%0A%20%20%20%20ys%20%3D%20np.linspace(-0.1%2C%200.1%2C%20n_1)%0A%20%20%20%20_xx%2C%20_yy%20%3D%20np.meshgrid(xs%2C%20ys)%0A%20%20%20%20x_1%20%3D%20np.c_%5B_xx.reshape(_xx.size%2C%201)%2C%20_yy.reshape(_yy.size%2C%201)%5D%0A%20%20%20%20x_1%20%3D%20x_1.astype(np.double)%0A%20%20%20%20numberOfLinearFuns%20%3D%20solt.size%0A%20%20%20%20phi%20%3D%20np.zeros(n_1%20**%202)%0A%20%20%20%20for%20i%20in%20range(0%2C%20n_1%20**%202)%3A%0A%20%20%20%20%20%20%20%20possibleValues%20%3D%20np.zeros(numberOfLinearFuns)%0A%20%20%20%20%20%20%20%20for%20j%20in%20range(0%2C%20numberOfLinearFuns)%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20val%20%3D%20solt%5Bj%5D%20%2B%20np.dot(sols%5Bj%2C%20%3A%5D%2C%20x_1%5Bi%2C%20%3A%5D%20-%20X%5Bj%2C%20%3A%5D)%0A%20%20%20%20%20%20%20%20%20%20%20%20possibleValues%5Bj%5D%20%3D%20val%0A%20%20%20%20%20%20%20%20phi%5Bi%5D%20%3D%20np.max(possibleValues)%0A%20%20%20%20return%20n_1%2C%20phi%2C%20x_1%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(%0A%20%20%20%20%20%20%20%20r%22%22%22%0A%20%20%20%20%23%23%23%20Plot%20the%20results%0A%20%20%20%20Finally%2C%20we%20can%20plot%20our%20solution.%0A%20%20%20%20%22%22%22%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(X%2C%20Y%2C%20cm%2C%20n_1%2C%20phi%2C%20plt%2C%20x_1)%3A%0A%20%20%20%20_fig%20%3D%20plt.figure()%0A%20%20%20%20_ax%20%3D%20_fig.add_subplot(projection%3D'3d')%0A%20%20%20%20_ax.set_title('Linear%20piece-wise%20approximation')%0A%20%20%20%20_ax.scatter(X%5B%3A%2C%200%5D%2C%20X%5B%3A%2C%201%5D%2C%20Y)%0A%20%20%20%20_xx%20%3D%20x_1%5B%3A%2C%200%5D.reshape(n_1%2C%20n_1)%0A%20%20%20%20_yy%20%3D%20x_1%5B%3A%2C%201%5D.reshape(n_1%2C%20n_1)%0A%20%20%20%20zz%20%3D%20phi.reshape(n_1%2C%20n_1)%0A%20%20%20%20axs%20%3D%20_ax.contour3D(_xx%2C%20_yy%2C%20zz%2C%2050%2C%20cmap%3Dcm.cool)%0A%20%20%20%20leg%20%3D%20_ax.legend(%5B'Data'%5D)%0A%20%20%20%20plt.show()%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%3Ca%20rel%3D%22license%22%20href%3D%22http%3A%2F%2Fcreativecommons.org%2Flicenses%2Fby%2F4.0%2F%22%3E%3Cimg%20alt%3D%22Creative%20Commons%20License%22%20style%3D%22border-width%3A0%22%20src%3D%22https%3A%2F%2Fi.creativecommons.org%2Fl%2Fby%2F4.0%2F80x15.png%22%20%2F%3E%3C%2Fa%3E%3Cbr%20%2F%3EThis%20work%20is%20licensed%20under%20a%20%3Ca%20rel%3D%22license%22%20href%3D%22http%3A%2F%2Fcreativecommons.org%2Flicenses%2Fby%2F4.0%2F%22%3ECreative%20Commons%20Attribution%204.0%20International%20License%3C%2Fa%3E.%20The%20**MOSEK**%20logo%20and%20name%20are%20trademarks%20of%20%3Ca%20href%3D%22http%3A%2F%2Fmosek.com%22%3EMosek%20ApS%3C%2Fa%3E.%20The%20code%20is%20provided%20as-is.%20Compatibility%20with%20future%20release%20of%20**MOSEK**%20or%20the%20%60Fusion%20API%60%20are%20not%20guaranteed.%20For%20more%20information%20contact%20our%20%5Bsupport%5D(mailto%3Asupport%40mosek.com).%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_()%3A%0A%20%20%20%20import%20marimo%20as%20mo%0A%20%20%20%20return%20(mo%2C)%0A%0A%0Aif%20__name__%20%3D%3D%20%22__main__%22%3A%0A%20%20%20%20app.run()%0A