Skip to main content Accessibility help




We study the problem of choosing the best subset of $p$ features in linear regression, given $n$ observations. This problem naturally contains two objective functions including minimizing the amount of bias and minimizing the number of predictors. The existing approaches transform the problem into a single-objective optimization problem. We explain the main weaknesses of existing approaches and, to overcome their drawbacks, we propose a bi-objective mixed integer linear programming approach. A computational study shows the efficacy of the proposed approach.


Corresponding author


Hide All
[1] Bertsimas, D., King, A. and Mazumder, R., “Best subset selection via a modern optimization lens”, Ann. Statist. 44 (2016) 813852; doi:10.1214/15-AOS1388.
[2] Bickel, P. J., Ritov, Y. and Tsybakov, A. B., “Simultaneous analysis of Lasso and Dantzig selector”, Ann. Statist. 37 (2009) 17051732; doi:10.1214/08-AOS620.
[3] Boland, N., Charkhgard, H. and Savelsbergh, M., “A criterion space search algorithm for biobjective integer programming: the balanced box method”, INFORMS J. Comput. 27 (2015) 735754; doi:10.1287/ijoc.2015.0657.
[4] Boland, N., Charkhgard, H. and Savelsbergh, M., “A criterion space search algorithm for biobjective mixed integer programming: the triangle splitting method”, INFORMS J. Comput. 27 (2015) 597618; doi:10.1287/ijoc.2015.0646.
[5] Candés, E. J. and Plan, Y., “Near-ideal model selection by $l_{1}$ minimization”, Ann. Statist. 37 (2009) 21452177; doi:10.1214/08-AOS653.
[6] Chankong, V. and Haimes, Y. Y., Multiobjective decision making: theory and methodology (Elsevier Science, New York, 1983).
[7] Chen, S. S., Donoho, D. L. and Saunders, M. A., “Atomic decomposition by basis pursuit”, SIAM J. Sci. Comput. 20 (1998) 3361; doi:10.1137/S1064827596304010.
[8] Dielman, T. E., “A comparison of forecasts from least absolute value and least squares regression”, J. Forecast. 5 (1986) 189195; doi:10.1080/0094965042000223680.
[9] Dielman, T. E., “Least absolute value regression: recent contributions”, J. Stat. Comput. Simul. 75 (2005) 263286; doi:10.1002/for.3980050305.
[10] Ghosh, D. and Chakraborty, D., “A new Pareto set generating method for multi-criteria optimization problems”, Oper. Res. Lett. 42 (2014) 514521; doi:10.1016/j.orl.2014.08.011.
[11] Hamacher, H. W., Pedersen, C. R. and Ruzika, S., “Finding representative systems for discrete bicriterion optimization problems”, Oper. Res. Lett. 35 (2007) 336344; doi:10.1016/j.orl.2006.03.019.
[12] Meinshausen, N. and Bühlmann, P., “High-dimensional graphs and variable selection with the Lasso”, Ann. Statist. 34 (2006) 14361462; doi:10.1214/009053606000000281.
[13] Miller, A., Subset selection in regression, 2nd edn, Monogr. Statistics and Applied Probability (Chapman and Hall/CRC Press, Boca Raton, FL, 2002).
[14] Miyashiroa, R. and Takanon, Y., “Mixed integer second-order cone programming formulations for variable selection in linear regression”, European J. Oper. Res. 247 (2015) 721731; doi:10.1214/009053606000000281.
[15] Papadimitriou, C. H. and Yannakakis, M., “On the approximability of trade-offs and optimal access of web sources”, in: Proceedings 41st Annual Symposium on Foundations of Computer Science (IEEE, Redondo Beach, CA, 2000) 8692; doi:10.1109/SFCS.2000.892068.
[16] Ren, Y. and Zhang, X., “Subset selection for vector autoregressive processes via adaptive Lasso”, Statist. Probab. Lett. 80 (2010) 17051712; doi:10.1016/j.spl.2010.07.013.
[17] Sayın, S., “An algorithm based on facial decomposition for finding the efficient set in multiple objective linear programming”, Oper. Res. Lett. 19 (1996) 8794; doi:10.1016/0167-6377(95)00046-1.
[18] Schwertman, N. C., Gilks, A. J. and Cameron, J., “A simple noncalculus proof that the median minimizes the sum of the absolute deviations”, Amer. Statist. 44 (1990) 3839; doi:10.1080/00031305.1990.10475690.
[19] Stidsen, T., Andersen, K. A. and Dammann, B., “An algorithm based on facial decomposition for finding the efficient set in multiple objective linear programming”, Manag. Sci. 60 (2014) 10091032; doi:10.1287/mnsc.2013.1802.
[20] Tibshirani, R., “Regression shrinkage and selection via the Lasso”, J. R. Stat. Soc. Ser. B 58 (1996) 267288;
[21] Wolsey, L. A., Integer programming, 2nd edn (Wiley-Interscience, New York, 1998).
[22] Zhang, C. and Huang, J., “The sparsity and bias of the Lasso selection in high-dimensional linear regression”, Ann. Statist. 36 (2008) 15671594; doi:10.1214/07-AOS520.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

The ANZIAM Journal
  • ISSN: 1446-1811
  • EISSN: 1446-8735
  • URL: /core/journals/anziam-journal
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


MSC classification


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed