Determinantes del bienestar subjetivo en Colombia: un enfoque híbrido bagging y boosting con interpretabilidad SHAP
Resumen
Este estudio parte de la hipótesis de que los modelos de aprendizaje automático explicables permiten predecir con mayor precisión los niveles de bienestar subjetivo (bs) en Colombia que los enfoques estadísticos tradicionales, y que su uso facilita la identificación de los principales determinantes para orientar políticas públicas. Para ello, se utilizaron datos de la Encuesta Nacional de Calidad de Vida (2023), aplicando modelos de clasificación multiclase tipo boosting y bagging (para una red neuronal de entrada híbrida), integrados posteriormente en un modelo stacking. Estos modelos se optimizaron mediante ajuste de hiperparámetros con validación cruzada estratificada y optuna. La metodología incluyó, además, técnicas de interpretabilidad shap y deep shap. El modelo final alcanzó un rendimiento competitivo, lo que permitió realizar un análisis en términos demográficos y geográficos. Se identificaron como principales determinantes la felicidad experimentada, la satisfacción con la salud, el sentido de la vida y la satisfacción con el ingreso, con diferencias regionales y por grupos poblacionales.
Descargas
Referencias bibliográficas
Ahmed, U., Jiangbin, Z., Almogren, A., Sadiq, M., Rehman, A. U., Sadiq, M. T. y Choi, J. (2024). Hybrid bagging and boosting with shap based feature selection for enhanced predictive modeling in intrusion detection systems. Scientific Reports, 14(1), 30532. https://doi.org/10.1038/s41598-024-81151-1
Barro, R. J. y Lee, J. W. (2013). A new data set of educational attainment in the world, 1950-2010. Journal of Development Economics, 104, 184-198. https://doi.org/10.1016/j.jdeveco.2012.10.001
Botello-Penaloza, H.-A. y Guerrero-Rincon, I. (2021). Ingresos y felicidad: paradoja de Easterlin en Colombia. ÁNFORA, 28(50), 275-294. https://doi.org/10.30854/ANF.V28.N50.2021.696
Breiman, L. (1996a). Bagging predictors. Machine Learning, 24, 123-140. https://doi.org/10.1007/bf00058655
Breiman, L. (1996b). Stacked regressions. Machine Learning, 24, 49-64. https://doi.org/10.1007/bf00117832
Buda, M., Maki, A. y Mazurowski, M. A. (2018). A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks, 106, 249-259. https://doi.org/10.1016/j.neunet.2018.07.011
CEPAL. (2021). Panorama Social de America Latina 2020. RDP Revista Digital de Posgrado, 3, 1-262. https://repositorio.cepal.org/server/api/core/bitstreams/500c9ce1-b11e-49d9-99a3-b3f371332f70/content
Chawla, N. V., Bowyer, K. W., Hall, L. O. y Kegelmeyer, W. P. (2002). smote: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321-357. https://doi.org/10.1613/jair.953
Chen, T. y Guestrin, C. (2016). xgboost: A scalable tree boosting system. Proceedings of the acm sigkdd International Conference on Knowledge Discovery and Data Mining, 785-794. https://doi.org/10.1145/2939672.2939785
Clark, A. E., Frijters, P. y Shields, M. A. (2008). Relative Income, Happiness, and Utility: An Explanation for the Easterlin Paradox and Other Puzzles. Journal of Economic Literature, 46(1), 95-144. https://doi.org/10.1257/JEL.46.1.95
DANE. (2024). Encuesta Nacional de Calidad de Vida - ecv 2023 - Colombia. https://microdatos.dane.gov.co/index.php/catalog/827
Diener, E. (1984). Subjective well-being. Psychological Bulletin, 95(3), 542-575. https://doi.org/10.1037/0033-2909.95.3.542
Diener, E., Oishi, S. y Tay, L. (2018). Advances in subjective well-being research. Nature Human Behaviour, 2(4), 253-260. https://doi.org/10.1038/s41562-018-0307-6
Diener, E., Suh, E. M., Lucas, R. E. y Smith, H. L. (1999). Subjective well-being: Three decades of progress. Psychological Bulletin 125(2), 276-302. American Psychological Association Inc. https://doi.org/10.1037/0033-2909.125.2.276
Easterlin, R. A. (1974). Does economic growth improve the human lot? Some empirical evidence. Nations and Households in Economic Growth, 89-125. https://doi.org/10.1016/B978-0-12-205050-3.50008-7
Frey, B. S. y Stutzer, A. (2002). What can economists learn from happiness research? Journal of Economic Literature, 40(2), 402-435. https://doi.org/10.1257/002205102320161320
Guo, C. y Berkhahn, F. (2016). Entity embeddings of categorical variables. Cornell University.
He, K., Zhang, X., Ren, S. y Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the 2016 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (cvpr), 770-778. https://doi.org/10.1109/CVPR.2016.90
Iglesias-Vazquez, E. M., Lopez, J. A. P. y Santos, J. M. S. (2013). Bienestar subjetivo, renta y bienes relacionales: Los determinantes de la felicidad en Espana. Revista Internacional de sociología, 71, 567-592. https://doi.org/10.3989/ris.2012.04.11
Johnson, J. M. y Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6(1), 27. https://doi.org/10.1186/s40537-019-0192-5
Kahneman, D. y Deaton, A. (2010). High income improves evaluation of life but not emotional well-being. Proceedings of the National Academy of Sciences of the United States of America, 107(38), 16489-16493. https://doi.org/10.1073/pnas.1011492107
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q0. y Liu, T.-Y. (2016). LightGBM: A highly efficient gradient boosting decision tree. https://papers.nips.cc/paper_files/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html.
Ke, N., Shi, G. y Zhou, Y. (2021). Stacking Model for Optimizing Subjective Well-Being Predictions Based on the cgss Database. Sustainability, 13(21), 11833. https://doi.org/10.3390/SU132111833
Lin, T. Y., Goyal, P., Girshick, R., He, K. y Dollar, P. (2017). Focal Loss for Dense Object Detection. Proceedings of the ieee International Conference on Computer Vision, 2999-3007. https://doi.org/10.1109/ICCV.2017.324
Lundberg, I., Brown-Weinstock, R., Clampet-Lundquist, S., Pachman, S., Nelson, T. J., Yang, V., Edin, K. y Salganik, M. J. (2024). The origins of unpredictability in life outcome prediction tasks. Proceedings of the National Academy of Sciences of the United States of America, 121(24). https://doi.org/10.1073/pnas.2322973121
Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N. y Lee, S. I. (2020). From local explanations to global understanding with explainable ai for trees. Nature Machine Intelligence, 2, 56-67. https://doi.org/10.1038/s42256-019-0138-9
Lundberg, S. M. y Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in neural information processing systems, 2017-December, 4766-4775.
Narayanan, A., Stewart, T., Duncan, S. y Pacheco, G. (2025). Using machine learning to explore the efficacy of administrative variables in prediction of subjective-wellbeing outcomes in New Zealand. Scientific Reports, 15(1), 6831. https://doi.org/10.1038/s41598-025-90852-0
OCDE. (2013). oecd Guidelines on measuring subjective well-being. En https://doi.org/https://doi.org/10.1787/9789264191655-en
Oparina, E., Kaiser, C., Gentile, N., Tkatchenko, A., Clark, A. E., De Neve, J-E. y D’Ambrosio, C. (2025). Machine learning in the prediction of human wellbeing. Scientific Reports, 15(1), 1632. https://doi.org/10.1038/s41598-024-84137-1
Pena-Lopez, A. y De Juan-Diaz, R. (2024). Indicadores de bienestar subjetivo y sus determinantes socioeconomicos: un estudio para la sociedad espanola. Empiria, 123-147. https://doi.org/10.5944/empiria.61.2024.41285
Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., y Gulin, A. (2018). Catboost: Unbiased boosting with categorical features. nips’18: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 6638-6648. https://dl.acm.org/doi/abs/10.5555/3327757.3327770
Saito, T. y Rehmsmeier, M. (2015). The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets. PLOS One, 10. https://doi.org/10.1371/journal.pone.0118432
Salganik, M. J., Lundberg, I., Kindel, A. T., Ahearn, C. E., Al-Ghoneim, K., Almaatouq, A., Altschul, D. M., Brand, J. E., Carnegie, N. B., Compton, R. J., Datta, D., Davidson, T., Filippova, A., Gilroy, C., Goode, B. J., Jahani, E., Kashyap, R.,Kirchner, A., McKay, S., McLanahan, S. et al. (2020). Measuring the predictability of life outcomes with a scientific mass collaboration. Proceedings of the National Academy of Sciences of the United States of America, 117, 8398-8403. https://doi.org/10.1073/pnas.1915006117
Shapley, L. S. (1953). 17. A value for n-person games. En H. W. Kuhn y A. W. Tucker (Ed.), Contributions to the theory of games, volume ii (pp. 307-318). Princeton University Press. https://doi.org/doi:10.1515/9781400881970-018
Sokolova, M. y Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing and Management, 45(4), 427-437. https://doi.org/10.1016/j.ipm.2009.03.002
United Nations. (s. f.). Sustainable Development Goals. United Nations. https://sdgs.un.org/goals
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241-259. https://doi.org/10.1016/S0893-6080(05)80023-1
Zhang, L., Fan, Y., Zhang y W. Zhang, S. (2018). Subjective well-being prediction using data mining techniques: Evidence from chinese general social survey. Applied and Computational Mathematics, 7(4), 197-202. https://doi.org/10.11648/j.acm.20180704.13
| Estadísticas de artículo | |
|---|---|
| Vistas de resúmenes | |
| Vistas de PDF | |
| Descargas de PDF | |
| Vistas de HTML | |
| Otras vistas | |
Derechos de autor 2025 Revista Facultad de Ciencias Económicas

Esta obra está bajo una licencia internacional Creative Commons Atribución-NoComercial-SinDerivadas 4.0.







