ACTUARIAL APPLICATIONS OF WORD EMBEDDING MODELS

Gee Y Lee; Scott Manski; Tapabrata Maiti

doi:10.1017/asb.2019.28

ACTUARIAL APPLICATIONS OF WORD EMBEDDING MODELS

Published online by Cambridge University Press: 22 October 2019

Gee Y Lee ,

Scott Manski and

Tapabrata Maiti

Show author details

Gee Y Lee*: Affiliation:
Department of Statistics and Probability Department of MathematicsMichigan State UniversityC337 Wells Hall, 619 Red Cedar Rd, East Lansing, MI 48824, USA E-Mail: leegee@msu.edu
Scott Manski: Affiliation:
Department of Statistics and ProbabilityMichigan State UniversityC511 Wells Hall, 619 Red Cedar Rd, East Lansing, MI 48824, USA E-Mail: manskisc@stt.msu.edu
Tapabrata Maiti: Affiliation:
Department of Statistics and ProbabilityMichigan State UniversityC424 Wells Hall, 619 Red Cedar Rd, East Lansing, MI 48824, USA E-Mail: maiti@stt.msu.edu
*: E-Mail: leegee@msu.edu

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In insurance analytics, textual descriptions of claims are often discarded, because traditional empirical analyses require numeric descriptor variables. This paper demonstrates how textual data can be easily used in insurance analytics. Using the concept of word similarities, we illustrate how to extract variables from text and incorporate them into claims analyses using standard generalized linear model or generalized additive regression model. This procedure is applied to the Wisconsin Local Government Property Insurance Fund (LGPIF) data, in order to demonstrate how insurance claims management and risk mitigation procedures can be improved. We illustrate two applications. First, we show how the claims classification problem can be solved using textual information. Second, we analyze the relationship between risk metrics and the probability of large losses. We obtain good results for both applications, where short textual descriptions of insurance claims are used for the extraction of features.

Keywords

Risk mitigation insurance claims adjustment word embedding text mining word2vec GloVe word similarity generalized additive models actuarial modeling classification logistic regression loss modeling

Type: Research Article
Information: ASTIN Bulletin: The Journal of the IAA , Volume 50 , Issue 1 , January 2020 , pp. 1 - 24

DOI: https://doi.org/10.1017/asb.2019.28 [Opens in a new window]
Copyright: © Astin Bulletin 2019

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Chollet, F. and Allaire, J. J. (2018) Deep Learning with R. Shelter Island, NY: Manning Publications Co.Google Scholar

Frees, E. W. (2009) Regression Modeling with Actuarial and Financial Applications. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Frees, E. W., Lee, G. Y. and Yang, L. (2016) Multivariate frequency-severity regression models in insurance. Risks, 2016(4): 4.CrossRef Google Scholar

Goldberg, Y. (2017) Neural Network Methods for Natural Language Processing. San Rafael, CA: Morgan & Claypool Publishers.CrossRef Google Scholar

Goodfellow, I., Bengio, Y. and Courville, A. (2016) Deep Learning. Cambridge, MA: MIT Press.Google Scholar

Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition. Berlin: Springer Science & Business Media.CrossRef Google Scholar

Hastie, T. J. and Tibshirani, R. J. (1990). Generalized Additive Models. Boca Raton, FL: Chapman and Hall.Google Scholar

Kearney, S. (2010). Insurance Operations. Malvern, PA: The Institutes.Google Scholar

Manning, C. D. and Schutze, H. (1999). Foundations of Statistical Natural Language Processing, 1st Edition. Cambridge, MA: The MIT Press.Google Scholar

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S. and Dean, J. (2013) Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems 26: 3111–3119.Google Scholar

Pennington, J., Socher, R. and Manning, C. D. (2014). Glove: Global vectors for word representation. In Empirical Methods in Natural Language Processing (EMNLP), vol. 2014, pp. 1532–1543.CrossRef Google Scholar

Sokolova, M. and Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing and Management, 45:427–437.CrossRef Google Scholar

Wood, S. (2013). On p values for smooth components of an extended generalized additive model. Biometrika 100, 221–228.CrossRef Google Scholar

Wood, S. N. (2017). Generalized Additive Models: An Introduction with R, Second Edition. Boca Raton, FL: CRC Press.CrossRef Google Scholar

Article contents

ACTUARIAL APPLICATIONS OF WORD EMBEDDING MODELS

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests