Cheating on Unproctored Internet Test Applications: An Analysis of a Verification Test in a Real Personnel Selection Context

David Aguado; Alejandro Vidal; Julio Olea; Vicente Ponsoda; Juan Ramón Barrada; Francisco José Abad

doi:10.1017/sjp.2018.50

Cheating on Unproctored Internet Test Applications: An Analysis of a Verification Test in a Real Personnel Selection Context

Published online by Cambridge University Press: 03 December 2018

Julio Olea ,

and

David Aguado*: Affiliation:
Universidad Autónoma de Madrid (Spain)
Alejandro Vidal: Affiliation:
Universidad Autónoma de Madrid (Spain)
Julio Olea: Affiliation:
Universidad Autónoma de Madrid (Spain)
Vicente Ponsoda: Affiliation:
Universidad Autónoma de Madrid (Spain)
Juan Ramón Barrada: Affiliation:
Universidad de Zaragoza (Spain)
Francisco José Abad: Affiliation:
Universidad Autónoma de Madrid (Spain)
*: *Correspondence concerning this article should be addressed to David Aguado. Universidad Autónoma de Madrid. Departamento de Psicología Social y Metodología. 28049 Madrid (Spain). E-mail: david.aguado@uam.es

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

This study analyses the extent to which cheating occurs in a real selection setting. A two-stage, unproctored and proctored, test administration was considered. Test score inconsistencies were concluded by applying a verification test (Guo and Drasgow Z-test). An initial simulation study showed that the Z-test has adequate Type I error and power rates in the specific selection settings explored. A second study applied the Z-test statistic verification procedure to a sample of 954 employment candidates. Additional external evidence based on item time response to the verification items was gathered. The results revealed a good performance of the Z-test statistic and a relatively low, but non-negligible, number of suspected cheaters that showed higher distorted ability estimates. The study with real data provided additional information on the presence of suspected cheating in unproctored applications and the viability of using item response times as an additional evidence of cheating. In the verification test, suspected cheaters spent 5.78 seconds per item more than expected considering the item difficulty and their assumed ability in the unproctored stage. We found that the percentage of suspected cheaters in the empirical study could be estimated at 13.84%. In summary, the study provides evidence of the usefulness of the Z-test in the detection of cheating in a specific setting, in which a computerized adaptive test for assessing English grammar knowledge was used for personnel selection.

Keywords

cheating Guo and Drasgow Z-test unproctored Internet testing verification testing

Type: Research Article
Information: The Spanish Journal of Psychology , Volume 21 , 2018 , E62

DOI: https://doi.org/10.1017/sjp.2018.50 [Opens in a new window]
Copyright: Copyright © Universidad Complutense de Madrid and Colegio Oficial de Psicólogos de Madrid 2018

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Cátedra UAM–IIC Modelos y Aplicaciones Psicométricos. Ministerio de Economía y Competitividad. PSI2013–44300–P. PSI2015–65557–P.

‡

Julio Olea actively participated in this paper. He passed away when we were preparing the last version of the manuscript. We take the opportunity to recognize his exceptional professional achievements and personal qualities.

How to cite this article:

Aguado, D., Vidal, A., Olea, J., Ponsoda, V., Barrada, J. R., & Abad, F. J. (2018). Cheating on unproctored Internet test applications: An analysis of a verification test in a real personnel selection context. The Spanish Journal of Psychology, 21. e62. Doi:10.1017/sjp.2018.50

References

Abad, F. J., Olea, J., Aguado, D., Ponsoda, V., & Barrada, J. R. (2010). Deterioro de parámetros de los ítems en tests adaptativos informatizados: Estudio con eCAT [Item parameter drift in computerized adaptive testing: Study with eCAT]. Psicothema, 22, 340–347.Google Scholar

Armstrong, R. D., & Shi, M. (2009). A parametric cumulative sum statistic for person fit. Applied Psychological Measurement, 33, 391–410. https://doi.org/10.1177/0146621609331961 CrossRef Google Scholar

Arthur, W., Glaze, R. M., Villado, A. J., & Taylor, J. E. (2009). Unproctored Internet-based tests of cognitive ability and personality: Magnitude of cheating and response distortion. Industrial and Organizational Psychology, 2, 39–45. https://doi.org/10.1111/j.1754-9434.2008.01105.x CrossRef Google Scholar

Bartram, D. (2000). Internet recruitment and selection: Kissing frogs to find princes. International Journal of Selection and Assessment, 8, 261–274. https://doi.org/10.1111/1468-2389.00155 CrossRef Google Scholar

Bates, D., Maechler, M., Bolker, B., Walker, S., Christensen, R. H. B., Singmann, H., …. Green, P. (2017). Package ‘lme4’ [Fit linear and generalized linear mixed-effects models]. Retrieved from https://cran.r-project.org/web/packages/lme4/lme4.pdf Google Scholar

Guo, J., & Drasgow, F. (2010). Identifying cheating on unproctored Internet tests: The Z-test and the likelihood ratio test. International Journal of Selection and Assessment, 18, 351–364. https://doi.org/10.1111/j.1468-2389.2010.00518.x CrossRef Google Scholar

Hox, J. J. (2002). Multilevel analysis: Techniques and applications. New Jersey, NJ: Lawrence Erlbaurn Associates, Inc.CrossRef Google Scholar

Kantrowitz, T. M., & Dainis, A. M. (2014). How secure are unproctored pre-employment tests? Analysis of inconsistent test scores. Journal of Business and Psychology, 29, 605–616. https://doi.org/10.1007/s10869-014-9365-6 CrossRef Google Scholar

Karabatsos, G. (2003). Comparing the aberrant response detection performance of thirty-six person-fit statistics. Applied Measurement in Education, 16, 277–298. https://doi.org/10.1207/S15324818AME1604_2 CrossRef Google Scholar

Lievens, F., & Burke, E. (2011). Dealing with the threats inherent in unproctored Internet testing of cognitive ability: Results from a large-scale operational test program. Journal of Occupational and Organizational Psychology, 84, 817–824. https://doi.org/10.1348/096317910X522672 CrossRef Google Scholar

Lievens, F., & Chapman, D. S. (2009). Recruitment and selection. In Wilkinson, A., Redman, T., Snell, S., & Bacon, N. (Eds.), The SAGE handbook of human resource management (pp. 133–154). London, UK: Sage.Google Scholar

Lievens, F., & Harris, M. M. (2003). Research on Internet recruiting and testing: Current status and future directions. In Cooper, C. L. & Robertson, I. T. (Eds.), International review of industrial and organizational psychology (Vol. 16, pp. 131–165). Chichester, UK: John Wiley & Sons.Google Scholar

Magis, D., & Gilles, R., (2012). Random Generation of Response Patterns under Computerized Adaptive Testing with the R Package catR. Journal of Statistical Software, 48(8), 1–31. https://doi.org/10.18637/jss.v048.i08>CrossRef Google Scholar

Magis, D., & Barrada, J. R. (2017). Computerized Adaptive Testing with R: Recent Updates of the Package catR. Journal of Statistical Software, Code Snippets, 76(1), 1–19. https://doi.org/doi:10.18637/jss.v076.c01 Google Scholar

Makransky, G., & Glas, C. A. W. (2011). Unproctored Internet test verification using adaptive confirmation testing. Organizational Research Methods, 14, 608–630. https://doi.org/10.1177/1094428110370715 CrossRef Google Scholar

Meijer, R. R., & Sijtsma, K. (2001). Methodology review: Evaluating person fit. Applied Psychological Measurement, 25, 107–135. https://doi.org/10.1177/01466210122031957 CrossRef Google Scholar

Naglieri, J. A., Drasgow, F., Schmit, M., Handler, L., Prifitera, A., Margolis, A., & Velasquez, R. (2004). Psychological testing on the Internet: New problems, old issues. American Psychologist, 59, 150–162. https://doi.org/10.1037/0003-066X.59.3.150 CrossRef Google Scholar PubMed

Nye, C. D., Do, B. R., Drasgow, F., & Fine, S. (2008). Two-step testing in employee selection: Is score inflation a problem? International Journal of Selection and Assessment, 16, 112–120. https://doi.org/10.1111/j.1468-2389.2008.00416.x CrossRef Google Scholar

Olea, J., Abad, F. J., Ponsoda, V., & Ximénez, M. C. (2004). Un test adaptativo informatizado para evaluar el conocimiento de inglés escrito: Diseño y comprobaciones psicométricas [A computerized adaptive test for the assessment of written English: Design and psychometric properties]. Psicothema, 16, 519–525.Google Scholar

Pace, V. L., & Borman, W. C. (2006). The use of warnings to discourage faking on noncognitive inventories. In Griffith, R. (Ed.), A closer examination of faking behavior. Greenwich, CT: Information Age.Google Scholar

Ployhart, R. E. (2006). Staffing in the 21^st Century: New challenges and strategic opportunities. Journal of Management, 32, 868–897. https://doi.org/10.1177/0149206306293625 CrossRef Google Scholar

R Core Team (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from https://www.R-project.org/Google Scholar

Revuelta, J., & Ponsoda, V. (1998). A comparison of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 35, 311–327. https://doi.org/10.1111/j.1745-3984.1998.tb00541.x CrossRef Google Scholar

Ryan, A. M., Inceoglu, I., Bartram, D., Golubovich, J., Grand, J., Reeder, M., …. Yao, X. (2015). Trends in testing: Highlights of a global survey. In Nikolaou, I. & Oostrom, J. K. (Eds.), Employee recruitment, selection, and assessment: Contemporary issues for theory and practice (pp. 136–153). Sussex, UK: Psychology Press.Google Scholar

Ryan, A. M., & Ployhart, R. E. (2014). A century of selection. Annual Review of Psychology, 65, 693–717. https://doi.org/10.1146/annurev-psych-010213-115134 CrossRef Google Scholar

Sackett, P. R., & Lievens, F. (2008). Personnel selection. Annual Review of Psychology, 59, 419–450. https://doi.org/10.1146/annurev.psych.59.103006.093716 CrossRef Google Scholar PubMed

Sanderson, K. R., Viswesvaran, C., & Pace, V. L. (2011). UIT practices: fair and effective? Industrial and Organizational Psychology, 48, 29–38.Google Scholar

Segall, D. O. (2001, April). Detecting test compromise in high stakes computerized adaptive testing: A verification testing approach. Paper presented at the annual meeting of the National Council on Measurement in Education, Seattle, WA.Google Scholar

Swygert, K. A. (2003). The relationship of item-level response times with test-taker and item variables in an operational CAT environment (Vol. 98, No. 10). Pennsylvania, PA: Law School Admission Council.Google Scholar

Tendeiro, J. N., & Meijer, R. R. (2012). A CUSUM to detect person misfit: A discussion and some alternatives for existing procedures. Applied Psychological Measurement, 36, 420–442. https://doi.org/10.1177/0146621612446305 CrossRef Google Scholar

Tendeiro, J. N., Meijer, R. R., Schakel, L., & Maij-deMeij, A. M. (2013). Using cumulative sum statistics to detect inconsistencies in unproctored internet testing. Educational and Psychological Measurement, 73, 143–161. https://doi.org/10.1177/0013164412444787 CrossRef Google Scholar

The International Test Commission (2006). International guidelines on computer-based and Internet delivered testing. International Journal of Testing, 6, 143–171. https://doi.org/10.1207/s15327574ijt0602_4 CrossRef Google Scholar

The International Test Commission (2006). International Guidelines on the Security of Tests, Examinations, and Other Assessments. International Journal of Testing, 16, 181–204. https://doi.org/10.1080/15305058.2015.1111221 Google Scholar

Tippins, N. T. (2009). Internet alternatives to traditional proctored testing: Where are we now? Industrial and Organizational Psychology, 2, 2–10. https://doi.org/10.1111/j.1754-9434.2008.01097.x CrossRef Google Scholar

Tippins, N. T. (2015). Technology and assessment in selection. Annual Review of Organizational Psychology and Organizational Behavior, 2, 551–582. https://doi.org/10.1146/annurev-orgpsych-031413-091317 CrossRef Google Scholar

Tippins, N. T., Beaty, J., Drasgow, F., Gibson, W. M., Pearlman, K., Segall, D. O., & Shepherd, W. (2006). Unproctored Internet testing in employment settings. Personnel Psychology, 59(1), 189–225. https://doi.org/10.1111/j.1744-6570.2006.00909.x CrossRef Google Scholar

van der Linden, W. J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73, 365–384. http://doi.org/10.1007/s11336-007-9046-8 CrossRef Google Scholar

Verbic, S., & Tomic, B. (2009). Test item response time and the response likelihood. arXiv preprint arXiv:0901.4356. Retrieved from http://arxiv.org/abs/0901.4356 Google Scholar

Wright, N. A., Meade, A. W., & Gutierrez, S. L. (2014). Using invariance to examine cheating in unproctored ability tests. International Journal of Selection and Assessment, 22, 12–22. https://doi.org/10.1111/ijsa.12053 CrossRef Google Scholar

Article contents

Cheating on Unproctored Internet Test Applications: An Analysis of a Verification Test in a Real Personnel Selection Context

Abstract

Keywords

Access options

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests