Validez basada en el contenido de un inventario de personalidad

psicometría asistida por LLMs

Autores/as

DOI:

https://doi.org/10.15448/1980-8623.2025.1.47225

Palabras clave:

inteligencia artificial, evaluación psicológica, psicometría

Resumen

Los Large Language Models (LLMs) representan un avance en el Procesamiento del Lenguaje Natural (PLN). Este estudio investiga la utilización de MLGEs en la obtención de evidencias de validez basadas en el contenido en la evaluación de los cinco grandes factores. Los ítems del nuevo instrumento fueron creados por ChatGPT y analizados semánticamente por Gemini, junto a los ítems del BFI2 (creados por humanos). El análisis empleó la clasificación de los ítems mediante prompt (juez experto) y el análisis factorial exploratorio de los embeddings (API), un nuevo enfoque psicométrico. Los resultados mostraron convergencia semántica para neuroticismo, amabilidad, apertura y consciencia, pero una mayor dispersión en los ítems de extraversión. Se observó también convergencia semántica entre los ítems creados por el LLMs y por humanos (validez convergente de contenido). Se concluye que los LLMs presentan un buen potencial para contribuir en el proceso de obtención de evidencias de validez de contenido.

Descargas

Los datos de descargas todavía no están disponibles.

Biografía del autor/a

José Maurício Haas Bueno, Universidad Federal de Pernambuco (UFPE), Recife, Pernambuco, Brasil.

Médico, afiliado a la Universidad Federal de Pernambuco.

Ricardo Primi, Universidad de São Francisco (USF), Campinas, São Paulo, Brasil.

Doctor, con afiliación institucional en la Universidad de São Francisco.

Emanuel Duarte de Almeida Cordeiro, Universidad Estatal del Suroeste de Bahía (UESB), Vitória da Conquista, Bahía, Brasil.

Médico, afiliado a la Universidad Estatal del Suroeste de Bahía.

Ana Deyvis Santos Araújo Jesuíno, Universidad Federal de Maranhão (UFMA), São Luís, Maranhão, Brasil.

Tiene doctorado y trabaja institucionalmente en la Universidad Federal de Maranhão.

Monalisa Muniz, Universidad Federal de São Carlos (UFSCar), São Carlos, São Paulo, Brasil.

Doctorado, afiliado a la Universidad Federal de São Carlos.

Ana Paula Porto Noronha, Universidad de São Francisco (USF), Campinas, São Paulo, Brasil.

Tiene doctorado y está afiliada a la Universidad de São Francisco.

Citas

Alexandre, N. M. C., & Coluci, M. Z. O. (2011). Validade de conteúdo nos processos de construção e adaptação de instrumentos de medidas. Ciência & Saúde Coletiva, 16(7), 3061–3068. https://doi.org/10.1590/S1413-81232011000800006

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. American Educational Research Association.

Attali, Y., Runge, A., LaFlair, G. T., Yancey, K., Goodwin, S., Park, Y., & Von Davier, A. A. (2022). The interactive reading task: Transformer-based automatic item generation. Frontiers in Artificial Intelligence, 5, 903077. https://doi.org/10.3389/frai.2022.903077

Debelak, R., Koch, T. K., Aßenmacher, M., & Stachl, C. (2024). From Embeddings to Explainability: A Tutorial on Transformer-Based Text Analysis for Social and Behavioral Scientists. https://doi.org/10.31234/osf.io/bc56a

Dempsey, P. A., & Dempsey, A. D. (2000). Using Nursing Research: Process, Critical Evaluation, and Utilization (5th ed.). Lippincott Williams & Wilkins.

Demszky, D., Yang, D., Yeager, D. S., Bryan, C. J., Clapper, M., Chandhok, S., Eichstaedt, J. C., Hecht, C., Jamieson, J., Johnson, M., Jones, M., Krettek-Cobb, D., Lai, L., Jones Mitchell, N., Ong, D. C., Dweck, C. S., Gross, J. J., & Pennebaker, J. W. (2023). Using large language models in psychology. Nature Reviews Psychology, 2, 688–701. https://doi.org/10.1038/s44159-023-00241-5

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North, 4171–4186. https://doi.org/10.18653/v1/N19-1423

Ethayarajh, K. (2019). How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings (arXiv:1909.00512). arXiv. http://arxiv.org/abs/1909.00512

Fitzner, K. (2007). Reliability and Validity A Quick Review. The Diabetes Educator, 33(5), 775–780. https://doi.org/10.1177/0145721707308172

Fors Connolly, F., & Johansson Sevä, I. (2021). Agreeableness, extraversion and life satisfaction: Investigating the mediating roles of social inclusion and status. Scandinavian Journal of Psychology, 62(5), 752–762. https://doi.org/10.1111/sjop.12755

Goldberg, L. R. (1990). An alternative “description of personality”: The Big-Five factor structure. Journal of Personality and Social Psychology, 59(6), 1216–1229. https://doi.org/10.1037/0022-3514.59.6.1216

Google. (2024). Gemini (Modelo models/text-embedding-004) [Large language model]. Google. https://ai.google.dev/gemini-api/docs/embeddings

Haynes, S. N., Richard, D. C. S., & Kubany, E. S. (1995). Content validity in psychological assessment: A functional approach to concepts and methods. Psychological Assessment, 7(3), 238–247. https://doi.org/10.1037/1040-3590.7.3.238

Hu, J., Dong, T., Gang, L., Ma, H., Zou, P., Sun, X., Guo, D., & Wang, M. (2024). PsycoLLM: Enhancing LLM for Psychological Understanding and Evaluation (Versão 2). arXiv. https://doi.org/10.48550/ARXIV.2407.05721

Hu, L., He, H., Wang, D., Zhao, Z., Shao, Y., & Nie, L. (2024). LLM vs Small Model? Large Language Model Based Text Augmentation Enhanced Personality Detection Model. Proceedings of the AAAI Conference on Artificial Intelligence, 38(16), 18234–18242. https://doi.org/10.1609/aaai.v38i16.29782

Kjell, O. N. E., Kjell, K., & Schwartz, H. A. (2024). Beyond rating scales: With targeted evaluation, large language models are poised for psychological assessment. Psychiatry Research, 333, 115667. https://doi.org/10.1016/j.psychres.2023.115667

Kuhn, M. (2008). Building Predictive Models in R Using the caret Package. Journal of Statistical Software, 28(5). https://doi.org/10.18637/jss.v028.i05

Lorenzo-Seva, U., & Ten Berge, J. M. F. (2006). Tucker’s Congruence Coefficient as a Meaningful Index of Factor Similarity. Methodology, 2(2), 57–64. https://doi.org/10.1027/1614-2241.2.2.57

McCrae, R. R., & Costa, P. T. (1997). Personality trait structure as a human universal. American Psychologist, 52(5), 509–516. https://doi.org/10.1037/0003-066X.52.5.509

Oliveira, J. P. (2019). Psychometric Properties of the Portuguese Version of the Mini-IPIP five-Factor Model Personality Scale. Current Psychology, 38(2), 432–439. https://doi.org/10.1007/s12144-017-9625-5

Ooms, J. (2014). The jsonlite Package: A Practical and Consistent Mapping Between JSON Data and R Objects (Versão 1). arXiv. https://doi.org/10.48550/ARXIV.1403.2805

OpenAI. (2023). ChatGPT (Versão 3.5, consulta de setembro) [Large language model]. OpenAI. https://chat.openai.com

Pasquali, L. (2010). Instrumentação Psicológica: Fundamentos e Práticas. Artmed.

Pellert, M., Lechner, C. M., Wagner, C., Rammstedt, B., & Strohmaier, M. (2024). AI Psychometrics: Assessing the Psychological Profiles of Large Language Models Through Psychometric Inventories. Perspectives on Psychological Science, 19(5), 808–826. https://doi.org/10.1177/17456916231214460

Pires, J. G., Nunes, C. H. S. D. S., Nunes, M. F. O., & Primi, R. (2023). Preliminary validity for the Big Five Inventory-2 in Brazilian adults. Psico-USF, 28(1), 91–102. https://doi.org/10.1590/1413-82712023280108

R Core Team. (2023). R: A Language and Environment for Statistical Computing (Vienna, Austria). R Foundation for Statistical Computing. https://www.R-project.org/

Revelle, W. (2007). psych: Procedures for Psychological, Psychometric, and Personality Research (p. 2.4.6.26) [Dataset]. https://doi.org/10.32614/CRAN.package.psych

Rizopoulos, D. (2006). ltm: An R Package for Latent Variable Modeling and Item Response Theory Analyses. Journal of Statistical Software, 17(5). https://doi.org/10.18637/jss.v017.i05

Roebianto, Roebianto, Savitri, Aulia, Suciyana, & Mubarokah. (2023). Content validity: Definition and procedure of content validation in psychological research. Testing, Psychometrics, Methodology in Applied Psychology, 30(1), 5–18. https://doi.org/10.4473/TPM30.1.1

Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A Primer in BERTology: What We Know About How BERT Works. Transactions of the Association for Computational Linguistics, 8, 842–866. https://doi.org/10.1162/tacl_a_00349

Slaney, K. (2017). Validating Psychological Constructs. Palgrave Macmillan UK. https://doi.org/10.1057/978-1-137-38523-9

Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4), 427–437. https://doi.org/10.1016/j.ipm.2009.03.002

Soto, C. J., & John, O. P. (2017). The next Big Five Inventory (BFI-2): Developing and assessing a hierarchical model with 15 facets to enhance bandwidth, fidelity, and predictive power. Journal of Personality and Social Psychology, 113(1), 117–143. https://doi.org/10.1037/pspp0000096

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is All You Need. 31st Conference on Neural Information Processing Systems, 30. https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

Wickham, H. (2023). httr: Tools for Working with URLs and HTTP (Versão 1.4.6) [Software]. https://CRAN.R-project.org/package=httr

Zhang, J., Xu, X., Zhang, N., Liu, R., Hooi, B., & Deng, S. (2023). Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View (Versão 3). arXiv. https://doi.org/10.48550/ARXIV.2310.02124

Zhang, W., Deng, Y., Liu, B., Pan, S. J., & Bing, L. (2023). Sentiment Analysis in the Era of Large Language Models: A Reality Check (Versão 1). arXiv. https://doi.org/10.48550/ARXIV.2305.15005

Publicado

2025-12-19

Cómo citar

Haas Bueno, J. M., Primi, R., Duarte de Almeida Cordeiro, E., Deyvis Santos Araújo Jesuíno, A., Muniz, M., & Porto Noronha, A. P. (2025). Validez basada en el contenido de un inventario de personalidad: psicometría asistida por LLMs. Psico, 56(1), e47225. https://doi.org/10.15448/1980-8623.2025.1.47225

Artículos más leídos del mismo autor/a