Assessment programs and their components : a network approach

Recebido em: 13 mar. 2020. Aprovado em: 02 mai. 2020. Publicado em: 19 jun. 2020. ABSTRACT: Exams and other assessments in health science education are not random events; rather, they are part of a bigger assessment program that is constructively aligned with the intended learning outcomes at different stages of a health science curriculum. Depending on topical and temporal distance, assessments in the program are correlated with each other to a more or lesser extent. Although correlation does not equate causation, once we come to understand the correlational structure of an assessment program, we can use that information to make predictions of future performance, to consider early intervention for students who are otherwise likely to drop out, and to inform revisions in either assessment or teaching. This article demonstrates how the correlational structure of an assessment program can be represented in terms of a network, in which the assessments constitute our nodes and the degree of connectedness between any two nodes can be represented as a thicker or thinner line connecting these two nodes, depending on whether the correlation between the two assessments at hand is stronger or weaker. Implications for educational practice and further research are discussed.


Introduction
Curriculum developers and teachers do not have it easy. Their daily jobs are about juggling between a multitude of tasks, some of which pertain to teaching and assessment in one or several educational programs as well as the evaluation and development of these programs. Although programs do evolve over time, there ought to be a constructive alignment

Assessment programs and their components: a network approach
Programas de avaliação e seus componentes: uma abordagem de rede between the intended learning outcomes at different stages of a curriculum, what is taught and in what ways, and what is assessed with which methods. Two assessments that, through topical vicinity, have a substantial overlap in the intended learning outcomes they intend to capture, will likely yield somewhat correlated results and more so if the amount of time between these assessments is relatively small (e.g., within the same academic year, or at the end of two consecutive academic years). That is, relatively better performance on one assessment tends to go together with relatively better performance on the other assessment, while relatively poor performance on one assessment tends to go together relatively poor performance on the other assessment. Absence of such a correlation may reflect a lack of reliability in at least one of the assessments, a lack of actual overlap in intended learning outcomes, at least one of the assessments suffering from limited validity due to an unintended skill influencing the results, or some combination thereof. Statistical analysis can shed light on the reliability factor and may to some degree inform a content review that will be needed to investigate the other factors.
In the light of the previously mentioned constructive alignment, assessments organized in the course of a curriculum can be conceived as nodes in a network that represents the assessment program for the curriculum at hand: the degree of connectedness of any pair of assessments can be represented as a line linking the two nodes representing these assessments, and the thickness of that line is a function of both topical and temporal vicinity [1]. That is, the more topical and/or temporal vicinity of two assessments, the stronger the correlation and therefore the thicker the line between these two assessments. While we should not mistake correlation for causation, correlations between assessments can help us to visualize and understand the correlational structure of an assessment program. This correlational structure can be used for several purposes, including (1) to make predictions of students' future performance, (2) to consider early intervention for students who are otherwise likely to drop out, and (3) to inform revisions in either assessment or teaching. Therefore, this article demonstrates how an emerging statistical method called network analysis [1][2][3][4][5] can help us in this endeavor of visualizing, understanding, and using the correlational structure of an assessment program through a simulated worked example that incorporates types of assessments and their correlations commonly encountered in educational practice. Next, this article presents a few guidelines for educational practice and future research.

Different models
A common approach to modeling correlations between assessments has been to treat different assessments as manifest indicators (i.e., observed variables) of so-called latent variables or variables that are not directly observed. In this approach, knowledge available on the part of a student is not observed directly but is assumed to be indicated by students' performance on one or more knowledge assessments that have been designed to measure that knowledge. The same holds for skills, attitudes and other traits or states of interest. For example, through their program, medical students learn several skills that are important in clinical examination, including history taking, physical examination, problem solving and patient relationship, and students' performances on clinical assessments are treated as manifest indicators of these latent skills.
If three assessments measure the same type of knowledge or skill (e.g., grammar knowledge, or probability calculus skill), core assumption in the latent variable approach is that these three assessments commonly respond to differences in the latent variable of interest. In practical terms, this means that students with higher degrees of that latent variable (i.e., more knowledge, or more skill) tend to score higher on these assessments than students with lower degrees of that latent variable (i.e., less knowledge, or less skill). This tendency induces a pattern of positive correlations between these assessments, with higher scores on one assessment tending to go hand in hand with higher scores on the other two assessments.

3/8
However, do we really need latent variables to explain this kind of patterns? If a group of animals -birds, cows, tigers, or other -decide to move as a group in a specific direction, that is because they communicate rather than there

Correlations visualized in a network
In a hypothetical Health Science Program X, which is a four-year undergraduate program, students face a total of thirteen assessments that    Table 1 in a network format (software used: JASP, version 0.11.1.0 [6], a zero-cost Open Source statistical software program that has very good facilitates for network analysis). of correlations K C that can be estimated equals: For 13 variables, this means: K C = [13 * 12] / 2 = 78.
We see that the correlations are strongest (Table   1) and therefore the lines are thickest (Figure 1) between adjacent exams from the same theme

A more parsimonious network
The correlations presented in Table 1 and visualized in Figure 1 Table 1 [10]. This combination of EBIC and LASSO has been called EBICglasso (e.g., the 'g' stands for 'graphical') [4][5] and is the method used to create the network in Figure 2.
In Figure 1, which uses the correlations from Table 1

Different questions
The networks in Figure 1 and   assessments. Likewise, using sub-scores instead of (as in the example) overall exam scores will require larger numbers of observations as more variables will be involved (i.e., one overall score is a combination of several sub-scores); as always, models involving more variables tend to put higher demands on sample size than models involving fewer variables. In smaller cohorts, this comes down to focusing on smaller numbers of assessments (e.g., two assessments with three or four sub-scores each instead).

Notes Funding
This study did not receive financial support from external sources

Conflicts of interest disclosure
The author declares no competing interests relevant to the content of this study.