Differential item functioning analysis on a measure of mindfulness (MAP)

Analysis of the differential item functioning (DIF) is of great importance when developing or validating psychological instruments, since it enables to identify whether there are bias on a given instrument concerning sample characteristics. Considering this importance, in this study, we verified the presence of DIF within items of a new instrument for the measurement of mindfulness (MAP), regarding the sex, age, practice with meditation, and use of alternative medicine of the sample. For this, 788 Brazilian adults, mean age of 26 years (SD=9.59), most women (79%) and single, responded the MAP. Overall, no DIF was identified with the positively worded items, indicating that the analyzed items do not favor, specifically, any of the groups tested in the present study.


Introduction
Verifying the impact of demographic variables on test items is a relevant topic in psychometrics (Zwick, 1990). In this respect, analysis of the differential item functioning (DIF), according to Baer et al. (2010), measures the extent to which groups of respondents with the same level of the latent trait of a given instrument have different responses to items. For these authors, when an item-DIF is identified "any significant differences for such an item might not be related with the latent trait" itself. Indeed, according to Linacre (2010), DIF analysis "investigates the items in a test for signs of interactions with sample characteristics, such as sex and age"; whilst Karami (2012) indicates that "any item flagged as showing DIF is biased if, and only if, the source of variance is irrelevant to the construct being measured by the test".
The presence of an item-DIF itself will not jeopardize the whole measurement, especially when moderate to large DIF represent less than 25% of the total items (Penfield & Algina, 2006), and when DIFs are balanced across the tested groups (Linacre, 2010), in terms of amount and content, favoring or functioning against respondents from different groups, in the same extent. This draws a process entitled by Teresi (2006) as "cancellation of DIF". Linacre (2010) also states that DIFs will only be meaningful when they occur at a significant level and display large magnitude of contrast between the tested groups. Indeed, it is worth mentioning that verifying whether DIFs are balanced within the instruments under development or validation is a way to guarantee the fairness of future evaluations that will use them (AERA, APA, NCME, 1999).
Different instruments have been used to understand and validate the construct of mindfulness within different cultures, for example, the Five Facets of Mindfulness Questionnaire (FFMQ) (Baer et al., 2006), the Freiburg Mindfulness Inventory (FMI) (Walach et al., 2006), and the Mindfulness Attention Awareness Scale (MAAS) (Brown & Ryan, 2003). However, even though we have scientifically advanced in terms of number of instruments to assess mindfulness, "there remains a lack of clarity in the operationalization of this construct, and underlying mechanisms" (Vago & Silbersweig, 2012), and when it comes to the psychometric proprieties of these instruments, especially from the item response theory perspective, some issues still unclear, drawing a smoke and mirrors atmosphere for the assessment of this construct. On the other hand, a few studies have addressed this question to gather different sources of validity for the instruments of mindfulness and, in some cases, have included DIF analyses. Van Dam, Earleywine and Borders (2009) tested whether the items of the FFMQ would function differently for meditators and nonmeditators (students) with the same level of Mindfulness. The authors reported DIF with 18 of the 39 items of the instrument, involving all of the five facets, and "even under limited power conditions". Besides these 18 items, other six FFMQ items showed large DIF. Items 7 (I can easily put my beliefs, opinions, and expectations into words), 24 (When I have distressing thoughts or images, I feel calm soon after), 27 (Even when I'm feeling terribly upset, I can find a way to put it into words) and 37 (I can usually describe how I feel at the moment in considerable detail) showed bias against meditators; while items 8 (I don't pay attention to what I'm doing because I'm daydreaming, worrying, or otherwise distracted) and 38 (I find myself doing things without paying attention) favored meditators. Authors conclude that the FFMQ functions differently in meditators and nonmeditators respondents to such an extent that, for the authors, it might be problematic to use this instrument for comparing meditators and nonmeditators. Baer et al. (2010) have replicated the study by Van Dam et al. (2009) and reported that only 4 items showed some evidence of significant DIF. Items 1 (When I'm walking, I deliberately notice the sensations of my body moving) and 11 (I notice how foods and drinks affect my thoughts, bodily sensations, and emotions) favored meditators, while items 18 (I find it difficult to stay focused on what's happening in the present) and 23 (It seems I am "running on automatic" without much awareness of what I'm doing) biased against meditators. In this study, "meditators were more likely to endorse positively worded items whereas nonmeditators were more likely to deny negatively worded (reversescored) items". Besides, it is important to note that two of these items with DIF reported by Baer et al. (2010) were not mentioned in the study led by Van Dam et al. (2010). These conflicting findings indicate that more psychometric studies with the FFMQ are demanded, especially because the FFMQ has been one of the most used instrument of mindfulness currently. Sauer et al. (2011) assessed the psychometric proprieties of the Freiburg Mindfulness Inventory (FMI), using Rasch model approach. Among the analyses, the authors tested the presence of DIF comparing groups in terms of age, education, mindfulness practice, and spiritual practice. In this study, item-DIF was not present for any of the tested groups. However, differently, in a further study of DIF with this same instrument, Sauer, et al. (2013) reported DIF by sex in item 13 (I am impatient with myself and with others [reverse scored]), despite the fact that the authors did not specify whether this item favored men or women. Also, using the ages median, strong DIF was found for items 2 (I sense my body, whether eating, cooking, cleaning or talking), 3 (When I notice an absence of mind, I gently return to the experience of the here and now), 4 (I am able to appreciate myself), 8 (I accept unpleasant experiences), 9 (I am friendly to myself when things go wrong) and 13 (already mentioned). The authors suggest the FMI functions differently in younger and older respondents, indicating that the scores on the FMI may be biased by the age of the respondents.
In line with this, Inchaustia, Prieto and Delgado (2013) analyzed the psychometric proprieties of the Mindfulness Awareness Attention Scale (MAAS) using Rasch model. Amongst the results, the authors reported that item 9 (I get so focused on the goal I want to achieve that I lose touch with what I am doing right now to get there) was the only to present significant DIF, when comparing participants assigned within the experimental and control groups. Additionally, in a more recent study involving the MAAS, Medvedev et al. (2016) reported significant DIF effect for item 5 (I tend not to notice feelings of physical tension or discomfort until they really grab my attention), when comparing samples of university students and general population. These results indicate that, overall, the MAAS is a fair instrument, however, more studies are demanded with this instrument, to ensure its psychometric quality.
Considering the importance on testing whether the instruments of mindfulness are sensitive to assess differences that are not related with the latent trait measured, but with sample characteristics, since these instruments have widely been used to comprehend the construct and to assess intervention on this topic, the purpose of the present study was to verify the presence of item-DIF with items of a new instrument to assess mindfulness (MAP) in adults. We hypothesized that the MAP would not present differential item functioning at any extent that would jeopardize the measurement across the different tested groups (meditators × nonmeditators, man × women, use of alternative medicine [for example: herbalism, bach florals and chromotherapy] × not; and age [based on median]), without favoring participants of any tested groups in the present study.

Method Participants and Procedures
We invited Brazilian adults to participate on this study within two approaches. University students (N = 558) from different regions of the State of Santa Catarina (South of Brazil) were invited to participate. Of the total sample, 28 students were from one university located in São Paulo city. Additionally, other 230 participants, from different regions of Brazil, responded the MAP through an online link based on the survey monkey platform. Most of the total sample (79%) declared to be women, and single.
Participants who indicated a minimum of one year of practice with any type of meditation were then included in the group of meditators. It is worth mentioning that this cutoff has been mentioned in previous studies of mindfulness (Lau et al., 2006). Indeed, this same cutoff was adopted for separating the participants regarding the use of alternative medicine. Participants who declared to make currently use of any type of alternative medicine, for instance, herbalism, bach florals, and chromotherapy, for a minimum of one year long, were considered in the group that Make use of it. For splitting the groups in terms of age, we followed the study by Sauer et al. (2011), whose authors used the median. Participants who reported age > 21 were added into group B; whereas the others were added into group A. Thus, considering that splitting the sample based on the age of the participants may mask differences regarding the educational level of the sample, for this study we only considered the participants who declared to be undergraduate student, without having any previous university degree. Table 1 presents additional characteristics of the sample.

Measures
Measure of Mindfulness (MAP) (Pires 2016), is a 47-item self report scale that assesses the following four domains of the construct of mindfulness: a) mindfulness (α = .88) indicates how aware, open, curious and sensitive one is in relation to one's own experiences, activities and surroundings. This component is also related to the intentional monitoring of experiences, involving attitudes such as observing and describing, also in a non-elaborate manner, as when one finds oneself thinking of something; b) attention regulation (α = .84), which evaluates the negative pole of mindfulness, and refers to the voluntary use of different attention skills (concentrating, alternating and dividing), whether for attaining higher awareness or for self-regulation; c) acceptance (α = .74),also evaluating the negative pole of the construct, indicates how much a person accepts her own experiences, and leaves them be as they are, without wishing to avoid or alter them; and d) novelty seeking (α = .62), factor evaluating the negative pole, and that denotes the attitude of living in the automatic state of functioning, and is related to the intentional promotion of awareness by the exploitation and discovery of new elements in the environment and context. This attitude amplifies context sensibility and contributes to prevent aimless wandering or being guided by automatic functioning. The MAP intends to be a new instrument to assess mindfulness, whose components were drawn from earlier studies on mindfulness assessment.
For designing the MAP components, we compared the factorial structures extracted within other current instruments of mindfulness (see Pires et al., 2015). Then, by comparing qualitatively the components extracted within these instruments of mindfulness, we selected five of the most frequent components, that are: a) conscientiousness and orientation to the present moment: indicates the intentional monitoring of inner experiences: thoughts, feelings, body sensations; which may also occur in a non-elaborated manner, involving awareness and insight. B) Attention regulation: refers to the intentional use of the different abilities of attention (dividing, focusing and alternating) in order to promote its regulation; oppose of living on the autopilot model; c) acceptance and not reactivity: encompasses the individual allowance of letting own experiences to follow their transitory course, by avoiding to produce evaluative labels to them. D) Observing experiences: refers to the ability of intentionally perceive own inner experiences affecting other experiences and the behavior; and e) Describing the experiences: indicates the capacity to using words for reproducing the experience of the mindful state.
Based on these five components collected from current instruments of mindfulness (Pires et al., 2015), and considering the mindful and mindless states (Langer, 2014), we elaborated 275 positively and negatively worded items to assess the constructo of Mindfulness for adults, on a five point scale. Furtherly, this preliminary pool of items was subjected to expert (N = 4) and semantic analyses (N = 16) (Pires et al., in press). Of the total items, 145 items remained within the pool, whose overall agreement between pairs of experts was moderate (k = 0,5059; p < .05). In this study, we observed low concordance for the components Observing, Conscientiousness and Attention, among the Brazilian experts who analyzed the pool, indicating that these dimensions might be reflecting, at any extent, overlapping components.
This hypothesis of overlapping components was tested in a follow up study, in which we reported (Pires, 2016) that an exploractory factor analysis with the MAP identified that all of the items originally drawn to assess the positive aspects of the construct (observing, describing, conscientiousness, awareness, insight, attention) merged into one single fator. This result corroborated some previous studies that have suggested the unifactorial structure as the most promising for representing the construct of mindfulness (Pires et al., 2015), including the FFMQ (Baer et al., 2006) that considers facets of mindfulness, not factors. However, it is important to be mindful that obtaining factorial solutions with different facets would be a more informative structure for the construct. Indeed, it indicates that more studies in this topic are needed to be done.
Besides, this study gives evidences that, at least within the Brazilian population, negatively worded items tend to split from the positively ones, similar finding reported in a study that validated the FFMQ with a sample of Brazilian (Barros et al., 2015). Moreover, we also tested the psychometric proprieties of the MAP utilizing item response theory model (Pires, 2016). We found good fit of the items to the Rasch model, with adequate levels of infit, outfit, item-theta correlation; besides coherence between the level of mindfulness of the sample and the different levels of difficulty of the items, based on the item/person map. Indeed, the item/ person map indicated the subscale mindfulness covers different levels of its underlying latent variable, ranging from very low to medium-high levels. However, it is valid to highlight that very high levels of mindfulness are not encompassed by the MAP.

Ethical Issues
This research was approved by the Institutional Research Board (IRB) of the Federal University of Santa Catarina (Number: 43086815.4.0000.0121). All individuals had to agree with and sign to the informed consent prior to participate on this study.

Analytic Approach
Firstly, the 47 items of the MAP were subjected to exploratory factor analysis, using the statistical package Stata 14. This procedure intended to ensure the unifactoriality of the subscales within the different groups tested in the present study. To verify the presence of DIF, the items were analyzed using the Winsteps software (Linacre, 2010). We considered the Mantel-Haenszel chi square to identify item-DIF (Zwick, 1990), however, for verifying the effect size of the difference (contrast) (Linacre, 2010;Penfield, 2007) we used the criterion indicated by Linacre (2010), who states that contrasts that are < .43 are considered negligible, whilst contrasts between ≤ .43 and ≤ .64 are slight to moderate, and when >.64 the contrast is moderate to large. Moreover, it is worth highlighting which a positive DIF contrast indicates the item is more difficult for the first group, favoring the second group.

Results
With respect to differences by experience with meditation, only three positively worded items (144, 60, 62) showed some evidence of DIF. Items 144 (When I am aware of my feelings [such as joy and sorrow] I try to observe them from an "outer perspective") and 60 (I tend to perceive details on nature) displayed bias against meditators; whereas item 62 (I catch myself thinking of how I am feeling) showed bias against nonmeditators. However, although DIF was present, the magnitude of the contrast for these three items are considered negligible. Furthermore, item 65 (I sometimes catch myself paying attention to my thoughts) displayed bias against meditators, however, the contrast of the difference was not statistically significant, ranging from -.42 to .43. Regarding the negatively worded items, we only identified item 275 (I tend not to recall details of the places where I usually go) to be displaying moderate bias against meditators.
In relation to the differences in terms of sex, apparently, six positively worded items displayed significant DIF, most regarding emotional regulation. Items 123 (I am sometimes aware of how my thoughts guide my emotions), 247 (I think I could write about my unpleasant emotions), 75 (When drinking water, I perceive the sensations that it causes in my body), 180 (When drinking water, I perceive my attention drawing into my body) and 62 (already mentioned) showed bias against men; whilst item 63 (Sometimes I catch myself thinking of what I am doing) displayed bias against women However, despite these significant differences, it is worth mentioning that the magnitude of the contrasts are all considered negligible. In this same condition, four negatively worded items displayed significant DIF with negligible magnitude of contrast.
Regarding the contrast by the use of alternative medicine, we identified seven positively worded items with DIF. Items 1 (I seek to accept the emotions I feel), 53 (I find I know where my thoughts go when I am in the shower), 55 (When drinking water, I imagine its track in my body), 58 (Sometimes I perceive myself aware of my body sensations) and 60 (already mentioned) showed significant differences favoring participants who declared to make use of alternative medicine. Additionally, items 144 and 123 displayed significant bias against this same group of participants. Though, despite these significant differences, the magnitude of the contrast for these items were all negligible.
In relation to differences in terms of age, item 103 (from subscale Attention and its regulation) and item 34 showed significant differences against participants who reported to be less than 21 years old. Items 60 and 62 (already mentioned) and item 185 (I pay attention to some emotions [such as jealousness, courageousness, homesickness] when I get aware they arise) showed bias against those who declared age below 21. Items 1 and 53 (already mentioned), and item 8 (I try to understand the emotions I have, no matter whether they are positive or not) displayed bias against participants who were above the cutoff. However, observing the magnitude of contrasts for these six positive items, which was overall low, indicates that such differences are negligible. On the other hand, four negatively worded items, from all subscales, displayed negligible bias, favoring and functioning against all tested groups in the present study. Moreover, item 101 (Sometimes, while I am doing a task, I catch myself thinking of how the next weekend is going to be) showed moderate DIF against those who reported age above 21. On the other side, item 101 (When I do ordinary tasks, I sometimes catch myself thinking of the next weekend) displayed significant moderate DIF favoring younger respondents. More details regarding the results are shown in Table 2.

Discussion
The objective of this study was to verify whether there is DIF within items of a new instrument drawn to assess the construct of mindfulness (MAP). We predicted that the items would not be biased, nor function specifically against any of the groups tested in the present study, at any level that would jeopardize the measurement.
Although we have identified some positively worded items with significant DIF, the contrasts of the differences across the tested groups were all negligible, indicating that the positive subscale of the MAP does not favor participants with any of the characteristics tested in the present study. Differently, five negatively worded items displayed moderate to large DIF, most regarding subscale attention and its regulation, indicating bias for these items across participants by age, use of alternative medicine, and sex. On this respect, despite considering that these items may jeopardize future measurement, it is important to recall that the amount of item-DIF does not represent Penfield and Algina (2006) cutoff of 25% of the total scale. However, new studies with these items are welcome to be addressed in the future, to confirm findings of the present study and to verify the real impact of these five item-DIFs on the overall measurement.
Additionally, this result supports the understanding that reverse scored items represent one issue that needs to be more discussed in the field of Mindfulness (Baer et. al., 2010;Van Dam et al., 2009). In line with this, if we consider that both states mindful and mindless are likely adequate representations for the construct of mindfulness, in this case, negatively worded items would be relevant for retaining in the instrument, as previously discussed by Baer et al. (2010). Therefore, more studies within the field of mindfulness should verify the real impact of using reverse scored items on the overall measurement of this construct.
With respect to differences by experience with meditation, findings of the present study are in accordance with previous studies on this topic (Baer et al., 2010;Inchaustia et al., 2013;Medvedev et al., 2016;Sauer et al., 2011) which have reported minimal presence of DIF in the FFMQ, FMI and MAAS, and that meditators tend to endorse positively worded items, whereas nonmeditators tend to deny the negatively worded ones. Indeed, in the present study only one negatively worded item (275) displayed significant bias against meditators, result that has theoretical congruence, since engagement with the environment is not a core product of meditation practices, but from creativity, personal level of openness to new experiences, among other possible factors. Additionally, considering we found balanced bias (in favor and against) in all tested groups in the present study, we understand that these DIFs might be cancelling one another (Teresi, 2006). Further studies, however, should retest these differences with the MAP, involving different samples of respondents.
To reiterate the point made above, within previous studies that identified differential item functioning with instruments of mindfulness across meditators and nonmeditators samples, DIFs involved specific components of the construct, mainly describing, acting with awareness, and non reacting (Baer et al., 2010;Van Dam et al., 2009). In the present study, two items that displayed bias against meditators refer to self-regulation, and engagement with the environment components; whereas item 62, that showed bias against nonmeditators, refers to self-monitoring. These components from the MAP are associated with those components in the FFMQ and the FMI, in terms of content. As stated, these findings show theoretical congruence, since self-monitoring practices are on the core of some types of meditation (Vago & Silbersweig, 2012) and given the fact that self-regulation and engagement with the environment are abilities that may be learned from different sources than just meditation, such as going to therapy, psychoanalysis, or attending religious services. Furthermore, results reiterate the assumption that the items of the MAP are free of bias and fair for the different types of respondents, considering the specificities tested in the present study.
Another important result in this topic refers to the fact that two of the positively worded items (65 and 144) displayed strong contrast against meditators, but with no statistical significance; whereas item 62 showed significant but small contrast, favoring this same group. This result was possible because the first two items may have very high or very low scores, so their standard erros are high; whilst the item 62 may have central score (Linacre, M., personal communication, March 30, 2017). Indeed, it is worth highlighting that DIF needs size and significance to be meaningful.
Regarding the results of sex differences, although we have found significant DIF for six positively and one negatively worded items, recalling what Linacre (2010) describes that DIF may not be as problematic as one may think, especially when it occurs at a negligible level, and are balanced in terms of quantity, we can state that this characteristic was met in the present study. Differently, only one negatively worded item (204) showed significant moderate DIF favoring men. The items which displayed significant differences between men and women are mainly about emotion regulation. In this sense, it is likely that these differences be reflecting women´s vulnerability to anxiety, since women tend to react best to negative stimuli than to positive, when compared with men (Gardener, Carr, MacGregor, & Felmingham, 2013). Additionally, within the scope of the construct of mindfulness, our findings reiterate the study by Sauer et al. (2013), whose results indicate that the FMI functions similarly to men and women.
When it comes to DIF regarding the age of the participants, findings reported in the present study show that, overall, the MAP does not favor, nor is biased against older and Younger respondentes, despite the fact that some items displayed negligible contrast, and that only one item showed moderate DIF favoring younger respondents. Also, this finding of absence of DIF corroborates previous research on this topic (Medvedev et al., 2016;Sauer et al., 2011), however, on the other side, our result is different of the findings described by Sauer et al. (2013), whose authors reported strong DIF in seven items of the FMI, when considering theage of the respondents.
Although we found significant differences in six items, three items with DIF against older respondents and other three items favoring this same sample, the magnitude of the contrasts between the two groups that were spplited by age are considered negligible, besides the fact that theese items might be cancelling one another (Teresi, 2006). This indicates which the items of the MAP appear to function similarly with older and younger respondents. Indeed, it is worth mentioning that the items displaying bias in favor of the older respondents encompass high abilities of emotion regulation/intelligence, whose abilities are expected to be present within late adulthood, since the increasing of age has been associated with factors such as greater clarity of emotions (Orgeta, 2009). The item 101, whereas, that seems to function against the youngers, encompasses an attitude of anxiety, frequently associated with this stage of human development.
Regarding differences in terms of use of alternative medicine, considering that previous research haven´t already addressed this same question, we have no parameters to compare the findings of the present study within the scope of the construct of mindfulness. However, in the study by Sauer et al. (2011) it was tested DIF considering participant´s "spiritual practice", variable which seem similar with the characteristic herein tested. For this variable, the authors reported no DIF for the items of the FMI, which are positively worded items. Similarly, in the present study, only two negatively worded items showed significant bias against respondents who indicated not to make current use of alternative medicine. To restate this result, if we consider that there may exist possible associations between individual´s levels of openness to the practice (s) of meditation (s), and the use of alternative medicine, we could interpret that the source of variance is, in this case, relevant to the construct being measured by the test, and according to Karami (2012) these items are not exactly as biased. Also, it is important to highlight that testing DIF within the instruments of mindfulness considering different characteristics of the sample is of great importance to ensure the quality of the assessments of this construct, as stated by Sauer et al. (2013).
Indeed, considering that we only found five items displaying significant bias against one or other population tested in the present study, and the recommendations made by Linacre (2010), two possible steps need to be taken furtherly, in order to resolve these DIFs, which are: a) ignoring these DIF effects, since their contrasts displayed small sizes and seem to be cancelling one another; despite the fact that their effects are not perfectly balanced; or b) resolving DIF specifically for the factor which they displayed strongest effect, in this case, attention and its regulation. Nonetheless, even for this subscale, both positive and negative constrasts tend to equilibrate.
Despite the fact that some research on mindfulness have shown DIF within items of some of the most used instruments to assess this construct (Baer et al., 2010;Sauer et a., 2013;Van Dam et al., 2009), findings of the present study indicate which, overall, the positively worded items of the MAP do not present significant DIF with any magnitude that would jeopardize the assessment of the present construct. Accordingly, a conclusion that can be drawn from the present study is that the MAP is a fair instrument (AERA, APA, NCME, 1999), since it does not favor participants with any of the characteristics herein tested. It is also important to note that by guaranteeing that the psychological instruments are not biased by characteristics of the sample is of great importance within the overall field of Psychology, since the instruments are one of the most legitimate way for collecting data in research and practice in this field.
Additionally, some limitations of the present study must be highlighted. The first limitation relates to the sample, which is small and not properly representative of the entire population, which indicates that the present study is preliminary, and that future studies involving larger samples are neeed to be addressed with the MAP. Also, the fact that we have not controlled for the type of meditation practiced by the meditators respondents must be considered as one of the limitations of the present study. Other limitations regarding the sample refer to the fact that most of the participants were women, undergraduate student, and single. Also, the different means obtained for the groups divided by the median of ages may be biasing, at any extent, the results reported on this topic in the present study.
Finally, in consideration of the small number of items displaying significant DIF, and the small effect sizes obtained for the groups differences, associated with the fact that only five, and negatively worded items, displayed large to moderate constrasts, findings of the present study indicate the MAP is clearly a fair and not biased instrument (Linacre, 2010;Karami, 2012;Penfield & Algina, 2006), regarding the sample characteristics herein tested.