
Measurement Invariance of the Differentiation of Family System Scale for Koreans and Americans
The Differentiation in the Family System Scale (DIFS; Anderson & Sabatelli, 1992) is a popular scale used to assess emotional connectedness and separateness within the context of family-of-origin experiences. This current study contributes to the research on family patterns of interaction by assessing the measurement invariance of DIFS for Americans and Koreans. The results indicated that the factor structure and loadings of the DIFS were invariant across both samples; however, latent means should not be compared across two groups because strong invariance was not met. It implies that there may be bias when participants from different cultures respond to the particular items. The findings challenge the practice, often used by internationally based researchers, of using translated versions of scales developed for use with U.S. based samples/populations. The non-invariance of some items is discussed in terms of the linguistic and cultural differences that would influence responses to translated measurement for Koreans. This research emphasizes the need for culturally grounded measures that integrate cultural factors into the measurement of marriage and family constructs.
INTRODUCTION
This paper focuses on issues related to the cultural robustness and invariance of measures designed to assess family constructs. In recent years, there has been a noticeable growth in the international scholarship focusing on family issues. In many instances, international scholars have adapted measures created by U.S. scholars for use in their studies. The use of these adapted measures is grounded in the unstated premise that measures created by U.S. scholars are culturally robust, meaning that in spite of differences in cultural value orientations, customs, practices, and/or policies, the conceptualization and operationalization of a construct is invariant. The view expressed within the paper is that, at the very least, support for the cultural robustness of a measure requires an examination of whether the measure performs in a statistically invariant way across the two cultures.
For illustrative purposes, this issue is explored using a measure developed in the U.S. that has been increasingly adapted for use by Korean scholars. Specifically, within the past decade, Bowen's Intergenerational Theory of Family Functioning (Bowen, 1976) and the Differentiation in the Family System Scale (DIFS) developed by Anderson and Sabatelli (1992), have been widely used in studies focusing on family patterns of development and adjustment within Korean families. The DIFS has been used, however, without first exploring whether it is a culturally relevant, reliable, and valid way of assessing family processes indicative of family levels of functioning in a non-western culture like in Korea. In other words, the Korean studies using this measure have done so without consideration of the possibility that the construct the scale is designed to assess may need to be re-conceptualized or a re-operationalized in order to be more sensitive to Korean cultural values and norms. It raises a question that the measure based primarily on families in Western cultures may not be reliably applied to different cultural settings because families are embedded in a socio-cultural and historical context. [End Page 379]
The Conceptualization and Operationalization ofFamily System Differentiation
Multigenerational perspectives on individual and family development are based on the assumption that the dynamics within a family of origin create a legacy that influences the trajectory of both individual and family development (Bowen, 1976). Since the development of Bowen family systems theory in the 1960s, multigenerational theorists and therapists have made extensive use of the concept of differentiation when referring to the manner in which family patterns affect the trajectory ofindividual health and development. In Bowen's model, a well differentiated system is characterized by patterns of interaction that enhance the abilities ofindividual family members to act with an age-appropriate degree of autonomy, take personal responsibility for age-appropriate tasks, and experience strong connections with significant others or close friends and family. Bowen proposes that poorly differentiated family systems are characterized by boundary processes, conflict-management customs, and emotional climates that increase the anxiety within individuals resulting in them becoming less and less able to regulate their emotions and behaviors in constructive ways (Bowen, 1976; Napier, 1988; Schnarch, 1997; Skowron, 2004; Skowron, Holmes, & Sabatelli, 2003; Wamboldt & Reiss, 1989; Wilcoxon & Hovestadt, 1985; Williamson, 1982). Over the past 20 years there has been growth in the research using the concept of family system differentiation to investigate the health and functioning of families and their members—including studies within the fields of nursing, marriage and family therapy, family studies, and psychology (Knauth & Skowron, 2004; Murdock & Gore, 2004; Rosen, Bartle-Haring, & Stith, 2001; Skowron, 2000).
While the development of psychometrically sound measures of concepts central to Bowen theory has lagged behind his theoretical work, there have been limited attempts to develop measures of family system differentiation. For example, in 1992, Anderson and Sabatelli (1992) developed the DIFS, which employs 11 items to assess family patterns of distance regulation within the context of family of origin experiences. Within studies conducted in the U.S., the psychometric properties of the DIFS have been examined, including its internal consistency, test-retest reliability (Anderson & Sabatelli, 1992; Bartle-Haring & Sabatelli, 1989; Sabatelli & Bartle-Haring, 2003), construct validity, and discriminant validity (Anderson & Sabatelli, 1992; Bartle-Haring & Sabatelli, 1998). For example, validity for this measure was supported by the finding that DIFS scores were associated with marital intimacy and psychological maturity among married adults (Bartle-Haring & Sabatelli, 1998).
It is often the case that cross-cultural researchers use an instrument that has been found to show adequate psychometric properties in one cultural group, and translate and administer it to another cultural group (Milfont & Fischer, 2010). This practice has, generally, been the case within the studies focusing on the construct of differentiation within Korean samples as a relatively large number of studies have used a translated version of the DIFS (Chun & MacDermid, 1997; Nam, 2003; Nam & Han, 1999). For example, empirical studies in Korea have used the DIFS to understand how family members balance closeness and autonomy within their family systems across generations (Kim & Kim, 2004; Oh & Choi, 2006; Park & Kim, 2010). In all of these studies, the researchers assumed that the DIFS measure performs in a culturally invariant way. Thus, an important but unanswered question is whether the construct measured is identical across cultural groups. In particular, instruments derived from [End Page 380] Western cultures may not be well-suited for measuring constructs within a non-Western context (van de Vijver & Tanzer, 2004) because cultural values influence how families manage boundary processes between generations. Thus, it is possible that measures will need to be adjusted to reflect these different cultural orientations.
Cultural Influences on Measures of Family Differentiation
The family is embedded in a socio-cultural context that, in turn, influences how family relationships are structured and experienced. For example, it is reasonable to expect that the structure and experience of family patterns of interaction will differ in countries with individualistic versus collectivistic cultural value orientations. Collectivistic value orientations in Korea are manifested in an emphasis on familism, which refers to a family-centered worldview (Rappa & Tan, 2003). Familism in Korea is uniquely influenced by Confucian philosophy and places a strong emphasis on filial piety, intergenerational loyalties and support for the elderly. As such, Korean familism is characterized by an emphasis on the interdependence of family members, which may not encourage children's psychological independence from their parents. Furthermore, filial piety, the core value of familism, refers to intergenerational support: adult children's care for older parents, as well as parents' support for their children even after the children are married (Lee & Bauer, 2013). Therefore, we can speculate that these value orientations will have a strong and long-lasting impact on family dynamics in Korea. It follows from this that measures used with Korean populations need to consider how specific cultural norms or social expectations filter into family processes. And, it follows that using a measure developed on U.S. populations and reflecting U.S. cultural and family norms cannot ensure the measure's cross-cultural relevance.
Families are as important within individualistic cultures as in collectivistic cultures. However, the emphasis on individual rights and responsibilities, as well as the value of autonomy and personal authority, all suggest that family patterns of interaction between and among generations will be structured differently within individualistically orientated family systems when compared to collectivistically orientated family systems. Further, as argued by a handful of researchers, it is reasonable to expect that the interrelationships between family patterns and individual development will vary across cultures. For example, Chun and MacDermid (1997) found that intergenerational fusion and individuation were negatively associated with self-esteem in Korean adolescents (opposite of what is typically found in studies of U.S. youth and families). Manzi and colleagues (2006) found that family enmeshment did not predict maladaptive behavior among Italian adolescents, taken, again, to support the conclusion that cultural value orientations influence how intergenerational patterns of separateness and connectedness are structured and experienced. Further evidence for a more culturally nuanced view of the adaptive and maladaptive levels of separateness and connectedness found within families comes from those studies examining these issues in Western and Asian cultures (Rothbaum, Rosen, Ujie, & Uchida, 2002; Segal, 1991; Yoshida & Busby, 2012).
It is clear that measures designed to capture effective and ineffective patterns of interactions between and among multi-generational family systems should be culturally grounded. Bowen's concept of differentiation is based on the experiences of middle class Caucasian [End Page 381] American families in the U.S, where there is clearly a cultural emphasis on individualism and independence (Gushue & Constantine, 2003). In contrast, a family in a collectivistic culture may encourage its members to develop connectedness to a greater extent while also supporting the achievement of autonomy to a lesser extent. In other words, individualism emphasizes the self as being independent from others and focuses on the development of autonomy over time, whereas the self is viewed as interdependent, emphasizing emotional dependence and group solidarity in cultures with collectivistic value orientations.
Despite these cultural differences, previous researchers who evaluated the psychometric properties of the DIFS within a different cultural context examined only the reliability or the factor structure of the measure in order to ensure the equivalence of the scale. For example, a validation study of the translated version of the DIFS reported acceptable levels of reliability and validity as a "one size fits all" solution to validate the measure in Korean samples (Nam, 2003; Nam & Han, 1999). These studies, however, lacked evidence for measurement invariance, which would suggest whether the measure performs in the same way across cultures (Byrne & Campbell, 1999; Byrne & van de Vijver, 2014; Kline, 2011). That is, equivalence testing of data gained from respondents in many cultures is needed in order to support the conclusion, from a statistical point of view, that a measure is cross-culturally relevant.
Thus, the present study tested measurement invariance, which is the degree to which the DIFS's measurement model functions in the same way across two cultures: Korea and the U.S. Despite the importance of measurement invariance that allows measures to be used for individuals from different cultures, the past literature using the DIFS in their studies included no measurement invariance testing procedure. This type of measurement study is particularly important to cross-cultural research, such as cross-cultural validation of a measure. Determining whether the DIFS demonstrates measurement invariance would allow the DIFS to better serve its function ofmeasuring family dynamics across cultures.
METHODS
Participants
The Korean participants for this study were randomly selected from a marketing database (Embrain, a marketing research company based in Asia). The final sample in Korea included 327 males and 331 females. The ages of the participants ranged from 21 to 45 years, with a mean age of 34.2 years (SD = 4.1). In terms of education, participants fell into two groups: those who were high school graduates or completed some college (12.4%) and those who were college students or earned a graduate degree (87.5%).
The U.S. participants were recruited in Northeastern Connecticut using three different ways: solicitation in public places (e.g., mall, public buildings), solicitations in schools and day care settings, and solicitation via telephone. These procedures produced a total sample of 235 male and 333 female participants. The ages of the U.S. participants ranged from 22 to 63 years (Mean=43, SD=7.5). Also, the education level of these participants fell into two groups: those [End Page 382] who were high school graduates, attended college for 1 year, or received specialized training (67%) and those who were college graduates or had received graduate-level training (33%). The participants in the sample were predominantly White, while less than 7 % of the sample was Hispanic, Black, or Asian. Overall, the U.S. participants had lower levels of education on average compared to the Korean participants.
Instrument
The DIFS consists of 11 items to measure respondents' perceptions of the levels of differentiation characterizing their family of origin (Anderson & Sabatelli, 1992). The measure focuses on a respondent's accounts of the patterns of interaction between selected family members. This is accomplished by using a circular questioning format that assesses respondents' perceptions of the patterns of interaction within six different dyadic relationships: their mother's relationships with their father, their father's relationship with their mother, their mother's relationship with them, their relationship with their mother, their father's relationship with them, and their relationship with their father. In this study, data on four different dyadic relationships were assessed (e.g., mother-to-child and child-to-mother and father-to-child and child-to-father) using 11 items repeated for each of these different dyadic relationships. The scale used a 5-point Likert-type response with anchors ranging from 1 (never) to 5 (always). The possible range of scores for each subscale was 11 to 55 with higher scores indicating more differentiation.
This study used a translated version of the DIFS for Koreans. The English survey was translated into Korean by Nam and Han (1999) and then the Korean version was translated back into English by two other bilingual speakers with 5 years of U.S. residency and a graduate degree in Psychology. The internal consistency reliabilities (a) for these dyadic subscales within this sample ranged from .80 to .86 for Korean and from .82 to .93 for U.S. participants.
Sampling Comparably Across Cultures
In order to ascertain that subsequent analyses were not moderated by demographic differences between the samples, the U.S. and Korean samples were compared on three demographic variables: age, gender, and education. A simple pairwise t-test on age and a chi-square analysis of the gender and education levels revealed significant differences between the U.S. and Korean samples. Also, independent sample t-tests showed that Koreans were younger (M=34, SD=4.1) than Americans (M=43, SD=7). Additionally, in terms of education levels, Koreans were more educated than Americans (x2=384.73, df= 1, N= 1,215, p<.001). Specifically, most of the Korean participants were college graduates (87.5%), while the American counterpart fell into two groups: those that were high school graduates or had partial college/specialized training (65. 8%) and those that were college graduates or had received graduate training (32.2%). Furthermore, gender composition showed that the disparity between females and males was larger in the U.S. sample, 58.6% female, than the Korean sample, which was evenly split. [End Page 383]
As a result of finding these differences, we investigated whether the DIFS scores were affected by demographic differences using the procedure outlined by Riordan and Vandenberg (1994). First, the DIFS scores were regressed on age, country, and age-country interaction. The results showed that the main effect of age was significant; however, the age-country interaction was not significant. Second, a pair of two-way ANOVAs was conducted on the DIFS scores exploring differences for gender and education. The first ANOVA tested the main effects for gender and country and the gender-by-country interaction; meanwhile the second ANOVA tested the main effects for education and country, as well as, the education-by-country interaction. The analyses revealed that neither main effects of gender and education nor their interactions with country were statistically significant on the DIFS scores. Thus, the DIFS scores were not affected by the demographic differences between the samples.
Analysis Plan
The invariance tests require a sequential process to examine the configural, weak, and strong invariance of a measure at the measurement and latent level. The tests compare a hypothesized model, in which parameters of interest are constrained to be equal across groups, and a less restrictive model, in which the same parameters are relaxed. In order to assess the effect of the equality constraint at each stage, the comparative fit index (CFI), root mean square error of approximation (RMSEA), standardized root mean square residual (SRMR), and Tucker-Lewis index (TLI) are inspected (Kline, 2011).
The invariance testing begins with evaluating the fit ofthe basic measurement model for each group. In measurement invariance analysis, configural invariance tests are conducted to test whether the same factor structure holds across groups. If the specified model with the same factor structure across groups shows a good model fit (i.e., CFI/TLI > .90 and RMSEA/SRMR < .05), further restrictive constraints can be imposed for weak invariance. The importance of the configural invariance test is that it serves as the baseline against all subsequent tests exploring the question of measurement invariance. The second level of invariance, weak invariance, implies that the unit of measurement for the underlying factors (i.e., factor loading) is the same across groups. Last, the intercepts of the items are constrained to be equal across groups in order to establish strong invariance. Each procedure compares between a hypothesized model, in which parameters of interest are constrained to be equal across groups, and a less restrictive model in which the same parameters are relaxed. In order to assess the effect ofthe equality constraint at each stage, a change in the CFI of less than 0.01 was used to be a reliable criterion (Cheung & Rensvold, 2002). In addition, the RMSEA was used to assess model fit and changes in model fit in this study (Vandenberg & Lance, 2000). The data for this study were analyzed using Mplus Version 6.12 statistical software.
RESULTS
Exploratory Factor Analysis
As a first step in exploring the cultural invariance ofthe DIFS, a principal component factor analysis was performed separately for each group with factors whose eigenvalues were above [End Page 384] 1 (See Table 1). For both Koreans and Americans, the findings support the conclusion that all ofthe items—except item 6 (pertaining to communication patterns among family members)—comprise a unidimensional construct. Item 6 ("My father/mother tells me that she/he doesn't mean what she/he is saying") showed a low factor loading in both groups. Furthermore, one item did not perform well among Korean respondents, specifically, item 10, "My father/mother shows understanding when I do not wish to share my feeling." Based on these findings, we decided to delete two items in the subsequent analysis (items 6 and 10). After these deletions, Cronbach's alpha for the 9-item scale was .93 for Koreans and .92 for Americans. Finally, the results of exploratory factor analysis supported a single factor model, composing ofthe DIFS with 9 items. The hypothesized model of family differentiation tested across Koreans and Americans is presented in Figure 1.
Factorial Validity Analysis
Further examination of the confirmatory factor analysis for the single factor structure of the DIFS showed a somewhat poor fit (%2(27) = 758.586; RMSEA = .203; CFI = .843; TLI = .790; SRMR = .062 for Koreans, x2(27) = 248.021; RMSEA = .120; CFI = .935; TLI = .913; SRMR = .040 for Americans). Additionally, testing for validity of the DIFS based on the pooled sample indicated a poor model fit (xV») = 1006.607; RMSEA = .170; CFI = .881; TLI = .842; SRMR = .053). Inspection of the modification indices suggested that sixteen pairs of error-terms should be allowed to correlate (i.e., e8-e11, e4-e9, e2-e7). Because these pairs of items dealt with the same issues, such as self-assertion or freedom of personal expression, we decided to allow them to correlate and chose the revised one-factor model as the baseline model for the following invariance test. After these modifications, the overall model fit was improved (see Table 2).
Measurement Invariance Analysis
Based on the revised one-factor model that included all correlated error terms, this section examines the measurement invariance ofthe DIFS across two groups. Table 2 presents the test for invariance across two cultures. The result of configural invariance testing showed an adequate fit ofthe model to the data (x2(24)= 83.179; CFI = .993; TLI = .978; RMSEA = .063; SRMR = .019). This indicated that the revised one-factor structure holds across two cultures. Since the baseline model (configural invariance) was supported, further restrictive constraints were imposed on the model. The equality constraint placed on the factor loadings ofthe DIFS did not result in a decline in model fit as compared to the configural model (ΔCFI = .005) . However, moving from the weak invariance model to the strong invariance model resulted in a significant drop in model fit. In other words, latent means were different across groups (ΔCFI = .077).
In sum, assumptions for weak measurement invariance were met for all items of the DIFS, indicating the factor loadings ofthe DIFS were invariant across both samples. However, we cannot conclude that latent means are comparable across the two groups because strong invariance did not hold. [End Page 385]
Factor Loadings of the DIFS in the Korea and U.S. Samples
[End Page 386]
Hypothesized Model of Family Differentiation
[End Page 387]
Differential Item Functioning
The results showed a lack of strong measurement invariance across the groups, which indicates differential item functioning (DIF). Differential item functioning refers to group differences in the probability of an item response after their ability scores are placed on a common scale (Lee, Little, & Preacher, 2011). Detecting differential item functioning suggests sources of non-invariance and may be useful for future research because it reveals bias in an item, which may result from things such as poor item translation or low appropriateness of the item content in different cultures. Upon failing to establish strong invariance, uniform DIF, which refers to group differences in intercepts, was examined by conducting the free-baseline mean and covariance structure (MACS) analysis with the fixed-factor scaling method and Bonferroni-corrected LR test (Lee et al., 2011).
Model Testing for Measurement Invariance
Scale Items and Assumptions of Invariance Met
The results showed that five ofthe nine items (item 2, 3, 4, 7, and 11) were identified as having uniform DIF. In other words, although assumptions for weak measurement invariance were met for all items, five items did not meet strong invariance assumptions. Therefore, partial strong invariance was evaluated with only the loading-invariant items. Moving from the baseline model to the partial strong invariance model did not result in significant drops in model fit (see Table 3). Thus, partial invariance tests find that behaviors that communicate a respect for the person and personal boundaries (item 1, 5, 8, and 9) are invariant in their intercepts. After locating items with differential item functioning, partial strong invariance was established. [End Page 388]
In summary, the results indicated that the items demonstrate similar internal consistency reliability across two cultural groups because factor loadings on a modified model are equal across groups. It is important to note that the measurement model in the present study was the one-factor model including 9 items with correlated 16 pairs of error-terms to improve the overall model fit in both groups instead ofthe original 11-items scale. However, strong invariance was not met; therefore, latent means should not be compared across two groups. This implies that there may be some bias when participants from the U.S. and Korea respond to the particular items.
DISCUSSION
This study was designed to explore whether the operationalization of Bowen's (1976) construct of family differentiation in the DIFS has similar explanatory power in countries with different family cultures: the U.S., known as an individualistic society, and South Korea, which has been described as a collectivist society. Our first goal was to test the assertion that the DIFS, which was developed in the U.S., has the same factor structure in a Korean cultural context. The results showed that scale items referring to the degree of support, empathy, and connection experienced within the family loaded on a single latent factor in both countries. However, the operationalization of some items appeared to be evoking different interpretations across samples, which in turn, decreased the internal consistency ofthe scale. Alternatively, it might be the case that there is item bias resulted from how certain items on the scale were worded. Specifically, it may be that Korean respondents had problems in interpreting the item that states, "My father/mother shows understanding when I do not wish to share my feelings." Given the fact that the item showed negative factor loading for Koreans, this would indicate that the item may tap into the construct differently for Koreans due to the mistranslation ofthe item. Again, the reasons for this are unclear; however, it is important to be clear that the findings showed bias at item level, suggesting the item has a different psychological meaning across cultures.
The measurement invariance testing results showed that the modified 9-items DIFS met the criteria for weak invariance, indicating the factor loadings ofthe measure were invariant across both samples. However, the lack of invariant intercepts indicated the presence of item bias (i.e., differential item functioning). In other words, participants from different groups may perform differently on some items because of the nuances of language in the items that may affect respondents who belong to a particular group. The subsequent DIF analysis supported the idea that there are differences in how some, not all, ofthe items function across groups. For example, the item that reflects the ability to express personal thoughts and perspective freely had the largest difference across two samples. It is reasonable to conclude that a family in a collectivistic culture emphasizes emotional dependence and solidarity, which may not encourage family members to share their feelings openly one another.
In addition to this, we speculated that there could be item bias resulted from how certain items on the scale were worded. As emphasized by van de Vijver and Hambleton (1996), an appropriate translation requires a balanced treatment ofpsychological, linguistic, and cultural considerations. In other words, each item should be translated in a culturally specific way, [End Page 389] which means that items should be applicable both in meaning and choice of expression to different cultures during a translation procedure. However, most Asian researchers who use Western scales in their studies have used a "translation and back-translation" procedure without endeavoring to attain linguistic, semantic, and conceptual accuracy.
Hence, even though there is a growing awareness ofthe need for researchers to use a culturally grounded and systematic approach to the adaptations of Western scales for use with Asian samples (e.g., Cheung, van de Vijver, & Leong, 2011), the translation and back-translation procedure prevails as the only way most of the contemporary Asian researchers attempt to achieve cultural relevance. Furthermore, in considering the lack of reporting on translation procedure in the literature (e.g., Juang & Tucker, 1991; Myers, Madathil, & Tingle, 2005; Zhang & Tsang, 2012), it appears that many researchers consider the translation process a precursory step that can be expedited or ignored to accomplish "real" research, which uses Western scales to assess marital or family constructs in a different cultural context. Thus, we challenge international researcher to focus on measurement issues such as bias in cross-cultural assessment and the impact of bias in research as highlighted by van de Vijver and Tanzer (2004), which aligns with our belief that doing so will encourage a shift in the field toward assessing measurement tools through a cultural lens.
It is important to note, in addition, that even when items are translated in a culturally specific way, the conceptual meaning of the items could still be interpreted differently across groups (Vandenberg & Lance, 2000). This indicates the possibility that some items are not appropriate for certain cultural groups. For example, in our study, Americans endorsed certain items consistently higher than Koreans (i.e., "my mother/father responds to my feelings as if they have no value", "my mother/father demonstrates respect for my privacy"). The findings suggest that it is possible that respondents from a Confucian-based collectivistic culture interpret the items that explicitly refer to psychological independence from their parents dissimilarly from their U.S. individualist counterparts.
As such, if items addressing psychological independence are not culturally grounded when applied to Asian populations, then the conclusions advanced by the studies focusing on the effects offamily differentiation on individual adjustment in countries other than the U.S. may be called into question—namely, that family differentiation has a weaker or opposite effect on individual adjustment for respondents living in collectivistic cultures when compared to U.S. respondents (Chung & Gale, 2006; Chun & MacDermid, 1997; Manzi, Vignoles, Regalia, & Scabini, 2006). It is possible that these findings are less about family differentiation and more the result of problems related to the measurement tools used in the studies. That is, these researchers all assumed that the measures they employed function in similar ways across cultures without testing for equivalence at the level of the construct or meaning in different cultural contexts. Our view is that any interpretation of cultural differences in family differentiation should be made only when there is some evidence that the measure is a reliable and valid tool for assessing family differentiation across cultures.
Clearly, our study highlights a need for research to support the development of culturally appropriate measures that include the continuum between human universals and culturally [End Page 390] specific aspects of family relationships. In line with this, emerging literature provides a systematic approach to integrating a cross-cultural perspective into the development of measures (e.g., Cheung, van de Vijver, & Leong, 2011; van de Vijver, & Tanzer, 2004). For example, Cheung, van de Vijver, and Leong (2011) suggested a systematic approach to developing a personality assessment for the Chinese by using mixed-methods. These include conducting focus group interviews; examining multiple sources of research literature, Chinese novels, and proverbs; and comparing with measures that have been widely used in Western cultures to examine cultural universality and specificity in personality constructs. In a similar vein, future research is needed to expand our understanding of family differentiation beyond the existing concepts and models grounded in Western cultures.
This study specifically focused on the invariance ofthe DIFS (Anderson & Sabatelli, 1992) across two cultural groups, Korea and the U.S. A logical extension of this measurement invariance research would be to replicate these analyses with diverse samples for generalizability. That is, future studies should be conducted to provide richer implications of the DIFS across individuals from diverse backgrounds (e.g., different cultural/ethnic groups). International scholars in family research are encouraged to employ the DIFS in order to better understand and evaluate family dynamics in different sociocultural contexts. In doing so, researchers can locate the source of bias at the construct, instrument administration, and items levels across cultures. Such future studies would open new directions for understanding the role of contextual factors in the psychometric properties of family relationship measures.
The current study has some limitations that should be addressed. First, it should be noted that, due to practical concerns, we examined two countries as representative of individualist and collectivist cultures. The sample sizes were moderate; however, the samples were homogenous. Therefore, we recommend that future research include people of different ages, education, and income groups to ensure that a variety of cultures are represented. Also, the discrepancy in demographic variables between the samples, such as age and education level, could have contributed to our failing to establish strong invariance. However, as reported earlier, the DIFS scores were not affected by demographic variables; therefore, we trust that the analyses have not been seriously jeopardized as a consequence of sample differences. Second, partial invariance testing is meaningful in that it suggests underlying reasons for non-invariance; however, the partial invariance model should be used with caution. For example, our findings revealed that the majority of items were non-invariant. The question, thus, becomes: What percentage of items need to be invariant in order to then be able to compare mean differences across groups? In line with this, Steenkamp and Baumgartner (1998) recommended that at least two loading and intercept invariant items are "equivalent enough" to compare mean differences across groups. Similarly, Byrne, Shavelson, and Muthén (1989) argued that at least one invariant loading item warrants further examination of invariance in both measurement and latent level parameters. In contrast, Vandenberg and Lance (2000) suggested that free factor loadings are allowed only for a minority of items. Given that there is no clear guideline on the choice of a minimum number of invariant items, the choice should be made based on conceptual and statistical grounds. [End Page 391]
Overall, the results showed that the DIFS (Anderson & Sabatelli, 1992) met the criteria for weak invariance, which implies that the unit of measurement for the underlying factors (i.e., factor loading) is the same across groups. However, assumptions for strong invariance (i.e., equal intercepts) were not established, which led to the discovery of item bias. Of course, strong invariance is difficult to establish in cross-cultural research (Van de Vijver & Tanzer, 2004). However, an observation of cross-cultural mean score differences would be a starting point for future research that is focused on finding systematic patterns in cross-cultural differences and looking for a valid explanation of the observed differences. The subsequent MACS analysis indicated that five of nine items were identified as having differential item functioning, which means that assumptions for strong measurement invariance were violated for five items in the scale. As such, because a majority ofthe items in the scale did not meet assumptions of strong measurement invariance, it would not be appropriate to use this measure when the goal ofastudy is to compare a composite score ofthe observed variables. Given the fact that strong invariance suggests completely bias-free measurement (van de Vijver & Tanzer, 2004), the evidence of non-invariance indicates a threat to cross-cultural equivalence; therefore, cross-cultural comparison in scores on the construct of family differentiation is invalid across Korea and the U.S.
In terms of partial measurement invariance, Steinmetz (2013) argued that partial measurement invariance is not sufficient for composite (observed mean) difference testing, although it is sufficient in latent variable models. In other words, the comparison of observed composite scores across groups is only warranted when assumption of full invariance is met. However, given the fact that strong invariance is hardly achieved in cross-cultural research (Byrne & Campbell, 1999; Byrne & Watkins, 2003), this study provided the sources of heterogeneity that threaten validity or, in other words, the extent to which items can be non-invariant across groups. In considering that the measure was developed in one culture and translated into another language, it became more critical to identify the sources of non-invariance. Accordingly, we provided evidence of non-invariance related to item bias; for example, item wording was not appropriate or, to some extent, the underlying assumption of an item was not interpreted in the same way, which was mainly caused by the prevailing practice—translation and back-translation procedures—among researchers. Taken together, the findings presented in this article demonstrated the importance of culturally informed measurement to move forward.
Our findings contribute to a growing body of literature exploring the psychometric properties of U.S. based measures being used in Asian cultures. Researchers often assume that an instrument devised in one culture is universally applicable. However, the current study provides a clear example of the dangers of hasty adaptations of measures for use within different cultural groups. Clearly, developing instruments that are culturally sensitive while being conceptually grounded and psychometrically sound is important in order for research on families in countries outside the U.S., and in the instance of this study, in Asian countries, to go forward. Given the context of living in a multicultural environment that impacts what we research and how we practice, we need to take a cross-cultural perspective into consideration in the measurement process in order to explore the continuum between human universals and culturally specific aspects of family relationships. In line with this, the present study is an [End Page 392] important step as it explored and compared cultural similarities and differences in the measurement of family differentiation. Moreover, the results of the study have the potential for explaining the cross-cultural relevance of family differentiation, which, in turn, contributes to the advancement of family research that provides better understandings of family dynamics in different sociocultural contexts. [End Page 393]