Mixed methods data analysis procedures

Author manuscript; available in pmc 2011 dec hed in final edited form as:j mix methods res. 1558689810382916pmcid: pmc3235529nihmsid: nihms248033a methodology for conducting integrative mixed methods research and data analysesfelipe gonzález castro,1,2 joshua g. Boyd,1 and albert kopak1,31arizona state university, tempe, az, usa2arizona state university, phoenix, az, usa3western carolina university, cullowhee, nc 28723corresponding author: felipe gonzález castro, phd, msw, department of psychology and southwest interdisciplinary research center, arizona state university, tempe, az 85287-1104, usa @author information ► copyright and license information ►copyright notice and disclaimersee other articles in pmc that cite the published ctmixed methods research has gained visibility within the last few years, although limitations persist regarding the scientific caliber of certain mixed methods research designs and methods. The need exists for rigorous mixed methods designs that integrate various data analytic procedures for a seamless transfer of evidence across qualitative and quantitative modalities. This article presents evidence generated from over a decade of pilot research in developing an integrative mixed methods methodology. It presents a conceptual framework and methodological and data analytic procedures for conducting mixed methods research studies, and it also presents illustrative examples from the authors' ongoing integrative mixed methods research ds: integrative mixed methods, grounded theory, methodological adaptation, multivariate data analysis, machismooverview on mixed methods approachesemergence of mixed methods approachescontrasting strengths of qualitative and quantitative methods within the social and behavioral sciences a schism has existed for decades that separates the qualitative and quantitative research traditions (tashakkori & teddlie, 2003; teddlie & tashakkori, 2003). Recently, mixed methods approaches have emerged that offer the promise of bridging across both traditions (haverkamp, morrow, & ponterotto, 2005).

Mixed methods data analysis

Moreover, the qualitative approach affords an in-depth analysis of complex human, family systems, and cultural experiences in a manner that cannot be fully captured with measurement scales and multivariate models (plano clark, huddleston-casas, churchill, green, & garrett, 2008). Furthermore, qualitative research methods often lack well-defined prescriptive procedures (morse, 1994), thus limiting the capacity for drawing definitive conclusions (confirmatory results), an important aspect of scientific research. Of sample size and approach qualitative studies are idiographic in approach, typically focusing on depth of analysis in small samples of participants. Unfortunately, saturation promotes the collection of smaller, “just enough” sized samples, for example, samples sizes of 8 to 20, which from a quantitative perspective is antithetical to attaining sufficiently large-sized samples for conducting stable multivariate data analyses (dreher, 1994) that can generate credible research results. In contrast, under an integrative mixed methods (imm) study, the determination of an appropriate sample size requires a broader integrative perspective: (a) that balances qualitative considerations favoring small manageable samples for conducting in-depth qualitative analyses (n = 20–40), against (b) quantitative considerations favoring larger sample sizes (n = 40–200) for conducting reliable multivariate statistical analyses (gelo et al. In qualitative data analytic methods the field of qualitative research has been rich in strategies for “entering the field” and for engaging special or hidden populations (denzin & lincoln, 1994), although by contrast qualitative approaches have often been methodologically weak in procedures for “mixing” qualitative and quantitative methods and data and for processing their inductively derived information (verbal evidence; dreher, 1994; gelo et al. Although such associations can be explored using visual case-ordered and predictor-outcome matrix methods that allow a cross-tabulation of categorical information (miles & huberman, 1994), nonetheless, these methods have lacked the capacity to reliably assess the strength of association among key categories or constructs, as can be accomplished with quantitative methods such as correlational among mixed methods studies, a common limitation has been the use of qualitative and quantitative approaches in a sequential temporal order, thus limiting the integration of both data forms under a unified process of data analysis (bryman, 2007).

Unfortunately, few studies have effectively integrated qualitative and quantitative approaches under a unified and fully integrative research design and data analytic plan (bryman, 2007; dreher, 1994; hanson, creswell, clark, petska, & creswell, 2005). Based on a decade of our pilot research, the imm approach, as presented here, has been designed for a concurrent, integrative, and unified analysis of qualitative and quantitative data. It aims to incorporate the strengths of qualitative and quantitative approaches for conducting rigorous data analyses that meet scientific standards of reliable and valid measurement and methods design approachessequential mixed methods designs creswell, plano clark, gutmann, and hanson (2003) classified mixed methods designs into two major categories: sequential and concurrent. In sequential designs, either the qualitative or quantitative data are collected in an initial stage, followed by the collection of the other data type during a second stage. In contrast, concurrent designs are characterized by the collection of both types of data during the same stage. Within each of these two categories, there can be three specific designs based on (a) the level of emphasis given to the qualitative and quantitative data (equal or unequal), (b) the process used to analyze and integrate the data, and (c) whether or not the theoretical basis underlying the study methodology is to bring about social change or advocacy (creswell et al. In accord with this typology, the three types of sequential mixed methods designs are (a) sequential exploratory, (b) sequential explanatory, and (c) sequential rent mixed methods designs the three concurrent mixed methods designs identified by creswell et al.

In each of these designs, the quantitative and qualitative data are collected during the same stage, although priority may be given to one form of data over the other. The purpose of concurrent triangulation designs is to use both qualitative and quantitative data to more accurately define relationships among variables of interest. In concurrent nested designs, both qualitative and quantitative data are collected during the same stage, although one form of data is given more weight over the other (creswell et al. Similar to sequential nested designs, concurrent transformative designs are theoretically driven to initiate social change or advocacy, and these designs may be used to provide support for various ative mixed methods designs within the context of these design approaches, the need persists for a methodology that affords a rigorous and integrative analysis of qualitative textual evidence and quantitative numeric data (schwandt, 1994). Given the noted strengths and weaknesses of the qualitative and quantitative approaches, it would be advantageous to have a truly integrative methodology for the concurrent use of both methods in a manner that offers the descriptive richness of text narratives and the precision in measurement and hypothesis testing afforded by quantitative approaches (carey, 1993; hanson et al. 2003) have indicated that, “there is still limited guidance for how to conduct and analyze such transformations [the qualitative–quantitative exchange of data] in practice” (p. A core feature of this approach is parallelism in study design, where integration begins with a unified conceptualization of information as “research evidence,” which can take the form of verbal text narrative evidence (qualitative) or numeric data evidence (quantitative).

1paradigm for the integrative mixed methods research approachbased on a specified theory or conceptual framework, a core category or construct, such as machismo, can be featured as a study's core construct. The basic imm design proceeds in six stages: (a) parallelism in study development, (b) evidence gathering, (c) processing/conversion, (d) data analyses, (e) interpretation, and (f) integration. In principle, a well-crafted study with this design would allow “seamless” data conversions, for example, the conversion of qualitative thematic categories into numeric thematic variables (castro & coe, 2007). Generally, the greater the qualitative–quantitative parallelism that is designed a priori into a study, the easier to transform, transfer, and interpret textual and numeric data forms across modalities (plano clark et al. Under a full integrative perspective, the principal aim is to examine research evidence gathered using both data forms, to generate “deep structure” conclusions (castro & nieri, 2008) that offer enhanced explanatory power above and beyond the sole use of a qualitative or quantitative ing integrative mixed methods researcha case for the integrative mixed methods approach this imm approach builds on fundamental concepts drawn from grounded theory, as described by strauss and corbin (1990), although these investigators did not speak of mixed methods research per se. One core feature under the imm approach is the equal emphasis given to qualitative and quantitative data forms (qual + quant; hanson et al. 2005) to facilitate rich, “deep structure,” data analyses (resnicow, soler, braithwait, ahluwalia, & butler, 2000) and ucting and deconstructing factorially complex constructs the imm approach offers procedures to study factorially complex constructs, such as the latino gender-role construct of machismo (torres, 1998).

The reliable encoding of complex emotions, such as ambivalence, could provide new insights into the influences of such emotions as motivational determinants of health-related al process analysis based on our prior research, the imm approach can also be used to conduct a temporal analysis of events. Thus, temporal process analysis uses interview-assisted retrospective recall of relevant thoughts, feeling, and behaviors that have occurred at each of several specified “windows of time,” or milestones. Of this methodological descriptiona major goal of the present imm methodological description is to present issues and methods for the design and implementation of an imm study (castro & nieri, 2008). A second goal is to describe methodological adaptations of our original imm approach (castro & coe, 2007), which was originally developed using an earlier-generation text analysis software program, textsmart 1. We have adapted this imm approach for use with a later-generation qualitative text analysis program, (muhr, 2004). Using selected cases from our ongoing studies, we will illustrate specific aspects of this imm approach for conducting scientifically rigorous and culturally sensitive data analyses that integrate qualitative and quantitative data. Methodology for integrative mixed methods studiesoverviewthe imm approach, as we have developed it, is implemented in six steps: (a) creating focus questions and conducting focus question interviews, (b) extracting response codes, (c) creating thematic categories (a “family” within ), (d) dimensionalizing the thematic category via scale coding, (e) qualitative–quantitative data analysis, and (f) creating story lines (castro & coe, 2007).

As indicated, in figure 2, the process of generating qualitative evidence (text data) involves the following: (a) eliciting verbal responses (ri) to a specific focus question, (b) identifying response codes (cj), (c) creating thematic categories (families; fk), and (d) converting these categories into thematic variables (vm; see figure 2). 2a flow chart of the process of thematic text analysisstep 1: the focus question and eliciting responsesa first aim in the content analysis of open-ended text narratives is to identify relevant responses (and their response codes) that answer a specific focus question. This methodology, as we have developed it, is a variation of a content analysis approach—an open-ended “topic category” interview that was developed by flannigan, mcgrath, meyer, and garcia (1995). Here, the response, “… but that to me it is almost a stereotype,” is solely a comment, and this would not be coded as a relevant collected via independent in-depth audio-recorded interviews, each participant serves as a “case,” and the “case” (not the response codes) serves as the “unit of analysis. Within the text analysis window, we also tag each response code at the beginning with the participant's case id number to link each response code to other quantitative data gathered in the structural interview, such as demographic variables and also outcome measures, for example, a life satisfaction scale. Matching thematic categories produced by the independent raters as we have developed this methodology, in a concordance analysis, we examine both independent coder solutions to reconcile them into an “optimal solution,” as defined above. Under this concordance analysis, this reconciling process yielded six thematic categories that had sufficient interrater agreement to contribute common thematic categories to the optimal solution (see table 1).

Summary, this concordance analysis used initial and revised solutions to generate an “optimal solution,” while also working to create “strong thematic categories. From our prior research, “weak thematic categories” later produce “skewed thematic variables,” which are problematic for quantitative data analyses. Given that the thematic analysis of a single focus question typically generates 3 to 12 thematic categories, each member of a two- or three-person team of coders independently rates all response codes within each thematic category. However, given the convention that, “the case is the unit of analysis,” each case should contribute only one scale code value to a given thematic category, so what to do? When dimensionalized, and if treating coded values as a likert-type scale, a thematic variable can then be used as a conventional measured variable and incorporated into conventional correlation, regression, or other multivariate data analyses. 2008) but one that as a discovered variable can aid in describing new and important conditional and interactive 5: data analytic approachesoverview of data analytic approaches descriptive and correlation analyses may now be conducted to examine associations among the qualitatively constructed thematic and the quantitatively based measured variables (castro & coe, 2007). Within a hierarchical regression analysis, the predictive effects of the inductively derived thematic variables can also be examined (a) as a unified block consisting of a set of thematic variable predictors along with a set of measured variable predictors or (b) as thematic variable predictors of an effect above and beyond (in sequentially introduced blocks) the effects of a previously entered block of measured variable predictors (cohen, cohen, west, & aiken, 2003).

In this latter case, the inductively generated “discovered” information encoded by thematic variables can introduce additional explanatory variance that otherwise would have remained undetected if solely incorporating the measured variables into the regression of data analyses preliminary data analyses can include descriptive frequency analyses to examine the distributional properties of the thematic variables. Factor analyses as examined in our prior studies, one can conduct an exploratory factor analysis with a set of thematic variables that measure a factorially complex construct such as machismo to examine its factor structure. Subsequently, one can then use results from this factor analysis to compute factor scores that can then be used as predictor variables within a hierarchical regression analysis of an outcome variable of interest, for example, life satisfaction scale scores (kellison, 2009). Example, we created factor scores for machismo self-identification, as generated from relevant thematic variables (see table 1), which were entered into a principal components analysis with oblimin rotation (kellison, 2009). A scree plot analysis revealed the viability of a two-factor solution, and as expected, these thematic variable factor loadings aptly identified two principal components: (a) negative machismo, which we labeled “control and dominance,” and (b) positive machismo, which we labeled “caballerismo and family oriented. The results of this exploratory factor analysis provided initial confirmatory evidence in support of the content validity of the constructed machismo thematic variables, as these thematic variables aptly captured the expected two-factor structure for this construct of machismo self-identification. Thus, in these integrative data analyses, both data forms were used as predictors of a dependent variable of interest, that is, life 6: coming full circle: creating “story lines” and recontextualizationa recontextualization of the data in qualitative data interpretation, contextualization is used to “give a meaning of the obtained results with reference to the specific and particular context of the study” (gelo et al.

Examining selected text narratives identified by the results of a regression model analysis allows the creation of story lines that can contribute to a deep-structure analysis that moves “beyond description to conceptualization” (strauss & corbin, 1990, p. The imm story line analysis is similar to the grounded theory story line analysis, which is used to generate “a descriptive story about the central phenomenon of the study” (strauss & corbin, 1990, p. Story lines by levels of life satisfaction table 3 presents the macho self-identification responses for a set of contrasting groups analysis. Narrative responses are presented in a stratified analysis for five cases having the highest life satisfaction scale scores as contrasted with the five lowest scoring cases (kellison, 2009). In this particular contrasting groups analysis, story line 1 for members of the highest-scoring strata of cases on life satisfaction voices positive machismo self-identification themes that involve caballerismo (chivalry; arciniega et al. These contrasting story lines reveal the presence of high life satisfaction among family-oriented responsible males, as contrasted with low life satisfaction among males who lack family involvement and who are 3contrasting groups story line statements for the five highest and lowest cases on life satisfactionstatus and areas for refinementsome challenges and limitationsadequate data gathering despite the stated advantages offered by the imm approach, several challenges exist. One challenge involves the need for effective interview data collection that requires adequate probing after an initial focus question response.

In textsmart, thematic categories are generated via three methods: (a) frequency of response, (b) co-occurrence, or (c) 4: iterative analysis toward an optimal solution. Allows a printout of all response codes listed within each family, and we have tagged each of these with the case id number to aid in integrating data analyses. Research investigator may choose to establish a different convention or decision rule if a review of the response codes presents several responses where truncating these according to a, “highest code rule,” introduces distortions that compete with the principal aim of “allowing the data to speak for itself. Culturally-sensitive research: emerging approaches in theory, measurement and methods for effective research on acculturation, ethnic identity and gender. Major issues and controversies in the use of mixed methods in the social and behavioral sciences.