[go: up one dir, main page]

Academia.eduAcademia.edu
Early Childhood Research Quarterly 28 (2013) 936–946 Contents lists available at ScienceDirect Early Childhood Research Quarterly Examining the factor structure of the Family Child Care Environment Rating Scale—Revised Diana Schaack a,∗ , Vi-Nhuan Le b , Claude Messan Setodji b a Center for the Study of Child Care Employment (CSCCE), Institute for Research on Labor and Employment (ILRE), University of California at Berkeley, 2521 Channing Way #5555, Berkeley, CA 94720-5555, United States b RAND Corporation, 1776 Main Street, Santa Monica, CA 90401, United States a r t i c l e i n f o Keywords: FCCERS-R Factor analysis Measurement properties Family child care Quality rating and improvement system a b s t r a c t The purpose of this study was to examine the factor structure of the Family Child Care Environment Rating Scale—Revised (FCCERS-R) in high-stakes contexts. The results of an exploratory factor analysis revealed three dimensions of quality on the FCCERS-R: (1) Activities/Materials, (2) Language/Interaction, and (3) Organization. This study also explored whether abridged versions of the FCCERS-R could serve as a proxy for the full instrument. In addition to subsets of FCCERS-R items created from the factor structure, purposively and randomly chosen item subsets were created. The purposively chosen subsets included 6-, 9-, and 12-item scales comprised of the items with the highest factor loading across the three factors, whereas the randomly chosen subsets consisted of 12 items. Results of a discriminant analysis showed that the factor subsets were poorer proxies for the total FCCERS-R score than were the other subsets, which demonstrated comparable internal consistencies and discriminant power as the full FCCERS-R when classifying homes into general quality categories. Implications for adopting shorter versions of the FCCERS-R are discussed. © 2013 Elsevier Inc. All rights reserved. As social policies have shifted over the past several decades in the United States, increasing numbers of young children are being cared for in licensed family child care homes. It is estimated that nearly a quarter of all children spend time in family child care by the time they reach kindergarten (Johnson, 2005). In addition, a number of studies have found associations between the quality of care that children experience in their family child care home and their social, cognitive, and language skills (Clarke-Stewart, Vandell, Burchinal, O’Brian, & McCartney, 2002; Kontos, Howes, Shinn, & Galinsky, 1995; Loeb, Fuller, Kagan, Carrol, & Carroll, 2004; NICHD ECCRN & Duncan, 2003). Yet many young children are in family child care settings that are considered minimal or even poor quality (Kontos et al., 1995; Layzer & Goodson, 2006; Whitebook & Sakai, 2004), and lower-income children are most frequently enrolled in the lowest-quality family child care homes (Kryzer, Kovan, Phillips, Donagall, & Gunnar, 2007; Layzer & Goodson, 2006; Raikes, Raikes, & Wilcox, 2005). These are also the children who are most likely to benefit from enriching and nurturing early care and education experiences (Peisner-Feinberg et al., 2001). ∗ Corresponding author. Tel.: +1 510 643 8293. E-mail addresses: diana.schaack@berkeley.edu (D. Schaack), vinhuan@gmail.com (V.-N. Le). 0885-2006/$ – see front matter © 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ecresq.2013.01.002 For these reasons, states and local communities are attempting to improve the quality of family child care (Paulsell, Porter, & Kirby, 2010). Many state and local improvement initiatives are hinged on the assessment of a family child care home’s quality, and administer the Family Child Care Environment Rating Scale—Revised (FCCERSR; Harms, Cryer, & Clifford, 2007) as part of a battery of assessments to evaluate children’s experiences in these settings. The results of the FCCERS-R, along with other indices of family child care quality, are then used to identify family child care homes in need of improvement. Currently 25 states include the FCCERS-R in their quality rating and improvement system (QRIS), where the FCCERSR scores are used to inform quality improvement activities. Under these systems, the FCCERS-R scores may have financial implications for family child care homes, as these scores can be used to award different levels of funding to family child care homes, or be made public to encourage families to choose higher-scoring programs (Schaack, Tarrant, Boller, & Tout, 2012). As the use of the FCCERS-R has evolved from the low-stakes administration in which it was validated (e.g., research purposes or as a quality improvement tool for providers and consultants) to one in which high-stakes decisions are increasingly being placed on it, there is greater need to establish that the FCCERS-R demonstrates sound psychometric properties (Tout, Zaslow, Halle, & Forry, 2009). The Standards for Educational and Psychological D. Schaack et al. / Early Childhood Research Quarterly 28 (2013) 936–946 Testing also recommend that measures should be administered under the settings and for the purposes for which they have been validated (American Educational Research Association (AERA), American Psychological Association (APA), & National Council on Measurement in Education (NCME), 1999). Thus, research is needed to understand how the FCCERS-R functions in applied, high-stakes contexts. Research on the consequences of high-stakes testing from the elementary and secondary grades has shown that when high-stakes decisions are attached to test scores, teachers can respond in ways that undermine the validity of inferences of the test scores (Smith & Fey, 2000). For example, when test scores are included as part of state accountability systems, teachers have been shown to focus only on those constructs or skills that can be easily improved (Koretz, McCaffrey, & Hamilton, 2001). It is possible that an analogous response may be seen when administering the FCCERS-R in high-stakes settings. For example, providers may focus on easierto-improve items (e.g., modifying the arrangement of the home) at the expense of harder-to-improve items (e.g., improving instructional or caregiving quality). While such a response can increase a family child care home’s FCCERS-R scores, targeting the easier-toimprove items may also undermine the validity of the inferences drawn from the FCCERS-R because the increases in scores are limited to particular aspects of quality. 1. Previous studies of the factor structure of the Environmental Rating Scales Although little is known about the measurement properties of the FCCERS-R, there have been several studies that have examined the measurement properties of its companion measures, the Infant–Toddler Environment Rating Scale—Revised (ITERS-R; Harms, Cryer, & Clifford, 2003) and the Early Childhood Environment Rating Scale—Revised (ECERS-R; Harms, Clifford, & Cryer, 1998). Previous factor analysis studies have reported fewer than the seven dimensions of quality put forth by the developers of these measures. For example, Hestenes, Cassidy, Hegde, and Hansen (2007) found four factors on the ITERS-R relating to Materials/Activities, Safety/Organization, Language/Interactions, and Parents/Staff. In contrast, Bisceglia, Perlman, Schaack, and Jenkins (2009) found only one factor. Similarly, other studies have failed to confirm the expected seven-factor structure of the ECERS-R. Both Cassidy, Hestenes, Hegde, Hestenes, and Mims (2005) and Sakai, Whitebook, Wishard, and Howes (2003) reported two distinct factors on the ECERS-R; and while there were differences in specific factor loadings, both studies found roughly the same factor structure. Cassidy, Hestenes, Hedge, et al. (2005) identified their factors as Materials/Activities and Language/Interaction, whereas Sakai et al. (2003) identified their factors as provisions for Learning and Teaching and Interactions. However, Perlman, Zellman, and Le (2004) concluded that the ECERS-R was essentially a onedimensional scale. The fact that there may be fewer than seven dimensions of quality on the Environmental Rating Scales can have implications for measurement efficiency and resource allocation. If a subset of items could be identified that had similar psychometric properties as the full scale (e.g., comparable coefficient alpha estimates and similar relationships to other measures of quality), then from a resource allocation standpoint, it may make sense to adopt an abridged version of the measure. If there were ways to administer a psychometrically sound but shorter version of the FCCERS-R that would save time to administer, then the cost savings associated with conducting a more efficient instrument could be directed elsewhere, such as toward quality improvement efforts in resourceconstrained child care programs (Bisceglia et al., 2009). In addition, 937 using shorter versions of the FCCERS-R could yield cost savings, as training raters on a 12-item scale may be more efficient than training them on a 42-item scale. In addition, selecting easy and less time-consuming items to administer on the FCCERS-R, such as those that capture elements of the physical environment, may allow raters to simultaneously administer other process quality measures, yielding efficiency and cost savings to data collection. Indeed, many states are considering incorporating the Classroom Scoring and Assessment System (Pianta, La Paro, & Hamre, 2008) into their QRISs (Tout et al., 2010), and administering a shorter FCCERS-R would allow states to capture multiple elements of quality efficiently. Earlier efforts that have attempted to create abridged versions of the ITERS-R and ECERS-R have reported the subsets to be psychometrically sound. Bisceglia et al. (2009) found that internally consistent subsets of the ITERS-R could be created by randomly selecting 12 items from the full ITERS-R scale. Furthermore, the randomly chosen subsets and the full ITERS-R scale classified centers as “Poor,” “Average,” and “Good” quality in a similar ways. Similar results were found for the ECERS-R (Perlman et al., 2004). In contrast, Cassidy, Hestenes, Hedge, et al. (2005) found that the randomly chosen 12-item subsets had substantially lower coefficient alpha estimates than both the full ECERS-R scale and the subset created from combining the two factors identified by their factor analytic results. The authors contend that this finding was partially attributable to the two-factor subset simulating the full ECERS-R scale better than the random subsets did. 2. Purpose of this study In this study, we report results from an exploratory factor analysis of the FCCERS-R, administered under one state’s QRIS. We also explored shorter versions of the FCCERS-R scale, guided by the results of our factor analysis and by random selection of the scale items. We then examined the extent to which the abridged versions of the FCCERS-R demonstrated similar properties to the full version of the instrument. We did this in two ways. First, we assessed the validity of the subsets by comparing the extent to which the subsets assigned family child care homes to the same quality categories as the full FCCERS-R scale. Second, we explored whether the full FCCERS-R scale and the subsets showed comparable relations to other regulatable quality indicators, including adult–child ratios and group sizes as well as to provider formal education, specialized training, and experience providing paid care to children. We chose these specific indicators because previous research has consistently found that family child care providers offer better process quality, including more stimulating interactions and responsive caregiving, when adult–child ratios and group sizes in homes are lower (Clarke-Stewart et al., 2002; NICHD ECCRN, 2000). Research has also found that providers with more formal education, child-related training, and experience also demonstrate more age-appropriate interactions with children, better environmental organization, and more responsive caregiving than do providers with less experience or training (Burchinal, Howes, & Kontos, 2002; Clarke-Stewart et al., 2002; Forry et al., 2012). Given the body of evidence attesting to the importance of these specific regulatable quality indicators to the quality of providers’ interactions with children, they are commonly included in states’ QRIS (Schaack et al., 2012). Thus, we examined the relationships between the FCCERS-R subsets and these regulatable quality indicators. The extent to which the subsets of the FCCERS-R show similar associations to regulatable quality indicators as the full scale would provide support for the use of shorter versions of the FCCERS-R. 938 D. Schaack et al. / Early Childhood Research Quarterly 28 (2013) 936–946 3. Method 3.1. Sample The sample used for this study consisted of 99 licensed family child care homes that were voluntarily participating in a state’s QRIS. This sample reflected 100% of the family child care homes that received a QRIS quality rating from September of 2008, when the revised edition of FCCERS-R was introduced in the state’s QRIS, to December of 2010. Family child care homes were eligible to participate in the QRIS if they were located near a low-performing elementary school and accepted child care subsidies as a form of payment, or if they were located in the state’s largest city and served at least one 4-year old child. Eighty-five percent of the child care homes in this sample met the first criteria, and the remaining 15% met the second criteria. Prior to their FCCERS-R assessment, all providers working in family child care homes in this study participated in a workshop that provided an overview of the FCCERS-R. During this workshop, each provider was given a checklist that outlined the criteria on the FCCERS-R that they were able to use as a self-assessment. For 60% (n = 59) of homes, the FCCERS-R observation used in this study represented a baseline or first observation as a participant in the state’s QRIS. The remaining 40% (n = 40) had received at least one prior FCCERS-R assessment and therefore had prior quality improvement support. All providers received the results of their FCCERS-R assessment approximately one month after it was administered. In addition, all family child care homes received incentives for their participation in the state’s QRIS, which included coaching and grants for materials and equipment after their initial assessment. Each home’s overall quality rating, including their FCCERS-R score was publically available and disseminated to families through the state’s Office of Child Care Resource and Referral Agencies. 3.2. Measures of quality 3.2.1. FCCERS-R The FCCERS-R is a 38-item measure of global child care quality organized into seven subscales that pertain to: Space and Furnishings (6 items); Personal Care Routines (6 items); Listening and Talking (3 items); Activities (11 items); Interactions (4 items); Program Structure (4 items); and Parents and Providers (4 items). Following prior research in child care using the Environment Rating Scales (see Farran & Son-Yarbrough, 2001; La Paro, Sexton, & Snyder, 1998; Whitebook, Howes, & Phillips, 1990), the agency administering the QRIS eliminated items on the Parents and Provider subscale. Many of the indictors on this subscale relied on self-report, and were considered subject to respondent bias (Achenbach, McConaughy, & Howell, 1987). In addition, many of the items were considered too distal to children’s school readiness outcomes, which was the goal of the state QRIS (Zellman, Perlman, Le, & Setodji, 2008). Each of the remaining 34 items on the FCCERS-R were scored on a 7-point Likert scale, where a score of 1 represents “inadequate,” a score of 3 represents “minimal,” a score of 5 represents “good,” and a score of 7 represents “excellent” quality. An overall FCCERS-R score was calculated by averaging the scores on all the administered items, and the subscale scores were calculated by averaging the scores across each of the items within a subscale. development completed. Formal education level and number of ECE credits completed were verified by trained data collectors via transcripts. Provider qualification data was collected from 100% of the homes in the study. In nearly 40% of the family child care homes, there was more than one provider caring for children. For these homes, the education level, years of experience, and number of ECE credits taken were averaged across providers. Analyses were also conducted using just the lead provider who identified himself or herself as the owner of the business. However, analyses yielded similar results to those presented that averaged qualifications across providers; therefore the average qualifications model is presented. 3.2.3. Ratios and group sizes Data collectors also documented the number of providers and children present during eight different time periods over the course of two days. This method followed those set forth by Le, Perlman, Zellman, and Hamilton (2006) who found that when ratios and group sizes were sampled during six specific morning time points and two specific afternoon time points and were averaged, the ratios yielded a representative account over a two-week period. In this study, morning time points were sample on one day during the FCCERS-R administration, and afternoon time points were sampled a second day and adult–child ratios and group sizes were calculated by averaging across the eight different counts. Adult–child ratios and group sizes were collected from 100% of the sample. 3.3. Procedures Each family child care home was assigned a one-month data collection window. During this time, the state’s QRIS administering agency dispatched trained data collectors to the family care homes. Data collectors were required to have a minimum of an associate’s degree in early childhood education or a related field, and all had experience working in early childhood settings. Prior to collecting data, each of the eight data collectors participated in a one-week training on data collection procedures. In addition, each data collector participated in a two-day training session on the FCCERS-R conducted by an expert scorer, referred to as a state anchor, who had been trained by the developers of the FCCERS-R, and who had been deemed a reliable rater. Each data collector was required, during three consecutive administrations of the FCCERS-R, to score 85% of the items on the FCCERS-R within one scale point of the state anchor. The reliability of the data collectors was re-checked after every tenth FCCERS-R administration, and all data collectors passed re-reliability checks. All FCCERS-R observations were unannounced, although providers were aware of their one-month data collection window. All observations were conducted between approximately 8:30 a.m. and 1:30 p.m. During the FCCERS-R administration, data collectors documented adult–child ratios and group sizes during six time points. The data collector returned to the family child care home for a second visit in the afternoon to collect training and education data, and to document adult–child ratios and group sizes during two additional time points. 4. Results 4.1. Descriptive statistics of the quality indicators 3.2.2. Provider qualifications To obtain information about the family child care provider’s qualifications, a survey was administered to each provider querying them about their highest level of education, their years of paid experience caring for and educating children ages birth to five (including experience in other child care settings), and number of course credits taken in early childhood education (ECE) or child Table 1 presents the descriptive statistics for the FCCERS-R total score and subscales. As shown in Table 1, the mean FCCERS-R score was 4.38. Forty percent of the family child care homes had taken part in the QRIS for at least a year, which included funding mechanisms to support programs to improve quality. Analyses were conducted comparing results for homes that had received D. Schaack et al. / Early Childhood Research Quarterly 28 (2013) 936–946 939 Table 1 Descriptive statistics, coefficient alphas, and intercorrelations among the FCCERS-R total score and subsets. FCCERS-R scale Mean SD Coefficient alpha (1) Space and furnishings (2) Personal care routines (3) Listening and talking (4) Activities (5) Interaction (6) Program structure (7) Total FCCERS-R 4.26 2.89 5.51 4.15 5.78 5.41 4.38 .88 .73 1.19 .84 1.38 1.31 .74 .44 .35 .59 .70 .81 .52 .87 Range 2.33–6.50 1.33–5.17 1.67–7.00 2.36–6.55 1.50–7.00 1.67–7.00 2.55–6.30 Correlations (1) (2) (3) (4) (5) (6) .41 .50 .62 .55 .47 .80 .33 .38 .35 .20 .58 .61 .69 .41 .77 .46 .36 .75 .36 .66 .66 Note: All correlations are significant at the .01 level. prior FCCERS-R assessments (and quality improvement support) and homes that had only received an initial FCCERS-R assessment (and had not yet received quality improvement support). Analyses yielded highly similar results; therefore the combined sample model is presented. While other research studies (Helburn, 1995) have found lower FCCERS-R scores than found in this sample, the higher scores may reflect the fact that family child care providers in states with QRISs attend to the items on the FCCERS-R more so than in states without QRISs and therefore may have higher scores. In addition, all providers attended a training on the FCCERS-R prior to being assessed, which may have resulted in higher scores than generally found in the population of family child care homes. Therefore, inferences from this study should be restricted to states with QRISs or that use the FCCERS-R as an accountability or improvement tool across the state. Mirroring the education levels of family child care providers nationally (Herzenberg, Price, & Bradley, 2005), 5% of the family child care providers in this sample had an associate’s degree, 15% had a bachelor’s degree, and 2% held a masters degree or higher. Providers had, on average, eight years (SD = 8.02) of paid experience working with children, and the typical provider had completed almost 11 ECE credits (SD = 14.79). In addition, low adult–child ratios were found in this sample, with an average of 3.71 children for every provider (SD = 1.28). Group size was also small, with an average of almost 6 children per family child care home (SD = 3.21). 4.2. Factor analysis of the FCCERS-R Table 1 presents the coefficient alpha estimates and intercorrelations among the subscales and the overall FCCERS-R. Although the internal consistency estimates for a given scale is partly a function of the number of items, such that higher alpha estimates are generally found on scales with more items (Streiner & Norman, 1989), in our sample, some subscales with fewer items (e.g., Interaction) showed higher internal consistency estimates than did other subscales with more items (e.g., Personal Care Routines). This suggested that several distinct aspects of quality were being captured within the same subscale. Table 1 also shows that the intercorrelations among the subscales range from 0.20 to 0.69, with a median of 0.50. Although some subscales were moderately correlated to each other, suggesting some redundancy among the items, other subscales were modestly correlated, suggesting that there is likely to be more than one dimension of quality identified in a factor analysis. The item intercorrelations supported this notion, as the median intercorrelation was relatively low at 0.19. Table 2 shows the descriptive statistics for the individual FCCERS-R items. For the most part, there was no evidence of floor or ceiling effects. The exceptions were safety practices, which had a low mean of 1.19 points, and greeting/departing and group time, which had high means of 6.84 and 6.50 points, respectively. We conducted sensitivity analyses with and without these items, and with their log transformations, but results were robust to these alternative specifications. Thus, we retained the items in their original form in the analysis. Most items were completely observed, with the exception of two items relating to provisions for children with disabilities and use of TV, video, and computers. These two items had 85% and 31% missing responses, respectively. We eliminated the item relating to provisions for children with disabilities from analysis because of the large percentage of missing data, which indicated that most of the family child care homes in our sample did not care for children with identified disabilities. We retained the item relating to the use of TV, video, and computers because extended TV watching is more likely to occur in lower-quality family child care homes (Layzer & Goodson, 2006). Following the procedures of Truxillo (2005), we imputed missing data using full-information maximum likelihood parameter estimation. This resulted in analysis on 33 FCCERS-R items. Because we expected the factors to be correlated, we conducted an exploratory factor analysis with promax rotation. Table 2 provides the results of the factor analysis. By the Kaiser criterion, we retained three factors with mean communalities among items calculated at 0.52. Following Tinsley and Tinsley (1987), we considered items with a factor loading of 0.30 or greater when interpreting the factors. Although items could have factor loadings of at least 0.30 on two different factors, for all items, the difference in factor loadings was at least 0.10, so the item was retained on the factor with the highest loading. The first factor, which we labeled Activities/Materials, had an eigenvalue of 6.85, and explained 40% of the variance. Items on this factor captured aspects of the family child care home primarily related to children’s play, including the types of materials available, the areas devoted to play, and the interactions that occurred while children were engaged in play with the materials. It was comprised of 12 items and had an internal consistency estimate of 0.85. The second factor, which we labeled Language/Interaction, had an eigenvalue of 2.16, and explained 12% of the variance. This scale assessed the extent to which providers promoted children’s language use and the emotional tone of providers’ interactions with children. It was comprised of six items and had an internal consistency estimate of 0.86. The final factor, which we labeled Organization, had an eigenvalue of 1.58, and explained 9% of the variance. The scale included items about scheduling, arrangement of furniture, and use of space. It was comprised of 10 items and had an internal consistency estimate of 0.64. 4.3. Creating shortened versions of the FCCERS-R The factor analysis results raise the possibility that subsets created from the three factors may be more efficient versions of the full FCCERS-R. We explored several ways of creating abridged versions of the full FCCERS-R. First, we created three subsets corresponding to each of our identified factors. The first subset was 940 D. Schaack et al. / Early Childhood Research Quarterly 28 (2013) 936–946 Table 2 Item loadings of the exploratory factor analysis. Factor scale Factor 1: Activities/Materials Fine motor Using books Free play Provision for relaxation and comfort Dramatic play Music and movement Math/number Blocks Nature/science Promoting acceptance of diversity Display for children Space for privacy Factor 2: Language/Interaction Provider–child interaction Discipline Interactions among children Helping children use language Helping children understand language Supervision of play and learning Factor 3: Organization Arrangement of indoor space for child care Art Furniture for routine care, play, and learning Meals/snacks Group time Health practices Nap/rest Sand and water play Use of TV, video, and/or computer Schedule Items that did not load on any factor Active physical play Diapering/toileting Greeting/departing Indoor space used for child care Safety practices Excluded itema Provisions for children with disabilities FCCERS-R scale Mean SD Range Activities Listening and talking Program structure Space and furnishings Activities Activities Activities Activities Activities Activities Space and furnishings Space and furnishings 4.02 4.37 5.53 4.83 4.73 4.76 4.56 3.49 5.06 4.34 4.78 4.61 2.13 1.98 1.86 1.60 1.44 1.47 1.46 1.19 1.51 1.55 1.60 2.31 1.00–7.00 2.00–7.00 1.00–7.00 2.00–7.00 1.00–7.00 2.00–7.00 2.00–7.00 1.00–7.00 2.00–7.00 2.00–7.00 1.00–7.00 1.00–7.00 Interaction Interaction Interaction Listening and talking Listening and talking Interaction 6.06 5.72 6.10 5.90 6.04 4.75 1.58 1.82 1.11 1.34 1.42 2.24 2.00–7.00 1.00–7.00 2.00–7.00 1.00–7.00 2.00–7.00 1.00–7.00 Space and furnishings Activities Space and furnishings Personal care routines Program structure Personal care routines Personal care routines Activities Activities Program structure 2.16 2.74 4.15 1.78 6.50 2.28 3.60 5.43 4.72 4.48 1.79 1.96 1.75 1.53 1.27 1.46 2.65 2.04 2.44 2.16 1.00–7.00 1.00–7.00 1.00–7.00 1.00–7.00 1.00–7.00 1.00–7.00 1.00–7.00 1.00–7.00 1.00–7.00 1.00–7.00 Activities Personal care routines Personal care routines Space and furnishings Personal care routines 2.02 1.71 6.84 5.03 1.19 .99 1.33 .87 1.17 .52 1.00–6.00 1.00–7.00 2.00–7.00 2.00–6.00 1.00–4.00 Program structure 5.06 2.35 1.00–7.00 Factor 1 loadings Factor 2 loadings .81 .69 .68 .64 .64 .64 .58 .53 .51 .44 .36 .49 .36 .34 .38 .39 .41 .37 .37 .34 .34 .85 .82 .78 .74 .62 .60 .37 .44 .36 .35 .48 .32 .58 .52 .50 .50 .38 .36 .36 .34 .34 .30 N/A N/A .31 N/A Factor 3 loadings Notes: Empty cells represent factor loadings less than 0.30. a This item was excluded from the factor analysis because of high rates of missing responses. comprised of the 12 items comprising the Activities/Materials factor, the second subset was comprised of the 6 items comprising the Language/Interaction factor, and the third subset was comprised of the 10 items comprising the Organization factor. Next, we explored several subsets that had roughly the same number of items as each of the factors, but instead of limiting the items to a particular factor, we included items from all three factors. For example, we created a subset that was comprised of the two items with the highest loadings on each factor. This subset, which we labeled the 6-item scale, was comprised of the following items: fine motor, using books, provider–child interaction, discipline, arrangement of indoor space for child care, and art. We then created two additional subsets in an analogous manner; that is, we created a subset comprised of the three items with the highest loadings on each factor, then created another subset comprised of the four items with the highest loadings on each factor. We labeled these subsets 9-item and 12-item scales, respectively. The 9-item scale was comprised of the same items on the 6-item scale, along with additional items relating to free play, interactions among children, and furniture for routine care, play, and learning. The 12-item scale was comprised of the same items on the 9-item scale, along with additional items relating to provision for relaxation and comfort, helping children use language, and meals/snacks. Finally, we randomly selected three random subsets of items consisting of 12 randomly chosen items following the method used by Scarr, Eisenberg, and Deater-Deckard (1994). To create the random subsets of items, we used a random number generator to assign each item a number, rank ordered the items by their assigned numbers, then chose the 12 highest-ranked items. We repeated this procedure three times, resulting in three random subsets of items. There was little item overlap between the three randomly chosen subsets and the 12-item scale, with the first two random subset sharing two items in common with the 12-item scale, and the other random subset sharing only one item in common with the 12-item scale. None of the random subsets included safety practices. Table 3 presents the descriptive statistics, coefficient alphas, and intercorrelations among the subsets and the total FCCERS-R score. As expected, all the subsets were highly correlated with the total FCCERS-R score, with correlations ranging from 0.76 to 0.93. Notably, despite having fewer items, the 6-item and 9-item scales had coefficient alpha estimates that were slightly higher than the 12-item random subsets. 4.4. Discriminant analysis We next explored the extent to which the subsets assigned family child care homes to the same quality categories as the total FCCERS-R score. We assigned family child care homes to quality designations using two different methods. First, following the same methods used by previous studies that examined the factor structures of the ITERS-R and ECERS-R (see e.g., Bisceglia et al., 2009; Perlman et al., 2004), we assigned homes to three quality levels based on their overall FCCERS-R scores. Family child care homes with a total FCCERS-R score between 1 and 3 were classified as D. Schaack et al. / Early Childhood Research Quarterly 28 (2013) 936–946 941 Table 3 Descriptive statistics, coefficient alphas, and intercorrelations among the FCCERS-R total score and the shortened versions of the FCCERS-R. Scale Mean SD Coefficient alpha (1) Activities/Materials (2) Language/Interaction (3) Organization (4) 6-Item scale (5) 9-Item scale (6) 12-Item scale (7) Random subtest 1 (8) Random subtest 2 (9) Random subtest 3 (10) Total FCCERS-R 4.59 5.88 3.74 4.21 4.59 4.84 4.15 4.43 4.33 4.38 1.03 1.24 .92 1.18 1.04 .98 .73 .78 .55 .74 .85 .86 .64 .70 .76 .81 .60 .65 .64 .87 Range 2.67–7.00 1.50–7.00 1.50–6.00 1.50–7.00 1.89–7.00 1.83–6.50 2.58–6.42 2.75–6.42 2.67–6.08 2.55–6.30 Correlations (1) (2) (3) (4) (5) (6) (7) (8) (9) .50 .42 .73 .78 .81 .84 .76 .80 .84 .47 .78 .80 .81 .56 .64 .66 .77 .61 .65 .65 .66 .72 .67 .76 .96 .95 .80 .87 .74 .86 .98 .83 .85 .80 .91 .83 .85 .81 .93 .85 .86 .89 .78 .89 .90 Note: All correlations are significant at .01 level. “Poor,” family child care homes with a score greater than 3 but less than 5 were classified as “Average,” and family child care homes with a score equal to or greater than 5 were classified as “Good.” Overall, there were 5 family child care homes classified as Poor, 77 classified as Average, and 17 classified as Good. Next, we classified full FCCERS-R scores based on the quality designations outlined in the state’s QRIS. The state’s QRIS uses a point system whereby a home is assigned 0 points if the FCCERS-R score falls below 3.50, 2 points if the FCCERS-R score falls between 3.50 and 3.99, 4 points if the FCCERS-R score falls between 4.00 and 4.69, 6 points if the FCCERS-R score falls between 4.70 and 5.49, 8 points if the FCCERS-R score falls between 5.50 and 5.99, and 10 points if the FCCERS-R score is at or above 6.00 (Tout et al., 2010). Under the state’s quality classification system, 11 homes received 0 points, 17 homes received 2 points, 35 homes received 4 points, 30 homes received 6 points, 5 homes received 8 points, and 1 home received 10 points. We conducted discriminant analyses to explore the correspondence in quality designations between the subsets and the full FCCERS-R instrument. To assign the prior probabilities, we assumed that the distribution of family child care homes was proportionally weighted across the quality categories in the same manner as the sample distribution. This method is similar to that employed by Perlman et al. (2004). Table 4 provides the correspondence between the three-category classifications (Poor, Average, and Good) based on the full FCCERS-R instrument and the classifications based on the subsets. The accuracy of classifications was consistent across the three-factor subsets, with the factor subsets matching the full FCCERS-R designations approximately three quarters of the time. However, across all three-factor subsets, there was a trend toward systematic under-prediction. For example, whereas the full FCCERS-R scale classified 17 family child care homes as Good, the Language/Interaction subscale classified all of these homes as Average. Similarly, approximately half of the family child care homes that were designated by the full FCCERS-R scale as Good were classified as Average by the Activities/Materials and Organization subsets. Classification accuracy with the three-category quality levels was better among the random subsets and the 6-, 9-, and 12-item scales. For these subsets, the accuracy rates ranged from 83% to 91%, with the highest accuracy rate found on the 12-item subset. Misclassifications were most likely to occur on the Good classification, with a trend toward under-prediction. Namely, family child care homes that were designated by the full FCCERS-R instrument as Good were classified by the subsets as Average. Table 6 provides the correspondence between the state’s quality point classifications and the classifications based on the subsets. The accuracy of classifications varied greatly by subset. Misclassification rates among the subsets derived from the factors on the state’s quality point system were notably higher than the misclassification rates using the three-category quality classifications. The factor subsets matched the full FCCERS-R designations for only 44–55% of the time. All three-factor subsets, Activities/Materials, Language/Interaction, and Organization tended to over-predict on the lower end of the point scale and under-predict at the higher end of the point scale. While the full FCCERS-R identified homes as being at the lower end of the point scale, the factor subsets classified many of these same programs as being in the mid-range of the point system. Conversely, whereas the FCCERSR scale classified programs at the high end of the continuum, the factor subsets classified many of these same programs within the mid-range of the quality point continuum. Classification accuracy with the state’s quality point system was also higher for the randomly selected subsets and with the 6-, 9-, and 12-item subsets than with the factor subsets, as classification accuracy ranged from 60% to 69%. Similar trends were noted in misclassifications as with the factor subsets. Namely, the 6-, 9and 12-item subsets and the randomly selected subsets tended to over-predict on the lower end of the quality-point scale, and under-predict at the higher end. 4.5. Relationships with other indicators of quality Given the potential of the subsets to serve as proxies for the full FCCERS-R scale, we next examined whether the subsets showed the same relationships to other indicators of quality as the total FCCERS-R scale. Table 6 provides the correlations between the factor subsets, random subsets, the 6-, 9-, and 12-item scales, and the total FCCERS-R scale with adult–child ratios, group size, and providers’ years of experience, ECE credits, and degree status (defined as whether or not the provider obtained at least an associate’s degree). As shown in Table 5, the subsets generally showed the same magnitude of correlations with the other quality indicators as the full FCCERS-R scale. However, for both the subsets and the total FCCERS-R score, no significant relationships were observed with the other regulatable quality indicators. 5. Discussion This exploratory study examined the measurement properties of the FCCERS-R, a widely used scale for the measurement of child care quality in family child care homes. Although the results of this study are preliminary, and replication in larger samples and in more diverse regulatory contexts are needed, our results suggest that the FCCERS-R captured several distinct aspects of quality. However, it did not appear to capture the seven dimensions of quality hypothesized by the developers. Instead, the exploratory factor analysis identified three dimensions of quality: Activities/Materials, Language/Interaction, and Organization. All three factors were highly correlated with the full FCCERS-R scale, and the two former factors 942 D. Schaack et al. / Early Childhood Research Quarterly 28 (2013) 936–946 Table 4 Accuracy of classification predictions of the three-category quality levels using the FCCERS-R subsets. Subsets Subtest classification Group membership based on total FCCERS-R score Poor (n = 5) Average (n = 77) Percent correct Good (n = 17) Activities/Materials Poor Average Good 0 0 0 5 68 10 0 9 7 75.75 Language/Interaction Poor Average Good 3 0 0 2 73 17 0 4 0 76.76 Organization Poor Average Good 1 0 0 4 68 11 0 9 6 75.75 6-Item scale Poor Average Good 4 1 0 1 73 8 0 3 9 86.87 9-Item scale Poor Average Good 4 2 0 1 73 6 0 2 11 88.89 12-Item scale Poor Average Good 4 0 0 1 74 5 0 3 12 90.91 Random subset 1 Poor Average Good 2 1 0 3 73 6 0 3 11 86.87 Random subset 2 Poor Average Good 3 1 0 2 73 8 0 3 9 85.86 Random subset 3 Poor Average Good 2 2 0 3 70 7 0 5 10 82.83 had internal consistency estimates that were comparable to the full FCCERS-R scale. We also explored the possibility that we could create abridged versions of the FCCERS-R scale that would show similar measurement properties as the full FCCERS-R instrument. In addition to the subsets created from the factor structure, we also explored randomly chosen subsets and purposively chosen subsets (i.e., 6-, 9-, and 12-item scales comprised of the items with the highest factor loading across the three factors). We examined the extent to which these subsets accurately classified family child homes based on three-quality categories (Poor, Average, and Good) and based on the state’s QRIS point system, which utilizes six quality categories. We found that classification accuracy for the factor subsets was particularly poor, with systematic under-prediction for the three-quality categories, and high misclassification rates for the state’s QRIS point system. In contrast, the classification accuracy for the purposively chosen subsets and the randomly chosen subsets were higher, particularly on the three-quality category designations, where classification accuracy was over 90% for the 12-item subset. The classification rates also became less accurate when using the more granulated quality categories found in the state’s QRIS point system. This suggests that within this state’s specific QRIS, the randomly chosen and purposively chosen subsets may not be accurate at assigning QRIS points. However, it is possible that these subsets may be more accurate at assigning QRIS points in states that utilize fewer FCCERS-R quality classifications, as is the case with the majority of QRIS operating throughout the country (Tout et al., 2010). Despite the high internal consistency estimates and the strong correlations with the total FCCERS-R score, the factor subsets turned out to be poorer proxies for the full FCCERS-R scale than the other subsets examined. That is, the randomly chosen subsets and the 6-, 9-, and 12-item scales were relatively good proxies for the full FCCERS-R instrument, assigning family child care homes to the same quality categories as the full FCCERS-R scale in upwards of 83% of the cases when using the three-quality classifications. In addition, these subsets showed the same relationships to other quality indicators as the total FCCERS-R score. Although the results should be interpreted cautiously because of the small sample size, the findings are consistent with our expectations, as child care quality is generally perceived to be multidimensional (Marshall, 2004). Each individual factor subset assessed only one specific aspect of quality, either activities and materials, or language and interactions, or the organization of the family child care home. Because each factor subset was narrowly defined, using a single factor subset alone as a proxy for the full FCCERS-R instrument underrepresented the full range of quality constructs measured by the FCCERS-R. This result is consistent with the findings of Bisceglia et al. (2009) as well as with Cassidy, Hestenes, Hedge, et al. (2005), who reported that a combination of items that assess both process and structural features of quality is best for simulating the full instrument. This may explain why the random subsets and the 6-, 9-, and 12-item scales were superior proxies for the total FCCERS-R scale. The random subsets and 6-, 9-, and 12-item scales assessed a wider range of quality constructs than the factor subsets because they included items across all three factors. Thus, the random subsets and 6-, 9-, and 12-item scales were more similar in content to the FCCERS-R scale than the factor subsets. If confirmed with larger and more diverse samples, the results of this study have potential implications for cost savings in the administration of the FCCERS-R. Our study suggests that as few as six items can be administered, and the resulting subset can serve as a reasonable proxy for the full scale. The 6-item scale, for example, was comparable if not better than the 12-item random subsets in terms of its classification accuracy with the three-quality categories. The D. Schaack et al. / Early Childhood Research Quarterly 28 (2013) 936–946 943 Table 5 Accuracy of classification predictions of the state’s QRIS point system using the FCCERS-R subsets. Subtest Subset classification Point rating based on total FCCERS-R score Percent 0 (n = 11) 2 (n = 17) 4 (n = 35) 6 (n = 30) 8 (n = 5) 10 (n = 1) Correct Activities/Materials 0 2 4 6 8 10 5 5 1 0 0 0 4 5 6 0 0 0 2 7 14 12 0 0 0 0 14 18 4 1 0 0 0 0 1 0 0 0 0 0 0 1 44.44 Language/Interaction 0 2 4 6 8 10 7 3 0 0 0 0 3 6 3 0 0 0 1 7 22 11 1 0 0 1 10 19 4 1 0 0 0 0 0 0 0 0 0 0 0 0 54.55 Organization 0 2 4 6 8 10 5 3 0 0 0 0 0 0 2 1 1 0 6 13 26 9 9 0 0 1 7 19 19 3 0 0 0 1 1 2 0 0 0 0 0 0 51.52 6-Item scale 0 2 4 6 8 10 9 3 0 0 0 0 1 6 3 0 0 0 1 7 25 8 0 0 0 1 7 22 2 0 0 0 0 0 3 1 0 0 0 0 0 0 65.66 9-Item scale 0 2 4 6 8 10 9 2 0 0 0 0 2 7 2 0 0 0 0 7 29 9 0 0 0 1 4 20 3 0 0 0 0 1 2 1 0 0 0 0 0 0 67.68 12-Item scale 0 2 4 6 8 10 9 1 0 0 0 0 2 9 5 0 0 0 0 7 24 7 0 0 0 0 6 22 1 0 0 0 0 1 4 1 0 0 0 0 0 0 68.69 Random subset 1 0 2 4 6 8 10 8 2 0 0 0 0 3 6 5 1 0 0 0 7 23 8 0 0 0 2 7 19 2 1 0 0 0 2 3 0 0 0 0 0 0 1 59.60 Random subset 2 0 2 4 6 8 10 6 1 0 0 0 0 5 5 6 0 0 0 0 10 26 7 0 0 0 1 3 22 2 0 0 0 0 1 3 1 0 0 0 0 0 0 62.22 Random subset 3 0 2 4 6 8 10 10 3 0 0 0 0 1 9 4 1 0 0 0 5 23 7 0 0 0 0 8 21 0 0 0 0 0 1 5 1 0 0 0 0 0 0 68.69 Table 6 Correlations between selected FCCERS-R subsets and total FCCERS-R score with regulatable quality indicators. Total FCCERS-R Ratios Group size Years of experience ECE credits A.A. degree or higher −.01 −.18 .12 .10 −.04 Activities/ Materials .01 −.11 .14 .12 −.08 Language/ Interaction .01 −.18 .10 .01 −.06 Organization −.03 −.11 −.07 .08 .02 6-Item scale −.05 −.20 .11 .12 −.07 9-Item scale −.06 −.20 .12 .10 −.07 12-Item scale −.04 −.20 .14 .08 −.07 12-Item random subsets 1 2 3 −.05 −.02 .14 .11 −.06 −.05 −.15 .12 .15 −.04 −.03 −.08 .18 .07 −.08 944 D. Schaack et al. / Early Childhood Research Quarterly 28 (2013) 936–946 6-item scale also showed a high internal reliability estimate and similar relationships to other quality indicators as the full FCCERSR scale. However, it is important to emphasize that the 6-item scale was purposively constructed, so that items with the highest factor loadings were selected for inclusion. Other subsets created from a different set of six items may not be good proxies for the total FCCERS-R score. For example, the Language/Interaction factor was also comprised of six items, and had an even higher internal consistency estimate than the 6-item scale, yet proved to be a poorer proxy for the total FCCERS-R score. The contrast between the Language/Interaction subset and the 6-item scale in terms of their psychometric properties highlights the tension associated with creating measures that attempt to capture both breadth and depth. All else being equal, measures that have higher internal consistency estimates will be more sensitive to detecting the effects of quality than less reliable measures (Light, Singer, & Willett, 1990). Yet, measures that have high internal consistency estimates may also be focusing on very specific or narrow aspects of quality. Thus, there may be trade-offs between creating measures that represent the full range of quality, while also maintaining high internal consistency estimates for those measures. To help evaluate these trade-offs, it will be important to understand the functioning of subsets of the FCCERS-R in relationship to outcomes of interest to states. In this study, we were only able to examine the full FCCERS-R and FCCERS-R subsets in relation to regulatable quality indices measured in most states’ QRIS. While the subsets showed similar associations as the full FCCERS-R, it is unclear whether analogous relationships would be observed if child outcomes had been examined instead. It may be the case that subsets of items that specifically measure care and instructional processes demonstrate stronger relationships to children’s learning and development than subsets that are primarily composed of items related to physical space and health practices, which are more distal to children’s learning. In recent research examining the functioning of the ITERS-R, items included in the Interaction scale demonstrated stronger relationships with children’s cognitive development than did items on other subscales (Setodji, Le, & Schaack, in press). If similar results are observed with the use of the FCCERS-R in family child care homes, then it may make sense for states to select the more narrow Language/Interaction subset and consider the trade-offs with respect to classification accuracy, relative to the full FCCERS-R instrument. Within the context of this study, we did not have access to other measures of process quality, as they were not used in the state’s QRIS and were only able to relate FCCERS-R subsets to structural quality variables. In future studies, FCCERS-R factors should be examined in relation to measures of process quality, such as instructional support and emotional climate, which have been shown to be associated with children’s developmental outcomes (Pianta et al., 2008). If specific FCCERS-R factors or items are redundant with other process quality measures, states may opt to select shorter versions of the FCCERS-R that are not overlapping in constructs. It is also important to consider potential implications of administering a subset of FCCERS-R items within accountability and quality improvement contexts. It is certainly possible that under these high-stakes conditions, providers may focus on changing practices in ways that comport only with the items selected. Consequently, it may be important for providers to be blind to the items assessed to help ensure that they focus on improving multiple aspects of quality. Sampling fewer items also limits the breadth of information that can be used by providers and coaches to inform quality improvement efforts. One strategy to overcome this limitation is to train coaches on the FCCERS-R, and to have the coaches and providers conduct a self-assessment using the full range of FCCERSR items. Generating quality improvement goals from facilitated self-assessments may be more aligned with sound adult learning principles, and may therefore be more effective at changing practices than generating them from top-down high-stakes assessments (Palsha & Wesley, 1998). Such a strategy may also serve to reduce the costs of assessment and monitoring efforts by reducing rater training time, and allow for more resources to be directed toward quality improvement efforts or administering other measures of quality more strongly associated with children’s well-being (Pianta et al., 2008). It is also possible that administering fewer items on the FCCERS-R may reduce the time needed to conduct observations and therefore reduce provider stress associated with the administration of the FCCERS-R (Pope, Denny, Homer, & Ricci, 2006). This, in turn, may prompt more family child care homes to participate in quality enhancement projects. It is also important to consider the broader implications of using a measure such as the FCCERS-R in high-stakes contexts. The FCCERS-R by design, is global in nature, and may not adequately assess the types of care and instructional practices most associated with children’s social development and academic learning (Snow & Van Hemel, 2008). Indeed, the majority of items relate to the physical home environment, including the furniture, quantity of learning materials, and the organization of the physical space, and far fewer items tap into care and instructional processes (Cassidy, Hestenes, Hansen, Hedge, & Shim, 2005). By using the FCCERS-R as an accountability and monitoring tool, providers and those tasked with supporting family child care improvement also appear to be narrowing their focus to the “mechanics” of early childhood education measured on the FCCERS-R, to the exclusion of strengthening pedagogy and caregiver interactions in ways that promote child learning and development (Smith, Schneider, & Kreader, 2010). Consequently careful consideration of the unintended consequences of the use of global measures should be taken before being administered in accountability contexts. It is also interesting to note that although we found similar relationships between the full FCCERS-R and FCCERS-R subsets with regulatable quality indices, neither the full FCCERS-R scale nor the subsets were related to adult–child ratios, group-sizes, and provider training and education variables. The low adult–child ratios and groups sizes observed in this study may have contributed to these findings. It is unclear, however, why different dimensions of provider training and education were unrelated to the FCCERSR or to the subsets. Providers in this study, like most nationally (Herzenberg et al., 2005), had low levels of formal education and had completed very few classes related to young children. It is possible that there is a minimum threshold of education that is needed before relationships between global quality and provider education can be observed. In this study, we also did not have information related to the content and quality of the coursework providers had completed, which may also be important factors in explaining the lack of relationships found (Whitebook & Ryan, 2011). However, it is also possible that within high-stakes initiatives such as QRIS, providers, regardless of their education, are attuned to the constructs measured in the FCCERS-R because of the consequences of their scores to the levels of funding received by their program. Therefore, they may comport their programs to meet the FCCERSR criteria, thus attenuating the relationships between regulatable quality variables and observed quality. It is also important to note that the Parents and Staff subscale on the FCCERS-R was not administered in this study because the state’s QRIS was concerned with respondent bias associated with self-report measures, especially within the context of high-stakes assessments. Nonetheless, provider and family needs are an important element of quality and are often included in state QRISs, but the construct itself has been difficult to measure (Zellman & Perlman, 2006). However, there are efforts underway to develop better measures of family sensitive care (see Bromer et al., 2011). When D. Schaack et al. / Early Childhood Research Quarterly 28 (2013) 936–946 such measures are available, it will be important to understand the relationships between FCCERS-R factors and theses measures. There are important limitations to this study that need to be addressed in future research before states consider adopting abridged versions of the FCCERS-R. Because of the small sample size, we were unable to conduct a confirmatory factor analysis to test the robustness of the three-factor solution. Nonetheless, our results indicted a high level of over-determination of factors (i.e., many items with strong loadings within each factor) and moderate to strong communalities among items, which suggests stability in our solution (Costello & Osborne, 2005; MacCallum, Widaman, Preacher, & Hong, 2001). It is also important to note that our sample only included family child care homes from one regulatory climate. Therefore, this study should be considered preliminary, and additional confirmatory studies should be conducted to determine whether our structure is replicated in other settings, including within family child care homes in states with differing regulatory environments governing family child care. As policymakers increase the consequences attached to the FCCERS-R, more attention needs to be paid to its psychometric properties. This study found evidence that the FCCERS-R is a multidimensional measure of quality, assessing several distinct aspects of family child care quality. Our results also suggest that less than one fifth of the FCCERS-R items could be administered, and the resulting subsets would remain psychometrically sound. In addition to conducting confirmatory factor analysis on the factors found in this study, future studies should also explore the relationships of these factors as well as abridged versions of the FCCERS-R to children’s outcomes. Cost–benefit analysis are also needed to determine whether the potential savings associated with a reduced administration of the FCCERS-R warrants the potential increases in inaccuracies or misclassifications that may arise from using an abridged version. References Achenbach, T. M., McConaughy, S. H., & Howell, C. T. (1987). Child/adolescent behavioral and emotional problems: Implications of cross-informant correlations for situational specificity. Psychological Bulletin, 101, 213–232. American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. Bisceglia, R., Perlman, M., Schaack, D., & Jenkins, J. (2009). Examining the psychometric properties of the Infant Toddler Environment Rating Scale in high stakes contexts. Early Childhood Research Quarterly, 24(3), 121–132. Bromer, J., Paulsell, D., Porter, T., Weber, R., Henly, J., Ramsburg, D., et al. (2011). Family-sensitive caregiving: A key component of quality in early care and education. In M. Zaslow, K. Tout, T. Halle, & I. Martinez-Beck (Eds.), Next steps in the measurement of quality in early childhood settings. Baltimore, MD: Brookes. Burchinal, M. R., Howes, C., & Kontos, S. (2002). Structural predictors of child care quality in child care homes. Early Childhood Research Quarterly, 17(1), 87–105. Cassidy, D., Hestenes, L., Hansen, J., Hedge, A., & Shim, J. (2005). Revisiting the two faces of child care quality: Structure and process. Early Education and Development, 16(4), 505–520. Cassidy, D. J., Hestenes, L. L., Hegde, A., Hestenes, S., & Mims, S. (2005). Measurement of quality in preschool childcare classroom: An exploratory and confirmatory factor analysis of the Early Childhood Environment Rating Scale—Revised. Early Childhood Research Quarterly, 20(3), 345–360. Clarke-Stewart, A. K., Vandell, D. L., Burchinal, M., O’Brian, M., & McCartney, K. (2002). Do regulatable features of child-care homes affect children’s development? Early Childhood Research Quarterly, 17(1), 52–86. Costello, A. B., & Osborne, J. W. (2005). Best practices in exploratory factor analysis: Four recommendations for getting the most from your analysis. Practical Assessment Research & Evaluation, 10(7), 1–9. Farran, D. C., & Son-Yarbrough, W. (2001). Title I funded preschools as a developmental context for children’s play and verbal behaviors. Early Childhood Research Quarterly, 16(2), 245–262. Forry, N. D., Iruka, I., Kainz, K., Tout, K., Torquati, J., Susman-Stillman, A., et al. (2012). Identifying profiles of quality in home-based child care [Issue Brief OPRE 2012-20]. Washington, DC: Office of Planning, Research and Evaluation, Administration for Children and Families, U.S. Department of Health and Human Services. Harms, T., Clifford, R. M., & Cryer, D. (1998). Early Childhood Environment Rating Scale Revised Edition. New York, NY: Teacher’s College Press. 945 Harms, T., Cryer, D., & Clifford, R. M. (2003). The Infant/Toddler Environment Rating Scale—Revised Edition. New York, NY: Teacher’s College Press. Harms, T., Cryer, D., & Clifford, R. M. (2007). Family Child Care Environment Rating Scale Revised Edition. New York, NY: Teacher’s College Press. Helburn, S. W. (1995). Cost, quality, and child outcomes in child care centers. Technical report. Denver: Department of Economics, Center for Research in Economics and Social Policy, University of Colorado at Denver. Herzenberg, S., Price, M., & Bradley, D. (2005). Losing ground in early childhood education: Declining workforce qualifications in an expanding industry, 1979–2004. Harrisburg, PA: Keystone Research Center. Hestenes, L., Cassidy, D., Hegde, A., & Hansen, J. (2007). Quality in inclusive and non-inclusive infant and toddler classrooms. Journal of Research in Childhood Education, 22(1), 69–84. Johnson, J. O. (2005). Who’s minding the kids? Child care arrangements: Winter 2002. Washington, DC: U.S. Census Bureau. Kontos, S., Howes, C., Shinn, M., & Galinsky, E. (1995). Quality in family child care and relative care. New York, NY: Teacher’s College Press. Koretz, D. M., McCaffrey, D. F., & Hamilton, L. S. (2001). Toward a framework for validating gains under high-stakes conditions [CSE technical report 551]. Los Angeles, CA: Center for Research on Evaluation, Standards, and Student Testing. Kryzer, E., Kovan, N., Phillips, D., Donagall, L., & Gunnar, M. (2007). Toddlers’ and preschoolers’ experiences in family day care: Age differences and behavioral correlates. Early Childhood Research Quarterly, 22(4), 451–466. La Paro, K. M., Sexton, D., & Snyder, P. (1998). Program quality characteristics in segregated and inclusive early childhood settings. Early Childhood Research Quarterly, 13(1), 151–168. Layzer, J. I., & Goodson, B. D. (2006). Care in the home: A description of family child care and the experiences of families and children that use it: National study for low-income families, wave 1 report. Cambridge, MA: Abt Associates. Le, V., Perlman, M., Zellman, G. L., & Hamilton, L. S. (2006). Measuring child–staff ratios in child care centers: Balancing effort and representativeness. Early Childhood Research Quarterly, 21(3), 267–279. Light, R. J., Singer, J. D., & Willett, J. B. (1990). By design: Planning research on higher education. Cambridge, MA: Harvard University Press. Loeb, S., Fuller, B., Kagan, S. L., Carrol, B., & Carroll, J. (2004). Child care in poor communities: Early learning effects of type, quality, and stability. Child Development, 75(1), 47–65. MacCallum, R. C., Widaman, K. F., Preacher, K. J., & Hong, S. (2001). Sample size in factor analysis: The role of model error. Multivariate Behavioral Research, 36, 611–637. Marshall, N. (2004). The quality of early child care and children’s development. Current Directions in Psychological Sciences, 13(4), 165–168. NICHD ECCRN. (2000). Characteristics and quality of child care for and preschoolers. Applied Developmental Science, 4(3), toddlers 116–141. NICHD ECCRN, & Duncan, G. J. (2003). Does quality of child care affect child outcomes at age 4 21 ? Developmental Psychology, 39(3), 451–469. Palsha, S. A., & Wesley, P. W. (1998). Improving quality in early childhood environments through on-site consultation. Topics in Early Childhood Special Education, 18(4), 243–253. Paulsell, D., Porter, T., & Kirby, G. (2010). Supporting quality in home-based child care. Final brief. Princeton, NJ: Mathematica Policy Research. Peisner-Feinberg, E. S., Burchinal, M. R., Clifford, R. M., Culkin, M. L., Howes, C., Kagan, S. L., et al. (2001). The relation of preschool child-care quality to children’s cognitive and social developmental trajectories through second grade. Child Development, 72, 1534–1553. Perlman, M., Zellman, G., & Le, V. (2004). Examining the psychometric properties of the Early Childhood Environment Rating Scale—Revised. Early Childhood Research Quarterly, 19(3), 398–412. Pianta, R. C., La Paro, K., & Hamre, B. K. (2008). Classroom Assessment Scoring System. Baltimore: Paul H. Brookes. Pope, B. G., Denny, J. H., Homer, K., & Ricci, K. (2006). What is working? What is not working? Report on the qualitative study of the Tennessee Report Card and StarQuality Program and Support System. Knoxville, TN: The University of Tennessee College of Social Work, Office of Research and Public Service. Raikes, H. A., Raikes, H., & Wilcox, B. (2005). Regulation, subsidy receipt and provider characteristics: What predicts quality in child care homes? Early Childhood Research Quarterly, 20(2), 164–184. Sakai, L. M., Whitebook, M., Wishard, A., & Howes, C. (2003). Evaluating the Early Childhood Environment Rating Scale (ECERS): Assessing differences between the first and revised edition. Early Childhood Research Quarterly, 18(4), 427–445. Scarr, S., Eisenberg, M., & Deater-Deckard, K. (1994). Measurement of quality in child care centers. Early Childhood Research Quarterly, 9(2), 131–151. Schaack, D., Tarrant, K., Boller, K., & Tout, K. (2012). Quality rating and improvement systems: Frameworks for early care and education systems change. In S. L. Kagan, & K. Kaurez (Eds.), Early childhood systems: Transforming early learning (pp. 71–86). New York, NY: Teacher’s College Press. Setodji, C.M., Le, V., & Schaack, D. Using Generalized Additive Modeling to Empirically Identify Thresholds Within the ITERS in Relation to Toddlers’ Cognitive Development. Developmental Psychology, in press. Smith, M. L., & Fey, P. (2000). Validity and accountability in high-stakes testing. Journal of Teacher Education, 51(5), 334–344. Smith, S., Schneider, W., & Kreader, J. L. (2010). Features of professional development and on-site assistance in child care quality rating improvement systems: A survey 946 D. Schaack et al. / Early Childhood Research Quarterly 28 (2013) 936–946 of state-wide systems. New York, NY: National Center for Children in Poverty, Columbia University Mailman School of Public Health. Snow, C. E., & Van Hemel, S. B. (2008). Early childhood assessment: Why, what, and how. Washington, DC: National Academies Press. Streiner, D. L., & Norman, G. R. (1989). Health Measurement Scales: A practical guide to their development and use. New York, NY: Oxford University Press. Tinsley, H. E. A., & Tinsley, D. J. (1987). Uses of factor analysis in counseling psychology research. Journal of Counseling Psychology, 34(4), 414–424. Tout, K., Starr, R., Soli, M., Moodie, S., Kirby, G., & Boller, K. (2010, April). The child care quality rating system (QRS) assessment: Compendium of quality rating systems and evaluations, OPRE report. Washington, DC: Office of Planning, Research and Evaluation, Administration for Children and Families, U.S. Department of Health and Human Services. Tout, K., Zaslow, M., Halle, T., & Forry, N. (2009). Issues for the next decade of quality rating and improvement systems. Washington, DC: Child Trends and the U.S. Department of Health and Human Services, Administration for Children and Families, Office of Planning, Research and Evaluation. Truxillo, C. (2005, April). Maximum likelihood parameter estimation with incomplete data. In Proceedings of the thirtieth annual SAS(r) Users Group International Conference Philadelphia, PA. Whitebook, M., Howes, C., & Phillips, D. (1990). Who cares? Child care teachers and the quality of care in America. Final report: National Child Care Staffing Study. Oakland, CA: Child Care Employee Project. Whitebook, M., & Ryan, S. (2011). Degrees in context: Asking the right questions about preparing skilled and effective teachers of young children [Preschool Policy Brief, 22]. New Brunswick, NJ: National Institute for Early Education Research. Whitebook, M., & Sakai, L. (2004). By a thread: How child care centers hold on to teachers, how teacher build lasting careers. Kalamazoo, MI: The Upjohn Institute for Employment Research. Zellman, G. L., & Perlman, M. (2006). Parent involvement in child care settings: Conceptual and measurement issues. Early Child Development and Care, 176(5), 521–538. Zellman, G., Perlman, M., Le, V., & Setodji, C. M. (2008). Assessing the validity of the Qualistar Early Learning Quality Rating and Improvement System as a tool for improving child care quality. Santa Monica, CA: RAND Corporation.