Accountability movement | College Composition Weekly: Summaries of research for college writing professionals

December 1, 2018
by vanderso

Pruchnic et al. Mixed Methods in Direct Assessment. J or Writ Assessment, 2018. Posted 12/01/2018.

Pruchnic, Jeff, Chris Susak, Jared Grogan, Sarah Primeau, Joe Torok, Thomas Trimble, Tanina Foster, and Ellen Barton. “Slouching Toward Sustainability: Mixed Methods in the Direct Assessment of Student Writing.” Journal of Writing Assessment 11.1 (2018). Web. 27 Nov. 2018.

[Page numbers from pdf generated from the print dialogue]

Jeff Pruchnic, Chris Susak, Jared Grogan, Sarah Primeau, Joe Torok, Thomas Trimble, Tanina Foster, and Ellen Barton report on an assessment of “reflection argument essay[s]” from the first-year-composition population of a large, urban, public research university (6). Their assessment used “mixed methods,” including a “thin-slice” approach (1). The authors suggest that this method can address difficulties faced by many writing programs in implementing effective assessments.

The authors note that many stakeholders to whom writing programs must report value large-scale quantitative assessments (1). They write that the validity of such assessments is often measured in terms of statistically determined interrater reliability (IRR) and samples considered large enough to adequately represent the population (1).

Administrators and faculty of writing programs often find that implementing this model requires time and resources that may not be readily available, even for smaller programs. Critics of this model note that one of its requirements, high interrater reliability, can too easily come to stand in for validity (2); in the view of Peter Elbow, such assessments favor “scoring” over “discussion” of the results (3). Moreover, according to the authors, critics point to the “problematic decontextualization of program goals and student achievement” that large-scale assessments can foster (1).

In contrast, Pruchnic et al. report, writing programs have tended to value the “qualitative assessment of a smaller sample size” because such models more likely produce the information needed for “the kinds of curricular changes that will improve instruction” (1). Writing programs, the authors maintain, have turned to redefining a valid process as one that can provide this kind of information (3).

Pruchnic et al. write that this resistance to statistically sanctioned assessments has created a bind for writing programs. Pruchnic et al. cite scholars like Peggy O’Neill (2) and Richard Haswell (3) to posit that when writing programs refuse the measures of validity required by external stakeholders, they risk having their conclusions dismissed and may well find themselves subject to outside intervention (3). Haswell’s article “Fighting Number with Number” proposes producing quantitative data as a rhetorical defense against external criticism (3).

In the view of the authors, writing programs are still faced with “sustainability” concerns:

The more time one spends attempting to perform quantitative assessment at the size and scope that would satisfy statistical reliability and validity, the less time . . . one would have to spend determining and implementing the curricular practices that would support the learning that instructors truly value. (4)

Hoping to address this bind, Pruchnic et al. write of turning to a method developed in social studies to analyze “lengthy face-to-face social and institutional interactions” (5). In a “thin-slice” methodology, raters use a common rubric to score small segments of the longer event. The authors report that raters using this method were able to predict outcomes, such as the number of surgery malpractice claims or teacher-evaluation results, as accurately as those scoring the entire data set (5).

To test this method, Pruchnic et al. created two teams, a “Regular” and a “Research” team. The study compared interrater reliability, “correlation of scores,” and the time involved to determine how closely the Research raters, scoring thin slices of the assessment data, matched the work of the Regular raters (5).

Pruchnic et al. provide a detailed description of their institution and writing program (6). The university’s assessment approach is based on Edward White’s “Phase 2 assessment model,” which involves portfolios with a final reflective essay, the prompt for which asks students to write an evidence-based argument about their achievements in relation to the course outcomes (8). The authors note that limited resources gradually reduced the amount of student writing that was actually read, as raters moved from full-fledged portfolio grading to reading only the final essay (7). The challenges of assessing even this limited amount of student work led to a sample that consisted of only 6-12% of the course enrollment.

The authors contend that this is not a representative sample; as a result, “we were making decisions about curricular and other matters that were not based upon a solid understanding of the writing of our entire student body” (7). The assessment, in the authors’ view, therefore did not meet necessary standards of reliability and validity.

The authors describe developing the rubric to be used by both the Research and Regular teams from the precise prompt for the essay (8). They used a “sampling calculator” to determine that, given the total of 1,174 essays submitted, 290 papers would constitute a representative sample; instructors were asked for specific, randomly selected papers to create a sample of 291 essays (7-8).

The Regular team worked in two-member pairs, both members of each pair reading the entire essay, with third readers called in as needed (8): “[E]ach essay was read and scored by only one two-member team” (9). The authors used “double coding” in which one-fifth of the essays were read by a second team to establish IRR (9). In contrast, the 10-member Research team was divided into two groups, each of which scored half the essays. These readers were given material from “the beginning, middle, and end” of each essay: the first paragraph, the final paragraph, and a paragraph selected from the middle page or pages of the essay, depending on its length. Raters scored the slices individually; the averaged five team members’ scores constituted the final scores for each paper (9).

Pruchnic et al. discuss in detail their process for determining reliability and for correlating the scores given by the Regular and Research teams to determine whether the two groups were scoring similarly. Analysis of interrater reliability revealed that the Research Team’s IRR was “one full classification higher” than that of the Regular readers (12). Scores correlated at the “low positive” level, but the correlation was statistically significant (13). Finally, the Research team as a whole spent “a little more than half the time” scoring than the Regular group, while individual average scoring times for Research team members was less than half of the scoring time of the Regular members (13).

Additionally, the assessment included holistic readings of 16 essays randomly representing the four quantitative result classifications of Poor through Good (11). This assessment allowed the authors to determine the qualities characterizing essays ranked at different levels and to address the pedagogical implications within their program (15, 16).

The authors conclude that thin-slice scoring, while not always the best choice in every context (16), “can be added to the Writing Studies toolkit for large-scale direct assessment of evaluative reflective writing” (14). Future research, they propose, should address the use of this method to assess other writing outcomes (17). Paired with a qualitative assessment, they argue, a mixed-method approach that includes thin-slice analysis as an option can help satisfy the need for statistically grounded data in administrative and public settings (16) while enabling strong curricular development, ideally resulting in “the best of both worlds” (18).

July 25, 2016
by vanderso 1 Comment

Giordano and Hassel. Developmental Reform and the Two-Year College. TETYC, May 2016. Posted 07/25/2016.

Giordano, Joanne Baird, and Holly Hassel. “Unpredictable Journeys: Academically At-Risk Students, Developmental Education Reform, and the Two-Year College.” Teaching English in the Two-Year College 43.4 (2016): 371-90. Web. 11 July 2016.

Joanne Baird Giordano and Holly Hassel report on a study of thirty-eight underprepared students negotiating the curriculum at a “small midwestern campus” that is part of a “statewide two-year liberal arts institution” (372). The study assessed the placement process, the support systems in place, and the efforts to “accelerate” students from developmental coursework to credit-bearing courses (374). The institution, an open-access venue, accepted 100 percent of applicants in 2014 (372).

Giordano and Hassel position their study in an ongoing conversation about how best to speed up students’ progress through college and improve graduation rates—the “college completion agenda” (371). Expressing concern that some policy decisions involved in these efforts might result from what Martha E. Casazza and Sharon L. Silverman designate as “misunderstood studies of ‘remedial’ student programs” (371), Giordano and Hassel present their study as reinforcing the importance of a robust developmental curriculum within an open-access environment and the necessity for ongoing support outside of regular classwork. They also focus on the degree to which placement procedures, even those using multiple measures, often fail to predict long-term student trajectories (371, 377).

The researchers characterize their institution as offering a “rigorous general-education curriculum” designed to facilitate student transfer to the four-year institutions within the state (372). They note that the two-year institution’s focus on access and its comprehensive placement process, which allows faculty to consider a range of factors such as high school grades, writing samples, and high-school coursework (375), mean that its developmental writing program is more likely to serve underprepared students than is the case at colleges that rely on less varied placement measures such as standardized tests (374). The thirty-eight students in the study all had test scores that would have placed them in multiple developmental sections at many institutions (374).

The institution’s goal is to reduce the amount of time such students spend in developmental curricula while supporting the transition to credit-bearing coursework (373). The writing program offers only one developmental course; after completing this course, students move to a two-course credit-bearing sequence, the second component of which fulfills the core writing requirement for four-year institutions within the state (373-74). A curriculum that features “integrated reading and writing” and a small-group “variable-credit, nondegree studio writing course” that students can take multiple times support students’ progress (373).

Examination of student work in the courses in which they were placed indicates that students were generally placed appropriately (375). Over the next two years, the researchers assessed how well the students’ written work met course outcomes and interviewed instructors about student readiness to move forward. Giordano and Hassel then collected data about the students’ progress in the program over a four-year period (375).

Noting that 74% of the students studied remained in good academic standing after their first year, Giordano and Hassel point out that test scores bore no visible relation to academic success (377). Eighteen of the students completed the second-semester writing course. Acknowledging that this percentage was lower than it would be for students whose test scores did not direct them into developmental classes, the authors argue that this level of success illustrates the value of the developmental coursework they undertook. Whereas policy makers often cite developmental work as an impediment to college completion, Giordano and Hassel argue that this coursework was essential in helping the underprepared students progress; they contend that what prevents many such students from moving more quickly and successfully through college is not having to complete extra coursework but instead “the gradual layering of academic and nonacademic challenges” that confronts these students (377).

The authors present a case study to argue that with ongoing support, a student whose scores predict failure can in fact succeed at college-level work (378-79). More problematic, however, are the outcomes for students who place into more than one developmental course, for example, both writing and math.

For example, only three of twenty-one students placing into more than one developmental section “completed a state system degree of any kind,” but some students in this category did earn credits during the four years of the study (380). The authors conclude from data such as these that the single developmental section of writing along with the studio course allowed the students to succeed where they would ordinarily have failed, but that much more support of different kinds is needed to help them progress into the core curriculum (381).

The authors examined the twenty students who did not complete the core requirement to understand how they “got stuck” in their progress (381). Some students repeatedly attempted the initial credit-bearing course; others avoided taking the core courses, and others could not manage the second, required writing course (382-83). The authors offer “speculat[ion]” that second-language issues may have intervened; they also note that the students did not take the accompanying studio option and their instructors chose a “high-stakes, single-grade essay submission” process rather than requiring a portfolio (382).

In addition, the authors contend, many students struggled with credit-bearing work in all their courses, not just writing and reading (383). Giordano and Hassel argue that more discipline-specific support is needed if students are to transition successfully to the analytical thinking, reading, and writing demanded by credit-bearing courses. They note that one successful strategy undertaken by some students involved “register[ing] in gradually increasing numbers of reading-intensive credits” (384), thus protecting their academic standing while building their skills.

Another case study of a student who successfully negotiated developmental and lower-level credit-bearing work but struggled at higher levels leads Giordano and Hassel to argue that, even though this student ultimately faced suspension, the chance to attend college and acquire credits exemplified the “tremendous growth as a reader, writer, and student” open access permits (384).

The study, the authors maintain, supports the conclusion, first, that the demand from policy-making bodies that the institutions and faculty who serve underprepared students be held accountable for the outcomes of their efforts neglects the fact that these institutions and educators have “the fewest resources and voices of influence in higher education and in the policy-making process” (384). Second, they report data showing that policies that discourage students from taking advantage of developmental work so they can move through coursework more quickly result in higher failure rates (387). Third, Giordano and Hassel argue that directed self-placement is not appropriate for populations like the one served by their institution (387). Finally, they reiterate that the value of attending college cannot be measured strictly by graduation rates; the personal growth such experiences offer should be an essential component of any evaluation (387-88).

December 3, 2015
by vanderso

Addison, Joanne. Common Core in College Classrooms. Journal of Writing Assessment, Nov. 2015. Posted 12/03/2015.

Addison, Joanne. “Shifting the Locus of Control: Why the Common Core State Standards and Emerging Standardized Tests May Reshape College Writing Classrooms.” Journal of Writing Assessment 8.1 (2015): 1-11. Web. 20 Nov. 2015.

Joanne Addison offers a detailed account of moves by testing companies and philanthropists to extend the influence of the Common Core State Standards Initiative (CCSSI) to higher education. Addison reports that these entities are building “networks of influence” (1) that will shift agency from teachers and local institutions to corporate interests. She urges writing professionals to pay close attention to this movement and to work to retain and restore teacher control over writing instruction.

Addison writes that a number of organizations are attempting to align college writing instruction with the CCSS movement currently garnering attention in K-12 institutions. This alignment, she documents, is proceeding despite criticisms of the Common Core Standards for demanding skills that are “not developmentally appropriate,” for ignoring crucial issues like “the impact of poverty on educational opportunity,” and for the “massive increase” in investment in and reliance on standardized testing (1). But even if these challenges succeed in scaling back the standards, she contends, too many teachers, textbooks, and educational practices will have been influenced by the CCSSI for its effects to dissipate entirely (1). Control of professional development practices by corporations and specific philanthropies, in particular, will link college writing instruction to the Common Core initiative (2).

Addison connects the investment in the Common Core to the “accountability movement” (2) in which colleges are expected to demonstrate the “value added” by their offerings as students move through their curriculum (5). Of equal concern, in Addison’s view, is the increasing use of standardized test scores in college admissions and placement; she notes, for example, “640 colleges and universities” in her home state of Colorado that have “committed to participate” in the Partnership for Assessment of Readiness for College and Career (PARCC) by using standardized tests created by the organization in admissions and placement; she points to an additional 200 institutions that have agreed to use a test generated by the Smarter Balanced Assessment Consortium (SBAC) (2).

In her view, such commitments are problematic not only because they use single-measure tools rather than more comprehensive, pedagogically sound decision-making protocols but also because they result from the efforts of groups like the English Language Arts Work Group for CCSSI, the membership of which is composed of executives from testing companies, supplemented with only one “retired English professor” and “[e]xactly zero practicing teachers” (3).

Addison argues that materials generated by organizations committed to promoting the CCSSI show signs of supplanting more pedagogically sound initiatives like NCTE’s Read-Write-Think program (4). To illustrate how she believes the CCSSI has challenged more legitimate models of professional development, she discusses the relationship between CCSSI-linked coalitions and the National Writing Project.

She writes that in 2011, funds for the National Writing Project were shifted to the president’s Race to the Top (3). Some funding was subsequently restored, but grants from the Bill and Melinda Gates Foundation specifically supported National Writing Project sites that worked with an entity called the Literacy Design Collaborative (LDC) to promote the use of the Common Core Standards in assignment design and to require the use of a “jurying rubric ” intended to measure the fit with the Standards in evaluating student work (National Writing Project, 2014, qtd. in Addison 4). According to Addison, “even the briefest internet search reveals a long list of school districts, nonprofits, unions, and others that advocate the LDC approach to professional development” (4). Addison contends that teachers have had little voice in developing these course-design and assessment tools and are unable, under these protocols, to refine instruction and assessment to fit local needs (4).

Addison expresses further concern about the lack of teacher input in the design, administration, and weight assigned to the standardized testing used to measure “value added” and thus hold teachers and institutions accountable for student success. A number of organizations largely funded by the Bill and Melinda Gates Foundation promote the use of “performance-based” standardized tests given to entering college students and again to seniors (5-6). One such test, the Collegiate Learning Assessment (CLA), is now used by “700 higher education institutions” (5). Addison notes that nine English professors were among the 32 college professors who worked on the development and use of this test; however, all were drawn from “CLA Performance Test Academies” designed to promote the “use of performance-based assessments in the classroom,” and the professors’ specialties were not provided (5-6).

A study conducted using a similar test, the Common Core State Standards Validation Assessment (CCSSAV) indicated that the test did provide some predictive power, but high-school GPA was a better indicator of student success in higher education (6). In all, Addison reports four different studies that similarly found that the predictor of choice was high-school GPA, which, she says, improves on the snapshot of a single moment supplied by a test, instead measuring a range of facets of student abilities and achievements across multiple contexts (6).

Addison attributes much of the movement toward CCSSI-based protocols to the rise of “advocacy philanthropy,” which shifts giving from capital improvements and research to large-scale reform movements (7). While scholars like Cassie Hall see some benefits in this shift, for example in the ability to spotlight “important problems” and “bring key actors together,” concerns, according to Addison’s reading of Hall, include

the lack of external accountability, stifling innovation (and I would add diversity) by offering large-scale, prescriptive grants, and an unprecedented level of influence over state and government policies. (7)

She further cites Hall’s concern that this shift will siphon money from “field-initiated academic research” and will engender “a growing lack of trust in higher education” that will lead to even more restrictions on teacher agency (7).

Addison’s recommendations for addressing the influx of CCSSI-based influences include aggressively questioning our own institutions’ commitments to facets of the initiative, using the “15% guideline” within which states can supplement the Standards, building competing coalitions to advocate for best practices, and engaging in public forums, even where such writing is not recognized in tenure-and-promotion decisions, to “place teachers’ professional judgment at the center of education and help establish them as leaders in assessment” (8). Such efforts, in her view, must serve the effort to identify assessment as a tool for learning rather than control (7-8).

Access this article at http://journalofwritingassessment.org/article.php?article=82

	vanderso on Wootton, Lacey. Truth-Telling…
	Lacey Wootton on Wootton, Lacey. Truth-Telling…
	When Writing Becomes… on Dush, Lisa. When Writing Becom…
	vanderso on Formo and Neary. Threshold Con…
	Kimberly Robinson Ne… on Formo and Neary. Threshold Con…

College Composition Weekly: Summaries of research for college writing professionals

Read, Comment On, and Share News of the Latest from the Rhetoric and Composition Journals

Category Archives: Accountability movement

Pruchnic et al. Mixed Methods in Direct Assessment. J or Writ Assessment, 2018. Posted 12/01/2018.

Giordano and Hassel. Developmental Reform and the Two-Year College. TETYC, May 2016. Posted 07/25/2016.

Addison, Joanne. Common Core in College Classrooms. Journal of Writing Assessment, Nov. 2015. Posted 12/03/2015.

Share this:

Share this:

Share this: