College Composition Weekly: Summaries of research for college writing professionals

Read, Comment On, and Share News of the Latest from the Rhetoric and Composition Journals


Leave a comment

Shi, Matos, and Kuhn. Dialogue and Argument. JoWR, Spring 2019. Posted 06/15/2019.

Shi, Yuchen, Flora Matos, and Deanna Kuhn. “Dialog as a Bridge to Argumentative Writing.” Journal of Writing Research 11.1 (2019): 107-29. Web. 5 June 2019.

Yuchen Shi, Flora Matos, and Deanna Kuhn report on a study of a dialogic approach to argumentative writing conducted with sixth-graders at “an urban public middle school in an underserved neighborhood in a large Northeastern city in the United States” (113). The study replicates earlier research on the same curriculum, with added components to assess whether the intervention increased “meta-level understanding of the purpose and goals of evidence in argumentative writing” (112-13).

Noting that research has documented the degree to which students struggle with the cognitive demands of argumentative writing as opposed to narration (108), the authors report that while the value of discourse as a precursor to writing an argument has been recognized, much of the discourse studied has been at the “whole-classroom level” (108). In contrast, the authors’ intervention paired students so that they could talk “directly” with others who both shared and opposed their positions (108).

In the authors’ view, this process provided students with two elements that affect the success of written communication: “a clearly defined audience and a meaningful purpose” (108). They argue that this direct engagement with the topic and with an audience over a period of time improves on reading about a topic, which they feel students may do “disinterestedly” because they do not yet have a sense of what kind of evidence they may need (110). The authors’ dialogic intervention allows students to develop their own questions as they become aware of the arguments they will have to make (110).

Further, the authors maintain, the dialogic exchange linking individual students “removes the teacher” and makes the process student-centered (109).

Claiming that the ability to produce “evidence-based claims” is central to argument, the authors centered their study on the relation between claims and evidence in students’ discussions and in their subsequent writing (110). Their model, they write, allowed them to see a developmental sequence as students were first most likely to choose evidence that supported their own position, only later beginning to employ evidence that “weaken[s] the opposing claim” (111). Even more sophisticated approaches to evidence, which the authors label “weaker my” and “support other,” develop more slowly if at all (111-12).

Two class were chosen to participate, one as the experimental group (22 students) and one as a comparison group (27 students). The curriculum was implemented in “twice-weekly 40-minute class sessions” that continued in “four cycles” throughout the school year (114). Each cycle began a new topic; the four topics were selected from a list because students seemed equally divided in their views on those issues (114).

The authors divided their process into Pregame, Game, and Endgame sections. In the Pregame, students in small groups generated reasons in support of their position. In the Game, student pairs sharing a position dialogued electronically with “a different opposing pair at each session” (115). During this section, students generated their own “evidence questions” which the researchers answered by the next session; the pairs were given other evidence in Q&A format. The Endgame consisted of a debate, which was then scored and a winning side designated (115). Throughout, students constructed reflection pieces; electronic transcripts preserved the interactions (115).

At the end of each cycle, students wrote individual papers. The comparison group also wrote an essay on the fourth topic, whether students should go directly to college from high school or work for a year. For this essay, students in the both groups were provided with evidence only at the end of the cycle. This essay was used for the final assessment (116-17).

Other elements assessed included whether students could recall answers to 12 evidence questions, in order to determine if differences in the use of evidence in the two groups was a function of superior memory of the material (123). A second component was a fifth essay written by the experimental group on whether teens accused of serious crimes should be tried as adults or juveniles (118). The authors wanted to assess whether the understanding of claims and evidence cultivated during the curriculum informed writing on a topic that had not been addressed through the dialogic intervention (118).

For the assessment, the researchers considered “a claim together with any reason and/or evidence supporting it” as an “idea unit” (118). These units were subcategorized as “either evidence-based or non-evidence-based.” Analyzing only the claims that contained evidence, the researchers further distinguished between “functional” and “non-functional” evidence-based claims. Functional claims were those where there was a clear written link between the evidence and claim. Only the use of functional claims was assessed. (118).

Results indicated that while the number of idea units and evidence-based claims did not vary significantly across the groups, the experimental group was significantly more successful in including functional evidence-based claims (120). Also, the intervention encouraged significantly more use of “weaken-other” claims, which the writers characterize as “a more demanding skill commonly neglected by novice writers” (120). Students did not show progress in using “weaken-own” or “support-other” evidence (121).

With the intention of determining the intervention’s effects on students’ meta-level awareness about evidence in arguing, researchers discovered that the groups did not vary in the kinds of evidence they would like most to see, with both choosing “support-own.” However, the experimental group was much more likely to state that “weaken-other” evidence was the type “they would like to see second most” (122). The groups were similar in students’ ability to recall evidence, in the authors’ view indicating that superior recall in one group or the other did not explain the results (125).

Assessment of the essay on the unfamiliar topic was hampered by an even smaller sample size and the fact that the two groups wrote on different topics. The writers report that 54% of the experimental-group students made support-own or weaken-other claims, but that the number of such claims decreased to a frequency similar to that of the comparison group on the college/work topic (124).

The authors argue that increased use of more sophisticated weaken-other evidence points to higher meta-awareness of evidence as a component of argument, but that students could show more growth as measured by their ability to predict the kind of evidence they would need or use (125).

Noting the small sample size as a limitation, the authors suggest that both the dialogic exchange of their curriculum and the students’ “deep engagement” with topics contributed to the results they recorded. They suggest that “[a]rguing to learn” through dialogue and engagement can be an important pedagogical activity because of the discourse and cognitive skills these activities develop (126).


Leave a comment

Litterio, Lisa M. Contract Grading: A Case Study. J of Writing Assessment, 2016. Posted 04/20/2017.

Litterio, Lisa M. “Contract Grading in a Technical Writing Classroom: A Case Study.” Journal of Writing Assessment 9.2 (2016). Web. 05 Apr. 2017.

In an online issue of the Journal of Writing Assessment, Lisa M. Litterio, who characterizes herself as “a new instructor of technical writing,” discusses her experience implementing a contract grading system in a technical writing class at a state university in the northeast. Her “exploratory study” was intended to examine student attitudes toward the contract-grading process, with a particular focus on how the method affected their understanding of “quality” in technical documents.

Litterio’s research into contract grading suggests that it can have the effect of supporting a process approach to writing as students consider the elements that contribute to an “excellent” response to an assignment. Moreover, Litterio contends, because it creates a more democratic classroom environment and empowers students to take charge of their writing, contract grading also supports critical pedagogy in the Freirean model. Litterio draws on research to support the additional claim that contract grading “mimic[s] professional practices” in that “negotiating and renegotiating a document” as students do in contracting for grades is a practice that “extends beyond the classroom into a workplace environment.”

Much of the research she reports dates to the 1970s and 1980s, often reflecting work in speech communication, but she cites as well models from Ira Shor, Jane Danielewicz and Peter Elbow, and Asao Inoue from the 2000s. In a common model, students can negotiate the quantity of work that must be done to earn a particular grade, but the instructor retains the right to assess quality and to assign the final grade. Litterio depicts her own implementation as a departure from some of these models in that she did make the final assessment, but applied criteria devised collaboratively by the students; moreover, her study differs from earlier reports of contract grading in that it focuses on the students’ attitudes toward the process.

Her Fall 2014 course, which she characterizes as a service course, enrolled twenty juniors and seniors representing seven majors. Neither Litterio nor any of the students were familiar with contract grading, and no students withdrew on learning from the syllabus and class announcements of Litterio’s grading intentions. At mid-semester and again at the end of the course, Litterio administered an anonymous open-ended survey to document student responses. Adopting the role of “teacher-researcher,” Litterio hoped to learn whether involvement in the generation of criteria led students to a deeper awareness of the rhetorical nature of their projects, as well as to “more involvement in the grading process and more of an understanding of principles discussed in technical writing, such as usability and document design.”

Litterio shares the contract options, which allowed students to agree to produce a stated number of assignments of either “excellent,” “great,” or “good” quality, an “entirely positive grading schema” that draws on Frances Zak’s claim that positive evaluations improved student “authority over their writing.”

The criteria for each assignment were developed in class discussion through an open voting process that resulted in general, if not absolute, agreement. Litterio provides the class-generated criteria for a resumé, which included length, format, and the expectations of “specific and strong verbs.” As the instructor, Litterio ultimately decided whether these criteria were met.

Mid-semester surveys indicated that students were evenly split in their preferences for traditional grading models versus the contract-grading model being applied. At the end of the semester, 15 of the 20 students expressed a preference for traditional grading.

Litterio coded the survey responses and discovered specific areas of resistance. First, some students cited the unfamiliarity of the contract model, which made it harder for them to “track [their] own grades,” in one student’s words. Second, the students noted that the instructor’s role in applying the criteria did not differ appreciably from instructors’ traditional role as it retained the “bias and subjectivity” the students associated with a single person’s definition of terms like “strong language.” Students wrote that “[i]t doesn’t really make a difference in the end grade anyway, so it doesn’t push people to work harder,” and “it appears more like traditional grading where [the teacher] decide[s], not us.”

In addition, students resisted seeing themselves and their peers as qualified to generate valid criteria and to offer feedback on developing drafts. Students wrote of the desire for “more input from you vs. the class,” their sense that student-generated criteria were merely “cosmetics,” and their discomfort with “autonomy.” Litterio attributes this resistance to the role of expertise to students’ actual novice status as well as to the nature of the course, which required students to write for different discourse communities because of their differing majors. She suggests that contract grading may be more appropriate for writing courses within majors, in which students may be more familiar with the specific nature of writing in a particular discipline.

However, students did confirm that the process of generating criteria made them more aware of the elements involved in producing exemplary documents in the different genres. Incorporating student input into the assessment process, Litterio believes, allows instructors to be more reflective about the nature of assessment in general, including the risk of creating a “yes or no . . . dichotomy that did not allow for the discussions and subjectivity” involved in applying a criterion. Engaging students throughout the assessment process, she contends, provides them with more agency and more opportunity to understand how assessment works. Student comments reflect an appreciation of having a “voice.”

This study, Litterio contends, challenges the assumption that contract grading is necessarily “more egalitarian, positive, [and] student-centered.” The process can still strike students as biased and based entirely on the instructor’s perspective, she found. She argues that the reflection on the relationship between student and teacher roles enabled by contract grading can lead students to a deeper understanding of “collective norms and contexts of their actions as they enter into the professional world.”


Leave a comment

Patchan and Shunn. Effects of Author and Reviewer Ability in Peer Feedback. JoWR 2016. Posted 11/25/2016.

Patchan, Melissa M., and Christian D. Shunn. “Understanding the Effects of Receiving Peer Feedback for Text Revision: Relations between Author and Reviewer Ability.” Journal of Writing Research 8.2 (2016): 227-65. Web. 18 Nov. 2016. doi: 10.17239/jowr-2016.08.02.03

Melissa M. Patchan and Christian D. Shunn describe a study of the relationship between the abilities of writers and peer reviewers in peer assessment. The study asks how the relative ability of writers and reviewers influences the effectiveness of peer review as a learning process.

The authors note that in many content courses, the time required to provide meaningful feedback encourages many instructors to turn to peer assessment (228). They cite studies suggesting that in such cases, peer response can be more effective than teacher response because, for example, students may actually receive more feedback, the feedback may be couched in more accessible terms, and students may benefit from seeing models and new strategies (228-29). Still, studies find, teachers and students both question the efficacy of peer assessment, with students stating that the quality of review depends largely on the abilities of the reviewer (229).

Patchan and Shunn distinguish between the kind of peer review characteristic of writing classrooms, which they describe as “pair or group-based face-to-face conversations” emphasizing “qualitative feedback” and the type more often practiced in large content classes, which they see as more like “professional journal reviewing” that is “asynchronous, and written-based” (228). Their study addresses the latter format and is part of a larger study examining peer feedback in a widely required psychology class at a “large, public research university in the southeast” (234).

A random selection of 189 students wrote initial drafts in response to an assignment assessing media handling of a psychological study using criteria from the course textbook (236, 238). Students then received four drafts to review and were given a week to revise their own drafts in response to feedback. Participants used the “web-based peer assessment functions of turnitin.com” (237).

The researchers rated participants as high-ability writers using SAT scores and grades in their two first-year writing courses (236). Graduate rhetoric students also rated the first drafts. The protocol then included a “median split” to designate writers in binary fashion as either high- or low-ability. “High” authors were categorized as “high” reviewers. Patchan and Shunn note that there was a wide range in writer abilities but argue that, even though the “design decreases the power of this study,” such determinations were needed because of the large sample size, which in turn made the detection of “important patterns” likely (236-37). They feel that “a lower powered study was a reasonable tradeoff for higher external validity (i.e., how reviewer ability would typically be detected)” (237).

The authors describe their coding process in detail. In addition to coding initial drafts for quality, coders examined each reviewer’s feedback for its attention to higher-order problems and lower-order corrections (239-40). Coders also tabulated which comments resulted in revision as well as the “quality of the revision” (241). This coding was intended to “determine how the amount and type of comments varied as a function of author ability and reviewer ability” (239). A goal of the study was to determine what kinds of feedback triggered the most effective responses in “low” authors (240).

The study was based on a cognitive model of writing derived from the updated work of Linda Flower and John R. Hayes, in which three aspects of writing/revision follow a writer’s review of a text: problem detection, problem diagnosis, and strategy selection for solving the diagnosed problems (230-31). In general, “high” authors were expected to produce drafts with fewer initial problems and to have stronger reading skills that allowed them to detect and diagnose more problems in others’ drafts, especially “high-level” problems having to do with global issues as opposed to issues of surface correctness (230). High ability authors/reviewers were also assumed to have a wider repertoire of solution strategies to suggest for peers and to apply to their own revisions (233). All participants received a rubric intended to guide their feedback toward higher-order issues (239).

Some of the researchers’ expectations were confirmed, but others were only partially supported or not supported (251). Writers whose test scores and grades categorized them as “high” authors did produce better initial drafts, but only by a slight margin. The researchers posit that factors other than ability may affect draft quality, such as interest or time constraints (243). “High” and “low” authors received the same number of comments despite differences in the quality of the drafts (245), but “high” authors made more higher-order comments even though they didn’t provide more solutions (246). “High” reviewers indicated more higher-order issues to “low” authors than to “high,” while “low” reviewers suggested the same number of higher-order changes to both “high” and “low” authors (246).

Patchan and Shunn considered the “implementation rate,” or number of comments on which students chose to act, and “revision quality” (246). They analyzed only comments that were specific enough to indicate action. In contrast to findings in previous studies, the expectation that better writers would make more and better revisions was not supported. Overall, writers acted on only 32% of the comments received and only a quarter of the comments resulted in improved drafts (248). Author ability did not factor into these results. Moreover, the ability of the reviewer had no effect on how many revisions were made or how effective they were (248).

It was expected that low-ability authors would implement more suggestions from higher-ability reviewers, but in fact, “low authors implemented more high-level criticism comments . . . from low reviewers than from high reviewers” (249). The quality of the revisions also improved for low-ability writers when the comments came from low-ability reviewers. The researchers conclude that “low authors benefit the most from feedback provided by low reviewers” (249).

Students acted on 41% of the low-level criticisms, but these changes seldom resulted in better papers (249).

The authors posit that rates of commenting and implementation may both be impacted by limits or “thresholds” on how much feedback a given reviewer is willing to provide and how many comments a writer is able or willing to act on (252, 253). They suggest that low-ability reviewers may explain problems in language that is more accessible to writers with less ability. Patchan and Shunn suggest that feedback may be most effective when it occurs within the student’s zone of proximal development, so that weaker writers may be helped most by peers just beyond them in ability rather than by peers with much more sophisticated skills (253).

In the authors’ view, that “neither author ability nor reviewer ability per se directly affected the amount and quality of revisions” (253) suggests that the focus in designing effective peer review processes should shift from how to group students to improving students’ ability to respond to comments (254). They recommend further research using more “direct” measures of writing and reviewing ability (254). A major conclusion from this study is that “[h]igher-ability students will likely revise their texts successfully regardless of who [they are] partnered with, but the lower-ability students may need feedback at their own level” (255).


1 Comment

West-Puckett, Stephanie. Digital Badging as Participatory Assessment. CE, Nov. 2016. Posted 11/17/2016.

Stephanie West-Puckett presents a case study of the use of “digital badges” to create a local, contextualized, and participatory assessment process that works toward social justice in the writing classroom.

She notes that digital badges are graphic versions of those earned by scouts or worn by members of military groups to signal “achievement, experience, or affiliation in particular communities” (130). Her project, begun in Fall 2014, grew out of Mozilla’s free Open Badging Initiative and the Humanities, Arts, Science, and Technology Alliance and Collaboratory (HASTAC) that funded grants to four universities as well as to museums, libraries, and community partnerships to develop badging as a way of recognizing learning (131).

West-Puckett employed badges as a way of encouraging and assessing student engagement in the outcomes and habits of mind included in such documents as the Framework for Success in Postsecondary Writing, the Outcomes Statements for First-Year Composition produced by the Council of Writing Program Administrators, and her own institution’s outcomes statement (137). Her primary goal is to foster a “participatory” process that foregrounds the agency of teachers and students and recognizes the ways in which assessment can influence classroom practice. She argues that such participation in designing and interpreting assessments can address the degree to which assessment can drive bias and limit access and agency for specific groups of learners (129).

She reviews composition scholarship characterizing most assessments as “top-down” (127-28). In these practices, West-Puckett argues, instruments such as rubrics become “fetishized,” with the result that they are forced upon contexts to which they are not relevant, thus constraining the kinds of assignments and outcomes teachers can promote (134). Moreover, assessments often fail to encourage students to explore a range of literacies and do not acknowledge learners’ achievements within those literacies (130). More valid, for West-Puckett, are “hyperlocal” assessments designed to help teachers understand how students are responding to specific learning opportunities (134). Allowing students to join in designing and implementing assessments makes the learning goals visible and shared while limiting the power of assessment tools to marginalize particular literacies and populations (128).

West-Puckett contends that the multimodal focus in writing instruction exacerbates the need for new modes of assessment. She argues that digital badges partake of “the primacy of visual modes of communication,” especially for populations “whose bodies were not invited into the inner sanctum of a numerical and linguistic academy” (132). Her use of badges contributes to a form of assessment that is designed not to deride writing that does not meet the “ideal text” of an authority but rather to enlist students’ interests and values in “a dialogic engagement about what matters in writing” (133).

West-Puckett argues for pairing digital badging with “critical validity inquiry,” in which the impact of an assessment process is examined through a range of theoretical frames, such as feminism, Marxism, or queer or disability theory (134). This inquiry reveals assessment’s role in sustaining or potentially disrupting entrenched views of what constitutes acceptable writing by examining how such views confer power on particular practices (134-35).

In West-Puckett’s classroom in a “mid-size, rural university in the south” with a high percentage of students of color and first-generation college students (135), small groups of students chose outcomes from the various outcomes statements, developed “visual symbols” for the badges, created a description of the components and value of the outcomes for writing, and detailed the “evidence” that applicants could present from a range of literacy practices to earn the badges (137). West-Puckett hoped that this process would decrease the “disconnect” between her understanding of the outcomes and that of students (136), as well as engage students in a process that takes into account the “lived consequences of assessment” (141): its disparate impact on specific groups.

The case study examines several examples of badges, such as one using a compass to represent “rhetorical knowledge” (138). The group generated multimodal presentations, and applicants could present evidence in a range of forms, including work done outside of the classroom (138-39). The students in the group decided whether or not to award the badge.

West-Puckett details the degree to which the process invited “lively discussion” by examining the “Editing MVP” badge (139). Students defined editing as proofreading and correcting one’s own paper but visually depicted two people working together. The group refused the badge to a student of color because of grammatical errors but awarded it to another student who argued for the value of using non-standard dialogue to show people “‘speaking real’ to each other” (qtd. in West-Puckett 140). West-Puckett recounts the classroom discussion of whether editing could be a collaborative effort and when and in what contexts correctness matters (140).

In Fall 2015, West-Puckett implemented “Digital Badging 2.0” in response to her concerns about “the limited construct of good writing some students clung to” as well as how to develop “badging economies that asserted [her] own expertise as a writing instructor while honoring the experiences, viewpoints, and subject positions of student writers” (142). She created two kinds of badging activities, one carried out by students as before, the other for her own assessment purposes. Students had to earn all the student-generated badges in order to pass, and a given number of West-Puckett’s “Project Badges” to earn particular grades (143). She states that she privileges “engagement as opposed to competency or mastery” (143). She maintains that this dual process, in which her decision-making process is shared with the students who are simultaneously grappling with the concepts, invites dialogue while allowing her to consider a wide range of rhetorical contexts and literacy practices over time (144).

West-Puckett reports that although she found evidence that the badging component did provide students an opportunity to take more control of their learning, as a whole the classes did not “enjoy” badging (145). They expressed concern about the extra work, the lack of traditional grades, and the responsibility involved in meeting the project’s demands (145). However, in disaggregated responses, students of color and lower-income students viewed the badge component favorably (145). According to West-Puckett, other scholars have similarly found that students in these groups value “alternative assessment models” (146).

West-Puckett lays out seven principles that she believes should guide participatory assessment, foregrounding the importance of making the processes “open and accessible to learners” in ways that “allow learners to accept or refuse particular identities that are constructed through the assessment” (147). In addition, “[a]ssessment artifacts,” in this case badges, should be “portable” so that students can use them beyond the classroom to demonstrate learning (148). She presents badges as an assessment tool that can embody these principles.


1 Comment

Moxley and Eubanks. Comparing Peer Review and Instructor Ratings. WPA, Spring 2016. Posted 08/13/2016.

Moxley, Joseph M., and David Eubanks. “On Keeping Score: Instructors’ vs. Students’ Rubric Ratings of 46,689 Essays.” Journal of the Council of Writing Program Administrators 39.2 (2016): 53-80. Print.

Joseph M. Moxley and David Eubanks report on a study of their peer-review process in their two-course first-year-writing sequence. The study, involving 16,312 instructor evaluations and 30,377 student reviews of “intermediate drafts,” compared instructor responses to student rankings on a “numeric version” of a “community rubric” using a software package, My Reviewers, that allowed for discursive comments but also, in the numeric version, required rubric traits to be assessed on a five-point scale (59-61).

Exploring the literature on peer review, Moxley and Eubanks note that most such studies are hindered by small sample sizes (54). They note a dearth of “quantitative, replicable, aggregated data-driven (RAD) research” (53), finding only five such studies that examine more than 200 students (56-57), with most empirical work on peer review occurring outside of the writing-studies community (55-56).

Questions investigated in this large-scale empirical study involved determining whether peer review was a “worthwhile” practice for writing instruction (53). More specific questions addressed whether or not student rankings correlated with those of instructors, whether these correlations improved over time, and whether the research would suggest productive changes to the process currently in place (55).

The study took place at a large research university where the composition faculty, consisting primarily of graduate students, practiced a range of options in their use of the My Reviewers program. For example, although all commented on intermediate drafts, some graded the peer reviews, some discussed peer reviews in class despite the anonymity of the online process, and some included training in the peer-review process in their curriculum, while others did not.

Similarly, the My Reviewers package offered options including comments, endnotes, and links to a bank of outside sources, exercises, and videos; some instructors and students used these resources while others did not (59). Although the writing program administration does not impose specific practices, the program provides multiple resources as well as a required practicum and annual orientation to assist instructors in designing their use of peer review (58-59).

The rubric studied covered five categories: Focus, Evidence, Organization, Style, and Format. Focus, Organization, and Style were broken down into the subcategories of Basics—”language conventions”—and Critical Thinking—”global rhetorical concerns.” The Evidence category also included the subcategory Critical Thinking, while Format encompassed Basics (59). For the first year and a half of the three-year study, instructors could opt for the “discuss” version of the rubric, though the numeric version tended to be preferred (61).

The authors note that students and instructors provided many comments and other “lexical” items, but that their study did not address these components. In addition, the study did not compare students based on demographic features, and, due to its “observational” nature, did not posit causal relationships (61).

A major finding was that. while there was some “low to modest” correlation between the two sets of scores (64), students generally scored the essays more positively than instructors; this difference was statistically significant when the researchers looked at individual traits (61, 67). Differences between the two sets of scores were especially evident on the first project in the first course; correlation did increase over time. The researchers propose that students learned “to better conform to rating norms” after their first peer-review experience (64).

The authors discovered that peer reviewers were easily able to distinguish between very high-scoring papers and very weak ones, but struggled to make distinctions between papers in the B/C range. Moxley and Eubanks suggest that the ability to distinguish levels of performance is a marker for “metacognitive skill” and note that struggles in making such distinctions for higher-quality papers may be commensurate with the students’ overall developmental levels (66).

These results lead the authors to consider whether “using the rubric as a teaching tool” and focusing on specific sections of the rubric might help students more closely conform to the ratings of instructors. They express concern that the inability of weaker students to distinguish between higher scoring papers might “do more harm than good” when they attempt to assess more proficient work (66).

Analysis of scores for specific rubric traits indicated to the authors that students’ ratings differed more from those of instructors on complex traits (67). Closer examination of the large sample also revealed that students whose teachers gave their own work high scores produced scores that more closely correlated with the instructors’ scores. These students also demonstrated more variance than did weaker students in the scores they assigned (68).

Examination of the correlations led to the observation that all of the scores for both groups were positively correlated with each other: papers with higher scores on one trait, for example, had higher scores across all traits (69). Thus, the traits were not being assessed independently (69-70). The authors propose that reviewers “are influenced by a holistic or average sense of the quality of the work and assign the eight individual ratings informed by that impression” (70).

If so, the authors suggest, isolating individual traits may not necessarily provide more information than a single holistic score. They posit that holistic scoring might not only facilitate assessment of inter-rater reliability but also free raters to address a wider range of features than are usually included in a rubric (70).

Moxley and Eubanks conclude that the study produced “mixed results” on the efficacy of their peer-review process (71). Students’ improvement with practice and the correlation between instructor scores and those of stronger students suggested that the process had some benefit, especially for stronger students. Students’ difficulty with the B/C distinction and the low variance in weaker students’ scoring raised concerns (71). The authors argue, however, that there is no indication that weaker students do not benefit from the process (72).

The authors detail changes to their rubric resulting from their findings, such as creating separate rubrics for each project and allowing instructors to “customize” their instruments (73). They plan to examine the comments and other discursive components in their large sample, and urge that future research create a “richer picture of peer review processes” by considering not only comments but also the effects of demographics across many settings, including in fields other than English (73, 75). They acknowledge the degree to which assigning scores to student writing “reifies grading” and opens the door to many other criticisms, but contend that because “society keeps score,” the optimal response is to continue to improve peer-review so that it benefits the widest range of students (73-74).


Leave a comment

Comer and White. MOOC Assessment. CCC, Feb. 2016. Posted 04/18/2016.

Comer, Denise K., and Edward M. White. “Adventuring into MOOC Writing Assessment: Challenges, Results, and Possibilities.” College Composition and Communication 67.3 (2016): 318-59. Print.

Denise K. Comer and Edward M. White explore assessment in the “first-ever first-year-writing MOOC,” English Composition I: Achieving Expertise, developed under the auspices of the Bill & Melinda Gates Foundation, Duke University, and Coursera (320). Working with “a team of more than twenty people” with expertise in many areas of literacy and online education, Comer taught the course (321), which enrolled more than 82,000 students, 1,289 of whom received a Statement of Accomplishment indicating a grade of 70% or higher. Nearly 80% of the students “lived outside the United States” and for a majority, English was not the first language, although 59% of these said they were “proficient or fluent in written English” (320). Sixty-six percent had bachelor’s or master’s degrees.

White designed and conducted the assessment, which addressed concerns about MOOCs as educational options. The authors recognize MOOCs as “antithetical” (319) to many accepted principles in writing theory and pedagogy, such as the importance of interpersonal instructor/student interaction (319), the imperative to meet the needs of a “local context” (Brian Huot, qtd. in Comer and White 325) and a foundation in disciplinary principles (325). Yet the authors contend that as “MOOCs are persisting,” refusing to address their implications will undermine the ability of writing studies specialists to influence practices such as Automated Essay Scoring, which has already been attempted in four MOOCs (319). Designing a valid assessment, the authors state, will allow composition scholars to determine how MOOCs affect pedagogy and learning (320) and from those findings to understand more fully what MOOCs can accomplish across diverse populations and settings (321).

Comer and White stress that assessment processes extant in traditional composition contexts can contribute to a “hybrid form” applicable to the characteristics of a MOOC (324) such as the “scale” of the project and the “wide heterogeneity of learners” (324). Models for assessment in traditional environments as well as online contexts had to be combined with new approaches that addressed the “lack of direct teacher feedback and evaluation and limited accountability for peer feedback” (324).

For Comer and White, this hybrid approach must accommodate the degree to which the course combined the features of an “xMOOC” governed by a traditional academic course design with those of a “cMOOC,” in which learning occurs across “network[s]” through “connections” largely of the learners’ creation (322-23).

Learning objectives and assignments mirrored those familiar to compositionists, such as the ability to “[a]rgue and support a position” and “[i]dentify and use the stages of the writing process” (323). Students completed four major projects, the first three incorporating drafting, feedback, and revision (324). Instructional videos and optional workshops in Google Hangouts supported assignments like discussion forum participation, informal contributions, self-reflection, and peer feedback (323).

The assessment itself, designed to shed light on how best to assess such contexts, consisted of “peer feedback and evaluation,” “Self-reflection,” three surveys, and “Intensive Portfolio Rating” (325-26).

The course supported both formative and evaluative peer feedback through “highly structured rubrics” and extensive modeling (326). Students who had submitted drafts each received responses from three other students, and those who submitted final drafts received evaluations from four peers on a 1-6 scale (327). The authors argue that despite the level of support peer review requires, it is preferable to more expert-driven or automated responses because they believe that

what student writers need and desire above all else is a respectful reader who will attend to their writing with care and respond to it with understanding of its aims. (327)

They found that the formative review, although taken seriously by many students, was “uneven,” and students varied in their appreciation of the process (327-29). Meanwhile, the authors interpret the evaluative peer review as indicating that “student writing overall was successful” (330). Peer grades closely matched those of the expert graders, and, while marginally higher, were not inappropriately high (330).

The MOOC provided many opportunities for self-reflection, which the authors denote as “one of the richest growth areas” (332). They provide examples of student responses to these opportunities as evidence of committed engagement with the course; a strong desire for improvement; an appreciation of the value of both receiving and giving feedback; and awareness of opportunities for growth (332-35). More than 1400 students turned in “final reflective essays” (335).

Self-efficacy measures revealed that students exhibited an unexpectedly high level of confidence in many areas, such as “their abilities to draft, revise, edit, read critically, and summarize” (337). Somewhat lower confidence levels in their ability to give and receive feedback persuade the authors that a MOOC emphasizing peer interaction served as an “occasion to hone these skills” (337). The greatest gain occurred in this domain.

Nine “professional writing instructors” (339) assessed portfolios for 247 students who had both completed the course and opted into the IRB component (340). This assessment confirmed that while students might not be able to “rely consistently” on formative peer review, peer evaluation could effectively supplement expert grading (344).

Comer and White stress the importance of further research in a range of areas, including how best to support effective peer response; how ESL writers interact with MOOCs; what kinds of people choose MOOCs and why; and how MOOCs might function in WAC/WID situations (344-45).

The authors stress the importance of avoiding “extreme concluding statements” about the effectiveness of MOOCs based on findings such as theirs (346). Their study suggests that different learners valued the experience differently; those who found it useful did so for varied reasons. Repeating that writing studies must take responsibility for assessment in such contexts, they emphasize that “MOOCs cannot and should not replace face-to-face instruction” (346; emphasis original). However, they contend that even enrollees who interacted briefly with the MOOC left with an exposure to writing practices they would not have gained otherwise and that the students who completed the MOOC satisfactorily amounted to more students than Comer would have reached in 53 years teaching her regular FY sessions (346).

In designing assessments, the authors urge, compositionists should resist the impulse to focus solely on the “Big Data” produced by assessments at such scales (347-48). Such a focus can obscure the importance of individual learners who, they note, “bring their own priorities, objectives, and interests to the writing MOOC” (348). They advocate making assessment an activity for the learners as much as possible through self-reflection and through peer interaction, which, when effectively supported, “is almost as useful to students as expert response and is crucial to student learning” (349). Ultimately, while the MOOC did not succeed universally, it offered many students valuable writing experiences (346).


3 Comments

Combs, Frost, and Eble. Collaborative Course Design in Scientific Writing. CS, Sept. 2015. Posted 11/12/15.

Combs, D. Shane, Erin A. Frost, and Michelle F. Eble. “”Collaborative Course Design in Scientific Writing: Experimentation and Productive Failure.” Composition Studies 43.2 (2015): 132-49. Web. 11 Nov. 2015.

Writing in the “Course Design” section of Composition Studies, D. Shane Combs, Erin A. Frost, and Michelle F. Eble describe a science-writing course taught at East Carolina University, “a doctoral/research institution with about 27,000 students, serv[ing] a largely rural population” (132). The course has been taught by the English department since 1967 as an upper-level option for students in the sciences, English, and business and technical communication. The course also acts as an option for students to fulfill the requirement to take two writing-intensive (WI) courses, one in the major; as a result, it serves students in areas like biology and chemistry. The two to three sections per semester offered by English are generally taught by “full-time teaching instructors” and sometimes by tenured/tenure-track faculty in technical and professional communication (132).

Combs et al. detail iterations of the course taught by Frost and Eble, who had not taught it before. English graduate student D. Shane Combs contributed as a peer mentor. Inclusion of the peer mentor as well as the incorporation of university-wide writing outcomes into the course-specific outcomes resulted from a Quality Enhancement Plan underway at the university as a component of its reaccreditation. This plan included a special focus on writing instruction, for example, a Writing Mentors program that funded peer-mentor support for WI instruction. Combs, who was sponsored by the English department, brought writing-center experience as well as learning from “a four-hour professional development session” to his role (133).

Drawing on work by Donna J. Haraway, Sandra Harding, and James C. Wilson, Frost and Eble’s collaboratively designed sections of the course were intent “on moving students into a rhetorical space where they can explore the socially constructed nature of science, scientific rhetoric, and scientific traditions” (134). In their classes, the instructors announced that they would be teaching from “an ‘apparent feminist’ perspective,” in Frost’s case, and from “a critical gender studies approach” in Eble’s (134-35). The course required three major assignments: field research on scientific writing venues in an area of the student’s choice; “a complete scientific article” for one of the journals that had been investigated; and a conversion of the scientific article into a general-audience article appropriate for CNN.com (135). A particular goal of these assignments was to provoke cognitive dissonance in order to raise questions of how scientific information can be transmitted “in responsible ways” as students struggled with the selectivity needed for general audiences (135).

Other components of students’ grades were class discussion, a “scripted oral debate completed in small groups,” and a “personal process journal.” In addition, students participated in “cross-class peer review,” in which students from Frost’s class provided feedback on the lay articles from Eble’s class and vice versa (136).

In their Critical Reflection, Combs et al. consider three components of the class that provided particular insights: the collaboration in course design; the inclusion of the peer mentor; and the cross-class peer review (137). Collaboration not only allowed the instructors to build on each other’s strengths and experiences, it also helped them analyze other aspects of the class. Frost and Eble determined that differences in their own backgrounds and teaching styles impacted student responses to assignments. For example, Eble’s experience on an Institutional Review Board influenced her ability to help students think beyond the perception that writing for varied audiences required them to “dumb down” their scientific findings (137).

Much discussion centers on what the researchers learned from the cross-class peer review about students’ dissonance in producing the CNN.com lay article. Students in the two classes addressed this challenge quite differently. Frost’s students resisted the complexity that Eble’s students insisted on sustaining in their revisions of their scientific article, while students in Eble’s class criticized the submissions from Frost’s students as “too simple.” The authors write that “even though students were presented with the exact same assignment prompt, they received different messages about their intended audiences” (138).

The researchers credit Combs’s presence as a peer mentor in Frost’s class for the students’ ability to revise more successfully for non-specialized audiences. They argue that he provided a more immediate outside audience at the same time that he promoted a sense of community and identification that encouraged students to make difficult rhetorical decisions (138-39). His feedback to the instructors helped them recognize the value of the cross-class peer review despite the apparent challenges it presented. In his commentary, he discusses how receiving the feedback from the other class prompted one student to achieve a “successful break from a single-form draft writing and in-class peer review” (Combs, qtd. in Combs et al. 140). He quotes the student’s perception that everyone in her own class “had the same understanding of what the paper was supposed to be” and her sense that the disruption of seeing the other class’s very different understanding fueled a complete revision that made her “happier with [her] actual article” (140). The authors conclude that both the contributions of the peer mentor and the dissonance created by the very different understandings of audience led to increased critical reflection (140), in particular, in Combs’s words, the recognition that

there are often spaces in writing not filled by right-and-wrong choices, but by creating drafts, receiving feedback, and ultimately making the decision to go in a chosen direction. (140)

In future iterations, in addition to retaining the cross-class peer review and the peer-mentor presence, the instructors propose equalizing the amount of feedback the classes receive, especially since receiving more feedback rather than less pushes students to “prioritize” and hence develop important revision strategies (141). They also plan to simplify the scientific-article assignment, which Frost deemed “too much” (141). An additional course-design revision involves creating a lay article from a previously published scientific paper in order to prepare students for the “affective impact” (141) of making radical changes in work to which they are already deeply committed. A final change involves converting the personal journal to a social-media conversation to develop awareness of the exigencies of public discussion of science (141).