Monday, December 17, 2018

Writing exams: Paper vs Online


Up until a few years ago, our BC Social Studies 11 students wrote a provincial examination to wrap up their course.  This was a standardized test featuring 55 multiple choice questions and two essay questions.  The exam was the same for all students in the province and was sometimes used by the school, district, or province to provide data on how students were doing.  Teachers occasionally used the exam results to give them feedback on whether what they are teaching has had the desired outcome.  In many cases the exam was ignored.  Not precisely a high-stakes exam (although it was worth 40% of the students' overall mark), although for some teachers it drove their course planning and was often seen as a barrier to more creative ways to teacher the course.  In my opinion it was a reasonable assessment; it drew from all areas of the course curriculum -- 20th century Canadian History, Human Development and Environmental Issues, Canadian Politics and Government -- and included a balance of straight-forward and higher order questions.  The exam also assured that students across BC more-or-less got an introduction to common topics of citizenship and Canadian identity, the state of the developed vs developing world, why it was worth voting (and who the parties were), and about Canada's involvement in world affairs. It was perhaps too focused on content and less on broader thinking concepts and subject-specific skills (other than interpreting population pyramids).  The essay questions could be hard for students, but they really showed whether students could synthesize learning from a big chunk of the course, and also whether they could write at a level that could be expected from a Grade 11 student.

In January 2013, a group of Social Studies teachers in Prince George conducted an informal experiment to compare the results of students who wrote their provincial exams using either the paper format or an online format in a computer lab. Who would do better on the written section?

BACKGROUND
At the first school, the teachers insisted on paper copies of the exam.  At a second school, they decided to try having all of the students write online.  The written section on the exam -- an essay on a historical topic and a second essay on a topic related to human geography or the environment -- is marked by teachers.  Our schools have similar demographics, the exam is the same, and the teachers who taught the course have similar styles and roughly the same attention to content, division of curriculum, and review strategies.  We did notice that the school with the online writers did not seem to emphasize human geography to the same extent as the other school, and this showed up in the responses to the second essay.  There were two classes in each school writing the exam, so we had about 50 exams at each site (thus 100 essays) to provide data.  The students did not get to choose paper vs online, so this perhaps removed the element of preferred styles and comfort-based selection.  Students with special adaptations required (e.g. scribes) wrote the exam in a separate sitting and were not included in our experiment.  The exam session is 2 hours, although most students require and hour or a little more to complete it, so "exam" fatigue" is rarely a concern.  The essays are all scored with the same 6-point grading rubric (see image above), and the teachers marked the exams together with two teachers marking each paper and agreeing on a score.  The discussion of results and conclusions (this blog post) are the result of the conversation between one of the markers from each school -- myself and a colleague. We participated in the marking but did not actually teach any of the classes involved.

RESULTS
While we weren't completely impressed by their achievement, the Paper group won this contest with ease. There was a higher overall average score, with less 0s and 1s and almost no NRs. The student students provided more detail, used more complex sentences, and had fewer lapses with grammar and punctuation. Interestingly, they related more "stories" from class; that is, more anecdotes that sounded like direct quotes from the teacher (for better or worse) or lines of thinking that were the result of activities that likely involved writing or speaking in class (as opposed to something studied before the exam).  They also had more repetition -- cycling back through an idea to fill the space.

The online writers had a lower average, with considerably more NRs, 0s, and 1s, and no 6s. They had shorter sentences and paragraphs, and used more informal grammar and less punctuation. These students had a higher prevalence of poor diction (word choice or vocabulary) but the syntax was fine (arrangement of words and phrases). We concluded that they knew most of the same facts and possessed similar opinions as the paper group,  but simply referenced them without expansion, kind of a "I know this stuff -- just read my mind" approach.  

DISCUSSION
With all of their writing contained in a textbox, then online writers had no annotation of their text, no circling or evidence of revision (eraser marks) or any other evidence of the "struggle" to capture their thoughts. We admitted that we felt a bias about this -- writing that came from (and had) a "personality" seemed to be more authentic than the digital text. We also found the digital essays easier to mark -- without the "personality," e.g. the peculiarity handwriting (particularly neat vs messy) we spent less time second guessing whether we were assessing a visual quality that was not necessarily tied to their level of understanding. With digital writers, it was simply a matter of how does this piece of writing place on the rubric?

Our interpretation of these results was that students writing essays online fall into a pattern of digital communication that is informal, truncated, and full of insinuation rather than exposition. Their writing has a quality of expedience and we imagined they were written much faster than their paper counterparts. The students writing essays on paper exhibited more care and attention to their work, but also included more material meant to fill up the page.  Perhaps the online writers had no expectation of how "big" their essay should look on the screen, whereas the paper writers looked at three lined pages for each essay and had a feeling that they should at least get to page two before wrapping it up.

We agreed that students are very comfortable writing in digital spaces, but this does not necessarily serve them well for formal tasks such as essay writing.  This conclusion goes against what many experts would suggest -- even based on our own experience, it would seem that the digital format would serve one better: it is easy to go back, fix errors, cut & paste from one section to another for a better flow, change one's mind about various parts including paragraph and so on.  But this is our adult sensibility, we are teachers who have spent years writing, first on paper and later with the "magic" of computers and word processors.  For some of our students, using a word processor is like accessing a heritage skill.  They are more finely attuned to the gestured inputs of digital devices, and are slower to type and less likely to take advantage of editing tools than the generation of students who used computers but did not have smartphones.

We carried these observations forward to other schools in 2013 and resisted them after subsequent SS11 provincial exams; our conclusion were generally reinforced by what we heard from every school.  The school that used the online exam in our experiment in 2013 started giving students a choice about online vs paper, and they also put a renewed emphasis on writing skills in their Social Studies courses.  These particular provincial exams are no longer with us, but I thought this would be an interesting set of thoughts to consider as we anticipate a new round of standardized tests in BC -- the upcoming Grade 10 numeracy assessment and the Grade 10 and 12 literacy assessments.