Marking Schemes for an Authentic Group Project, Trial by Statistics-A Case Study

This study is an analysis of two different marking schemes for an ‘authentic’ Group Project worth 50% of a first year undergraduate university agribusiness course at The University of Queensland (UQ). A number of different marking schemes for the Group Project had been trialled over the last ten years in an effort to obtain an equitable method of marking individual students doing the Group Project. In 2019, a marking scheme for the Group Project that had been successfully used previously was advertised for 2019 prior to the commencement of semester. However, issues during the semester within some of the Groups meant that students requested a Peer Evaluation marking scheme be employed. Eventually, for a class of 105 students, both marking schemes were used in assessing students’ work and a Pearson Correlation coefficient was run on the results of the final project mark to determine how equivalent the two marking schemes were. A good correlation (0.75) between the two schemes was returned, which was also reflected in a good correlation in the comparison for the final overall mark for the whole course (0.87). These statistical results suggest that there is a good argument for the existing marking scheme to continue to be used rather than a peer evaluation, which can have behavioural issues associated with it that are difficult to resolve.


Introduction
When lecturers meet in the corridor or chat over a cup of coffee, inevitably they get around to discussing issues they are facing in their classes. Often, the discussion quickly moves towards the assignments being undertaken by students, how they are doing them (or not) and the marking of them. Much can be learned by colleagues and peers during such informal discussions (Nonaka and Takeuchi 1996). In fact many times people find that 'their' problem or issue is not theirs alone but has presented itself already to others who have managed to create a solution or workaround. The project presented in this paper came about because of such discussions.
This study resulted from three issues in teaching a first year undergraduate course about the use of 'E' technologies in the agrifood chain in the Bachelor of Agribusiness at The Universiy of Queensland (UQ) in Semester 2 2019. The issues were that (i) the main piece of assessment for the course was worth 50%, and needed to be an 'authentic' assessment piece that engaged students in making use of content taught in the course which related strongly to industry practice; (ii) with 105 students in the course, this piece of assessment needed to be undertaken in small groups (no more than 4 students in a group); and (iii) the markng system used had to deliver a fair and equitable mark to individual students within each group, which was more difficult than it initially sounds, because while the course was ostensibly a 1 st year course, other higher year level students were able to take it as an elective. This meant that quite apart from potential issues of students not pulling their weight in the group because of poor scholastic effort (for example), group members were quite diverse in their knowledge, maturity and abilities, which did impact on the overall quality of the final projects delivered by groups. This in turn meant that a high performing individual in a low performing project could be penalised as a result of the make up of the members of the group (Chang and Brickman, 2018).
The project described here looks at the three issues in the context of contemporary educational literature and provides a statistical analysis of the validity of two criterion based marking (grading) schemes used in order to deliver equity to individual students in terms of their final marks for the Group Project in the case described.

Group Projects as Authentic Assessment
A group project can be defined as "a graded assignment requiring students to work collaboratively across multiple class periods and involving some time outside the normal class meeting" (Ettington & Camp, 2002, p. 357). Group projects as part of university course assessments are a widely used teaching tool (Wilson et al, 2018), and there is a well established pedagogical literature on the topic showing a number of benefits of group work for students, including: (i) learning teamwork skills (Davis and Miller 1996;Michaelsen et al. 2014) -a skill often requested by employers (Graduate Outlook Survey 2010); (ii) improving critical thinking skills (Anderson et al. 2001, MacGuiness, 2005; (iii) improving communication and collaborative skills (Slavin, 2014); and (iv) gaining insight into a particular topic at a deeper level than individual research (Burke 2011). Furthermore, as large organizations have become increasingly dependent on small groups or teams to achieve their goals, it has becoming increasingly important for employees to have the ability to work together collaboratively in today's business world (Aggarwal and O'Brien 2008). As an 'authentic' piece of assessment (Frey et al. 2012), a group project, if designed properly, can often tick all the boxes. The definition of authentic assessment is commonly agreed to be an assessment with real world applicability and one that students can employ what they have learnt during the course to perform real-world tasks (Mueller, 2018).
Despite being a good teaching and learning tool, there are challenges associated with group projects, which if not managed can prevent effective learning and result in poor-quality outputs, unequal distribution of workload, and conflict among team members (Chang and Brickman 2018). Indeed, the concept of 'social loafing' or scholastic laziness (Aggarwal andO'Brien 2008, Pandeira andAseng 2017) creates an imbalance of effort, such that 'free riders' are able to benefit from the contributions of others, which is received badly by other students. Lecturers also need be aware that group projects introduce their own grading complexitiesand it is the grading complexity of a group project that arose during the course of the semester that forms the focus of the study described here.

Marking/ Grading of Group Projects
Marking (or grading) a group project is a complex task. Koshy (2009) andBrookhard (2018) give good overviews of the literature and a multiplicity of university 'how to' webpages give advice. Essentially, however, what has to be determined is whether the lecturer assesses the product (overall piece of assessment or project) or assesses the process (evaluating an individual's work within the project), or both. Once this has been decided, the actual marking schemes to be used needs to be developed. Barnes, (1997) describes two main schemes (i) criterion based reference frameworks, where assessment of an assignment is made on the basis of performance defined by pre-specified criteria; and (ii) norm referenced approaches, where assessment is made on the basis of performance relative to that of other members of the class or cohort. Criterion based frameworks have become more popular over time and requirements for more transparent schemes with better learning outcomes for the student have ramped up in recent years (Koshy 2009). Thus rubrics that 'articulate expectations for student work by listing criteria for the work and performance level descriptions across a continuum of quality' (Brookhard, 2018) ) within criterion based schemes have become more favoured compared to norm-based schemes.

The Case Study
This case study is based on the marking of an authentic piece of assessment for a course entitled "ETechnologies in the Food & Fibre Industry" where students were asked to create an innovative "E" product within an electronically enabled Agribusiness and to 'sell' this product in as innovative a manner as possible, documenting how the new product could add value to the business. The project had to be undertaken in small groups of four students (Burke, 2011 indicates that small groups generally realise better learning than larger groups), and was worth 50% of the overall course assessment marks. The groups were put together by the lecturer to prevent groups of friends or year levels 'ganging up'. The project when completed was delivered via a new online platform at UQ called CIRRUS, which also had a facility to create a PDF document to be uploaded into the Course Blackboard site by each student. The Project was scaffolded with an instruction document with clear guidelines of requirements and Marking Criteria (Table 1) and was assessed on two main areas: The "Idea" (20 marks): and an 'E' Product/Agribusiness Business Plan (30 marks).

TOTAL (x/105) = (x/ 30) TOTAL (Idea + Business Plan Report) (x/50)
A marking scheme (the ECP Weighting system (ECPW)) that has been successfully used since 2013 for evaluating an individual's mark for the group project was used and advertised prior to commencement of semester (success here defined in the sense that students have reported that the ECPW reduces 'free riding' that can occur in group projects). Table 2 shows a worked example of how an individual's mark was calculated using the overall Project mark as basis. A weighting for each student in a Group was assigned by averaging their marks over the previous assignments in the course equivocated to projected final grade. However, issues during the semester within some of the groups (social loafing and free riding, not logging onto the CIRRUS platform to deliver any physical input into the project, poor or no communication, plagiarised work from websites), meant that several groups of students requested a Peer Evaluation marking scheme (Dyrud, 2001).
Consultation with the class to verify that this was what was collectively wanted and that everyone was familiar with the process from previous classes, took place. With an affirmative answer, a formal evaluation scheme with rubrics was developed (Table 3) which was mandatory for all students to useuploading their evaluations with the CIRRUS linked PDF of their project into the Course Blackboard site Project assignment link for marking.

Marking Process for the Group Projects and Results
Eventually, for a class of 105 students, both the ECP weighting scheme (ECPW) and Peer Evaluation weighting scheme (PEW) were used in assessing students' work. A Pearson Correlation coefficient was run on the results of each group's overall project mark obtained using each marking scheme in order to determine how equivalent the schemes were. The process of marking the group projects was an exhaustive practice to ensure equity, given the concerns that students had voiced.
The project was first assessed overall (to obtain 'product' mark) and a group mark was assigned based on the Marking Criteria shown in Table 1. Individual student marks within each group were then calculated using their group's overall project mark and the ECPW shown in Table 2 (Individual Project Mark 1). An individual's project mark was also calculated using the overall project mark and the PEW scheme shown in Table 3 (Individual's Project Mark 2).
A Regression Analysis on each individual's marks for their project was calculated in Excel using the ECPW vs the PEW system and also on their ultimate overall course mark using the ECPW project mark vs the PEW project mark. An R 2 of 0.7469415 and 0.876877 respectively was obtained meaning the two marking schemes were strongly correlated, and that neither would penalise students in terms of their overall final grade for the course.
There were some actual mark differences for individuals between the two marking systems (Figure 1). Most were less than 1 or 2 marks but eight student marks were significantly different (>5 marks). On review these were due mainly to individuals either 'opting out'of being in group (n=2, with one student being given a PEW of 0.1 but having a much higher ECPW of 0.8, and the other a PEW of 0.35 and an ECPW of 1.0), or simply not participating at all and not submitting anything (eg PDF from CIRRUS or PEW document) on the Blackboard site (n=2), or they were in a group in which all students gave each other full marks on the PEW (n=4) such that poor early assignments giving a low ECPW, were negated. My own observations and discussion with students indicated that the PEWs were fairly accurate in most cases, although it is acknowledged that scores were skewed in others. This reflects the literature where in a number of studies, PEW schemes have been shown to be both positive and negativepositive in that it gives students a chance to participate in the marking of their projects which gives ownership and interest and to some degree prevents 'free riding' but negative in that occasionally some students may not give honest evaluations or evaluations that really reflect other students efforts (Strong & Anderson, 1990, Dyrud, 2001. This, plus the fact that the PEW was introduced late in the semester and students were not thus 'prepped' for its use, made it seem fairest to calculate an individual's final project mark and thus final course mark using an average of the PEW calculated marks and the ECPW marks. As expected -the average had a strong correlation to both systems = PEW = 0.968871 and ECP = 0.968589 respectively.

Conclusion
The results of this small statistical analysis between two criterion based marking schemes suggest that there is a good argument for the established and advertised ECPW marking scheme to remain in place for the Group Project in this course because it accurately reflects how students go on to work in their groups as assessed using the PEW scheme. This result, despite the previous years' of success in using the ECPW for marking the Group Project in this course, also allays a minor procedural concern that the ECPW -through its structural adjustment using the past quality of assignments to predict a future mark in the Group Project -could be seen as not facilitating a student to improve their final course mark over the semester. This potential issue which aligns with setting deep learning goals, is discussed in Hermida, (2015:Ch 1) and will be tested during a further iteration of the marking systems' comparison when the course is run again in 2020.