Should we be afraid of open book exams? Our experience

This study reports our first experience introducing open-book exams in the Anatomy subject at a medical school. Students were authorized to bring study material to the exam and consult books, atlases and their own notes. Our objective was to check whether the test scores were significantly higher or lower compared to the rote tests taken before it. After statistical analysis, there are no differences in the mean, median, range, minimum score, or maximum score. These results are consistent with the consulted bibliography. We can conclude that there is no need to be afraid of implementing this type of exam in the future.


Introduction
At the beginning of this 2022-2023 academic year, a guideline was announced at our Cardenal Herrera University that many of us did not expect: the midterm continuous assessment exams could not be rote. This instruction did not affect final exams, but it did put professors in a position to replace the traditional midterm exams with other types of evaluable activities, or allow the midterm exam but with the presence of notes: what in the literature has come to be called open-book exams.
The controversy between open-book and rote exams is not new. The latest recommendations point to a mixed method of examination since each type of examination assesses different skill sets in students. (Johanns, Dinkens, and Moore 2017). Other authors recommend using it as a complement to closed-book questions and believe that the post-covid world could be a good time to rethink opening up to these possibilities.(Zagury-Orly and Durning 2020). The grades tend to be generally higher and prepare the student for a proactive knowledge of the exam content and the application of the subject to solve problems. (Dave et al. 2020). Some experiences in health students have shown how the use of tools, including Internet searches, are not very useful in questions about problem solving or those in which evidence has to be analyzed to issue a clinical attitude, so it is necessary that the student has studied the subject from memory (Mathieson, Sutthakorn, and Thomas 2020) Arter reading some examples in the literature, in the Anatomy III and Anatomy IV subjects of the Medicine degree, we planned this course to carry out an open-book midterm exam and analyze the results to see if there were differences in results with respect to traditional exams. Therefore, we allowed students to do the midterm exam using the analog material they wanted: books, notes, notebooks, sheets, atlases, cards, etc. The only material we did not authorize was digital material that could be used to directly search the questions on the Internet. The exam proceeded normally. Finally, in the final exam in January 2023, the exam was performed from memory in the traditional style, without any materials, thus complying with the guidelines implemented this year.

Objectives
Our goal is to verify whether open-book testing has had an impact on student grades. Although general research wishes to find differences between groups (alternative hypothesis), in this specific case the good news would be that the implementation of openbook exams has not influenced student grades and, therefore, the implementation of these tests have not affected the performance of our students.

Material and methods
The first step is to carry out a search of the grades of the students in the subjects of Anatomy III and Anatomy IV that are taken by the Medicine students of the Cardenal Herrera CEU University in Alfara del Patriarca (Spain). Anatomy III is taught in the first semester (September to December, with an exam in January) and Anatomy IV in the second semester (February to May, with an exam in June). In each subject there is a midterm exam and a final exam that are evaluated on a total of 10 points, so that for each calendar year we will find four exams in total. An advantage of this research is that both subjects are taught by the same professor, which guarantees homogeneity in both teaching and evaluation. The exams were of the same style over the years: multiple choice questions without open questions, with incorrect answers subtracting a third of the correct ones, with 50 questions for the final exam and 40 for the partmidterm exams. These common characteristics made it very easy to compare the different exams since the methodology has remained constant. To avoid biases in our results, we eliminated the exams that due to the Covid-19 pandemic could not be done in the classroom and had to be done virtually or through proctoring. Therefore we discard the exams "Midterm Anatomy IV 2020 (which in fact was not held), Final Anatomy IV 2020, Midterm Anatomy III 2020, Final Anatomy III 2021 and Midterm Anatomy IV 2021". Beginning again with Final Anatomy IV 2021, exams were already carried out under normal conditions in the classroom and therefore the confounding factors disappear. We will use Student's t test to compare the means between both groups of exams to check if there have been differences between both groups.

Results
The main search results are shown in Table 1 from the most recent to the oldest. In it we find the parameters of the 17 exams, the second one being the exam of interest because it is the one that was carried out as an open-book. At first glance, it may be striking that there are small variations in the number of students who take the partial exam and the corresponding final exam. These variations can be explained for several reasons: students who follow the normal course but finally drop the subject for the extraordinary session in July, or because they enroll when the course is halfway through, or who do not take the partial exam for some justified reason, etc.
Regarding the average of the exam, it can be seen that it has been oscillating in values close to 6 and 7, with excellent years like 2018, and years with very low average grades like in 2022. If we compare the average of the same group of students who took the open-book exam (6.09) and later the final exam (5.84) we see a slight decrease, although not as pronounced as the one suffered by the Anatomy III 2021-22 group (which average dropped almost 2 points) or the group from Anatomy III 2019-20 (which lowered the average 0.64 points). It is not possible to establish a clear trend as to whether the final exam has a better mean than the midterm, since we find different cases every year. Regarding the range, practically every year there have been students with grades of less than 1 and brilliant students with a 10 or almost a 10. In the open-book exam there are no differences in this regard  It might be thought that the partial exams and the final exams are not comparable, since the number of questions in the final exams is greater, the amount of knowladge is considerably larger, and that this could affect the results of the exams. For this reason, the results of the five previous partial exams have been grouped in Table 1, obtaining an average of 6.69, somewhat higher than the 6.09 average points reached in the open-book exam. There are hardly any changes in the 50th percentile, which reaches 6.40 in the open-book exam and 6.62 in the average of the five previous midterms. There are no changes in the minimum marks (0.1 in the five previous midterms and 0.8 in the open-book exam) or maximum marks (10 and 9.8 respectively). This comparison can be visualized graphically in a more comfortable way in Figure 2: The mean score for the rote exams was 6.77, while the mean score for the open-book exam was 6.09. To check if this difference is statistically significant, we have performed the Student's t-test to compare the different parameters using the categorical classification variable "open-book exam" or "rote exam". In all applied parameters, Levene's test indicated that equal variances were assumed and that the two-sided p values for each of the parameters were not significant. That is, for 15 degrees of freedom, the mean (t = -0.853, p = 0.407), the median (t = 0.587, p = 0.566), the range (t = -0.465, p = 0.648), the minimum ( t = 0.572, p = 0.576) and maximum (t = 0.480, p = 0.638) do not represent a statistically significant difference between the two groups. With these data we can conclude the main objective of the study: the fact of taking an open-book exam did not significantly alter the parameters of the exam with respect to the exams performed rotely.

Discussion
Our findings are consistent with other studies that have analyzed the impact on open-book exam scores. Spiegel found no significant differences in the two cohorts (open an closed exams), both in academic performance and knowledge retention.(Spiegel and Nivette 2021). The acceptance among the students is also high, a study in London in which students were asked for feedback found 65% of students wanting more open-book exams. (Chadha, Maraj, and Kogelbauer 2020). Another study reports that 85% of the students preferred an openbook exam, due to less stress and pressure, valuing positively that the effort was focused on comprehension rather than on memorization (Quille et al. 2021). However, these exams must be carefully prepared to ensure the validity, reliability and fairness of the examinations (Er et al. 2020). There are also many risks and limitations, for example the evaluation of the information retrieval instead of the knowledge acquired (Dave et al. 2020).
Before concluding, we want to highlight a limitation of our study and that is that it is based on only one open-book exam (the first of them). However, we wanted to analyze these results early so that we could consider whether to continue using it or not next year. Another limitation of this study is that we have focused solely on the exam grade, without taking into account other results of the learning process or the distribution of other quantitative parameters such as the number of retaking students, or the relationship with other course assessments such as the dissection practical exam. For future editions we will analyze this data and implement some Feedback questions to find out the preferences of the students and the perceived difficulty.

Conclusion
The analysis of the results of this first open-book exam allows us to conclude that there have been no significant differences between performing the exam in this way and the traditional rote way, and therefore there should be no fear of continuing to use them in the future. However, we have only carried out one of these exams, so we will have to continue analyzing the results in the following courses