A longitudinal study on language learning vocabulary in L 2 Spanish

Some learners perform better on listening tests that include visual input instead of audio only (Wagner, 2008) while others have found no difference in the performance of participants in the two test formats (Batty, 2015). These mixed results make it necessary to examine the role of using audio and video in listening comprehension (LC). This study examines the effect of input modality on the learning of new vocabulary with intermediate L2 learners. The study gave four versions of the same text: a baseline in audio format, a baseline in video format, a redundancy-enhanced version in audio format and a redundancy-enhanced version in video format. Three hundred sixty two intermediate learners of Spanish participated in this study over a period of three consecutive semesters. Results about input modality indicated audio or video does not seem to matter in responding correctly to the vocabulary items. However, the redundancy-enhanced version in audio and video formats helped learners to respond correctly to vocabulary items when enrolled in face2faceblended courses compared to online-hybrid courses.


Introduction
For years I have been observing and evaluating language classes as one of my responsibilities for directing first and second Spanish courses at a Midwest medium-size research university.Teaching a more interactive class has given prominence to the use of videos, visual support, or games, especially when the target language is spoken constantly.That is, language instructors incorporate visual support in their lesson plans (Batty, 2015) not only for teaching listening, grammar, vocabulary and culture (Pardo-Ballester, 2012), but also to prepare students before an assessment.Moreover, well-known publishers supplement their textbooks with online learning management systems to fit every learning style.Platforms such as WileyPlus, McGraw-Hill Connect, and others include online exercises and assessment using visuals and videos.However, most of the time when language instructors assess their students" LC skill, visual aids are not used.Becker and Sturm (2017) state that among scholars there is agreement in incorporationg audiovisuals for L2 instructions, but researchers do not agree about using them for language testing.Furthermore, my personal experience, observing and evaluating courses as some of my service responsibilities in my position as a faculty member at the Department of World Languages and Cultures, reveals that for the listening section of the test, the instructor still reads aloud a scripted text instead of using an audio-only file or audiovisual file (i.e., still pictures and graphics, or a video with non-verbal communication such as body gestures and facial cues).The scripted text is normally written and revised either by the instructor, the supervisor, or the publisher.According to Wagner (2014), important language testing organizations do not use videos for L2 LC such as the International English Language Testing System (IELTS) or the Pearson Test of English Academic (PTE).It makes sense that if students learn the foreign language using visual support, they should also be tested in the same way (Lee & Van Patten, 2003).
A number of researchers (Batty, 2015;Coniam, 2001;Pardo-Ballester, 2016;Suvorov, 2014;Wagner, 2010) have found mixed results when assessing L2 LC with only audio input and/or visual support, resulting in a need to investigate this topic.The purpose of this particular project was to develop different types of L2 Spanish LC tasks with audio and video formats that were related to the type of L2 listening assessment students used in the classroom.That is, when students take a language test, the listening section includes a monologue with audio-only.The intention was to learn about 1) the effect of a redundancyenhanced version in video and audio formats on LC and vocabulary recognition; and 2) the effect of different instruction formats (online-hybrid vs. face2face-blended) on learners" ability to respond correctly to items with new vocabulary.Jones and Plass"s (2002) study, which investigated the effect of visual and verbal annotations on LC of students of French, revealed that a visual component can aid in the recall of information and vocabulary recognition.In their study, students exposed to both visual and verbal annotations performed better than those students exposed to only visuals.In a follow-up study, Jones (2003) "suggests that the availability and the choice of visual and verbal annotations in listening comprehension activities enhances students" abilities to comprehend the material presented and to acquire vocabulary" (p.41).

Literature review
Elaboration is when the input is modified by adding redundancy such as repetition, paraphrasing of information, or providing synonyms of low frequency lexical items (Long, 2007).L2 learners can better understand texts that have been modified with the elaboration device because learners "have more opportunities to process critical information" (Oh, 2001, p. 86).Elaboration has been an effective device for better comprehension of a written or an aural passage (Long, 2007;Oh, 2001).Ginther (2002) specifically calls for more research to be done in the area of content and context visuals and their effect on learner performance on L2 listening.She differentiates between context and content visuals.Context visuals are visuals that provide information about a situation, for example a picture with a situation such as a couple of friends playing basketball in the park.Content visuals are visuals that are related to the oral input.For example, if students hear a low frequency word such as "mostrador" (counter), they will see a visual aid to illustrate that word with the purpose of facilitating the comprehension.That is, words and pictures convey identical content making the text redundant for the L2 learners which again facilitates comprehension.Her study showed that visuals can help students" performance on listening tests.Ockey (2007) compared context visuals in the form of still images to context visuals with a video.His study reveals that context visuals with a video were distracting for students, but it helps at the beginning of the listening test because it provided a situational context.
Regarding item preview, in Berne"s (1995) study, one group had a chance to read the items before listening to the oral input while the other group reviewed vocabulary.Results showed that students who previewed items before the listening task performed better than their counterparts without access to the items.

The Spanish courses
The intermediate Spanish courses presented in this study were delivered in two different formats: online-hybrid and face2face-blended.The students enrolled in the online-hybrid course met twice a week in the classroom for fifty minutes and one 25-minute synchronous on-line course with four or five classmates and the instructor.The face2face-blended course met three days in a regular classroom and one day in a language computer lab.Each class time meeting lasted fifty minutes.Participants took the listening tests in the computer lab.

The listening assessments
The instruments used consisted of eight listening tests with monologues in a Spanish target language use domain, and a total of five items of multiple-choice format for each assessment.Students have access to question the preview before listening to the audio.Only two items were focused on vocabulary items, that is words or expressions that did not appear in their textbook such as "chuparse los dedos" (a delicious meal).See Figure 1 for a sample of the video format test.The vocabulary items were the ones analyzed in this study.
Each test was delivered on a web-based computer.All texts were monologues and were administrated within the video and audio tests based on a redundancy-enhanced version and a baseline version.Video and audio formats were embedded in the listening tests within the course platform.All listening tests included a play button and it could be played only twice without stopping the audio or reviewing for specific details.
The visual input for the video format includes context and content visuals.The context visual is the title of the test projected on the screen as a caption and the first visual they see from the video.According to Ginther (2002), this helps to set the scene for the spoken input.Participants see the title of the video and the visual input related to the topic of the test.The content visual includes photos and videos and tends to be equivalent to the aural content.If participants hear "mostrador" (counter), they also see an image of that spoken word.

Research questions
The present research seeks to address the following questions: RQ1: Do intermediate level learners of Spanish perform better in vocabulary items using video listening tests when compared to learners using audio listening tests?RQ2: Is there a difference in performance between students enrolled in an hybrid course and students in a face2face-course when responding to the same vocabulary items?RQ3: After 4-6 weeks of taking listening tests, did students learn the new vocabulary when compared with their first performance?

Method
A pseudo-crossover design was used for this study of the eight tasks.See Table 1 for a sample of how tests were distributed.Three hundred sixty two students of intermediate Spanish level participated in the study during three semesters.The data collected for this study were the test results for eight tests.This data was analyzed with a generalized linear mixed effects model in the SAS software for the probability of getting a correct answer on the vocabulary items.The model includes fixed effects for question, method (face2faceblended vs. online-hybrid vs. control group), audio versus video, and an interaction between audio versus video and method.
Data was also collected from a delayed post-test administrated 4 weeks after taking the last test.Results investigated whether or not students could recall vocabulary items that appeared in four of the listening tests.The paper and pencil post-test was administrated in the third semester to a total of 111 participants.This test included the same questions and possible answers that appeared in the listening tests.Students needed to identify the meaning of the vocabulary items and tell if they knew the eight vocabulary items before the research.For this, a Likert scale was used (1-for completely disagree, 4-for completely agree).For example, they were asked "Based on your Spanish knowledge, did you know the meaning of "cargarse las pilas' before?"To investigate vocabulary retention, a profit regression model was used.

R.Q.1.
There was no significant difference on vocabulary items between the audio group and the video group (p= 0.6759) indicating that the effect of audio versus video did not seem to matter.Table 2 lists Least Squares-Means (LS-Means) for the audio effect; the predicted probability or the average of log-odds of getting a correct answer is shown with the estimate result (0.04485), meaning that participants are more likely to respond correctly to vocabulary items when using video.No a significance difference on vocabulary items for audio or video (p= 0.6759) R.Q.2., Looking at specific groups based on instructional formats in Table 3, we learn that the face-2-face-blended group working with videos with redundancy performed better answering vocabulary items compared to other groups.The estimate column displays the least-squares mean estimate on the logit scale of getting the correct answer when the odds are higher than 0, and the mean column represents its mapping onto the probability scale.So, as showed in Table 3, the performance of the F2F-blended group is significantly different when they used video with redundancy, indicating that this group outperformed the others (with an estimate of 1.17, M=0.76, p=<.0001).The same group is significantly different when using audio enhanced with redundancy when compared to the hybrid and control groups (with an estimate of 0.90, M= 0.71, p=<.0001).Results for comparing groups show that the hybrid and control groups have similar performance (p=0.98).It is significantly different from the face2face-blended group when compared to control (p=0.0009) and compared to the hybrid (p=0.00014).This tells us, on average, the performance of students who were enrolled in face2face-online courses and worked with listening tests enhanced with redundancy did better than their counterparts in the control groups and the hybrid groups.R.Q. 3. Results from the delayed post-tests indicated that none of the 111 students tested knew the vocabulary items before the study.A significance difference (N=111, z = 2.62, p>|z|=0.009),comparing results from the first time they took the tests to the second time taking the recognition vocabulary test, indicated that the probability of answering the items correctly is less than 1%.

Implications and Conclusions
Implications for future research in pedagogy and testing emerge as an outcome of the results of this study.Participants were comfortable listening to an audio in Spanish and answering questions.That is, they were not instructed about any listening strategies.Current research has proved that the role of strategy instruction on L2 LC helps learners improve their listening proficiency (Nogueroles Lopez & Blanco Canales, 2017).Further research should be conducted to examine the relationship of instructed strategies for listening tests with the use of visuals which could help sensitize students on the process of watching and listening.For example, if students are taught about the cognitive strategy of visual elaboration, they could relate an image with the audio and therefore select the correct answer.This study proved that when listening tests are enhanced with redundancy, participants perform better when responding to vocabulary items, but only for those students not pertaining to the online-hybrid or control groups.So there appears to be some variation among L2 listeners as to the degree to which they can cope with unknown vocabulary items as they listen.Perhaps students spending more time in the classroom with their teacher is the key to account for this variation.

Figure 1 .
Figure 1.Sample of a video format listening test for 'chuparse los dedos'