ELIC team members gave presentations at LREC-COLING 2024 (May 20-25) and at Field Matters 2024 (August 16).
Shulin Zhang, John Hale, Margaret Renwick, Zvjezdana Vrzić, and Keith Langston. 2024. An Evaluation of Croatian ASR Models for Čakavian Transcription. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 1098–1104, Torino, Italia. ELRA and ICCL.
Austin Jones, Shulin Zhang, John Hale, Margaret Renwick, Zvjezdana Vrzic, and Keith Langston. 2024. Comparing Kaldi-Based Pipeline Elpis and Whisper for Čakavian Transcription. In Proceedings of the 3rd Workshop on NLP Applications to Field Linguistics (Field Matters 2024), pages 61–68, Bangkok, Thailand. Association for Computational Linguistics.
As part of the NSF-funded ELIC project, PI Keith Langston and Co-PI Zvjezdana Vrzić organized two workshops at the University of Rijeka and Juraj Dobrila University of Pula in July. Project team members worked to further refine the annotation system for the different varieties included in the corpus and were trained in the use of ELAN for annotation of the corpus data.
Langston, Keith, Silvana Vranić, and Zvjezdana Vrzić. “Language variation and contact between closely related varieties: Čakavian dialects in the Istria-Kvarner region of Croatia”
Langston, Keith, Zvjezdana Vrzić, Margaret Renwick, and John Hale. “The construction and annotation of a spoken corpus for language documentation and research: The Endangered Languages in Contact in Istria and Kvarner project (ELIC)”
As part of the NSF-funded project “Endangered languages in contact in Istria and Kvarner, Croatia (ELIC)”, Dr. Keith Langston (PI) and Dr. Zvjezdana Vrzić (NYU, Co-PI) led two workshops in Croatia at the University of Rijeka and University of Pula in June, assisted by Dr. Silvana Vranić (University of Rijeka) and Dr. Ivana Lalli-Paćelat (University of Pula), who are also collaborating on this project. The goal of this research is to create an online spoken corpus of four endangered language varieties found in this region of Croatia: Čakavian (Slavic) and Istriot, Istro-Romanian, and Istro-Venetian (Romance). The corpus will be used to document these varieties, to analyze linguistic features of interest, and to study language contact phenomena. It will also be available to local community members, for use in language maintenance and revitalization projects or for other purposes. Other UGA researchers working on this grant are Co-PIs Dr. John Hale and Dr. Margaret Renwick.
During the workshops, research team members discussed and refined the system of transcription and annotation used in the project and consulted with one another about the individual transcriptions that they were currently working on. Researchers and assistants are currently focused on processing Čakavian and Istro-Venetian interviews that have been recently collected. The transcription of Istro-Venetian poses special problems because of language variation and the lack of any established orthographic norms. Since the corpus will include audio, researchers are developing a systematic orthographic transcription that is designed to be broadly acceptable to members of the local community, taking into account the variation in informal written usage.
The first (hopefully annual) CresSSLing was a big success. We had a great group of young scholars, and I was really impressed by their passion and their ideas for their own research projects. Congratulations to the organizers, Dr. Silvana Vranić (University of Rijeka) and Dr. Zvjezdana Vrzić (New York University). I’m already looking forward to next year!
I’ll be teaching a course, An introduction to Praat and ELAN for field linguists, in CresSSLing 2022, July 4-8. The summer school will be held in the University of Rijeka’s educational center in the Moise Palace in the town of Cres. It will include courses taught by instructors from the University of Rijeka and University of Jena, with students from various universities in Europe and the US.