CROP Initiative
T H E C R O A T I A N O P E N D A T A B A S E S I N I T I A T I V E
The CROP initiative has emerged from the need for available psycholinguistic data for Croatian to ensure sufficient experimental control in our research. The Initiative started in 2023, and includes existing open science material developed by the team, as well as plans for the future. The team’s key Open Science objectives include the development of publicly available psycholinguistic resources, including computational-linguistic tools and resources such as corpora, databases and other materials necessary for achieving that goal, dissemination of research methodology and data in open access journals, and sharing our knowledge through free of charge workshops and scientific meetings. Our activities target interdisciplinary and multidisciplinary researchers (linguists, psychologists, cognitive scientists), AI and NLP researchers, students as well as other interested members of society.
CROP Tools
Classifier https://github.com/laconlab/lacon-classy
CROP Materials
Materials for creating the corpora https://github.com/laconlab/lacon-go-tiny-scrapy
Manually labeled dataset for training and testing the classifier: https://github.com/laconlab/lacon-engri-classifier
CROP Resources
Corpus of Croatian news portals ENGRI 2014-2018: https://www.clarin.si/repository/xmlui/handle/11356/1416
https://live.european-language-grid.eu/catalogue/corpus/8378
Corpus of Croatian news portals ENGRI 2014-2020: https://repository.pfri.uniri.hr/islandora/object/pfri%3A2156
Database of English words in Croatian: https://repository.pfri.uniri.hr/islandora/object/pfri%3A2495
Database of English words and their Croatian equivalents: https://repository.pfri.uniri.hr/islandora/object/pfri:2518
CROWD 5e Database: https://repository.pfri.uniri.hr/islandora/object/pfri:2614
https://figshare.com/articles/dataset/CROWD-5e_xlsx/19221678/3
ENGRI CROWD Database: https://figshare.com/articles/dataset/Normative_database_for_English_loanwords_in_Croatian/21687830
CROP Papers
Creating the corpora: https://doi.org/10.23919/MIPRO52101.2021.9597093
http://www.mipro.hr/LinkClick.aspx?fileticket=RAMnkK6T5UQ%3d&tabid=196&language=hr-HR
Preparing the materials for normative studies: https://doi.org/10.31724/rihjj.48.2.7
CROWD 5e research paper https://link.springer.com/article/10.3758/s13428-022-02003-2
Priming paper: https://doi.org/10.36505/ExLing-2022/13/0036/000578
accepted for publication in an open-access journals:
Paper on the Database of English words in Croatia https://hrcak.srce.hr/fluminensia
Paper on the Database of English words and their Croatian equivalents https://czasopisma.filologia.uwb.edu.pl/index.php/c/index
CROP Workshops
Corpus linguistics (free of charge, publicly available materials): https://github.com/laconlab/lacon-workshop-2021
EEG Workshop: https://www.laconlab.com/news#h.wyih4bv817ae