Many of the word lists used in this study are in common currency, and are made accessible here only to aid reproducing the study. In particular, we provide the Longman Defining Vocabulary (2190 words); the Natural Semantic Metalanguage ist (78 words); the Ogden list (850 words); the Up-Goer 5 list (1000 words); and the Swadesh list (207 words) on this basis, copyright remains with the original authors. The 4lang list (732 words) refers to Version 2.0, see github for developments. The compressed BNC speech frequency list is computed on a downcased version of the corpus before the Stanza analysis.