Submitted by Stefan Schweter 6 The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models CORAL NLP Research 5 2