Create README.md
Browse files
    	
        README.md
    ADDED
    
    | @@ -0,0 +1,25 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            ---
         | 
| 2 | 
            +
            language: ces
         | 
| 3 | 
            +
            license: cc-by-4.0
         | 
| 4 | 
            +
            tags:
         | 
| 5 | 
            +
            - word2vec
         | 
| 6 | 
            +
            datasets: Czech_CoNLL17_corpus
         | 
| 7 | 
            +
            ---
         | 
| 8 | 
            +
             | 
| 9 | 
            +
            ## Information
         | 
| 10 | 
            +
            A word2vec model trained by Andrey Kutuzov (andreku@ifi.uio.no) on a vocabulary of size 1767815 corresponding to 2113686735 tokens from the dataset `Czech_CoNLL17_corpus`.
         | 
| 11 | 
            +
            The model is trained with the following properties: no lemmatization and postag with the algorith Word2Vec Continuous Skipgram with window of 10 and dimension of 100.
         | 
| 12 | 
            +
             | 
| 13 | 
            +
            ## How to use?
         | 
| 14 | 
            +
            ```
         | 
| 15 | 
            +
            from gensim.models import KeyedVectors
         | 
| 16 | 
            +
            from huggingface_hub import hf_hub_download
         | 
| 17 | 
            +
            model = KeyedVectors.load_word2vec_format(hf_hub_download(repo_id="Word2vec/nlpl_37", filename="model.bin"), binary=True, unicode_errors="ignore")
         | 
| 18 | 
            +
            ```
         | 
| 19 | 
            +
             | 
| 20 | 
            +
            ## Citation
         | 
| 21 | 
            +
            Fares, Murhaf; Kutuzov, Andrei; Oepen, Stephan & Velldal, Erik (2017). Word vectors, reuse, and replicability: Towards a community repository of large-text resources, In Jörg Tiedemann (ed.), Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017. Linköping University Electronic Press. ISBN 978-91-7685-601-7
         | 
| 22 | 
            +
             | 
| 23 | 
            +
            This archive is part of the NLPL Word Vectors Repository (http://vectors.nlpl.eu/repository/), version 2.0, published on Friday, December 27, 2019.
         | 
| 24 | 
            +
            Please see the file 'meta.json' in this archive and the overall repository metadata file http://vectors.nlpl.eu/repository/20.json for additional information.
         | 
| 25 | 
            +
            The life-time identifier for this model is: http://vectors.nlpl.eu/repository/20/37.zip
         | 
