TechWolf
/

JobBERT-v2

Sentence Similarity

sentence-transformers

feature-extraction

Generated from Trainer

dataset_size:5579240

loss:CachedMultipleNegativesRankingLoss

Model card Files Files and versions

mattdl commited on 11 days ago

Commit

a480476

·

verified ·

1 Parent(s): 721a27e

Update README.md

Files changed (1) hide show

README.md +18 -0

README.md CHANGED Viewed

@@ -317,6 +317,24 @@ In this example:
 ### BibTeX
 #### Sentence Transformers
 ```bibtex
 @inproceedings{reimers-2019-sentence-bert,

 ### BibTeX
+### JobBERT-v2 paper
+Please cite this paper when using JobBERT-v2:
+```bibtex
+@article{01K47W55SG7ZRKFG431ESRXC35,
+  abstract     = {{Labor market analysis relies on extracting insights from job advertisements, which provide valuable yet unstructured information on job titles and corresponding skill requirements. While state-of-the-art methods for skill extraction achieve strong performance, they depend on large language models (LLMs), which are computationally expensive and slow. In this paper, we propose ConTeXT-match, a novel contrastive learning approach with token-level attention that is well-suited for the extreme multi-label classification task of skill classification. ConTeXT-match significantly improves skill extraction efficiency and performance, achieving state-of-the-art results with a lightweight bi-encoder model. To support robust evaluation, we introduce Skill-XL a new benchmark with exhaustive, sentence-level skill annotations that explicitly address the redundancy in the large label space. Finally, we present JobBERT V2, an improved job title normalization model that leverages extracted skills to produce high-quality job title representations. Experiments demonstrate that our models are efficient, accurate, and scalable, making them ideal for large-scale, real-time labor market analysis.}},
+  author       = {{Decorte, Jens-Joris and Van Hautte, Jeroen and Develder, Chris and Demeester, Thomas}},
+  issn         = {{2169-3536}},
+  journal      = {{IEEE ACCESS}},
+  keywords     = {{Taxonomy,Contrastive learning,Training,Annotations,Benchmark testing,Training data,Large language models,Computational efficiency,Accuracy,Terminology,Labor market analysis,text encoders,skill extraction,job title normalization}},
+  language     = {{eng}},
+  pages        = {{133596--133608}},
+  title        = {{Efficient text encoders for labor market analysis}},
+  url          = {{http://doi.org/10.1109/ACCESS.2025.3589147}},
+  volume       = {{13}},
+  year         = {{2025}},
+}
+```
 #### Sentence Transformers
 ```bibtex
 @inproceedings{reimers-2019-sentence-bert,