metadata
tags:
- ontology-embedding
- hyperbolic-space
- hierarchical-reasoning
- biomedical-ontology
- generated_from_trainer
- dataset_size:150000
- loss:HierarchyTransformerLoss
base_model: sentence-transformers/all-mpnet-base-v2
widget:
- source_sentence: cellular response to stimulus
sentences:
- response to stimulus
- medial transverse frontopolar gyrus
- biological regulation
- source_sentence: >-
regulation of cell differentiation involved in embryonic placenta
development
sentences:
- thoracic wall
- ectoderm-derived structure
- regulation of cell differentiation
- source_sentence: regulation of hippocampal neuron apoptotic process
sentences:
- external genitalia morphogenesis
- compact layer of ventricle
- biological regulation
- source_sentence: transitional myocyte of internodal tract
sentences:
- secretory epithelial cell
- internodal tract myocyte
- insect haltere disc
- source_sentence: alveolar atrium
sentences:
- organ part
- superior recess of lesser sac
- foramen of skull
pipeline_tag: sentence-similarity
library_name: sentence-transformers
OnT: Language Models as Ontology Encoders
This is an OnT (Ontology Transformer) model trained on the GO dataset, based on sentence-transformers/all-mpnet-base-v2. OnT is a language model-based framework for ontology embeddings, enabling effective representation of concepts as points in hyperbolic space and axioms as hierarchical relationships between concepts.
Model Details
Model Description
- Model Type: Ontology Transformer (OnT)
- Base model: sentence-transformers/all-mpnet-base-v2
- Training Dataset: GO
- Maximum Sequence Length: 384 tokens
- Output Dimensionality: 768 dimensions
- Embedding Space: Hyperbolic Space
- Key Features:
- Hyperbolic embeddings for ontology concept encoding
- Modeling of hierarchical relationships between concepts
- Support for role embeddings as rotations over hyperbolic spaces
- Concept rotation, transition, and existential quantifier representation
Model Sources
- Repository: OnT on GitHub
- Paper: Language Models as Ontology Encoders
Available Versions
This model is available in 4 versions (Git branches) to suit different use cases:
| Branch | Training Type | Role Embedding | Use Case |
|---|---|---|---|
main (default) |
Prediction Dataset | ✅ With role embedding | Default version: training on prediction dataset, support role embedding |
role-free |
Prediction Dataset | ❌ Without role embedding | Training on prediction dataset, without role embedding |
inference-default |
Inference Dataset | ✅ With role embedding | Training on inference dataset, with role support |
inference-role-free |
Inference Dataset | ❌ Without role embedding | Training on inference dataset, without role embeddings |
How to use different versions:
from OnT import OntologyTransformer
# Default version (main branch - OnTr with role embedding)
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-go")
# Role-free version (without role embedding)
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-go", revision="role-free")
# Inference version with role embedding
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-go", revision="inference-default")
# Inference version without role embedding
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-go", revision="inference-role-free")
Full Model Architecture
OntologyTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Installation
First, install the required dependencies:
pip install sentence-transformers==3.4.0.dev0
You also need to install HierarchyTransformers following the instructions in their repository.
Direct Usage
Load the model and use it for ontology concept encoding:
import torch
from OnT import OntologyTransformer
# Load the OnT model
path = "Hui97/OnT-MPNet-go"
ont = OntologyTransformer.from_pretrained(path)
# Entity names to be encoded
entity_names = [
'alveolar atrium',
'organ part',
'superior recess of lesser sac',
]
# Get the entity embeddings in hyperbolic space
entity_embeddings = ont.encode_concept(entity_names)
print(entity_embeddings.shape)
# [3, 768]
# Role sentences to be encoded
role_sentences = [
"application attribute",
"attribute",
"chemical modifier"
]
# Get the role embeddings (rotations and scalings)
role_rotations, role_scalings = ont.encode_roles(role_sentences)
Citation
BibTeX
If you use this model, please cite:
@article{yang2025language,
title={Language Models as Ontology Encoders},
author={Yang, Hui and Chen, Jiaoyan and He, Yuan and Gao, Yongsheng and Horrocks, Ian},
journal={arXiv preprint arXiv:2507.14334},
year={2025}
}