Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
•
1908.10084
•
Published
•
12
This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base on the touch-rugby-modernbert-pairs dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Trelis/modernbert-embed-base-touch-rugby-ft")
# Run inference
sentences = [
'Is there a time-out for injuries during a standard Touch Rugby match?',
'4.9\tReferees and players may wear spectacles or sunglasses provided they are safe \nand securely attached.4.10\tReferees and players may wear sport monitoring equipment and medical \nsupports such as knee or ankle braces provided, at the sole discretion of \ncompetition’s controlling body, the items are not dangerous.5\u2002 Team Composition \n5.1\tA Team consists of a maximum of 14 players, no more than six (6) of whom are \nallowed on the field at any time.FIT Playing Rules - 5th Edition\n6\nCOPYRIGHT © Touch Football Australia 2020\nRuling = A Penalty awarded to the non-offending Team at the time the offence is identified \nseven (7) metres infield on the Halfway Line or the position of the ball, whichever is the \ngreater Advantage.5.2\tA Team must have a minimum of four (4) players on the field for a match to \ncommence or continue, except during a Drop-Off.',
'10.10\tIf a player in Possession intentionally makes a Touch on an Offside defender \nwho is making every effort to retire and remain out of play, the Touch counts.FIT Playing Rules - 5th Edition\nCOPYRIGHT © Touch Football Australia 2020\n9\n10.11\tIf a Touch is made on a player in Possession while the player is juggling the ball \nin an attempt to maintain control of it, the Touch counts if the attacking player \nfollowing the Touch retains Possession.10.12\tIf a player in Possession is Touched and subsequently makes contact with \neither the Sideline, a field marker or the ground outside the Field of Play, the \nTouch counts and play continues with a Rollball at the Mark where the Touch \noccurred.10.13\tWhen a player from the Defending Team enters its defensive Seven Metre Zone, \nthe Defending Team must move Forward at a reasonable pace until a Touch is \nImminent or made.Ruling = A Penalty to the Attacking Team at the point of the Infringement.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
question and related_chunk| question | related_chunk | |
|---|---|---|
| type | string | string |
| details |
|
|
| question | related_chunk |
|---|---|
What is defined as 'Forward' in the context of Touch Rugby? |
Referee |
What is the numerical difference between the teams on the field of play during a Drop-Off if a player has been sent to the sin bin? |
24.5 For the avoidance of doubt for clauses 24.3 and 24.4 the non-offending Team |
What happens if neither team is leading after two minutes of play in a Drop-Off? |
24.1.5 Should neither Team be leading at the expiration of two (2) minutes, a |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
question and related_chunk| question | related_chunk | |
|---|---|---|
| type | string | string |
| details |
|
|
| question | related_chunk |
|---|---|
What is the penalty for an attacking player obstructing a defender in Touch Rugby? |
Ruling = A Penalty to the Attacking Team at the point of the Infringement or on the seven (7) |
When must a player perform a Rollball seven metres in-field? |
13.5 A player may only perform a Rollball at the Mark under the following |
What is the ruling if a player uses excessive force when making a touch? |
FIT Playing Rules - 5th Edition |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepsper_device_train_batch_size: 32per_device_eval_batch_size: 32learning_rate: 2e-05num_train_epochs: 1lr_scheduler_type: cosinewarmup_ratio: 0.3overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: cosinelr_scheduler_kwargs: {}warmup_ratio: 0.3warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | Validation Loss |
|---|---|---|---|
| 0.2222 | 2 | 2.7671 | nan |
| 0.4444 | 4 | 0.0 | nan |
| 0.6667 | 6 | 0.0 | nan |
| 0.8889 | 8 | 0.0 | nan |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
answerdotai/ModernBERT-base