NeoAraBERT: A Modern Foundation Model for Arabic Embeddings with Diacritics-Aware Tokenization and POS-Targeted Masking
AI & ML interests
ML, NLP, CL for Arabic
Recent Activity
Organization Card
Unit For Research Studies in Arabic and Social Digial Spaces
models 11
U4RASD/NeoAraBERT_DA
Feature Extraction • 0.3B • Updated • 114 • 4
U4RASD/NeoAraBERT_MSA
Feature Extraction • 0.3B • Updated • 124 • 4
U4RASD/NeoAraBERT
Feature Extraction • 0.3B • Updated • 231 • 5
U4RASD/AREEj
0.6B • Updated • 18 • 1
U4RASD/ar-ms-baseline
Text Generation • Updated • 2 • 1
U4RASD/dalla-model-training
Updated
U4RASD/dalla-gemma-it
9B • Updated • 336
U4RASD/dalla-llama-it
8B • Updated
U4RASD/ArATTC
Text Classification • 0.1B • Updated • 13
U4RASD/ArGTC
Text Classification • Updated • 12
datasets 9
U4RASD/Muradif
Viewer • Updated • 38.6k • 80 • 4
U4RASD/omar-al-saleh-manuscripts-full
Viewer • Updated • 22 • 13 • 2
U4RASD/omar-al-saleh-manuscripts-segments
Viewer • Updated • 20.7k • 12
U4RASD/Masrad
Viewer • Updated • 19.4k • 9 • 1
U4RASD/curriculum_books_sft
Viewer • Updated • 1.36k • 7
U4RASD/curriculum_books_cpt
Viewer • Updated • 887 • 44
U4RASD/ArSRED
Viewer • Updated • 500k • 14 • 1
U4RASD/ArTopicDS-Books
Viewer • Updated • 21.2k • 12 • 2
U4RASD/ArBNTopic
Viewer • Updated • 19.8k • 14 • 3