mena-open-data
's Collections
Arabic NLP datasets
updated
lightonai/nanobeir-multilingual
Viewer
•
Updated
•
522k
•
559
•
11
Viewer
•
Updated
•
47.8M
•
12.6k
•
31
Viewer
•
Updated
•
2.72k
•
8
Viewer
•
Updated
•
7.42k
•
85
•
2
Viewer
•
Updated
•
149
•
30
Viewer
•
Updated
•
4.13k
•
1.11k
•
1
Omartificial-Intelligence-Space/Arabic-NLi-Pair-Class
Viewer
•
Updated
•
981k
•
423
•
2
malaysia-ai/Multilingual-TTS
Viewer
•
Updated
•
34.2M
•
1.61k
•
13
opendatalab/WanJuanSiLu-Multimodal-5Languages
Preview
•
Updated
•
87
•
3
Preview
•
Updated
•
177
•
35
Viewer
•
Updated
•
66k
•
61
•
10
LLaMAX/BenchMAX_Function_Completion
Viewer
•
Updated
•
2.79k
•
1.04k
•
1
Viewer
•
Updated
•
8.86k
•
2.18k
•
7
Viewer
•
Updated
•
3.25M
•
87
•
3
MLCommons/ml_spoken_words
Updated
•
1.99k
•
34
Twitter/HashtagPrediction
Viewer
•
Updated
•
1.07M
•
150
•
2
Viewer
•
Updated
•
1.4M
•
306
•
1
Viewer
•
Updated
•
3.62M
•
768
•
2
Viewer
•
Updated
•
197k
•
566
•
3
Viewer
•
Updated
•
54.9k
•
4.2k
•
74
Viewer
•
Updated
•
108k
•
5.93k
•
66
Updated
•
3.89k
•
14
Viewer
•
Updated
•
624
•
222
•
4
Viewer
•
Updated
•
5.07k
•
1.22k
Viewer
•
Updated
•
13.3k
•
80
•
4
Viewer
•
Updated
•
200
•
130
Viewer
•
Updated
•
37.4k
•
398
•
4
Updated
•
1.37k
•
4
Viewer
•
Updated
•
130k
•
212
•
2
Viewer
•
Updated
•
3.12k
•
1.2k
vg055/SemEval2025_Task11_TrackA
Viewer
•
Updated
•
2k
•
8
sarulab-speech/commonvoice22_sidon
Viewer
•
Updated
•
15.1M
•
1.35k
•
13
Preview
•
Updated
•
22
ToxicityPrompts/PolyGuardMix
Viewer
•
Updated
•
1.91M
•
273
•
4
Viewer
•
Updated
•
481k
•
78
•
15
Preview
•
Updated
•
235
•
8
Viewer
•
Updated
•
124M
•
265
•
16
linagora/linto-dataset-audio-ar-tn
Viewer
•
Updated
•
37.3k
•
1.88k
•
13
Viewer
•
Updated
•
13.6k
•
863
•
26
Viewer
•
Updated
•
676k
•
1.84k
•
35
Viewer
•
Updated
•
9.71k
•
1.56k
•
19
fr3on/election-questions-arabic
Viewer
•
Updated
•
1.49k
•
46
Updated
•
93
•
8
Viewer
•
Updated
•
3
•
13
•
1
Updated
•
349
•
20
papluca/language-identification
Viewer
•
Updated
•
90k
•
3.14k
•
61
vincentkoc/tiny_qa_benchmark_pp
Viewer
•
Updated
•
662
•
454
•
2
Viewer
•
Updated
•
70.3M
•
6.83k
•
17
Viewer
•
Updated
•
88.8k
•
9.46k
•
1.46k
Viewer
•
Updated
•
4.8k
•
14
s-nlp/EverGreen-Multilingual
Viewer
•
Updated
•
4.76k
•
53
•
1
camel-ai/ai_society_translated
Preview
•
Updated
•
112
•
16
LLaMAX/BenchMAX_Problem_Solving
Viewer
•
Updated
•
12.1k
•
582
•
1
alexandrainst/multi-wiki-qa
Viewer
•
Updated
•
1.22M
•
2.16k
•
21
SaiedAlshahrani/Moroccan_Arabic_Wikipedia_20230101_nobots
Viewer
•
Updated
•
4.68k
•
43
•
3
Melaraby/EvArEST-dataset-for-Arabic-scene-text-recognition
Viewer
•
Updated
•
296k
•
92
mozilla-foundation/common_voice_17_0
Updated
•
2.5k
•
2
suchirsalhan/Phonemized-UD
Viewer
•
Updated
•
1.19M
•
2.01k
LLMXperts/Arabic-NLi-Triplet
Viewer
•
Updated
•
571k
•
40
Updated
•
1.61k
•
3
adithya7/xlel_wd_dictionary
Viewer
•
Updated
•
230k
•
1.31k
•
3
Viewer
•
Updated
•
10k
•
273
•
54
Viewer
•
Updated
•
86.8M
•
2.93k
•
22
Viewer
•
Updated
•
76.3k
•
4.91k
•
4
Viewer
•
Updated
•
78k
•
90
•
3
Viewer
•
Updated
•
46.2k
•
1.17k
•
26
SaiedAlshahrani/Detect-Egyptian-Wikipedia-Articles
Viewer
•
Updated
•
756k
•
560
•
1
Omartificial-Intelligence-Space/Arabic-NLi-Pair
Viewer
•
Updated
•
328k
•
85
•
4
aida-ugent/llm-ideology-analysis
Viewer
•
Updated
•
315k
•
469
•
4
Viewer
•
Updated
•
1.2k
•
31
•
6
Viewer
•
Updated
•
206k
•
3.55k
•
331
Viewer
•
Updated
•
290k
•
434
•
42
Viewer
•
Updated
•
255k
•
106
•
5
Preview
•
Updated
•
124
•
3
tellarin-ai/ntx_llm_instructions
Viewer
•
Updated
•
5.98k
•
118
Viewer
•
Updated
•
29.2k
•
3.21k
•
34
UBC-NLP/nilechat-arabizi-mor
Viewer
•
Updated
•
1.45M
•
25
•
2
Viewer
•
Updated
•
2.14M
•
46
•
5
CohereLabs/include-lite-44
Viewer
•
Updated
•
10.8k
•
1.04k
•
14
Viewer
•
Updated
•
3.48k
•
607
•
14
Viewer
•
Updated
•
7.35k
•
1.11k
Viewer
•
Updated
•
5.16k
•
180
•
5
JQL-AI/JQL-Human-Edu-Annotations
Viewer
•
Updated
•
20.4k
•
479
•
5
Viewer
•
Updated
•
9.03B
•
32.9k
•
36
Viewer
•
Updated
•
310k
•
1.75k
•
9
CohereLabs/fusion-pairwise-evals-finetuned
Viewer
•
Updated
•
5.25k
•
21
Viewer
•
Updated
•
400
•
42
•
7
Viewer
•
Updated
•
8.69k
•
92
•
1
faisaltareque/XL-HeadTags
Viewer
•
Updated
•
415k
•
60
•
3
Viewer
•
Updated
•
3.91M
•
472
•
6
Viewer
•
Updated
•
100
•
25
•
1
Viewer
•
Updated
•
798k
•
3.64k
•
80
Viewer
•
Updated
•
330
•
68
•
3
Viewer
•
Updated
•
94.4k
•
1.32k
•
11
Updated
•
887
•
8
CohereLabs/fusion-synth-data-ufb
Viewer
•
Updated
•
94.7k
•
33
•
1
QCRI/AraDICE-ArabicMMLU-egy
Viewer
•
Updated
•
14.5k
•
1.5k
•
1
Viewer
•
Updated
•
121
•
88
•
3
Viewer
•
Updated
•
2.97M
•
1.9k
•
29
ClusterlabAi/101_billion_arabic_words_dataset
Viewer
•
Updated
•
33.1M
•
1.08k
•
69
omar-emad/financesecondtrial
Viewer
•
Updated
•
30
•
8
Viewer
•
Updated
•
11.4k
•
25
Viewer
•
Updated
•
695k
•
625
•
8
CohereLabs/deja-vu-pairwise-evals
Updated
•
30
•
3
kaust-generative-ai/fineweb-edu-ar
Viewer
•
Updated
•
363M
•
327
•
13
Preview
•
Updated
•
53
•
1
Viewer
•
Updated
•
893
•
21
•
1
Viewer
•
Updated
•
135k
•
495
•
1
UBC-NLP/nilechat-arabizi-egy
Viewer
•
Updated
•
572k
•
29
Viewer
•
Updated
•
761k
•
35
•
3
Viewer
•
Updated
•
11.1k
•
102
•
5
KFUPM-JRCAI/arabic-generated-abstracts
Viewer
•
Updated
•
8.39k
•
781
Viewer
•
Updated
•
5.73k
•
168
•
6
badrex/ALDi-predictions-MADIS5
Viewer
•
Updated
•
263
•
7
Viewer
•
Updated
•
467k
•
11
•
2
Viewer
•
Updated
•
10.1k
•
74
•
1
CohereLabs/include-base-44
Viewer
•
Updated
•
23k
•
6.6k
•
42
CohereLabs/m-ArenaHard-v2.0
Viewer
•
Updated
•
11.5k
•
296
•
5
Viewer
•
Updated
•
77.2M
•
2.34k
•
51
ToxicityPrompts/PolyGuardPrompts
Viewer
•
Updated
•
29.3k
•
142
•
2
Updated
•
9.23k
•
2
SaiedAlshahrani/Egyptian_Arabic_Wikipedia_20230101
Viewer
•
Updated
•
728k
•
57
•
4
QCRI/AraDICE-ArabicMMLU-lev
Viewer
•
Updated
•
14.5k
•
1.44k
Viewer
•
Updated
•
97.6k
•
2.11k
•
47
Updated
•
929
•
12
Viewer
•
Updated
•
141k
•
45
•
7
CohereLabsCommunity/afri-aya
Viewer
•
Updated
•
2.47k
•
163
•
11
Omar-youssef/Egyptian-text-summarization
Viewer
•
Updated
•
3.69k
•
25
jonathanmutal/Medical-Questionnaire-Multilingual-Translation
Preview
•
Updated
•
18
Updated
•
41.9k
•
41
CohereLabs/Global-MMLU-Lite
Viewer
•
Updated
•
10.9k
•
5.73k
•
28
MBZUAI/speecht5_tts_clartts_ar
Text-to-Speech
•
Updated
•
1.6k
•
24
LLaMAX/BenchMAX_General_Translation
Viewer
•
Updated
•
228k
•
677
abdullah-alamodi/aqeedah-rag-dataset
Viewer
•
Updated
•
5.42k
•
22
•
1
Viewer
•
Updated
•
63.8k
•
373
•
1
Viewer
•
Updated
•
127k
•
1.1k
•
26
Viewer
•
Updated
•
5.1M
•
1.32k
•
47
sboughorbel/arabic-web-edu-seed
Viewer
•
Updated
•
236k
•
82
•
3
amphora/Open-R1-Mulitlingual-SFT
Viewer
•
Updated
•
128k
•
93
•
3
SaiedAlshahrani/Moroccan_Arabic_Wikipedia_20230101_bots
Viewer
•
Updated
•
5.4k
•
31
brighter-dataset/BRIGHTER-emotion-intensities
Viewer
•
Updated
•
41.2k
•
569
•
3
LLaMAX/BenchMAX_Domain_Translation
Viewer
•
Updated
•
47.3k
•
665
LLaMAX/BenchMAX_Rule-based
Viewer
•
Updated
•
7.29k
•
652
•
2
ELYADATA & LIA at NADI 2025: ASR and ADI Subtasks
Paper
•
2511.10090
•
Published
Viewer
•
Updated
•
393k
•
7.36k
•
513
Omar-youssef/islamic-qa-egyptian-arabic
Viewer
•
Updated
•
7.47k
•
31
alconost/alconost-multilingual-speech-en-ja-ar-pl-v1
Viewer
•
Updated
•
280
•
41
LLaMAX/BenchMAX_Question_Answering
Viewer
•
Updated
•
17
•
68
2A2I/Arabic-OpenHermes-2.5
Viewer
•
Updated
•
982k
•
388
•
20
FreedomIntelligence/ApolloMoEDataset
Viewer
•
Updated
•
293k
•
163
•
5
SaiedAlshahrani/Arabic_Wikipedia_20230101_bots
Viewer
•
Updated
•
1.09M
•
68
•
1
UBC-NLP/palmx_2025_subtask1_culture
Viewer
•
Updated
•
4.5k
•
63
•
1
Viewer
•
Updated
•
17.6M
•
95
•
4
Viewer
•
Updated
•
8.79k
•
474
•
41
Viewer
•
Updated
•
158k
•
183
•
7
UBC-NLP/nilechat-fw-edu-egy
Viewer
•
Updated
•
5.52M
•
58
•
2
LLaMAX/BenchMAX_Model-based
Viewer
•
Updated
•
8.5k
•
234
Viewer
•
Updated
•
180
•
1.32k
•
1
Raniahossam33/Arabic_cultural_dataset
Viewer
•
Updated
•
12.1k
•
6
•
2
Preview
•
Updated
•
45
Viewer
•
Updated
•
380M
•
23k
•
39
Viewer
•
Updated
•
7.18B
•
35.9k
•
566
visheratin/laion-coco-nllb
Viewer
•
Updated
•
894k
•
1.42k
•
44
obadx/recitation-segmentation-augmented
Viewer
•
Updated
•
64.6k
•
432
Viewer
•
Updated
•
159M
•
10.9k
•
12
Viewer
•
Updated
•
2.56M
•
27.5k
•
77
Viewer
•
Updated
•
602k
•
9.52k
•
144
Viewer
•
Updated
•
13.2k
•
7.56k
•
2
rabah2026/Quran-Ayah-Corpus
Viewer
•
Updated
•
263k
•
970
•
1
omar-emad/FinanceTripletSecond
Viewer
•
Updated
•
30
•
14
Viewer
•
Updated
•
3.3k
•
95
•
8
Viewer
•
Updated
•
6.98k
•
135
•
8
Viewer
•
Updated
•
1.05M
•
110
•
12
UBC-NLP/palmx_2025_subtask2_islamic
Viewer
•
Updated
•
1.9k
•
14
Viewer
•
Updated
•
388
•
139
rubricreward/m-reward-bench
Viewer
•
Updated
•
66k
•
24
Fujitsu-FRE/MAPS_Verified
Viewer
•
Updated
•
3.05k
•
3.78k
•
2
Viewer
•
Updated
•
135k
•
1.58k
•
278
LLaMAX/BenchMAX_Multiple_Functions
Viewer
•
Updated
•
5.41k
•
173
Fumika/Wikinews-multilingual
Viewer
•
Updated
•
15.2k
•
74
•
7
Omartificial-Intelligence-Space/awesome_chatgpt_prompts_ar
Viewer
•
Updated
•
201
•
29
•
1
mrlbenchmarks/global-piqa-nonparallel
Viewer
•
Updated
•
11.6k
•
2.64k
•
29
NAMAA-Space/QariOCR-v0.3-markdown-mixed-dataset
Viewer
•
Updated
•
37k
•
192
•
9
Viewer
•
Updated
•
1.49M
•
49
•
2
Viewer
•
Updated
•
23k
•
574
•
1
m0pper/Small-Multilingual-Corpora
Viewer
•
Updated
•
7.61M
•
136
Viewer
•
Updated
•
236k
•
12
Preview
•
Updated
•
11
haoranxu/X-ALMA-Preference
Viewer
•
Updated
•
772k
•
160
•
6
SaiedAlshahrani/Arabic_Wikipedia_20230101_nobots
Viewer
•
Updated
•
847k
•
160
•
2
Viewer
•
Updated
•
367
•
21
•
2
vgaraujov/semeval-2025-task11-track-c
Viewer
•
Updated
•
57.3k
•
196
Viewer
•
Updated
•
935
•
1.67k
•
1
Viewer
•
Updated
•
3.94k
•
1.11k
Viewer
•
Updated
•
7.62k
•
5.63k
•
3
Viewer
•
Updated
•
10.4k
•
3.05k
•
35
Updated
•
2.26k
•
123
brighter-dataset/BRIGHTER-emotion-categories
Viewer
•
Updated
•
140k
•
1.47k
•
13
lukasellinger/homonym-mcl-wic
Viewer
•
Updated
•
1.61k
•
21
Viewer
•
Updated
•
160
•
34
•
3
Preview
•
Updated
•
30
HeshamHaroon/Arabic_Function_Calling
Viewer
•
Updated
•
50.8k
•
197
•
55