Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
7
7
16
Catherine Arnett
catherinearnett
Follow
ararruga's profile picture
lucabaggi's profile picture
romyluo7's profile picture
114 followers
·
39 following
https://catherinearnett.github.io/
linguist_cat
catherinearnett
catherinearnett.bsky.social
AI & ML interests
multilingual NLP, tokenization
Recent Activity
liked
a dataset
2 days ago
tylerxdurden/pindorama-corpus
published
a dataset
5 days ago
catherinearnett/trilingual-tokenizers
published
a dataset
5 days ago
catherinearnett/trilingual-tokenizer-data
View all activity
Organizations
catherinearnett
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a dataset
2 days ago
tylerxdurden/pindorama-corpus
Viewer
•
Updated
18 days ago
•
2.06k
•
109
•
2
liked
a dataset
about 1 month ago
mrlbenchmarks/global-piqa-parallel
Viewer
•
Updated
19 days ago
•
13.5k
•
1.76k
•
9
liked
a dataset
3 months ago
commoncrawl/CommonLID
Viewer
•
Updated
Feb 10
•
373k
•
241
•
52
liked
4 datasets
4 months ago
aaparajit02/punjabi-asr
Viewer
•
Updated
Jul 23, 2023
•
39.2k
•
531
•
3
aznlp/azerbaijani-blogs
Viewer
•
Updated
Apr 14, 2024
•
6.93k
•
17
•
3
MWirelabs/assamese-monolingual-corpus
Viewer
•
Updated
Nov 13, 2025
•
1.61M
•
15
•
1
Atnafu/Afri-MCQA
Viewer
•
Updated
Jan 15
•
15.3k
•
527
•
18
liked
a dataset
7 months ago
mrlbenchmarks/global-piqa-nonparallel
Viewer
•
Updated
19 days ago
•
13.5k
•
9.06k
•
35
liked
a dataset
8 months ago
nlip/DIWALI
Viewer
•
Updated
Apr 20
•
8.82k
•
111
•
6
liked
4 datasets
11 months ago
classla/ParlaSpeech-PL
Viewer
•
Updated
Jul 2, 2025
•
531k
•
161
•
6
classla/ParlaSpeech-HR
Viewer
•
Updated
Jul 2, 2025
•
868k
•
3.86k
•
5
classla/ParlaSpeech-CZ
Viewer
•
Updated
Jul 2, 2025
•
711k
•
4.72k
•
5
classla/ParlaSpeech-RS
Viewer
•
Updated
Dec 1, 2025
•
278k
•
1.16k
•
4
liked
a dataset
12 months ago
filbench/UD_Tagalog-NewsCrawl
Viewer
•
Updated
Jul 23, 2025
•
15.6k
•
67
•
1
liked
a dataset
about 1 year ago
jumelet/multiblimp
Viewer
•
Updated
May 16, 2025
•
121k
•
6.22k
•
17
liked
a dataset
almost 2 years ago
ambean/lingOly
Viewer
•
Updated
Jun 11, 2024
•
90
•
6.12k
•
9