A multilingual news corpus built from Common Crawl CC-News, indexed and queriable in milliseconds, cleaned an enriched with language and topic id
Ruggero Marino Lazzaroni
ruggsea
AI & ML interests
NLP in any form
Recent Activity
liked a model 2 days ago
Eriskii/LFM2.5-8B-A1B-Multichannel updated a dataset 5 days ago
ruggsea/social-sim-bench-gens published a dataset 26 days ago
ruggsea/social-sim-bench-gens