Model Database's logo Model Database
  • Models
  • Datasets
  • Spaces
  • Docs
  • Pricing

  • Log In
  • Sign Up

Datasets:
tiiuae
/
falcon-refinedweb

Tasks:
Text Generation
Languages: English
Size Categories: 100B<n<1T
ArXiv:
DOI:
License: odc-by
Dataset card Files Files and versions Community
12
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

Delete data/train-00469-of-05534-091b605405757e80.parquet

#12 opened 9 days ago by CD8Effector

Delete data/train-00469-of-05534-091b605405757e80.parquet

#11 opened 13 days ago by ibnsultan

Hi there! How this released refinedweb is sampled from your full corpus?

1
#10 opened 3 months ago by ChangranHuuu

Was any attempt made to remove copyrighted material?

1
#9 opened 3 months ago by umm-maybe

Can you open source your preprocess code?

#8 opened 3 months ago by Forceless

Update README.md

#7 opened 3 months ago by harshit-gupta

Error when loading data after git lfs download

2
#6 opened 4 months ago by keremturgutlu
Company
© Model Database
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs