Decision Workspace
segtok vs text-splitter vs charabia
Side-by-side comparison of Rust crates
46
segtok
growingv0.1.5
Sentence segmentation and word tokenization tools
62
text-splitter
growingv0.30.1
Split text into semantic chunks, up to a desired chunk size. Supports calculating length by characters and tokens, and is callable from Rust and Python.
60
charabia
growingv0.9.9
A simple library to detect the language, tokenize the text and normalize the tokens
Core Metrics
| segtok | text-splitter | charabia | |
|---|---|---|---|
| Health Score | 46 | 62 | 60 |
| Total Downloads | 646.7K | 1.4M | 914.9K |
| 30d Downloads | 175.5K | 150.5K | 85.6K |
| Dependents | 10 | 782 | 145 |
| Releases | 6 | 62 | 31 |
| Last Updated | 461d ago | 38d ago | 182d ago |
| Age | 1y 4m | 3y | 4y |
Health Breakdown
segtok
Maintenance
8
Quality
13
Community
8
Popularity
7
Documentation
10
text-splitter
Maintenance
17
Quality
13
Community
13
Popularity
7
Documentation
12
charabia
Maintenance
14
Quality
13
Community
13
Popularity
7
Documentation
13
Technical Details
| segtok | text-splitter | charabia | |
|---|---|---|---|
| Version | 0.1.5 | 0.30.1 | 0.9.9 |
| Stable (≥1.0) | ✗ No | ✗ No | ✗ No |
| License | MIT | MIT | MIT |
| Dependencies | 8 | 21 | 18 |
| Crate Size | 36KB | 60KB | 1.1MB |
| Features | 0 | 4 | 20 |
| Yanked % | 0.0% | 1.6% | 3.2% |
| Edition | 2021 | 2021 | 2021 |
| MSRV | — | 1.86.0 | — |
| Owners | 1 | 1 | 2 |
Links
Quick Verdict
- •text-splitter leads with a health score of 62/100, but none of the options score above 80.
- •text-splitter is depended on by 782 crates — strongest ecosystem trust.
- •⚠ segtok has not been updated in over a year.