Decision Workspace
unobtanium-segmenter vs charabia vs language-tokenizer
Side-by-side comparison of Rust crates
51
unobtanium-segmenter
experimentalv0.5.2
A text segmentation toolbox for search applications inspired by charabia and tantivy.
62
charabia
growingv0.9.9
A simple library to detect the language, tokenize the text and normalize the tokens
45
language-tokenizer
experimentalv0.1.0
Text tokenizer for linguistic purposes, such as text matching. Supports more than 40 languages, including English, French, Russian, Japanese, Thai etc.
Core Metrics
| unobtanium-segmenter | charabia | language-tokenizer | |
|---|---|---|---|
| Health Score | 51 | 62 | 45 |
| Total Downloads | 2.3K | 761.2K | 447 |
| 30d Downloads | 64 | 81.1K | 134 |
| Dependents | 3 | 140 | 2 |
| Releases | 9 | 31 | 1 |
| Last Updated | 33d ago | 123d ago | 79d ago |
| Age | 9m | 3y 11m | 2m |
Health Breakdown
unobtanium-segmenter
Maintenance
17
Quality
13
Community
7
Popularity
4
Documentation
10
charabia
Maintenance
16
Quality
13
Community
13
Popularity
7
Documentation
13
language-tokenizer
Maintenance
9
Quality
16
Community
4
Popularity
3
Documentation
13
Technical Details
| unobtanium-segmenter | charabia | language-tokenizer | |
|---|---|---|---|
| Version | 0.5.2 | 0.9.9 | 0.1.0 |
| Stable (≥1.0) | ✗ No | ✗ No | ✗ No |
| License | LGPL-3.0-only | MIT | non-standard |
| Dependencies | 7 | 18 | 12 |
| Crate Size | 47KB | 1.1MB | 27KB |
| Features | 0 | 20 | 11 |
| Yanked % | 0.0% | 3.2% | 0.0% |
| Edition | 2024 | 2021 | 2021 |
| MSRV | — | — | 1.83.0 |
| Owners | 1 | 2 | 1 |
Links
Quick Verdict
- •charabia leads with a health score of 62/100, but none of the options score above 80.
- •charabia is depended on by 140 crates — strongest ecosystem trust.