45
language-tokenizer
v0.1.0 ExperimentalText tokenizer for linguistic purposes, such as text matching. Supports more than 40 languages, including English, French, Russian, Japanese, Thai etc.
non-standard Edition 2021 MSRV 1.83.0
#text#language#tokenizer#segmenter#linguistic
Quick Verdict
- โActively maintained (updated 79d ago)
- !Pre-1.0: API may have breaking changes
Security
Checking security advisories...
Downloads
447
Dependents
2
Releases
1
Size
27KB
Deep Insights
๐
Download decline
141 downloads in the last 30 days, down 51% from the previous period. May indicate migration to alternatives.
๐
Compact crate
At 27KB, language-tokenizer is lightweight. Small crate size correlates with focused, well-scoped functionality.
Health Breakdown
Maintenance 9/25
Recency, release consistency, active ratio
Quality 16/25
Yanked ratio, deps, size, maturity, features
Community 4/20
Reverse deps, ownership, ecosystem
Popularity 3/15
Downloads, momentum, growth trend
Documentation 13/15
Docs, repo, license, metadata
Download Trend
Daily downloads ยท last 90 days
5/day avg+251%
Top Dependents
Most downloaded crates that depend on language-tokenizer
Version Adoption
v0.1.0
100%
Release Timeline
1 releasessince 2026
J
F
M
A
M
J
J
A
S
O
N
D
2026
1
LessMore
Feature Flags
fullserdesnowballchinese-icujapanese-icukorean-linderachinese-linderasoutheast-asianjapanese-ipadic-linderajapanese-unidic-linderajapanese-ipadic-neologd-lindera
README
Loading README...
Maintainers
Dependencies
12
direct dependencies
Dependents
2
crates depend on language-tokenizer