Text Processing
regex
1.12.3 StableAn implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.
dissimilar
1.0.11 StableDiff library with semantic cleanup, based on Google's diff-match-patch
lindera-dictionary
2.3.4 StableA morphological dictionary library.
lindera
2.3.4 StableA morphological analysis library.
calamine
0.34.0 GrowingAn Excel/OpenDocument Spreadsheet reader and deserializer in pure Rust
bstr
1.12.1 StableA string type that is not required to be valid UTF-8.
lindera-ipadic
2.3.4 StableA Japanese morphological dictionary for IPADIC.
lazy-regex
3.6.0 Stablelazy static regular expressions checked at compile time
comrak
0.51.0 GrowingA 100% CommonMark-compatible GitHub Flavored Markdown parser and formatter
grok
2.4.1 StableA Rust implementation of the popular Java & Ruby grok library which allows easy text and log file processing with composable patterns.
oak-highlight
0.0.10 ExperimentalA lightweight syntax highlighter for Rust with support for multiple programming languages and customizable themes.
uutils_term_grid
0.8.0 GrowingLibrary for formatting strings into a grid layout. Fork of term_grid.
read-fonts
0.38.0 GrowingReading OpenType font files.
font-types
0.11.1 GrowingScalar types used in fonts.
oak-pretty-print
0.0.10 ExperimentalSyntax highlighter supporting multiple programming languages.
topiary-queries
0.7.3 Growingtree-sitter query files compatible with Topiary
unicase
2.9.0 StableA case-insensitive wrapper around strings.
kreuzberg-tesseract
4.6.3 ExperimentalRust bindings for Tesseract OCR with cross-compilation, C++17, and caching improvements
lingua-japanese-language-model
1.3.0 StableThe Japanese language model for Lingua, an accurate natural language detection library
html-to-markdown-rs
2.30.0 ExperimentalHigh-performance HTML to Markdown converter using the astral-tl parser. Part of the Kreuzberg ecosystem.
skrifa
0.41.0 GrowingMetadata reader and glyph scaler for OpenType fonts.
toml-test-data
2.5.0 StableTOML test cases
lindera-cli
2.3.4 StableA morphological analysis CLI.
logos
0.16.1 GrowingCreate ridiculously fast Lexers
rphonetic
3.0.6 StableRust port of phonetic Apache commons-codec algorithms
asimov-prompt
25.1.0 GrowingASIMOV Software Development Kit (SDK) for Rust
indoc
2.0.7 GrowingIndented document literals
varcon-core
5.0.6 StableVarcon-relevant data structures
lindera-tantivy
2.0.0 StableLindera Tokenizer for Tantivy.
lingua-marathi-language-model
1.3.0 StableThe Marathi language model for Lingua, an accurate natural language detection library
lingua-swahili-language-model
1.3.0 StableThe Swahili language model for Lingua, an accurate natural language detection library
lingua-bengali-language-model
1.3.0 StableThe Bengali language model for Lingua, an accurate natural language detection library
lingua-hindi-language-model
1.3.0 StableThe Hindi language model for Lingua, an accurate natural language detection library
lingua-korean-language-model
1.3.0 StableThe Korean language model for Lingua, an accurate natural language detection library
lingua-chinese-language-model
1.3.0 StableThe Chinese language model for Lingua, an accurate natural language detection library
lingua-gujarati-language-model
1.3.0 StableThe Gujarati language model for Lingua, an accurate natural language detection library
lingua-tamil-language-model
1.3.0 StableThe Tamil language model for Lingua, an accurate natural language detection library
lingua-sotho-language-model
1.3.0 StableThe Sotho language model for Lingua, an accurate natural language detection library
lingua-telugu-language-model
1.3.0 StableThe Telugu language model for Lingua, an accurate natural language detection library
lingua-punjabi-language-model
1.3.0 StableThe Punjabi language model for Lingua, an accurate natural language detection library
lingua-tsonga-language-model
1.3.0 StableThe Tsonga language model for Lingua, an accurate natural language detection library
lingua-tswana-language-model
1.3.0 StableThe Tswana language model for Lingua, an accurate natural language detection library
arborium-theme
2.16.0 ExperimentalTheme support for arborium syntax highlighting
arborium-html
2.16.0 ExperimentalHTML grammar for arborium (tree-sitter bindings)
lindera-cc-cedict
2.3.4 StableA Chinese morphological dictionary for CC-CEDICT.
lindera-ko-dic
2.3.4 StableA Korean morphological dictionary for ko-dic.
lindera-unidic
2.3.4 StableA Japanese morphological dictionary for UniDic.
lindera-ipadic-neologd
2.3.4 StableA Japanese morphological dictionary for IPADIC NEologd.
asimov-patterns
25.1.0 GrowingASIMOV Software Development Kit (SDK) for Rust
ammonia
4.1.2 GrowingHTML Sanitization