STConvert is analyzer that convert chinese characters between traditional and simplified.中文简繁體互相转换.
STConvert is analyzer that convert Chinese characters between Traditional and Simplified. [中文简繁體转换][简体到繁体][繁体到简体][简繁查询Expand]
You can download the pre-build package from release page
The plugin includes analyzer:
stconvert, tokenizer:
stconvert, token-filter:
stconvert, and char-filter:
stconvert
Supported config:
convert_type: default
s2t,optional option:
s2t,convert characters from Simple Chinese to Traditional Chinese
t2s,convert characters from Traditional Chinese to Simple Chinese
keep_both:default
false,
delimiter:default
,
Custom example:
PUT /stconvert/ { "settings" : { "analysis" : { "analyzer" : { "tsconvert" : { "tokenizer" : "tsconvert" } }, "tokenizer" : { "tsconvert" : { "type" : "stconvert", "delimiter" : "#", "keep_both" : false, "convert_type" : "t2s" } }, "filter": { "tsconvert" : { "type" : "stconvert", "delimiter" : "#", "keep_both" : false, "convert_type" : "t2s" } }, "char_filter" : { "tsconvert" : { "type" : "stconvert", "convert_type" : "t2s" } } } } }
Analyze tests
GET stconvert/_analyze { "tokenizer" : "keyword", "filter" : ["lowercase"], "char_filter" : ["tsconvert"], "text" : "国际國際" }Output: { "tokens": [ { "token": "国际国际", "start_offset": 0, "end_offset": 4, "type": "word", "position": 0 } ] }
Normalizer usage
DELETE index PUT index { "settings": { "analysis": { "char_filter": { "tsconvert": { "type": "stconvert", "convert_type": "t2s" } }, "normalizer": { "my_normalizer": { "type": "custom", "char_filter": [ "tsconvert" ], "filter": [ "lowercase" ] } } } }, "mappings": { "properties": { "foo": { "type": "keyword", "normalizer": "my_normalizer" } } } }PUT index/_doc/1 { "foo": "國際" }
PUT index/_doc/2 { "foo": "国际" }
GET index/_search { "query": { "term": { "foo": "国际" } } }
GET index/_search { "query": { "term": { "foo": "國際" } } }