Lucene获取分词后的关键字

论坛 期权论坛 编程之家     
选择匿名的用户   2021-5-28 12:09   11   0

整理了一下

String keyWord = "java是一种可以撰写跨平台应用软件的面向对象的程序设计语言。";

IKAnalyzer analyzer = new IKAnalyzer();
System.out.println("分词:"+keyWord);
try {
 TokenStream tokenStream = analyzer.tokenStream("content",new StringReader(keyWord));
    tokenStream.addAttribute(CharTermAttribute.class);


   //必须先调用reset方法,否则会报下面的错,可以参考TokenStream的API说明
   tokenStream.reset();

   /* java.lang.IllegalStateException: 
   TokenStream contract violation: reset()/close() call missing, 
   reset() called multiple times, or subclass does not call super.reset(). 
   Please see Javadocs of TokenStream class for more information 
   about the correct consuming workflow.
   */
 
  System.out.print("结果:");
  while (tokenStream.incrementToken()) {
        CharTermAttribute charTermAttribute = 
                   (CharTermAttribute)tokenStream.getAttribute(CharTermAttribute.class);
        System.out.print(charTermAttribute.toString() + " ");
   }

 tokenStream.end();
 tokenStream.close();

} catch(Exception e) {
 e.printStackTrace();
}

lucene 4.9.0

ikanalyzer 2012FF_u1

转载于:https://my.oschina.net/LinuxDaxingxing/blog/796991

分享到 :
0 人收藏
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

积分:3875789
帖子:775174
精华:0
期权论坛 期权论坛
发布
内容

下载期权论坛手机APP