一个简单的 flask 全文搜索插件

from flask_msearch import Search
[...]
search = Search()
search.init_app(app)

# models.py
class Post(db.Model):
    __tablename__ = 'post'
    __searchable__ = ['title', 'content']

# views.py
@app.route("/search")
def w_search():
    keyword = request.args.get('keyword')
    results = search.whoosh_search(Post,query=keyword,fields=['title'],limit=20)
    return ''

如果要对已存在的数据创建索引

search.create_index()

自定义 analyzer

from jieba.analyse import ChineseAnalyzer
search = Search(analyzer=ChineseAnalyzer())

项目地址:https://github.com/honmaple/flask-msearch

可以查看演示:demo

(还有更多 whoosh 的功能还没加上)

插件

flask

whoosh

10 条回复 • 2017-04-17 11:22:07 +08:00

pathbox

2017-04-16 19:32:47 +08:00

简单的项目可以用用。其他还是上 ES 吧

clino

2017-04-16 20:06:42 +08:00 via Android

然而搜标题美国并没有结果

awanabe

2017-04-16 20:14:11 +08:00

分词建议用 jieba
ChineseAnalyzer 很弱

awanabe

2017-04-16 20:14:36 +08:00

@clino 分词引擎没有选好

honmaple

2017-04-16 21:27:26 +08:00

@pathbox 嗯， ES 需要 java 环境，纯 python 的就 whoosh,简单方便

honmaple

2017-04-16 21:30:49 +08:00

@clino @awanabe 是的,demo 上我使用默认的 StemmingAnalyzer 作为分词引擎,并没有使用 ChineseAnalyzer,后来我在本地上试过了,如果使用 ChineseAnalyzer 创建索引后是可以搜索到的

honmaple

2017-04-16 21:36:57 +08:00

@awanabe ChineseAnalyzer 就是使用 jieba 实现的啊,不过使用 jieba 检索速度会有所下降

clino

2017-04-17 10:29:58 +08:00

@honmaple 用 jieba 检索速度会下降是什么意思?
如果说做索引的速度会下降我还比较理解

honmaple

2017-04-17 11:11:34 +08:00

@clino 因为 jieba 是在每次使用时加载，而不是保存到内存中,使用时可以看出有明显的停顿

```
Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 1.450 seconds.
Prefix dict has been built succesfully.
```

honmaple

2017-04-17 11:22:07 +08:00

不好意思,我刚才再次验证了一下, jieba 只在第一次使用时会从 cache 中加载,之后就保存到内存中了,使用 jieba 对检索速度没有太大影响