MySQL 全文搜索,如何才能连同中文标点一起搜? - V2EX
V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
MySQL 5.5 Community Server
MySQL 5.6 Community Server
Percona Configuration Wizard
XtraBackup 搭建主从复制
Great Sites on MySQL
Percona
MySQL Performance Blog
Severalnines
推荐管理工具
Sequel Pro
phpMyAdmin
推荐书目
MySQL Cookbook
MySQL 相关项目
MariaDB
Drizzle
参考文档
http://mysql-python.sourceforge.net/MySQLdb.html
Jiangoogle
V2EX    MySQL

MySQL 全文搜索,如何才能连同中文标点一起搜?

  •  
  •   Jiangoogle 2022-11-19 06:32:34 +08:00 via Android 3091 次点击
    这是一个创建于 1120 天前的主题,其中的信息可能已经有所发展或是发生改变。
    mysql5.7, Myisam

    想模拟出和 like "%你好,我是%" 一样的效果

    现在用的是 match () against('"你好,我是" in boolean mode)

    同时把 ft_stopword_file 已经设成了 '',并已经重启服务,repair table xx quick 重建索引

    但仍然一加标点就搜不到,请问如何解决?
    8 条回复    2022-11-20 17:43:24 +08:00
    ih8es9OIzne0959p
        1
    ih8es9OIzne0959p  
       2022-11-19 08:08:41 +08:00 via Android
    我猜,转译 ascll 码
    ih8es9OIzne0959p
        2
    ih8es9OIzne0959p  
       2022-11-19 08:11:13 +08:00 via Android
    百度上的 1 、查询带有中文标点符号,使用 COLLATE Chinese_PRC_CS_AS_WS ,注意在%%中间输入就要是中文符号。

    select * from TASK where info COLLATE Chinese_PRC_CS_AS_WS like '%,%'
    ih8es9OIzne0959p
        3
    ih8es9OIzne0959p  
       2022-11-19 08:12:12 +08:00 via Android
    @ajaxgoldfish 大意了,没看题,各位请无视我
    brader
        4
    brader  
       2022-11-19 10:13:17 +08:00
    刚测试了一下,是可以达到你这个效果的。你再试试。

    CREATE TABLE posts (
    id INT PRIMARY KEY AUTO_INCREMENT,
    title VARCHAR(255),
    body TEXT,
    FULLTEXT ( title , body ) WITH PARSER NGRAM
    ) ENGINE=INNODB CHARACTER SET UTF8MB4;

    INSERT INTO posts(title,body)
    VALUES('MySQL 全文搜索','MySQL 提供了具有许多好的功能的内置全文搜索'),
    ('MySQL 教程','学习 MySQL 快速,简单和有趣');

    SELECT * FROM posts WHERE MATCH (title , body) AGAINST ("快速,简单" IN BOOLEAN MODE);
    brader
        5
    brader  
       2022-11-19 10:15:57 +08:00
    注意使用 show variables like "%ft%" 和 show variables like "%ngram_token_size%",查看你设置的分词粒度
    brader
        6
    brader  
       2022-11-19 10:29:41 +08:00
    对了,如果上面那个搜索语句找不到,而你仅仅需要查包含这样的简单场景,还是使用 SELECT * FROM posts WHERE MATCH (title , body) AGAINST ("速," IN natural language MODE); 会好点,分词粒度看你需要,可以设置为 2 ,甚至是 1
    Jiangoogle
        7
    Jiangoogle  
    OP
       2022-11-19 19:30:24 +08:00
    @brader 不行啊,可能是因为你用的是 InnoDB ?
    我的环境是:mysql5.7.20 ,MyISAM ,
    配置是:ft_stopword_file = '',ngram_token_size = 2

    CREATE TABLE `tmp` (`book_name` char(32) NOT NULL, FULLTEXT KEY `book_name` (`book_name`) WITH PARSER `ngram`) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4;

    insert into tmp values('你好,我的');

    mysql> select book_name from tmp where match(book_name) against('"你好,"' in boolean mode) ;
    +-----------------+
    | book_name |
    +-----------------+
    | 你好,我的 |
    +-----------------+
    1 row in set (0.00 sec)

    mysql> select book_name from tmp where match(book_name) against('"你好,我的"' in boolean mode);
    Empty set (0.00 sec)

    mysql> select book_name from tmp where match(book_name) against('"你好"' in boolean mode);
    +-----------------+
    | book_name |
    +-----------------+
    | 你好,我的 |
    +-----------------+
    1 row in set (0.00 sec)

    就上边这三种情况,不知道为什么
    Jiangoogle
        8
    Jiangoogle  
    OP
       2022-11-20 17:43:24 +08:00 via Android
    继续求助
    关于     帮助文档     自助推广系统     博客     API     FAQ     Solana     2684 人在线   最高记录 6679       Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 25ms UTC 02:43 PVG 10:43 LAX 18:43 JFK 21:43
    Do have faith in what you're doing.
    ubao msn snddm index pchome yahoo rakuten mypaper meadowduck bidyahoo youbao zxmzxm asda bnvcg cvbfg dfscv mmhjk xxddc yybgb zznbn ccubao uaitu acv GXCV ET GDG YH FG BCVB FJFH CBRE CBC GDG ET54 WRWR RWER WREW WRWER RWER SDG EW SF DSFSF fbbs ubao fhd dfg ewr dg df ewwr ewwr et ruyut utut dfg fgd gdfgt etg dfgt dfgd ert4 gd fgg wr 235 wer3 we vsdf sdf gdf ert xcv sdf rwer hfd dfg cvb rwf afb dfh jgh bmn lgh rty gfds cxv xcv xcs vdas fdf fgd cv sdf tert sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf shasha9178 shasha9178 shasha9178 shasha9178 shasha9178 liflif2 liflif2 liflif2 liflif2 liflif2 liblib3 liblib3 liblib3 liblib3 liblib3 zhazha444 zhazha444 zhazha444 zhazha444 zhazha444 dende5 dende denden denden2 denden21 fenfen9 fenf619 fen619 fenfe9 fe619 sdf sdf sdf sdf sdf zhazh90 zhazh0 zhaa50 zha90 zh590 zho zhoz zhozh zhozho zhozho2 lislis lls95 lili95 lils5 liss9 sdf0ty987 sdft876 sdft9876 sdf09876 sd0t9876 sdf0ty98 sdf0976 sdf0ty986 sdf0ty96 sdf0t76 sdf0876 df0ty98 sf0t876 sd0ty76 sdy76 sdf76 sdf0t76 sdf0ty9 sdf0ty98 sdf0ty987 sdf0ty98 sdf6676 sdf876 sd876 sd876 sdf6 sdf6 sdf9876 sdf0t sdf06 sdf0ty9776 sdf0ty9776 sdf0ty76 sdf8876 sdf0t sd6 sdf06 s688876 sd688 sdf86