2022/01/21

MySQL 5.1からMySQL 5.7へのレプリケーションが [ERROR] Slave I/O for channel '': error reconnecting to master 'repluser@xxxx:3306' - retry-time: 60 retries: n, Error_code: 0 で止まった

TL;DR

  • そもそも5.1から5.7へのレプリケーションはサポートされていないのは知っている

  • レプリカが character_set_server = utf8mb4 で起動した後、 SET GLOBAL character_set_server = utf8 はしていたけど、mysqldを再起動しないとI/Oスレッドはutf8mb4でコネクションをネゴシエートしようとして、5.1側に「そんなcharsetは知らん」と言われていたようだ。レプリカのmysqldを再起動して解決。


MySQL 5.1からMySQL 5.7にお引越ししようと、5.7のインスタンスを作って5.1をソースにした CHANGE MASTER TO してから START SLAVE した。


2022-01-20T17:47:46.873075+09:00 19926 [Note] 'CHANGE MASTER TO FOR CHANNEL '' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='xxxxx', master_port= 3306, master_log_file='binary_log.000005', master_log_pos= 237438337, master_bind=''.

2022-01-20T17:47:46.878668+09:00 19927 [Note] Slave I/O thread: Start semi-sync replication to master 'repluser@xxxxx:3306' in log 'binary_log.000005' at position 237438337

2022-01-20T17:47:46.878742+09:00 19927 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.

2022-01-20T17:47:46.880010+09:00 19928 [Note] Slave SQL thread for channel '' initialized, starting replication in log 'binary_log.000005' at position 237438337, relay log '/usr/local/mysql/data/relay_log.000001' position: 4

2022-01-20T17:47:46.887069+09:00 19927 [Note] Slave I/O thread for channel '': connected to master 'repluser@xxxxx:3306',replication started in log 'binary_log.000005' at position 237438337

2022-01-20T17:47:46.890918+09:00 19927 [Warning] Slave I/O for channel '': Notifying master by SET @master_binlog_checksum= @@global.binlog_checksum failed with error: Unknown system variable 'binlog_checksum', Error_code: 1193

2022-01-20T17:47:46.891412+09:00 19927 [Warning] Slave I/O for channel '': Unknown system variable 'SERVER_UUID' on master. A probable cause is that the variable is not supported on the master (version: 5.1.54-log), even though it is on the slave (version: 5.7.27-log), Error_code: 1193

2022-01-20T17:47:46.901204+09:00 19927 [Warning] Master server does not support semi-sync, fallback to asynchronous replication

2022-01-20T17:47:53.468844+09:00 19927 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.

2022-01-20T17:47:53.470482+09:00 19927 [ERROR] Slave I/O for channel '': error reconnecting to master 'repluser@xxxxx:3306' - retry-time: 60  retries: 1, Error_code: 0

2022-01-20T17:48:53.472305+09:00 19927 [ERROR] Slave I/O for channel '': error reconnecting to master 'repluser@xxxxx:3306' - retry-time: 60  retries: 2, Error_code: 0

2022-01-20T17:49:53.474224+09:00 19927 [ERROR] Slave I/O for channel '': error reconnecting to master 'repluser@xxxxx:3306' - retry-time: 60  retries: 3, Error_code: 0

2022-01-20T17:50:53.476546+09:00 19927 [ERROR] Slave I/O for channel '': error reconnecting to master 'repluser@xxxxx:3306' - retry-time: 60  retries: 4, Error_code: 0

..

折角なので細かく見ていく。

[Note] ‘CHANGE MASTER TO FOR CHANNEL ‘’ executed’.

  • CHANGE MASTER TO が実行されると吐くログ。
  • Previous stateNew state がそれぞれ記録される。べんり。

[Note] Slave I/O thread: Start semi-sync replication to master

  • START SLAVE した瞬間にI/Oスレッドが動き始めたログ。
  • semi-syncプラグインを噛ませてる(けれど5.1にはsemi-syncはない)のはあるけど、取り敢えずこの段階ではスルーする

[Warning] Storing MySQL user name or password information in the master info repository is not secure

[Note] Slave SQL thread for channel ‘’ initialized

  • SQLスレッドが起動したログ

[Note] Slave I/O thread for channel ‘’: connected to master

  • 起動したI/Oスレッドがレプリケーションソースに接続できたログ

[Warning] Slave I/O for channel ‘’: Notifying master by SET @master_binlog_checksum= @@global.binlog_checksum failed with error: Unknown system variable ‘binlog_checksum’, Error_code: 1193

[Warning] Slave I/O for channel ‘’: Unknown system variable ‘SERVER_UUID’ on master. A probable cause is that the variable is not supported on the master

[Warning] Master server does not support semi-sync, fallback to asynchronous replication

  • semisyncはMySQL 5.5からなので、ソースが5.1なら当然こうなる。
  • いつもの(?) asyncレプリケーションに自動でフォールバックするので問題ない

[Warning] Storing MySQL user name or password information in the master info repository is not secure

  • I/Oスレッドがasyncモードに切り替えて再接続しようとしたのでもう一発同じのが出た

[ERROR] Slave I/O for channel ‘’: error reconnecting to master

  • フツーは Error_code のところに失敗理由(ソース由来ならば多くは1000番台か3000番台のエラー、疎通ができなかったとかなら2000万台のエラー)が出るはずなんだけれど、出ない( ゚д゚)
  • ううむ

ちなみにソース側のエラーログは何も吐いてなかった。


構築してたやつは↓のような形になっていて、緑色の矢印(今回問題になったところ)の他に、赤い矢印(5.7同士で準同期レプリケーション)もある。

確か準同期と非同期って混ぜると上手く動かなくなかったっけな? と思ったので、5.7のNew SourceとNew ReplicaでそれぞれsemisyncをOFFにしてSTOP SLAVE/START SLAVE。

状況は、変わらず。

MySQLプロトコル上の問題かとも思って、 mysql コマンドラインクライアントでも接続してみる。

$ mysql -h xxxxx -u replicator -pyyyy
mysql> SHOW GRANTS;
+---------------------------------------------------------------------------------------------------------------------+
| Grants for replicator@%                                                                                               |
+---------------------------------------------------------------------------------------------------------------------+
| GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'%' IDENTIFIED BY PASSWORD '*xxx' |
+---------------------------------------------------------------------------------------------------------------------+

mysql> SHOW BINLOG EVENTS;
----------------------------------------------------------------------------------------------------------------------------------------------------------+
| Log_name          | Pos       | Event_type  | Server_id | End_log_pos | Info                                                                                                                                                                                                                                   |
+-------------------+-----------+-------------+-----------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| binary_log.000004 |         4 | Format_desc |      1502 |         106 | Server ver: 5.1.54-log, Binlog ver: 4                                                                                                                                                                                    |
| binary_log.000004 |       106 | Query       |      1502 |         208 | UPDATE ...                                                                                                                                                                                                    |

接続もできたし権限も足りてるしI/Oスレッド相当の SHOW BINLOG EVENTS も動く。

もう訳がわからないので大人しくgdbに頼る。
エラーログを吐いてる関数は error_log_print .

# gdb -p $(pidof mysqld)
(gdb) b error_log_print

(gdb) c

Breakpoint 1, error_log_print (level=ERROR_LEVEL, format=0x1606eb8 "Slave %s%s: %s%s Error_code: %d", args=0x7f86538ab160) at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql/log.cc:2119
2119    in /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql/log.cc
(gdb) bt
#0  error_log_print (level=ERROR_LEVEL, format=0x1606eb8 "Slave %s%s: %s%s Error_code: %d", args=0x7f86538ab160) at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql/log.cc:2119
#1  0x0000000000c68727 in sql_print_error (format=<optimized out>) at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql/log.cc:2145
#2  0x0000000000f24e7e in Slave_reporting_capability::va_report (this=<optimized out>, level=<optimized out>, err_code=0, prefix_msg=0x0, msg=0x1603a90 "error %s to master '%s@%s:%d' - retry-time: %d  retries: %lu", 
    args=0x7f86538ab6f0) at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql/rpl_reporting.cc:170
#3  0x0000000000f24fdf in Slave_reporting_capability::report (this=<optimized out>, level=<optimized out>, err_code=<optimized out>, msg=<optimized out>)
    at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql/rpl_reporting.cc:113
#4  0x0000000000f153c3 in connect_to_master (thd=0x7f852cdea000, mysql=0x7f852ce6d420, mi=0x7f852f3b5000, reconnect=true, suppress_warnings=false)
    at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql/rpl_slave.cc:8932
#5  0x0000000000f1c68e in safe_reconnect (suppress_warnings=true, mi=0x7f852f3b5000, mysql=0x7f852ce6d420, thd=0x7f852cdea000) at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql/rpl_slave.cc:8986
#6  try_to_reconnect (thd=0x7f852cdea000, mysql=0x7f852ce6d420, mi=0x7f852f3b5000, retry_count=<optimized out>, suppress_warnings=true, messages=0x1de0e80 <reconnect_messages+96>)
    at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql/rpl_slave.cc:5515
#7  0x0000000000f1dbc7 in handle_slave_io (arg=0x7f852f3b5000) at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql/rpl_slave.cc:5815
#8  0x0000000000f71ea4 in pfs_spawn_thread (arg=0x7f852e085920) at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/storage/perfschema/pfs.cc:2190
#9  0x00007f865540dea5 in start_thread () from /lib64/libpthread.so.0
#10 0x00007f8653ec596d in clone () from /lib64/libc.so.6

reportっぽい関数は呼び出し元から err_code を受け取っているだけっぽかったので、 err_code を生成してそうな connect_to_master までさかのぼって読む。

  while (!(slave_was_killed = io_slave_killed(thd,mi))
         && (reconnect ? mysql_reconnect(mysql) != 0 :
             mysql_real_connect(mysql, mi->host, user,
                                password, 0, mi->port, 0, client_flag) == 0))
  {
    /*
       SHOW SLAVE STATUS will display the number of retries which
       would be real retry counts instead of mi->retry_count for
       each connection attempt by 'Last_IO_Error' entry.
    */
    last_errno=mysql_errno(mysql);
    suppress_warnings= 0;
    mi->report(ERROR_LEVEL, last_errno,
               "error %s to master '%s@%s:%d'"
               " - retry-time: %d  retries: %lu",
               (reconnect ? "reconnecting" : "connecting"),
               mi->get_user(), mi->host, mi->port,
               mi->connect_retry, err_count + 1);

と、こうなので mysql_real_connect ( mysql_reconnect も最終的に mysql_real_connect に来る) が返すエラー番号 ( mysql_errno が返すやつ)が何故か0になっていると。

というわけで mysql_real_connect にブレークポイントを張って

(gdb) b mysql_real_connect
(gdb) c
(gdb) bt
#0  mysql_real_connect (mysql=0x7f86538ab290, host=0x7f852cd812c0 "xxxx", user=0x7f852cc150c0 "replicator", passwd=0x7f852cc150f0 "yyyy", db=0x0, port=3306, unix_socket=0x0, client_flag=2176360965)
    at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql-common/client.c:3970
#1  0x0000000000ded80b in mysql_reconnect (mysql=0x7f852ce6d420) at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql-common/client.c:4622
#2  0x0000000000f15334 in connect_to_master (thd=0x7f852cdea000, mysql=0x7f852ce6d420, mi=0x7f852f3b5000, reconnect=true, suppress_warnings=<optimized out>)
    at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql/rpl_slave.cc:8915
#3  0x0000000000f1c68e in safe_reconnect (suppress_warnings=true, mi=0x7f852f3b5000, mysql=0x7f852ce6d420, thd=0x7f852cdea000) at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql/rpl_slave.cc:8986
#4  try_to_reconnect (thd=0x7f852cdea000, mysql=0x7f852ce6d420, mi=0x7f852f3b5000, retry_count=<optimized out>, suppress_warnings=true, messages=0x1de0e80 <reconnect_messages+96>)
    at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql/rpl_slave.cc:5515
#5  0x0000000000f1dbc7 in handle_slave_io (arg=0x7f852f3b5000) at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/sql/rpl_slave.cc:5815
#6  0x0000000000f71ea4 in pfs_spawn_thread (arg=0x7f852e085920) at /export/home/pb2/build/sb_0-34537258-1560179931.8/mysql-5.7.27/storage/perfschema/pfs.cc:2190
#7  0x00007f865540dea5 in start_thread () from /lib64/libpthread.so.0
#8  0x00007f8653ec596d in clone () from /lib64/libc.so.6

mysql 構造体の中身を覗いたところで、電撃が走った。

(gdb) p mysql
$3 = (MYSQL *) 0x7f852ce6d420

(gdb) p *mysql
$4 = {net = {vio = 0x0, buff = 0x0, buff_end = 0x7f852cc0a020 "", write_pos = 0x7f852cc06020 "\001", read_pos = 0x7f852cc06020 "\001", fd = 24, remain_in_buf = 0, length = 0, buf_length = 0, where_b = 0, max_packet = 16384, 
    max_packet_size = 1073745872, pkt_nr = 12, compress_pkt_nr = 12, write_timeout = 60, read_timeout = 4, retry_count = 10, fcntl = 0, return_status = 0x0, reading_or_writing = 0 '\000', save_char = 0 '\000', unused1 = 0 '\000', 
    unused2 = 0 '\000', compress = 0 '\000', unused3 = 0 '\000', unused = 0x0, last_errno = 0, error = 2 '\002', unused4 = 0 '\000', unused5 = 0 '\000', 
    last_error = "\000here is an attribute with the same name already", '\000' <repeats 463 times>, sqlstate = "00000", extension = 0x0}, connector_fd = 0x0, host = 0x7f852cd812c0 "xxxxx", 
  user = 0x7f852cc150c0 "replicator", passwd = 0x7f852cc150f0 "yyyy", unix_socket = 0x0, server_version = 0x7f852cd812d8 "5.1.54-log", host_info = 0x7f852cd812a0 "xxxxx via TCP/IP", info = 0x0, db = 0x0, 
  charset = 0x1e63760 <my_charset_utf8mb4_general_ci>, fields = 0x0, field_alloc = {free = 0x0, used = 0x0, pre_alloc = 0x0, min_malloc = 32, block_size = 8160, block_num = 4, first_block_usage = 0, max_capacity = 0, 
    allocated_size = 0, error_for_capacity_exceeded = 0 '\000', error_handler = 0x0, m_psi_key = 0}, affected_rows = 18446744073709551615, insert_id = 0, extra_info = 0, thread_id = 43147884, packet_length = 7, port = 3306, 
  client_flag = 2176360965, server_capabilities = 63487, protocol_version = 10, field_count = 0, server_status = 2, server_language = 33, warning_count = 0, options = {connect_timeout = 4, read_timeout = 4, write_timeout = 0, 
    port = 0, protocol = 0, client_flag = 0, host = 0x0, user = 0x0, password = 0x0, unix_socket = 0x0, db = 0x0, init_commands = 0x0, my_cnf_file = 0x0, my_cnf_group = 0x0, 
    charset_dir = 0x7f852cdaf080 "/usr/local/mysql/share/charsets/", charset_name = 0x7f852cc15090 "utf8mb4", ssl_key = 0x0, ssl_cert = 0x0, ssl_ca = 0x0, ssl_capath = 0x0, ssl_cipher = 0x0, shared_memory_base_name = 0x0, 
    max_allowed_packet = 1073745872, use_ssl = 0 '\000', compress = 0 '\000', named_pipe = 0 '\000', unused1 = 0 '\000', unused2 = 0 '\000', unused3 = 0 '\000', unused4 = 0 '\000', methods_to_use = MYSQL_OPT_GUESS_CONNECTION, 
    ci = {client_ip = 0x0, bind_address = 0x0}, unused5 = 0 '\000', report_data_truncation = 1 '\001', local_infile_init = 0x0, local_infile_read = 0x0, local_infile_end = 0x0, local_infile_error = 0x0, 
    local_infile_userdata = 0x0, extension = 0x7f852cda9120}, status = MYSQL_STATUS_READY, free_me = 1 '\001', reconnect = 1 '\001', scramble = "1.1Jc4vwt`bL7k:yvfyP", unused1 = 0 '\000', unused2 = 0x0, unused3 = 0x0, 
  unused4 = 0x0, unused5 = 0x0, stmts = 0x0, methods = 0x1de0360 <client_methods>, thd = 0x0, unbuffered_fetch_owner = 0x0, info_buffer = 0x0, extension = 0x7f852cdc1040}

Σ(゚д゚lll) utf8mb4!!

確かに character_set_server = utf8mb4 で起動しちゃったので、my.cnfを直した上で SET GLOBAL character_set_server = utf8 で再設定していたのだ。

しかしレプリケーションスレッドには効かなかった…ということだろうか。

mysqld再起動したらつながるようになりました。

あー楽しかった!

0 件のコメント :

コメントを投稿