TL;DR
- みんなだいすきCLONEプラグインを使って本番のマスター(稼働中)から本番のスレーブ(諸事情により再構築が必要だった)を作った。
- 何度かリトライはしたけど、ちゃんと動いているような気がする
CLONE INSTANCE FROM ..
を実行するターミナルはtmux
などで切断から保護しましょう…
CLONEプラグインって何的な人は先にこっちを読んでいただくと良いかも知れない。
取り敢えず
何も考えずにクローン用のアカウントをマスターに作成する。
mysql> CREATE USER cloner@xxx.xxx.xxx.xxx IDENTIFIED BY 'xxxx';
Query OK, 0 rows affected (0.00 sec)
mysql> GRANT BACKUP_ADMIN ON *.* TO cloner@xxx.xxx.xxx.xxx;
Query OK, 0 rows affected (0.00 sec)
スレーブ用地から
CLONE INSTANCE FROM ..
でクローンを開始しようとする。mysql> SET GLOBAL SEESSION clone_valid_donor_list = 'yyy.yyy.yyy.yyy:3306';
Query OK, 0 rows affected (0.00 sec)
mysql> CLONE INSTANCE FROM cloner@yyy.yyy.yyy.yyy:3306 IDENTIFIED BY 'xxxx';
ERROR 3862 (HY000): Clone Donor Error: 1016 : Can't open file: './undo001' (errno: 2 - No such file or directory).
(つд⊂)ゴシゴシ
ERROR 3862は「ドナーからこんなエラーを受け取った」ためのエラーコードなので、ERROR 1016 Can’t open file .. がドナーから受け取った生のエラー。
クローンに関する進捗とかその他は全てドナーノード側のエラーログに出力される。
Note
レベルなのでドナー側が log_error_verbosity= 3
でないと出力されないと思う。
こっちがドナー側のエラーログ。
2019-11-26T10:39:22.974397+09:00 34186 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: Acquired backup lock.'
2019-11-26T10:39:22.974459+09:00 34186 [Note] [MY-013457] [InnoDB] Clone Begin Master Task by cloner@172.21.160.132
2019-11-26T10:39:22.979792+09:00 34186 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: COM_INIT: Storage Initialize.'
2019-11-26T10:39:22.979825+09:00 34186 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: COM_RES_COMPLETE.'
2019-11-26T10:39:51.981458+09:00 34187 [Note] [MY-013458] [InnoDB] Clone set state change ACK: 1
2019-11-26T10:39:51.981516+09:00 34187 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: COM_ACK: Storage Ack.'
2019-11-26T10:39:51.981544+09:00 34187 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: COM_RES_COMPLETE.'
2019-11-26T10:39:52.081398+09:00 34186 [Note] [MY-013458] [InnoDB] Clone Master received state change ACK
2019-11-26T10:39:52.081463+09:00 34186 [Note] [MY-013458] [InnoDB] Clone State Change : Number of tasks = 1
2019-11-26T10:39:52.081483+09:00 34186 [Note] [MY-013458] [InnoDB] Clone State BEGIN FILE COPY
2019-11-26T10:39:52.087133+09:00 34186 [Note] [MY-011845] [InnoDB] Clone Start PAGE ARCH : start LSN : 1927678359668, checkpoint LSN : 1927629671670
2019-11-26T10:39:52.094897+09:00 34186 [Note] [MY-013458] [InnoDB] Clone Set Error code: 1815 Saved Error code: 0
2019-11-26T10:39:52.094936+09:00 34186 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: COM_EXECUTE: Storage Execute: error: 1815: Can't open file: './undo001' (errno: 2 - No such file or directory).'
2019-11-26T10:39:52.094954+09:00 34186 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: Before sending COM_RES_ERROR: : error: 1815: Can't open file: './undo001' (errno: 2 - No such file or directory).'
2019-11-26T10:39:52.095022+09:00 34186 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: After sending COM_RES_ERROR.'
2019-11-26T10:39:52.095295+09:00 34187 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: COM_EXIT: Storage End.'
2019-11-26T10:39:52.095347+09:00 34187 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: COM_RES_COMPLETE.'
2019-11-26T10:39:52.095375+09:00 34187 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: Exiting clone protocol.'
2019-11-26T10:39:52.096028+09:00 34186 [Note] [MY-011846] [InnoDB] Clone Stop PAGE ARCH : end LSN : 1927629671670, log sys LSN : 1927678363098
2019-11-26T10:39:52.096207+09:00 34186 [Note] [MY-013457] [InnoDB] Clone End Master Task ID: 0 Passed, code: 0:
2019-11-26T10:39:52.096234+09:00 34186 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: COM_EXIT: Storage End.'
2019-11-26T10:39:52.096268+09:00 34186 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: COM_RES_COMPLETE.'
2019-11-26T10:39:52.096302+09:00 34186 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: Exiting clone protocol.'
こっちがレシピエント側のエラーログ。
2019-11-26T10:39:22.963763+09:00 139 [Note] [MY-013272] [Clone] Plugin Clone reported: 'Client: Task Connect.'
2019-11-26T10:39:22.974258+09:00 139 [Note] [MY-013272] [Clone] Plugin Clone reported: 'Client: Master ACK Connect.'
2019-11-26T10:39:22.974323+09:00 139 [Note] [MY-013457] [InnoDB] Clone Apply Begin Master Version Check
2019-11-26T10:39:22.975340+09:00 139 [Note] [MY-013457] [InnoDB] Clone Apply Version End Master Task ID: 0 Passed, code: 0:
2019-11-26T10:39:22.975382+09:00 139 [Note] [MY-013457] [InnoDB] Clone Apply Begin Master Task
2019-11-26T10:39:22.976280+09:00 139 [Warning] [MY-013460] [InnoDB] Clone removing all user data for provisioning: Started
2019-11-26T10:39:22.976316+09:00 139 [Note] [MY-011977] [InnoDB] Clone Drop all user data
2019-11-26T10:39:24.122046+09:00 139 [Note] [MY-011977] [InnoDB] Clone: Fix Object count: 2314 task: 0
2019-11-26T10:39:51.598531+09:00 139 [Note] [MY-011977] [InnoDB] Clone Drop User schemas
2019-11-26T10:39:51.598901+09:00 139 [Note] [MY-011977] [InnoDB] Clone: Fix Object count: 14 task: 0
2019-11-26T10:39:51.623511+09:00 139 [Note] [MY-011977] [InnoDB] Clone Drop User tablespaces
2019-11-26T10:39:51.624095+09:00 139 [Note] [MY-011977] [InnoDB] Clone: Fix Object count: 6 task: 0
2019-11-26T10:39:51.625091+09:00 132 [Note] [MY-010596] [Repl] Error reading relay log event for channel '': slave SQL thread was killed
2019-11-26T10:39:51.625136+09:00 132 [Note] [MY-010587] [Repl] Slave SQL thread for channel '' exiting, replication stopped in log 'FIRST' at position 0
2019-11-26T10:39:51.742987+09:00 139 [Note] [MY-011977] [InnoDB] Clone Drop: finished successfully
2019-11-26T10:39:51.743073+09:00 139 [Warning] [MY-013460] [InnoDB] Clone removing all user data for provisioning: Finished
2019-11-26T10:39:51.981055+09:00 139 [Note] [MY-013272] [Clone] Plugin Clone reported: 'Client: Command COM_INIT.'
2019-11-26T10:39:52.095229+09:00 139 [Note] [MY-013458] [InnoDB] Clone Set Error code: 3862 Saved Error code: 0
2019-11-26T10:39:52.095271+09:00 139 [Note] [MY-013458] [InnoDB] Clone Apply set error code: 3862: Clone Donor Error: 1016 : Can't open file: './undo001' (errno: 2 - No such file or directory).
2019-11-26T10:39:52.095290+09:00 139 [Note] [MY-013272] [Clone] Plugin Clone reported: 'Client: Wait for remote after local issue: error: 3862: Clone Donor Error: 1016 : Can't open file: './undo001' (errno: 2 - No such file or directory)..'
2019-11-26T10:39:52.095305+09:00 139 [Note] [MY-013272] [Clone] Plugin Clone reported: 'Client: Command COM_EXECUTE: error: 3862: Clone Donor Error: 1016 : Can't open file: './undo001' (errno: 2 - No such file or directory)..'
2019-11-26T10:39:52.095552+09:00 139 [Note] [MY-013272] [Clone] Plugin Clone reported: 'Client: Master ACK COM_EXIT.'
2019-11-26T10:39:52.095941+09:00 139 [Note] [MY-013272] [Clone] Plugin Clone reported: 'Client: Master ACK Disconnect : abort: false.'
2019-11-26T10:39:52.096479+09:00 139 [Note] [MY-013272] [Clone] Plugin Clone reported: 'Client: Task COM_EXIT.'
2019-11-26T10:39:52.096798+09:00 139 [Note] [MY-013272] [Clone] Plugin Clone reported: 'Client: Task Disconnect : abort: false.'
2019-11-26T10:39:52.096858+09:00 139 [Note] [MY-013458] [InnoDB] Clone Set Error code: 3862 Saved Error code: 3862
2019-11-26T10:39:52.096890+09:00 139 [Note] [MY-013458] [InnoDB] Clone Apply set error code: 3862: Clone Donor Error: 1016 : Can't open file: './undo001' (errno: 2 - No such file or directory).
2019-11-26T10:39:52.096944+09:00 139 [Note] [MY-013458] [InnoDB] Clone Set Error code: 3862 Saved Error code: 3862
2019-11-26T10:39:52.097189+09:00 139 [Note] [MY-013457] [InnoDB] Clone Apply End Master Task ID: 0 Failed, code: 3862: Clone Donor Error: 1016 : Can't open file: './undo001' (errno: 2 - No such file or directory).
目をこらして良く見ていただくとわかるのだが、
[Warning] [MY-013460] [InnoDB] Clone removing all user data for provisioning: Finished
なので、クローンが始まった時点でレシピエントのdatadirからは諸々が消える。
ちなみにドナー側の
undo001
、ちゃんとあるんですけどねってすったもんだした後、レシピエントを再起動したりドナーの innodb_undo_log_truncate = ON
にしてみたり色々したけれど、結局ドナー(= 稼働中のマスター)を再起動するまでクローンは同じエラーで転け続けた。- ひょっとしたら、ドナーを5.7から8.0にバージョンアップした時に、「自動で更新かかるから良いっしょー」ってアップグレード後に再起動してなかったのがいけないんじゃないかなと思っている(
mysql_upgrade
の時は自分で再起動してたけど、すっかり忘れていたんじゃないかなと) - 【2019/11/26 18:45】
- innodb_undo_tablespaces= 2とかで5.7をインストールして、その後8.0にアップグレードするとこうなることまでは分かった。やっぱり単に再起動すればいいだけの気がする。
- MySQL Bugs: #97784: clone plugin misunderstanding UNDO table-space file name
mysql> CLONE INSTANCE FROM cloner@yyy.yyy.yyy.yyy:3306 IDENTIFIED BY 'xxxx';
Query OK, 0 rows affected (37 min 23.83 sec)
再起動後はすんなり通った。
レシピエントの方のエラーログには
レシピエントの方のエラーログには
2019-11-26T11:08:19.464385+09:00 7 [Note] [MY-011978] [InnoDB] Clone estimated size: 188.15 GiB Available space: 359.75 GiB
と書いてあったので、「何も設定しない(= ほぼ無制限にCLONEがリソースを欲しがる)状態」で85MB/sくらい。
m4.xlarge + 900IOPSなgp2のEBSなので「マスターでトラフィックを受けながらにしてはまあまあ?」みたいな感じはする。
m4.xlarge + 900IOPSなgp2のEBSなので「マスターでトラフィックを受けながらにしてはまあまあ?」みたいな感じはする。
ただしまあ
dstat
を見てたら結構iowait出てるしストレージもかなり読むし出ていくパケットも多いし…で、2台目やる時はちょっと制限しようかなとか思った(↓ はドナー側)$ dstat -amrl
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- ------memory-usage----- --io/total- ---load-avg---
usr sys idl wai hiq siq| read writ| recv send| in out | int csw | used buff cach free| read writ| 1m 5m 15m
53 1 12 33 0 0| 90M 318k| 337k 90M| 0 0 | 20k 5285 |12.0G 0 1602M 1952M| 751 105 |5.57 6.21 5.83
62 1 7 30 0 0| 90M 378k| 220k 90M| 0 0 | 16k 5335 |12.0G 0 1602M 1953M| 752 118 |5.57 6.21 5.83
53 1 17 29 0 1| 89M 321k| 291k 89M| 0 0 | 18k 5428 |12.0G 0 1602M 1950M| 751 103 |5.57 6.21 5.83
2019-11-26T11:08:46.108105+09:00 112 [Note] [MY-013458] [InnoDB] Stage progress: 1% completed.
2019-11-26T11:10:31.546724+09:00 112 [Note] [MY-013458] [InnoDB] Stage progress: 11% completed.
制限するにはこの辺のパラメーターをいじるんだけれど、
複数個所いっぺんに変えるとわからなくなる(というか、なったので一度中断してやり直している)ので今回は clone_enable_compression = ONだけ変更して実行(↓もドナー側)
mysql> SET GLOBAL SEESSION clone_valid_donor_list = 'yyy.yyy.yyy.yyy:3306';
Query OK, 0 rows affected (0.00 sec)
mysql> SET GLOBAL clone_enable_compression = ON;
Query OK, 0 rows affected (0.00 sec)
mysql> CLONE INSTANCE FROM cloner@yyy.yyy.yyy.yyy:3306 IDENTIFIED BY 'xxxx';
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- ------memory-usage----- --io/total-
usr sys idl wai hiq siq| read writ| recv send| in out | int csw | used buff cach free| read writ
50 0 46 4 0 0| 14M 2675k| 47k 4043k| 0 0 |7780 6601 |12.0G 0 2152M 1425M| 143 277
49 1 48 3 0 0| 16M 692k| 29k 4434k| 0 0 |5988 4010 |12.0G 0 2152M 1425M| 171 92.0
50 1 47 3 0 0| 12M 843k| 35k 3330k| 0 0 |7091 5590 |12.0G 0 2152M 1425M| 127 141
2019-11-26T12:17:54.871875+09:00 474 [Note] [MY-013458] [InnoDB] Stage progress: 1% completed.
2019-11-26T12:27:37.153130+09:00 474 [Note] [MY-013458] [InnoDB] Stage progress: 11% completed.
( ゚д゚)
いやー、IOがぐっと下がったのは良いけど、遅い遅い。
読み出し速度が14MB/secしか出てなくて単純計算6倍近くかかるってことは3時間かー…ごはんでも食べてこようそうしよう。
読み出し速度が14MB/secしか出てなくて単純計算6倍近くかかるってことは3時間かー…ごはんでも食べてこようそうしよう。
_人人人人人_
> 時は流れた <
 ̄Y^Y^Y^Y^Y^ ̄
さーてクローンどうなったかな。
mysql> CLONE INSTANCE FROM cloner@yyy.yyy.yyy.yyy:3306 IDENTIFIED BY 'xxxx';
Timeout, server 192.168.211.58 not responding.
Σ(゚д゚lll) VPN切れてる!?
2019-11-26T15:31:25.203117+09:00 474 [Note] [MY-013458] [InnoDB] Stage progress: 96% completed.
2019-11-26T15:33:30.443342+09:00 475 [Note] [MY-013458] [InnoDB] Clone set error ACK: 1317
2019-11-26T15:33:30.443418+09:00 475 [Note] [MY-013458] [InnoDB] Clone Set Error code: 1317 Saved Error code: 0
2019-11-26T15:33:30.443440+09:00 475 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: COM_ACK: Storage Ack.'
2019-11-26T15:33:30.443488+09:00 475 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: COM_RES_COMPLETE.'
2019-11-26T15:33:30.443902+09:00 475 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: COM_EXIT: Storage End.'
2019-11-26T15:33:30.443943+09:00 475 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: COM_RES_COMPLETE.'
2019-11-26T15:33:30.443959+09:00 475 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: Exiting clone protocol.'
2019-11-26T15:33:30.464396+09:00 474 [Note] [MY-013458] [InnoDB] Clone Set Error code: 1160 Saved Error code: 1317
2019-11-26T15:33:30.464443+09:00 474 [Note] [MY-013458] [InnoDB] Clone Set Error code: 1160 Saved Error code: 1317
2019-11-26T15:33:30.464465+09:00 474 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: COM_EXECUTE: Storage Execute: error: 1160: Got an error writing communication packets.'
2019-11-26T15:33:30.464491+09:00 474 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: Before sending COM_RES_ERROR: network : error: 1160: Got an error writing communication packets.'
2019-11-26T15:33:30.464566+09:00 474 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: After sending COM_RES_ERROR: error: 1160: Got an error writing communication packets.'
2019-11-26T15:33:30.464646+09:00 474 [Note] [MY-013458] [InnoDB] Clone Set Error code: 1317 Saved Error code: 1317
2019-11-26T15:33:30.464687+09:00 474 [Note] [MY-011846] [InnoDB] Clone Stop PAGE ARCH : end LSN : 1928777840909, log sys LSN : 1928826565652
2019-11-26T15:33:30.464922+09:00 474 [Note] [MY-013457] [InnoDB] Clone End Master Task ID: 0 Failed, code: 1317: Got an error writing communication packets
2019-11-26T15:33:30.464955+09:00 474 [Note] [MY-013273] [Clone] Plugin Clone reported: 'Server: Exiting clone protocol: error: 1160: Got an error writing communication packets.'
Σ(゚д゚lll)
というわけで今、圧縮しない設定でCLONE取り直してます。ちゃんとtmux使って。
機会があれば他のパラメーターも試す。