MySQL DBA如何利用strace/pstack/gdb来定位问题

 2022-10-27    335  

strace简介

strace是Linux环境下的一款程序调试工具,用来监察一个应用程序所使用的系统调用。

Strace是一个简单的跟踪系统调用执行的工具。在其最简单的形式中,它可以从开始到结束跟踪二进制的执行,并在进程的生命周期中输出一行具有系统调用名称,每个系统调用的参数和返回值的文本行。

常用选项

  • -T:strace输出显示时间
  • -t:可以在每行的输出之前添加时间戳,-tt可以精确到微妙级别,-ttt也可以精确到微妙级,但是它并不是打印当前时间,而是显示自从epoch以来的所经过的秒数
  • -s:指定输出的字符串的最大长度,默认为32,如果输出到文件的话会全部输出
  • -o:指定将strace输出到文件
  • -ff:如果提供-o filename,则所有进程的跟踪结果输出到相应的filename.pid中,pid是各进程的进程号.
  • -e:指定跟踪某个行为,例如-e trace=open指定只跟踪open行为
  • -y:将文件句柄用文件路径代替显示

正确姿势

简单使用

strace-T-tt-o/tmp/strace.logCMD
strace-T-ttCMD2>&1|tee/tmp/strace.log
strace-T-tt-s100-o/tmp/strace.logCMD
strace-T-tt-s100-ff-o/tmp/strace.logCMD
strace-T-tt-s100-etrace=XXXX-o/tmp/strace.logCMD

使用案例

利用strace观察客户端client执行SQL

#通过sys.processlist表中pid可以知道客户端连接pid
MySQL[sys]>selectthd_id,conn_id,user,pid,program_name,command,current_statementfromprocesslistwhereconn_id>0andpid>0;
+--------+---------+------------------+------+--------------+---------+-------------------+
|thd_id|conn_id|user|pid|program_name|command|current_statement|
+--------+---------+------------------+------+--------------+---------+-------------------+
|78|22|xucl@172.17.0.11|8656|mysql|Sleep|select*fromt1|
+--------+---------+------------------+------+--------------+---------+-------------------+
1rowinset(0.05sec)
#session1
[root@VM_0_11_centos~]#strace-T-tt-s100-o/tmp/strace.log-p5841
strace:Process5841attached
^Cstrace:Process5841detached
#session进行数据查询
MySQL[(none)]>usexucl
Readingtableinformationforcompletionoftableandcolumnnames
Youcanturnoffthisfeaturetogetaquickerstartupwith-A
Databasechanged
MySQL[xucl]>select*fromt1;
+----+----+
|id|c1|
+----+----+
|1||
+----+----+
1rowinset(0.00sec)

需要提醒一下的是从客户端连接到MySQL的时候需要注意客户端的版本,例如我用mariadb的客户端连接官方版本8.0就无法获取到pid

查看strace文件

....
20:54:16.901980poll([{fd=3,events=POLLIN|POLLPRI}],1,0)=0(Timeout)<0.000006>
20:54:16.902001write(3,"\21\0\0\0\3select*fromt1",21)=21<0.000011>
20:54:16.902026read(3,"\1\0\0\1\2\"\0\0\2\3def\4xucl\2t1\2t1\2id\2id\f?\0\v\0\0\0\3\3P\0\0\0\"\0\0\3\3def\4xucl\2t1\2t1\2c1\2c1\f!\0\36\0\0\0\375\t@\0\0\0\5\0\0\4\376\0\0\2\0\3\0\0\5\0011\0\5\0\0"...,16384)=106<0.000355>
20:54:16.902405times({tms_utime=0,tms_stime=0,tms_cutime=0,tms_cstime=0})=429630696<0.000005>
20:54:16.902441write(1,"+----+----+\n",12)=12<0.000026>
20:54:16.902482write(1,"|id|c1|\n",12)=12<0.000018>
20:54:16.902514write(1,"+----+----+\n",12)=12<0.000018>
20:54:16.902549write(1,"|1||\n",12)=12<0.000019>
20:54:16.902582write(1,"+----+----+\n",12)=12<0.000019>
20:54:16.902631write(1,"\33(B\33[0;1m1rowinset(0.00sec)\n",33)=33<0.000019>
20:54:16.902666write(1,"\33(B\33[m\33(B\33[0;1m\n",16)=16<0.000019>
20:54:16.902702stat("/etc/localtime",{st_mode=S_IFREG|0644,st_size=528,...})=0<0.000006>
20:54:16.902752rt_sigprocmask(SIG_BLOCK,[INT],[],8)=0<0.000004>
20:54:16.902771ioctl(0,TIOCGWINSZ,{ws_row=42,ws_col=163,ws_xpixel=0,ws_ypixel=0})=0<0.000005>
20:54:16.902789ioctl(0,TIOCSWINSZ,{ws_row=42,ws_col=163,ws_xpixel=0,ws_ypixel=0})=0<0.000004>
20:54:16.902806ioctl(0,TCGETS,{B38400opostisigicanonecho...})=0<0.000005>
20:54:16.902824ioctl(0,SNDCTL_TMR_STOPorTCSETSW,{B38400opostisig-icanon-echo...})=0<0.000006>
20:54:16.902841write(1,"\33(B\33[m",6)=6<0.000022>
20:54:16.902877rt_sigprocmask(SIG_SETMASK,[],NULL,8)=0<0.000005>
20:54:16.902895rt_sigprocmask(SIG_BLOCK,[INTQUITALRMTSTPTTINTTOU],[],8)=0<0.000005>
20:54:16.902912rt_sigaction(SIGINT,{0x55e19ff48a00,[],SA_RESTORER,0x7fc3e63585f0},{0x55e19fee21d0,[INT],SA_RESTORER|SA_RESTART,0x7fc3e63585f0},8)=0<0.000005>
20:54:16.902931rt_sigaction(SIGTERM,{0x55e19ff48a00,[],SA_RESTORER,0x7fc3e63585f0},{SIG_DFL,[],SA_RESTORER,0x7fc3e63585f0},8)=0<0.000005>
20:54:16.902948rt_sigaction(SIGQUIT,{0x55e19ff48a00,[],SA_RESTORER,0x7fc3e63585f0},{0x55e19fee1fa0,[QUIT],SA_RESTORER|SA_RESTART,0x7fc3e63585f0},8)=0<0.000005>
20:54:16.902966rt_sigaction(SIGALRM,{0x55e19ff48a00,[],SA_RESTORER,0x7fc3e63585f0},{SIG_DFL,[],SA_RESTORER,0x7fc3e63585f0},8)=0<0.000005>
20:54:16.902983rt_sigaction(SIGTSTP,{0x55e19ff48a00,[],SA_RESTORER,0x7fc3e63585f0},{SIG_DFL,[],SA_RESTORER,0x7fc3e63585f0},8)=0<0.000005>
20:54:16.903001rt_sigaction(SIGTTOU,{0x55e19ff48a00,[],SA_RESTORER,0x7fc3e63585f0},{SIG_DFL,[],SA_RESTORER,0x7fc3e63585f0},8)=0<0.000004>
20:54:16.903018rt_sigaction(SIGTTIN,{0x55e19ff48a00,[],SA_RESTORER,0x7fc3e63585f0},{SIG_DFL,[],SA_RESTORER,0x7fc3e63585f0},8)=0<0.000006>
20:54:16.903037rt_sigprocmask(SIG_SETMASK,[],NULL,8)=0<0.000005>
20:54:16.903055rt_sigaction(SIGWINCH,{0x55e19ff481a0,[],SA_RESTORER|SA_RESTART,0x7fc3e63585f0},{0x55e19fee0760,[WINCH],SA_RESTORER|SA_RESTART,0x7fc3e63585f0},8)=0<0.000006>
20:54:16.903088write(1,"MySQL[xucl]>",14)=14<0.000032>
20:54:16.903139read(0,<detached...>

利用strace观察server端执行

#命令
strace-o/tmp/strace.log-T-tt-f-p`pidofmysqld`
#假设你想看跟IO相关的行为,可以用-etrace来指定
#mysql相关的IO操作:
#mysql:read,write,open
#innodb:pread64,pwrite64
strace-o/tmp/strace.log-T-tt-f-etrace=read,open,write,pwrite64,pread64-p`pidofmysqld`

示例:

strace-o/tmp/strace.log-T-tt-ff-p`pidofmysqld`
#然后在本地通过socket登录执行命令
MySQL[performance_schema]>usexucl;
Readingtableinformationforcompletionoftableandcolumnnames
Youcanturnoffthisfeaturetogetaquickerstartupwith-A
Databasechanged
MySQL[xucl]>select*fromt1;
+------+
|id|
+------+
|1|
|2|
|4|
|3|
|5|
|6|
+------+
6rowsinset(0.01sec)
#查看本线程对应的LWP
MySQL[xucl]>showprocesslist;
+----+------+-----------+------+---------+------+----------+------------------+
|Id|User|Host|db|Command|Time|State|Info|
+----+------+-----------+------+---------+------+----------+------------------+
|28|root|localhost|xucl|Query|0|starting|showprocesslist|
+----+------+-----------+------+---------+------+----------+------------------+
1rowinset(0.00sec)
MySQL[xucl]>select*fromperformance_schema.threadswhereprocesslist_id=28\G
***************************1.row***************************
THREAD_ID:62
NAME:thread/sql/one_connection
TYPE:FOREGROUND
PROCESSLIST_ID:28
PROCESSLIST_USER:root
PROCESSLIST_HOST:localhost
PROCESSLIST_DB:xucl
PROCESSLIST_COMMAND:Query
PROCESSLIST_TIME:0
PROCESSLIST_STATE:Sendingdata
PROCESSLIST_INFO:select*fromperformance_schema.threadswhereprocesslist_id=28
PARENT_THREAD_ID:NULL
ROLE:NULL
INSTRUMENTED:YES
HISTORY:YES
CONNECTION_TYPE:Socket
THREAD_OS_ID:8208
1rowinset(0.00sec)

strace会在/tmp文件夹下生成strace.log.xxx,xxx对应的就是THREAD_OS_ID,我们直接看strace.log.8208文件

t@VM_0_9_centostmp]#catstrace.log.8208
16:44:14.447709restart_syscall(<...resuminginterruptedrestart_syscall...>)=1<3.268331>
16:44:17.716076recvfrom(134<socket:[12555437]>,"\22\0\0\0",4,MSG_DONTWAIT,NULL,NULL)=4<0.000015>
16:44:17.716179recvfrom(134<socket:[12555437]>,"\3SELECTDATABASE()",18,MSG_DONTWAIT,NULL,NULL)=18<0.000017>
16:44:17.716318sendto(134<socket:[12555437]>,"\1\0\0\1\1\0\0\2\3def\0\0\0\nDATABASE()\0\f!\0f"...,68,MSG_DONTWAIT,NULL,0)=68<0.000033>
16:44:17.716420recvfrom(134<socket:[12555437]>,"\5\0\0\0",4,MSG_DONTWAIT,NULL,NULL)=4<0.000015>
16:44:17.716476recvfrom(134<socket:[12555437]>,"\2xucl",5,MSG_DONTWAIT,NULL,NULL)=5<0.000017>
16:44:17.716538access("./xucl",F_OK)=0<0.000016>
16:44:17.716604sendto(134<socket:[12555437]>,"\7\0\0\1\0\0\0\2\0\0\0",11,MSG_DONTWAIT,NULL,0)=11<0.000090>
16:44:17.716738recvfrom(134<socket:[12555437]>,0x7f06ec009fe0,4,MSG_DONTWAIT,NULL,NULL)=-1EAGAIN(Resourcetemporarilyunavailable)<0.000014>
16:44:17.716793poll([{fd=134<socket:[12555437]>,events=POLLIN|POLLPRI}],1,600000)=1([{fd=134,revents=POLLIN}])<1.326139>
16:44:19.042985recvfrom(134<socket:[12555437]>,"\21\0\0\0",4,MSG_DONTWAIT,NULL,NULL)=4<0.000014>
16:44:19.043048recvfrom(134<socket:[12555437]>,"\3select*fromt1",17,MSG_DONTWAIT,NULL,NULL)=17<0.000017>
16:44:19.043256sendto(134<socket:[12555437]>,"\1\0\0\1\1\"\0\0\2\3def\4xucl\2t1\2t1\2id\2id\f?"...,97,MSG_DONTWAIT,NULL,0)=97<0.000137>
16:44:19.043448recvfrom(134<socket:[12555437]>,0x7f06ec009fe0,4,MSG_DONTWAIT,NULL,NULL)=-1EAGAIN(Resourcetemporarilyunavailable)<0.000014>
16:44:19.043501poll([{fd=134<socket:[12555437]>,events=POLLIN|POLLPRI}],1,600000)=1([{fd=134,revents=POLLIN}])<11.279983>
16:44:30.323556recvfrom(134<socket:[12555437]>,"\21\0\0\0",4,MSG_DONTWAIT,NULL,NULL)=4<0.000050>
16:44:30.323682recvfrom(134<socket:[12555437]>,"\3showprocesslist",17,MSG_DONTWAIT,NULL,NULL)=17<0.000017>
16:44:30.323788sendto(134<socket:[12555437]>,"\1\0\0\1\10\30\0\0\2\3def\0\0\0\2Id\0\f?\0\25\0\0\0\10\201\0\0\0"...,324,MSG_DONTWAIT,NULL,0)=324<0.000221>
16:44:30.324066recvfrom(134<socket:[12555437]>,0x7f06ec009fe0,4,MSG_DONTWAIT,NULL,NULL)=-1EAGAIN(Resourcetemporarilyunavailable)<0.000014>
16:44:30.324120poll([{fd=134<socket:[12555437]>,events=POLLIN|POLLPRI}],1,600000)=1([{fd=134,revents=POLLIN}])<2.943989>
16:44:33.268173recvfrom(134<socket:[12555437]>,"A\0\0\0",4,MSG_DONTWAIT,NULL,NULL)=4<0.000055>
16:44:33.268309recvfrom(134<socket:[12555437]>,"\3select*fromperformance_schem"...,65,MSG_DONTWAIT,NULL,NULL)=65<0.000017>
16:44:33.268592sendto(134<socket:[12555437]>,"\1\0\0\1\21H\0\0\2\3def\22performance_schema"...,1606,MSG_DONTWAIT,NULL,0)=1606<0.000142>
16:44:33.268790recvfrom(134<socket:[12555437]>,0x7f06ec009fe0,4,MSG_DONTWAIT,NULL,NULL)=-1EAGAIN(Resourcetemporarilyunavailable)<0.000018>
16:44:33.268853poll([{fd=134<socket:[12555437]>,events=POLLIN|POLLPRI}],1,600000<detached...>

可以清楚地看到有对应的SQL输出

pstack简介

pstack用来跟踪进程栈,这个命令在排查进程问题时非常有用,比如我们发现一个服务一直处于work状态(如假死状态,好似死循环),使用这个命令就能轻松定位问题所在;

可以在一段时间内,多执行几次pstack,若发现代码栈总是停在同一个位置,那个位置就需要重点关注,很可能就是出问题的地方;

正确使用姿势

pstack使用起来非常简单,直接pstack -pid即可

[root@VM_0_9_centosmysql3306]#netstat-lntp|grep3306
tcp600:::33060:::*LISTEN13499/mysqld
tcp600:::3306:::*LISTEN13499/mysqld
[root@VM_0_9_centosmysql3306]#ps-Lf13499
UIDPIDPPIDLWPCNLWPSTIMETTYSTATTIMECMD
mysql13499113161349905320:31pts/0Sl0:00/usr/local/mysql/bin/mysqld--defaults-file=/data/mysql/mysql3306/my3306.cnf--user=mysql
mysql13499113161350605320:31pts/0Sl0:00/usr/local/mysql/bin/mysqld--defaults-file=/data/mysql/mysql3306/my3306.cnf--user=mysql
mysql13499113161350705320:31pts/0Sl0:00/usr/local/mysql/bin/mysqld--defaults-file=/data/mysql/mysql3306/my3306.cnf--user=mysql
mysql13499113161350805320:31pts/0Sl0:00/usr/local/mysql/bin/mysqld--defaults-file=/data/mysql/mysql3306/my3306.cnf--user=mysql
mysql13499113161350905320:31pts/0Sl0:00/usr/local/mysql/bin/mysqld--defaults-file=/data/mysql/mysql3306/my3306.cnf--user=mysql
mysql13499113161351005320:31pts/0Sl0:00/usr/local/mysql/bin/mysqld--defaults-file=/data/mysql/mysql3306/my3306.cnf--user=mysql
mysql13499113161351105320:31pts/0Sl0:00/usr/local/mysql/bin/mysqld--defaults-file=/data/mysql/mysql3306/my3306.cnf--user=mysql
....

mysqld的pid为13499,fork了几十个线程,有的是用户连接,有的是MySQL自己的线程,例如read_thread、write_thread等等,这里的LWP对应的就是performance_schema.threads表的THREAD_OS_ID。

全部抓取

假如发生了mysqld hang住的情况,可以直接pstack mysqldpid来抓取所有的线程堆栈,这也是通常使用的方式,例如

pstack13499>>/tmp/pstack.log

这里简单随便看下抓取到的pstack

Thread27(Thread0x7ff1bc29f700(LWP13535)):
#00x00007ff1f6368da2inpthread_cond_timedwait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
#10x00000000020ace25inos_event::timed_wait(timespecconst*)()at../../../mysql-8.0.19/storage/innobase/include/sync0types.h:540
#20x00000000020ad9e1inos_event::wait_time_low(this=0x7ff1dc022458,time_in_usec=<optimizedout>,reset_sig_count=1)at../../../mysql-8.0.19/storage/innobase/os/os0event.cc:495
#30x000000000215fe0cinsrv_error_monitor_thread()()at../../../mysql-8.0.19/storage/innobase/srv/srv0srv.cc:1835
#40x0000000002076da5in__invoke_impl<void,void(*&)()>(__f=<syntheticpointer>)at/opt/rh/devtoolset-8/root/usr/include/c++/8/bits/invoke.h:89
#5__invoke<void(*&)()>(__fn=<syntheticpointer>)at/opt/rh/devtoolset-8/root/usr/include/c++/8/bits/invoke.h:95
#6__call<void>(__args=...,this=<syntheticpointer>)at/opt/rh/devtoolset-8/root/usr/include/c++/8/functional:400
#7operator()<>(this=<syntheticpointer>)at/opt/rh/devtoolset-8/root/usr/include/c++/8/functional:484
#8operator()<void(*)()>(f=@0x7ff1dc4aec18:0x215fd20<srv_error_monitor_thread()>,this=0x7ff1dc4aec20)at../../../mysql-8.0.19/storage/innobase/include/os0thread-create.h:101
#9__invoke_impl<void,Runnable,void(*)()>(__f=...)at/opt/rh/devtoolset-8/root/usr/include/c++/8/bits/invoke.h:60
#10__invoke<Runnable,void(*)()>(__fn=...)at/opt/rh/devtoolset-8/root/usr/include/c++/8/bits/invoke.h:95
#11_M_invoke<0,1>(this=0x7ff1dc4aec18)at/opt/rh/devtoolset-8/root/usr/include/c++/8/thread:244
#12operator()(this=0x7ff1dc4aec18)at/opt/rh/devtoolset-8/root/usr/include/c++/8/thread:253
#13std::thread::_State_impl<std::thread::_Invoker<std::tuple<Runnable,void(*)()>>>::_M_run(this=0x7ff1dc4aec10)at/opt/rh/devtoolset-8/root/usr/include/c++/8/thread:196
#140x000000000265d64finexecute_native_thread_routine()
#150x00007ff1f6364e65instart_thread()from/lib64/libpthread.so.0
#160x00007ff1f44a188dinclone()from/lib64/libc.so.6

从堆栈可以看出这其实就是MySQL的error monitor线程

单个抓取

假设你运气好,你现在可以通过performance_schema.threads可以查到对应的线程的thread_os_id,你可以指定单个线程的lwp进行抓取

MySQL[performance_schema]>showprocesslist;
+----+-----------------+-----------+--------------------+---------+------+------------------------+---------------------------+
|Id|User|Host|db|Command|Time|State|Info|
+----+-----------------+-----------+--------------------+---------+------+------------------------+---------------------------+
|4|event_scheduler|localhost|NULL|Daemon|4319|Waitingonemptyqueue|NULL|
|23|root|localhost|performance_schema|Query|0|starting|showprocesslist|
|24|root|localhost|xucl|Query|22|Usersleep|selectsleep(600)fromt1|
+----+-----------------+-----------+--------------------+---------+------+------------------------+---------------------------+
3rowsinset(0.00sec)
MySQL[performance_schema]>selectTHREAD_OS_IDfromthreadswhereprocesslist_id=24\G
***************************1.row***************************
THREAD_OS_ID:15020
1rowinset(0.00sec)

你可以这样操作

pstack15020>/tmp/single_pstack.log

查看pstack结果,只抓取了15020线程的堆栈

Thread1(process15020):
#00x00007ff1f6368da2inpthread_cond_timedwait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
#10x0000000001189b53innative_cond_timedwait(abstime=0x7ff1b810d7f0,mutex=0x3b96760<LOCK_item_func_sleep>,cond=0x7ff1b810d8e0)at../../mysql-8.0.19/include/thr_cond.h:149
#2my_cond_timedwait(abstime=0x7ff1b810d7f0,mp=0x3b96760<LOCK_item_func_sleep>,cond=0x7ff1b810d8e0)at../../mysql-8.0.19/include/thr_cond.h:149
#3inline_mysql_cond_timedwait(src_file=0x2de5538"../../mysql-8.0.19/sql/item_func.cc",src_line=4786,abstime=0x7ff1b810d7f0,mutex=0x3b96760<LOCK_item_func_sleep>,that=0x7ff1b810d8e0)at../../mysql-8.0.19/include/mysql/psi/mysql_cond.h:217
#4Interruptible_wait::wait(thisthis=this@entry=0x7ff1b810d890,condcond=cond@entry=0x7ff1b810d8e0,mutexmutex=mutex@entry=0x3b96760<LOCK_item_func_sleep>)at../../mysql-8.0.19/sql/item_func.cc:4786
#50x0000000001189e26inItem_func_sleep::val_int(this=<optimizedout>)at../../mysql-8.0.19/sql/item_func.cc:5435
#60x00000000010ff567inItem::send(this=0x7ff1d45d9380,protocol=0x7ff1d4392240,buffer=<optimizedout>)at../../mysql-8.0.19/sql/item.cc:6638
#70x0000000000e3d36binTHD::send_result_set_row(List<Item>*)()at../../mysql-8.0.19/sql/sql_class.cc:2530
#80x000000000127e8f1inQuery_result_send::send_data(this=<optimizedout>,thd=0x7ff1d457b7b0,items=...)at../../mysql-8.0.19/sql/query_result.cc:93
#90x0000000000f5d29ainSELECT_LEX_UNIT::ExecuteIteratorQuery(THD*)()at../../mysql-8.0.19/sql/sql_union.cc:1530
#100x0000000000f5f3f9inSELECT_LEX_UNIT::execute(THD*)()at../../mysql-8.0.19/sql/sql_union.cc:1588
#110x0000000000eedf8binSql_cmd_dml::execute_inner(THD*)()at../../mysql-8.0.19/sql/sql_select.cc:910
#120x0000000000ef7418inSql_cmd_dml::execute(this=0x7ff1d45d9a38,thd=0x7ff1d457b7b0)at../../mysql-8.0.19/sql/sql_select.cc:715
#130x0000000000eab3a9inmysql_execute_command(THD*,bool)()at../../mysql-8.0.19/sql/sql_parse.cc:4478
#140x0000000000ead0ccinmysql_parse(thdthd=thd@entry=0x7ff1d457b7b0,parser_stateparser_state=parser_state@entry=0x7ff1b810f610)at../../mysql-8.0.19/sql/sql_parse.cc:5288
#150x0000000000eaeb6bindispatch_command(THD*,COM_DATAconst*,enum_server_command)()at../../mysql-8.0.19/sql/sql_parse.cc:1777
#160x0000000000eb0104indo_command(thdthd=thd@entry=0x7ff1d457b7b0)at../../mysql-8.0.19/sql/sql_parse.cc:1275
#170x0000000000fc1a08inhandle_connection(argarg=arg@entry=0x43cae90)at../../mysql-8.0.19/sql/conn_handler/connection_handler_per_thread.cc:302
#180x00000000023ffdecinpfs_spawn_thread(arg=0x44c2a30)at../../../mysql-8.0.19/storage/perfschema/pfs.cc:2854
#190x00007ff1f6364e65instart_thread()from/lib64/libpthread.so.0
#200x00007ff1f44a188dinclone()from/lib64/libc.so.6

gdb简介

GDB 是 linux 环境下的一般功能强大的调试器,用来调试 C 或 C++ 写的程序。它可以做这些事情

  • Start your program, specifying anything that might affect its behavior.
  • Make your program stop on specified conditions.
  • Examine what has happened, when your program has stopped.
  • Change things in your program, so you can experiment with correcting the effects of one bug and go on to learn about another.

常用命令

  • info threads:查看全部线程
  • thread n:指定某个线程
  • b:在某处打断点
  • c:继续往下走
  • s:执行一行代码,如果代码函数调用,则进入函数
  • n:执行一行代码,函数调用不进入
  • p:打印某个变量值
  • list:打印代码的文本信息
  • bt:查看某个线程的栈帧
  • info b:查看当前所有断点信息

实际案例

进入gdb界面

[root@VM_0_9_centos~]#gdb/usr/local/mysql5.7/bin/mysqld

进入调试

run--defaults-file=/data/mysql/mysql3308/my3308.cnf--user=mysql--gdb

通过info thread查看所有线程,这个时候我从另外一个会话连接到了MySQL,我从pfs.threads查寻到这个会话的LWP值为513,那么我可以通过thread 41切换到这个线程

(gdb)infothread
IdTargetIdFrame
41Thread0x7fff819cf700(LWP513)"mysqld"0x00007ffff61abbedinpoll()from/lib64/libc.so.6
40Thread0x7fff5affd700(LWP416)"mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
39Thread0x7fff81a51700(LWP415)"mysqld"0x00007ffff7bce381insigwait()from/lib64/libpthread.so.0
38Thread0x7fff5b7fe700(LWP414)"mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
37Thread0x7fff5bfff700(LWP413)"mysqld"0x00007ffff7bcada2inpthread_cond_timedwait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
36Thread0x7fff6cff9700(LWP412)"mysqld"0x00007ffff7bcada2inpthread_cond_timedwait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
35Thread0x7fff6d7fa700(LWP411)"mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
34Thread0x7fff6dffb700(LWP410)"mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
33Thread0x7fff6e7fc700(LWP409)"mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
32Thread0x7fff6effd700(LWP408)"mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
31Thread0x7fff6f7fe700(LWP407)"mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
30Thread0x7fff6ffff700(LWP406)"mysqld"0x00007ffff7bcde5dinnanosleep()from/lib64/libpthread.so.0
29Thread0x7fff8084e700(LWP405)"mysqld"0x00007ffff7bcada2inpthread_cond_timedwait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
28Thread0x7fff8104f700(LWP404)"mysqld"0x00007ffff7bcada2inpthread_cond_timedwait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
27Thread0x7fff81850700(LWP403)"mysqld"0x00007ffff7bcada2inpthread_cond_timedwait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
24Thread0x7fff74c24700(LWP398)"mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
23Thread0x7fff75425700(LWP397)"mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
22Thread0x7fff75c26700(LWP396)"mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
21Thread0x7fff76427700(LWP395)"mysqld"0x00007ffff7bcada2inpthread_cond_timedwait@@GLIBC_2.3.2()from/lib64/libpthread.so.0
20Thread0x7fff76c28700(LWP394)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
19Thread0x7fff77429700(LWP393)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
18Thread0x7fff77c2a700(LWP392)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
17Thread0x7fff7842b700(LWP391)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
16Thread0x7fff78c2c700(LWP390)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
15Thread0x7fff7942d700(LWP389)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
14Thread0x7fff79c2e700(LWP388)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
13Thread0x7fff7a42f700(LWP387)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
12Thread0x7fff7ac30700(LWP386)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
11Thread0x7fff7b431700(LWP385)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
10Thread0x7fff7bc32700(LWP384)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
9Thread0x7fff7c433700(LWP383)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
8Thread0x7fff7cc34700(LWP382)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
7Thread0x7fff7d435700(LWP381)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
6Thread0x7fff7dc36700(LWP380)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
5Thread0x7fff7e437700(LWP379)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
4Thread0x7fff7ec38700(LWP378)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
3Thread0x7fff7f439700(LWP377)"mysqld"0x00007ffff79bd644in__io_getevents_0_4()from/lib64/libaio.so.1
2Thread0x7fffec8c4700(LWP376)"mysqld"0x00007ffff60ef53ainsigwaitinfo()from/lib64/libc.so.6
*1Thread0x7ffff7fe4880(LWP371)"mysqld"0x00007ffff61abbedinpoll()from/lib64/libc.so.6
(gdb)thread41
[Switchingtothread41(Thread0x7fff819cf700(LWP513))]
#00x00007ffff61abbedinpoll()from/lib64/libc.so.6

演示一下打断点:

(gdb)bdo_command
Breakpoint1at0xc8d060:file/usr/local/src/mysql-5.7.30/sql/sql_parse.cc,line903.
(gdb)c
Continuing.
(gdb)bdo_command
Breakpoint1at0xc8d060:file/usr/local/src/mysql-5.7.30/sql/sql_parse.cc,line903.
(gdb)n
Singlesteppinguntilexitfromfunctionpoll,
whichhasnolinenumberinformation.
vio_io_wait(viovio=vio@entry=0x54fd330,eventevent=event@entry=VIO_IO_EVENT_READ,timeout=<optimizedout>)at/usr/local/src/mysql-5.7.30/vio/viosocket.c:815
815MYSQL_END_SOCKET_WAIT(locker,0);
(gdb)n
817}
(gdb)n
vio_socket_io_wait(viovio=vio@entry=0x54fd330,eventevent=event@entry=VIO_IO_EVENT_READ)at/usr/local/src/mysql-5.7.30/vio/viosocket.c:92
92ret=-1;
(gdb)n
105}
(gdb)n
vio_read(vio=0x54fd330,buf=0x54e3260"\001",size=4)at/usr/local/src/mysql-5.7.30/vio/viosocket.c:134
134while((ret=mysql_socket_recv(vio->mysql_socket,(SOCKBUF_T*)buf,size,flags))==-1)
(gdb)n
148}
(gdb)n
net_read_raw_loop(netnet=net@entry=0x54841e8,countcount=count@entry=4)at/usr/local/src/mysql-5.7.30/sql/net_serv.cc:697
697if(recvcnt==VIO_SOCKET_ERROR)
(gdb)n
706elseif(!recvcnt)
(gdb)n
715thd_increment_bytes_received(recvcnt);
(gdb)n
712count-=recvcnt;
(gdb)n
713buf+=recvcnt;
(gdb)n
715thd_increment_bytes_received(recvcnt);
(gdb)n
692while(count)
(gdb)n
737}
(gdb)n
736returnMY_TEST(count);
(gdb)n
737}
(gdb)n
net_read_packet_header(net=0x54841e8)at/usr/local/src/mysql-5.7.30/sql/net_serv.cc:779
779server_extension->m_after_header(net,user_data,count,rc);
(gdb)n
778rc=net_read_raw_loop(net,count);
(gdb)n
779server_extension->m_after_header(net,user_data,count,rc);
(gdb)n
787if(rc)
(gdb)n
792pkt_nr=net->buff[net->where_b+3];
(gdb)n
798if(pkt_nr!=(uchar)net->pkt_nr)
(gdb)n
818net->pkt_nr++;
(gdb)n
net_read_packet(net=0x54841e8,complen=0x7fff819cec00)at/usr/local/src/mysql-5.7.30/sql/net_serv.cc:850
850if(net->compress)
(gdb)n
844if(net_read_packet_header(net))
(gdb)n
847net->compress_pkt_nr=net->pkt_nr;
(gdb)n
850if(net->compress)
(gdb)n
868pkt_len=uint3korr(net->buff+net->where_b);
(gdb)n
871if(!pkt_len)
(gdb)n
868pkt_len=uint3korr(net->buff+net->where_b);
(gdb)n
871if(!pkt_len)
(gdb)ppkt_len
$1=22
(gdb)n
874pkt_data_len=max(pkt_len,*complen)+net->where_b;
(gdb)n
877if((pkt_data_len>=net->max_packet)&&net_realloc(net,pkt_data_len))
(gdb)n
881if(net_read_raw_loop(net,pkt_len))
(gdb)n
868pkt_len=uint3korr(net->buff+net->where_b);
(gdb)n
886returnpkt_len;
(gdb)n
891}
(gdb)n
my_net_read(net=0x54841e8)at/usr/local/src/mysql-5.7.30/sql/net_serv.cc:922
922if(len==MAX_PACKET_LENGTH)
(gdb)n
925ulongsave_pos=net->where_b;
(gdb)n
922if(len==MAX_PACKET_LENGTH)
(gdb)n
937net->read_pos=net->buff+net->where_b;
(gdb)n
938if(len!=packet_error)
(gdb)n
937net->read_pos=net->buff+net->where_b;
(gdb)n
938if(len!=packet_error)
(gdb)n
939net->read_pos[len]=0;/*Safeguardformysql_use_result*/
(gdb)n
1056MYSQL_NET_READ_DONE(0,len);
(gdb)n
1058}
(gdb)n
Protocol_classic::read_packet(this=0x5483a88)at/usr/local/src/mysql-5.7.30/sql/protocol_classic.cc:820
820raw_packet=m_thd->net.read_pos;
(gdb)n
819bad_packet=false;
(gdb)n
820raw_packet=m_thd->net.read_pos;
(gdb)n
829}
(gdb)n
820raw_packet=m_thd->net.read_pos;
(gdb)n
829}
(gdb)n
Protocol_classic::get_command(this=0x5483a88,com_data=0x7fff819cecc0,cmd=0x7fff819cecb0)at/usr/local/src/mysql-5.7.30/sql/protocol_classic.cc:983
983if(packet_length==0)/*safety*/
(gdb)n
990raw_packet[packet_length]='\0';/*safety*/
(gdb)n
1002returnparse_packet(com_data,*cmd);
(gdb)n
990raw_packet[packet_length]='\0';/*safety*/
(gdb)n
992*cmd=(enumenum_server_command)raw_packet[0];
(gdb)n
994if(*cmd>=COM_END)
(gdb)n
992*cmd=(enumenum_server_command)raw_packet[0];
(gdb)n
1002returnparse_packet(com_data,*cmd);
(gdb)n
999packet_length--;
(gdb)n
1000raw_packet++;
(gdb)n
1002returnparse_packet(com_data,*cmd);
(gdb)n
1003}
(gdb)n
do_command(thdthd=thd@entry=0x5482a30)at/usr/local/src/mysql-5.7.30/sql/sql_parse.cc:974
974if(rc)
(gdb)s
971rc=thd->get_protocol()->get_command(&com_data,&command);
(gdb)bt
#0do_command(thdthd=thd@entry=0x5482a30)at/usr/local/src/mysql-5.7.30/sql/sql_parse.cc:971
#10x0000000000d4abf8inhandle_connection(argarg=arg@entry=0x5418ff0)at/usr/local/src/mysql-5.7.30/sql/conn_handler/connection_handler_per_thread.cc:313
#20x000000000120a451inpfs_spawn_thread(arg=0x5452410)at/usr/local/src/mysql-5.7.30/storage/perfschema/pfs.cc:2197
#30x00007ffff7bc6e65instart_thread()from/lib64/libpthread.so.0
#40x00007ffff61b688dinclone()from/lib64/libc.so.6
(gdb)s
972thd->m_server_idle=false;
(gdb)s
974if(rc)
(gdb)s
1027thd->variables.net_buffer_length);
(gdb)s
Protocol_classic::get_packet(this=0x5483a88)at/usr/local/src/mysql-5.7.30/sql/protocol_classic.cc:807
807{
(gdb)bt
#0Protocol_classic::get_packet(this=0x5483a88)at/usr/local/src/mysql-5.7.30/sql/protocol_classic.cc:807
#10x0000000000c8d1b3indo_command(thdthd=thd@entry=0x5482a30)at/usr/local/src/mysql-5.7.30/sql/sql_parse.cc:1027
#20x0000000000d4abf8inhandle_connection(argarg=arg@entry=0x5418ff0)at/usr/local/src/mysql-5.7.30/sql/conn_handler/connection_handler_per_thread.cc:313
#30x000000000120a451inpfs_spawn_thread(arg=0x5452410)at/usr/local/src/mysql-5.7.30/storage/perfschema/pfs.cc:2197
#40x00007ffff7bc6e65instart_thread()from/lib64/libpthread.so.0
#50x00007ffff61b688dinclone()from/lib64/libc.so.6

我们可以在调试过程中随时打印某个变量的值,例如:

(gdb)ppkt_len
$1=22

甚至gdb可以在紧急情况下救你一命,例如,当MySQL数据库连接打满又没有后台线程可以连接到MySQL的时候,你可以通过gdb来修改MySQL的连接数,例如

gdb-p$(pidofmysqld)-ex"setmax_connections=1500"-batch

总结一下:

  • 本文内容比较多,给大家介绍了几个MySQL DBA(可能其他岗位同样适用)应该掌握的必备工具
  • strace可以用来跟踪某个线程的调用情况,例如可以适用strace跟踪客户端SQL执行情况,如果开发说有很多慢SQL,而MySQL却没有任何记录,那么就可以用strace来跟踪把锅甩给开发(案例来自有赞王航威)
  • pstack是平时用的比较多的工具,尤其是诊断MySQL hang住的情况,例如主从延迟特别高等
  • gdb可能用的不是太多,但是我认为是MySQL DBA进阶必会的技能之一,尤其是抽丝剥茧某些疑难case的时候非常有用

  •  标签:  
  • MySQL
  •  

原文链接:https://77isp.com/post/9977.html

=========================================

https://77isp.com/ 为 “云服务器技术网” 唯一官方服务平台,请勿相信其他任何渠道。