Saturday, September 10, 2016

Fun with Bugs #45 - On Some Bugs Fixed in MySQL 5.7.15

Oracle released MySQL 5.7.15 recently, earlier than expected. The reason for this "unexpected" release is not clear to me, but it could happen because of a couple of security related internal bug reports that got fixed:

  • "It was possible to write log files ending with .ini or .cnf that later could be parsed as option files. The general query log and slow query log can no longer be written to a file ending with .ini or .cnf. (Bug #24388753)
  • Privilege escalation was possible by exploiting the way REPAIR TABLE used temporary files. (Bug #24388746)"
Let me concentrate on the most important fixes to bugs and problems reported by Community users. First of all, in MySQL 5.7.15 one can just turn off InnoDB deadlock detection using the new  innodb_deadlock_detect dynamic server variable. Domas had explained the positive effect of this more than 6 years ago in his post. Some improvements to the way deadlock detection worked in MySQL happened in frames of fix for the Bug #49047 long time ago, but this time Oracle just implemented a way to disable check and rely on InnoDB lock wait timeout instead.

Other InnoDB-related fixes to problems reported in public bugs database include:
  • Bug #82073 - "Crash with InnoDB Encryption, 5.7.13, FusionIO & innodb_flush_method=O_DIRECT". It was reported by my colleague from MariaDB, Chris Calender, and verified by other my colleague from MariaDB, Jan Lindström. Probably Bugs Verification Team in Oracle just had no access to proper hardware to verify this.
  • Bug #79378 - "buf_block_align() makes incorrect assumptions about chunk size". This bug was reported by Alexey Kopytov, who had provided a patch.
There were several fixes to replication-related bugs:
  • Bug #81675 - "mysqlbinlog does not free the existing connection before opening new remote one". It was reported by Laurynas Biveinis from Percona, who had also provided a patch, and verified by Umesh.
  • Bug #80881 - "MTR: binlog test suite failed to cleanup (contribution)". This fix to the binlog test suit was contributed by Daniel Black and verified by Umesh.
  • Bug #79867 - "unnecessary using temporary for update". This bug was reported by Zhang Yingqiangwho had also contributed a patch (that was not used after all, according to the comment from Oracle developer). It was verified by Umesh.
 Some more bugs from other categories were also fixed:
  • Bug #82125 - "@@basedir sysvar value not normalized if set through the command line/INI file". It was reported by Georgi Kodinov from Oracle. It's funny that there is a typo in the release notes when this fix is described (pay attention to slashes):
    "If the basedir system variable was set at server startup from the command line or option file, the value was not normalized (on Windows, / was not replaced with /)"
  • Bug #82097 is private. I can not say anything about it in addition to this:
    "kevent statement timer subsystem deinitialization was revised to avoid a mysqld hang during shutdown on OS X 10.12."
    I can repeat, though, my usual statement that in most cases making bugs private is a wrong thing to do. I feel myself personally insulted every time when I see that fixed bug report remains private.
  • Bug #81666 - "The MYSQL_SERVER define not defined du to spelling error in plugin.cmake". It was reported by Magnus Blåudd who had provided a patch also.
  • Bug #81587 - "Combining ALTER operations triggers table rebuild". This bug was reported by Daniël van Eeden and verified by Umesh.
  • Bug #68972 - "Can't find temporary table". This bug (that could happen in a stored procedure or when prepared statements are used) was reported by Cyril Scetbon and verified by Miguel Solorzano.
  • Bug #82019 - "Is client library supposed to retry EINTR indefinitely or not". It was reported by Laurynas Biveinis from Percona, who had also contributed patches later. This bug was verified formally by Sinisa Milivojevic.
To summarize, you should consider upgrade to MySQL 5.7.15 for sure if you use FusionIO or want to be able to disable InnoDB deadlock detection entirely, or if you consider security-related fixes in this release really important (I don't). Otherwise just check other fixes that could impact you positively, or just wait for 5.7.16...

Saturday, July 30, 2016

Fun with Bugs #44 - Community Bugs Fixed in MySQL 5.7.14

MySQL 5.7.14 was officially released yesterday. So, it's time to check what bugs reported by MySQL Community in public were fixed in this release. Some of these bugs are presented below.

As usual, let me start with InnoDB. The following bugs were fixed there:
  • Bug #80296 - "FTS query exceeds result cache limit". It was reported (for 5.6, but I do not see new 5.6 release notes yet) by Monty Solomon and verified by Umesh.
  • Bug #80304 - "generated columns don't work with foreign key actions". It was reported by Guilhem Bichot based on test case by Peter Gulutzan presented here.As most community bug reports during last 2-3 years, it was verified by Umesh.
  • Bug #80298 - "Full-Text queries with additional secondary index gives NULL or Zero rows", was reported by Ray Lambe and verified by Umesh.
  • Bug #76728 - "reduce lock_sys->mutex contention for transaction that only contains SELECT". This old bug report by Zhai Weixiang (who had provided a patch) was verified by Sinisa Milivojevic.
  • Bug #80083 - "Setting innodb_monitor_enable to ALL does not enable all monitors". It was reported by Davi Arnaut and verified by Miguel Solorzano.
  • Bug #79772 - "Foreign key not allowed when a virtual index exists". It was reported and verified by Jesper wisborg Krogh from Oracle.
There are many more bugs fixed in InnoDB, but all of them were reported in internal Oracle's bugs database by Oracle employees. I do not like this trend.

Now, let's check replication bugs that were fixed:
  • Bug #79324 - "Slave is ~10x slower to execute set of statements compared to master RBR", was reported by Serge Grachov and verified by Umesh.
  • Bug #62008 - "read-only option does not allow inserts/updates on temporary tables". This bug was reported long time ago by Ivan Stanojevic and verified by me when I worked in Oracle.It's really good to see it fixed now!
Some bugs were fixed in Performance_Schema (who could imagine it has bugs...), but they were either reported internally or remain private, like Bug #81464. Just take into account that SELECT from some P_S tables could crash server before 5.7.14, based on release notes...

This time I see several build-related bugs fixed, like these:
  • Bug #81274 - "Add support for Solaris Studio 12.5 aka 5.14". It was reported (and probably fixed) by Tor Didriksen.
  • Bug #81593 - "adapt to gcc 5.3 on solaris". It was also reported and fixed by Tor Didriksen. personally I am happy to see that Oracle still cares about Solaris and related software. Historical sentiments...
  • Bug #80996 - "correct make_pair for c++11 (contribution)". This fix was contributed by Daniel Black
  • Bug #80371 - "MySQL fails to build with new default mode in GCC6". It was reported by Terje Røsten.
The last but not the least, I also have to mention this bug in audit (and, thus, query rewrite) plugins, Bug #81298 - "query rewrite plugin suffers scalability issues". It was reported by Vadim Tkachenko and verified by Sinisa Milivojevic. This is a great improvement.

To summarize, I see reasons to upgrade for those who rely a lot on FTS indexes in InnoDB, replication, audit plugins and Performance_schema. I had not even tried to build 5.7.14 from source yet, so I do not have any personal experience to share.

Sunday, June 5, 2016

Fun with Bugs #43 - Bugs Fixed in MySQL 5.7.13

For some reason I do not see numerous public announcements about recent MySQL 5.7.13 release, even though it happened a couple of days ago formally. Maybe that's because we do not have any really "big" new features in this release (new ->> operator for introduced though for those who, unlike me, care about JSON). Still there are many fixes for bugs previously reported by Community (or Oracle engineers, but still in public bugs database), and I'd like to discuss some of them.

Let me start with the bug that was NOT fixed though, Bug #81093, "mysqld_multi not included in mysql-community-server package", reported by Georgi Iovchev and verified by Umesh. For this bug we had not got any fix, but explanation of this kind:
"On platforms for which systemd support is installed, systemd has the capability of managing multiple MySQL instances. For details, see Configuring Multiple MySQL Instances Using systemd. Consequently, mysqld_multi and mysqld_multi.server are not installed because they are unnecessary."
I still think that removing familiar scripts like mysqld_safe or mysqld_multi from systems where they are not used by default to start services upon OS startup is a mistake that has to be corrected.

I'd like to highlight Bug #80299, "DEFAULT is not deterministic and should not be allowed in generated columns", also. It was reported by Roy Lyseng and verified by Umesh.The fix is not those bug's synopsis suggested. Instead, ALTER TABLE operations that change the definition of a column now cause a table rebuild if any generated column expression uses DEFAULT().

Now let's concentrate on InnoDB bugs fixed:
  • Bug #80898 - "Replication stops after transaction is rolled back asynchronously in master". It was reported by Debarun Banerjee (who probably works on group replication feature in Oracle). I wonder how this fix may influence Galera-based cluster, especially PXC 5.7 from Percona that may become available as GA some day. Something to check later.
  • Bug #80772 - "Excessive memory used in memory/innodb/os0file starting 5.7.8". This regression bug was reported by Morgan Tocker and verified by Umesh.
  • Bug #80761 - "Server crashed in mutex_create due to invalid option". It was reported by Allen Lai. It's not clear who could verify it, but the problem is fixed now.
  • Bug #80708 - "Problem importing encrypted tablespace". It was reported by my former colleague Sergei Glushchenko from Percona and verified by Umesh.
  • Bug #80669 - "Failing assert: fil_space_get(table->space) != __null in row0quiesce.cc line 724", was also reported by Percona employee and great, hard working QA engineer, Ramesh Sivaraman. As most of the bugs from Community, it was immediately (in 44 minutes!) verified by Umesh.
  • Bug #80667 - "innobase_get_computed_value(dtuple_t const*, dict_v_col_t const*, dict_index_t c". It was reported based on failing test case runs by Viswanatham Gudipati, who works for Oracle.
  • Bug #80327 - "InnoDB: Failing assertion: !(rec_get_info_bits(rec2, comp) & 0x10UL)", was reported by Erlend Dahl from Oracle based on numerous failures of innodb_gis.rtree_purge test. I am really happy to see Oracle engineers reporting test failures in public, even when they happen to 5.8 (a three that we, Community, may not have access to at the moment). This is exactly what I was fighting for since I started this blog in 2012.
  • Bug #80261 - "Invalid InnoDB FTS Doc ID during INSERT". This bug was reported by Monty Solomon and promptly verified by Umesh. The bug is fixed, but release notes do not mention it explicitly. There are bugs even in release notes, we should live with that.
  • Bug #80182 - "Unsupported extension error for new features while 'create table like'". This bug was reported by yet another great QA engineer and my former colleague in Percona, Shahriyar Rzayev. As usual, it was promptly verified by Umesh.
  • Bug #79330 - "DROP TABLESPACE fails for missing general tablespace *.ibd file", was reported by Daniel Price who works for Oracle and verified by Marko Mäkelä.
  • Bug #78761 - "Unexpected deadlock with innodb_autoinc_lock_mode=0". It was reported by Yiftach Kaplan and verified by Miguel Solorzano.
  • Bug #77011 - Hanging "System Lock" when executing "flush table ... for export", was reported by Vincent Meng and verified by Umesh. It is also fixed in MySQL 5.6.31.
Now let's proceed with replication bugs that were fixed:
  • Bug #78999 - "Injected query log events do not set explicit_defaults_ts, read it on applying", was reported by great Laurynas Biveinis from Percona and verified by Umesh. It was a regression introduced by the fix for Bug #72794.
  • Bug #78995 - "Binlog_sender::read_event may access freed event buffer". Yet another bug reported by Laurynas Biveinis and immediately verified by Umesh. Percona engineers did a great QA job while working on Percona Server 5.7 GA release, and we see that Oracle fixes their bugs reported back then...
  • Bug #78467 - "mysqldump --dump-slave does not work with multi-source". It was reported by Sven Sandberg. I think that bugs of this kind should not even appear with proper design for new features and, in any case, they have to be fixed before GA release, so that it is at least feature complete. But what do I know...
  • Bug #77496 - "Replication position lost after crash on MTS configured slave". It was reported by Simon Mudd almost a year ago for MySQL 5.6, but is fixed only recently (also in 5.6.31). The bug was verified by Umesh.
Honestly, I am scared with numerous problems we still see in new replication features that appeared in MySQL 5.6, and thinking about Oracle's group replication development on top of that makes me scared even more. It will be us, Support engineers, and poor production DBAs who are forced to deal with numerous features not working well together or breaking usual work habits here and there...
There we not so many bugs reported in public that were fixed in other areas. I'd like to highlight the following:
  • Bug #78512 - "/var/log/mysqld.log has incorrect permissions after installing server from repo", Bug #81390 - "mariadb galera not replaced with community packages during upgrade on fc24", and Bug #79377 - "service start failed to create database if it doesn't exists(ubuntu1510)". They were all reported by Ramana Yeruva who seem to work for Oracle.
  • Bug #81139 - "Missing break statement in mysql_read_default_options()". It was reported by Georgi Kodinov from Oracle.
  • Bug #81014 - "DELETE from joined tables with WHERE using derived table fails with error 1093". This regression bug in optimizer was reported by Jim Parks, who seem to work for Oracle.
  • Bug #80935 - "Dev library header files not on default include path". It was reported by Lars Tangvald who works for Oracle.
  • Bug #80748 - "mysql5.6 does not build on solaris12". This bug was reported by Tor Didriksen from Oracle. Fixed in MySQL 5.6.31 also. I have to check if this fix really help my builds on OpenIndiana VM one day...
  • Bug #80526 - "LEFT OUTER JOIN returns incorrect results on the outer side". It was reported by Attila Rózsár and verified by Miguel Solorzano
  • Bug #80461 - "Accessing uninitialized memory inside strings/decimal.c ull2dec line ~1071". This bug was reported by Sergey Sprogis and verified by Umesh.
  • Bug #80451 - "mysqld --initialize does not support the keyring_file_data option". It was reported and verified by Jesper Krogh from Oracle.
  • Bug #80333 - "Valgrind: Conditional jump or move in Rewriter::do_refresh", was reported by Roel Van de Paar from Percona and verified by Umesh.
  • Bug #80231 - "Column name in having, which is contained in select list '*', produces error", was reported by Su Dylan and formally verified by Sinisa Milivojevic.
  • Bug #80089 - "Memory leak from open files limit", was reported by Manuel Ung and verified by Umesh. This bug is also fixed in MySQL 5.6.31.
  • Bug #79591 - "select distinct not returning distinct rows in 5.7...". This regression bug was reported by Shane Bester and probably verified (and fixed) by Øystein Grøvlen.
  • Bug #79396 - "null pointer dereference", was reported by Павел Гусев and verified by Sinisa Milivojevic. It is fixed in MySQL 5.6.31 also. It would be nice to see Oracle using AppChecker static analyzer for MySQL code...
  • Bug #78254 - "After running mysql_upgrade proxies_priv user columns are not updated to 32". It was reported by Robert Gołębiowski and verified by Umesh well before 5.7 became GA, but fixed only recently. I really wonder from prevented fixing this bug in 5.7.9...
  • Bug #74977 - "Cryptic error when failing to unload a dynamic library", was reported by  Martin Hansson back in 2014 and fixed only recently (also in MySQL 5.6.31). 
  • Bug #74636 - "mysqld_multi misleading when my_print_defaults is not found", was reported by Andrii Nikitin from Oracle back in 2014 and fixed also in MySQL 5.6.31 recently.
  • Bug #72230 - "Undesirable MySQL yum repo packaging dependencies". Yet another bug reported by Simon Mudd more than 2 years ago and verified by Umesh. it is fixed only in 5.7.13.
  • The last but not the least, Bug #71783, "mysqldump silently quits when it encounters an error.", was reported by Shane Bester in 2014, and was recently fixed in MySQL 5.7.13 and 5.6.31.            
To summarize, there are many bug fixes of all kinds in MySQL 5.7.13. Tomorrow I'll surely try to build it from sources on all VMs and platforms I have.

What I consider suspicious is the fact that aside from Oracle engineers, my former Percona colleagues (who had released their own versions of Percona Server and Xtrabackup based on MySQL 5.7) and few well known community members like Simon Mudd, there are very few bug reports from Community users for MySQL 5.7 that are fixed. Not sure what it means...

Saturday, April 30, 2016

Fun with Bugs #42 - Bugs Fixed in MySQL 5.7.12

MySQL 5.7.12 was released more than 2 weeks ago. New features introduced there in a form of "rapid plugins" are widely discussed, but I am more interested in bugs reported by MySQL Community users that are fixed there. Unfortunately I do not see MySQL Community Release Notes by Morgan (like this) for quite a some time, so I have to continue describing key bug fixes and name people who reported and verified bugs in my "Fun with Bugs" series.

As usual, let's start with InnoDB bugs fixed:
  • Bug #80070 - "allocated_size and file_size differ if create general tablespace outside datadir". It was reported by my former colleague from Percona Shahriyar Rzayev and verified by Bogdan Kecman. Nice to see more people from Oracle involved in processing community bug reports!
  • Bug #79185 - "Innodb freeze running REPLACE statements". This bug (that affected many users, also on versions 5.5.x and 5.6.x, and was a kind of a regression) was reported by Will Bryant and verified (probably) and fixed by Shaohua Wang. The fix is also included into versions 5.5.49 and 5.6.30.
  • Bug #73816 - ''MySQL instance stalling “doing SYNC index”". It was reported by Denis Jedig and a lot of additional evidence was provided by my former colleague Aurimas Mikalauskas. This bug was fixed (and probably verified) by Shaohua Wang.
  • Bug #79200 - "InnoDB: "data directory" option of create table fails with pwrite() OS error 22", is a widely noted regression (I've seen customer issue with a potentially related MariaDB problem this week). This bug was reported by Frank Ullrich and verified by Bogdan Kecman. It is also fixed in MySQL 5.6.30.
  • Bug #79725 - "Check algorithm=innodb on crc32 checksum mismatch before crc32(big endian)". This bug was created to track the patch contributed by Daniel Black at GitHub. It was verified by Umesh.
Next, let's review replication bugs fixed in 5.7.12:
  • Bug #79504 - "STOP SLAVE IO THREAD prints wrong LOST CONNECTION message in error log file". It was reported by Venkatesh Duggirala.
  • Bug #78722 - "Relay log info currently_executing_gtid is not properly initialized or protected". This bug was reported by Pedro Gomes. It contains a nice simple test case and fix suggested.
  • Bug #78445 is private. So, I can only quote the release notes:
    "RESET SLAVE ALL could delete a channel even when master_pos_wait and wait_until_sql_thread_after_gtid were still waiting for binlog to be applied. This could cause a MySQL server exit when the functions tried to access the channel that was deleted. Now, a channel reference counter was added that is increased if the channel should not be deleted when the functions are running. RESET SLAVE ALL will wait for no reference, and then it will delete the channel."
    I am not sure this crash is a "security" bug of any kind, but what do I know...
  • Bug #78352 - "Slow startup of 5.7.x slave with relay_log_recovery = ON and many relay logs". I reported it based on regression comparing to 5.6.x reported by a customer of Percona, and verified by Umesh. Nice to see it fixed, as it was really annoying for almost anyone who upgraded production replication setup to 5.7.
  • Bug #78133 - "Slave_worker::write_info() incorrect DBUG_ENTER (contribution)". This bug was created to track the patch contributed by Stewart Smith at GitHub. It was verified by Umesh.
  • Bug #77740 - "silent failure to start if mysql.gtids_executed gets HA_ERR_LOCK_WAIT_TIMEOUT ". It was reported and verified by Shane Bester.
  • Bug #77237 - "Multi-threaded slave log spamming on failure". This bug was reported by Davi Arnaut and verified by Umesh. Fix is also included in MySQL 5.6.30.
  • Bug #78963 - "super_read_only aborts STOP SLAVE if relay_log_info_repository=TABLE, dbg crash". It was reported by my former colleague in Percona Laurynas Biveinis and verified by Umesh. Check also related Bug #79328 - "super_read_only broken as a server option".
  • Bug #77684 - "DROP TABLE IF EXISTS may brake replication if slave has replication filters". This bug was reported by my former colleague in Percona Fernando Laudares Camargos for MySQL 5.6.x and verified by Umesh. MySQL 5.6.30 also got this fixed.
We all remember that Performance Schema is perfect and the next greatest thing after sliced bread, but sometimes bugs are noted even there. Check Bug #79934 - "i_perfschema.table_leak random result failure" reported and verified by Magnus Blåudd. Another example is Bug #79784 - "update setup_instruments do not affect the global mutex/rwlock/cond" reported by Zhang Yingqiang and verified by Umesh. The later, IMHO, is related to or a super set of my good old report, Bug #68097 - "Manual does not explain that some P_S instruments must be enabled at startup" that remains open as a feature request (after some changes in the manual) for more than 3 years already. I truly hope 5.7.12 fixed this for a general case - it's truly important to be able to enable instruments dynamically if we expect Performance Schema to be used as a main tool for troubleshooting.

I'd also want to highlight a couple of fixes related to optimizer:
  • Bug #77209 - "Update may use index merge without any reason (increasing chances for deadlock)". It was reported and verified by my former colleagues from Oracle, Andrii Nikitin. MySQL 5.6.30 also includes the fix.
  • Bug #72858 - "EXPLAIN .. SELECT .. FOR UPDATE takes locks". This bug was reported by my former colleague in Percona (and, I hope, my colleague again soon) Justin Swanhart, who has a birthday today. Happy Birthday to you, Justin! The bug was verified by Umesh and is also fixed in MySQL 5.6.30. Justin had reported another bug fixed in 5.7.12, Bug #69375 - "LOAD DATA INFILE claims to be holding 'System Lock' in processlist".
Several more bugs reported by community were also fixed, but they were in the areas (or for platforms) I am not particularly interested in.

To summarize, MySQL 5.7.12 contains important bug fixes in replication and InnoDB and it makes sense to consider upgrade even if you do not care about any "rapid plugins", X protocol, encryption of data at rest, MySQL Keyring and other "cool" new shiny features.

Wednesday, April 27, 2016

Building MaxScale 1.4.2 from GitHub on Fedora 23

MariaDB MaxScale is mentioned in many blog posts recently. It's Application of the Year 2016 after all! I'd like to test it, follow posts like this etc, all that on my favorite and readily available testing platforms that are now Ubuntu of all kinds and, surely, Fedora 23 (on my wife's workstation, the most powerful hardware at hand).

My old habits force me to build open source software I test from source, and I do not want to even discuss the topic of "MaxScale binaries availability" that was quite "popular" some time ago. So, after building MaxScale 1.4.1 on CentOS 6.7 back on March 31, 2016 (mostly just following MariaDB KB article on the topic) using libmysqld.a from MariaDB 10.0.23, this morning I decided to check new branch, 1.4.2, and build it on Fedora 23, following that same KB article (that unfortunately does not even mention Fedora after the fix to MXS-248). Thing is, Fedora is not officially supported as a platform for MaxScale 1.4.x, but why should we, those who can build things from source for testing purposes, care about this?

I started with cloning MaxScale:

git clone https://github.com/mariadb-corporation/MaxScale.git
cd MaxScale
and then:

[openxs@fc23 MaxScale]$ git branch -r
...
  origin/HEAD -> origin/develop
...
  origin/release-1.4.2
...
I remember spending enough time fighting with develop branch while building on CentOS 6.7, mostly with sqlite-related things it contained, so this time I proceed immediately to the branch I want to build:

[openxs@fc23 MaxScale]$ git checkout release-1.4.2
Branch release-1.4.2 set up to track remote branch release-1.4.2 from origin.
Switched to a new branch 'release-1.4.2'
[openxs@fc23 MaxScale]$ git branch
  develop
* release-1.4.2

[openxs@fc23 MaxScale]$ mkdir build
[openxs@fc23 MaxScale]$ cd build

Last two steps originate from the KB article. We are almost ready for building, but what about the prerequisites? I've collected all the packages required for CentOS in that article and tried to install them all:
[openxs@fc23 build]$ sudo yum install mariadb-devel mariadb-embedded-devel libedit-devel gcc gcc-c++ ncurses-devel bison flex glibc-devel cmake libgcc perl make libtool openssl-devel libaio libaio-devel librabbitmq-devel libcurl-devel pcre-devel rpm-build[sudo] password for openxs:
Yum command has been deprecated, redirecting to '/usr/bin/dnf install mariadb-devel mariadb-embedded-devel libedit-devel gcc gcc-c++ ncurses-devel bison flex glibc-devel cmake libgcc perl make libtool openssl-devel libaio libaio-devel librabbitmq-devel libcurl-devel pcre-devel rpm-build'.
See 'man dnf' and 'man yum2dnf' for more information.
To transfer transaction metadata from yum to DNF, run:
'dnf install python-dnf-plugins-extras-migrate && dnf-2 migrate'

Last metadata expiration check: 0:26:04 ago on Wed Apr 27 10:43:24 2016.
Package gcc-5.3.1-6.fc23.x86_64 is already installed, skipping.
...
Package pcre-devel-8.38-7.fc23.x86_64 is already installed, skipping.
Dependencies resolved.
================================================================================
 Package                  Arch     Version                      Repository
                                                                           Size
================================================================================
Installing:
 autoconf                 noarch   2.69-21.fc23                 fedora    709 k
 automake                 noarch   1.15-4.fc23                  fedora    695 k
 dwz                      x86_64   0.12-1.fc23                  fedora    106 k
 flex                     x86_64   2.5.39-2.fc23                fedora    328 k
 ghc-srpm-macros          noarch   1.4.2-2.fc23                 fedora    8.2 k
 gnat-srpm-macros         noarch   2-1.fc23                     fedora    8.4 k
 go-srpm-macros           noarch   2-3.fc23                     fedora    8.0 k
 libcurl-devel            x86_64   7.43.0-6.fc23                updates   590 k
 libedit-devel            x86_64   3.1-13.20150325cvs.fc23      fedora     34 k
 librabbitmq              x86_64   0.8.0-1.fc23                 updates    43 k
 librabbitmq-devel        x86_64   0.8.0-1.fc23                 updates    52 k
 libtool                  x86_64   2.4.6-8.fc23                 updates   707 k
 mariadb-common           x86_64   1:10.0.23-1.fc23             updates    74 k
 mariadb-config           x86_64   1:10.0.23-1.fc23             updates    25 k
 mariadb-devel            x86_64   1:10.0.23-1.fc23             updates   869 k
 mariadb-embedded         x86_64   1:10.0.23-1.fc23             updates   4.0 M
 mariadb-embedded-devel   x86_64   1:10.0.23-1.fc23             updates   8.3 M
 mariadb-errmsg           x86_64   1:10.0.23-1.fc23             updates   199 k
 mariadb-libs             x86_64   1:10.0.23-1.fc23             updates   637 k
 ocaml-srpm-macros        noarch   2-3.fc23                     fedora    8.1 k
 patch                    x86_64   2.7.5-2.fc23                 fedora    123 k
 perl-Thread-Queue        noarch   3.07-1.fc23                  updates    22 k
 perl-generators          noarch   1.06-1.fc23                  updates    15 k
 perl-srpm-macros         noarch   1-17.fc23                    fedora    9.7 k
 python-srpm-macros       noarch   3-7.fc23                     updates   8.1 k
 redhat-rpm-config        noarch   36-1.fc23.1                  updates    59 k
 rpm-build                x86_64   4.13.0-0.rc1.13.fc23         updates   137 k

Transaction Summary
================================================================================
Install  27 Packages

Total download size: 18 M
Installed size: 64 M
Is this ok [y/N]: Y

...

Complete!
Now, let's try simple approach:

[openxs@fc23 build]$ cmake ..
...
-- MySQL version: 10.0.23
-- MySQL provider: MariaDB
-- Looking for pcre_stack_guard in MYSQL_EMBEDDED_LIBRARIES_STATIC-NOTFOUND
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
MYSQL_EMBEDDED_LIBRARIES_STATIC
    linked by target "cmTC_2494a" in directory /home/openxs/git/MaxScale/build/CMakeFiles/CMakeTmp

CMake Error: Internal CMake error, TryCompile configure of cmake failed
-- Looking for pcre_stack_guard in MYSQL_EMBEDDED_LIBRARIES_STATIC-NOTFOUND - not found
-- PCRE libs: /usr/lib64/libpcre.so
-- PCRE include directory: /usr/include
-- Embedded mysqld does not have pcre_stack_guard, linking with system pcre.
CMake Error at cmake/FindMySQL.cmake:115 (message):
  Library not found: libmysqld.  If your install of MySQL is in a non-default
  location, please provide the location with -DMYSQL_EMBEDDED_LIBRARIES=<path
  to library>
Call Stack (most recent call first):
  CMakeLists.txt:37 (find_package)


-- Configuring incomplete, errors occurred!
See also "/home/openxs/git/MaxScale/build/CMakeFiles/CMakeOutput.log".
See also "/home/openxs/git/MaxScale/build/CMakeFiles/CMakeError.log".

Failure, cmake can not find libmysqld.a it seems. Let me try to find it:

[openxs@fc23 build]$ sudo find / -name libmysqld.a 2>/dev/null
/home/openxs/git/percona-xtrabackup/libmysqld/libmysqld.a
/home/openxs/dbs/5.7/lib/libmysqld.a
/home/openxs/dbs/p5.6/lib/libmysqld.a
/home/openxs/dbs/fb56/lib/libmysqld.a
/home/openxs/10.1.12/lib/libmysqld.a
That's all, even though I installed all packages that looked as required based on the article! I have the library in many places (in my own builds and even in sandbox with MariaDB 10.1.12), but it's not installed where expected. Some more desperate tries (installing MariaDB server with sudo yum install mariadb-server, searches for package that provides libmysqld.a etc), chat with engineers of MariaDB and I've ended up with the fact that my packages are from Fedora (not MariaDB) and they just do not include the static library. Looks like a bug in Fedora packaging, if you ask me.

I was not ready to add MariaDB's repository at the moment (to get MariaDB-devel etc, something KB article also suggests for supported platforms), so I decided that it would be fair just to build current MariaDB 10.1.13 from source and use everything needed from there. Last time I built 10.2 branch, so I had to check out 10.1 first:
[openxs@fc23 server]$ git checkout 10.1
Switched to branch '10.1'
Your branch is behind 'origin/10.1' by 2 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)
[openxs@fc23 server]$ git pull
Updating 1cf852d..071ae30
Fast-forward
 client/mysqlbinlog.cc                    | 523 ++++++++++++++++++++++---------
 mysql-test/r/mysqlbinlog_raw_mode.result | 274 ++++++++++++++++
 mysql-test/t/mysqlbinlog_raw_mode.test   | 387 +++++++++++++++++++++++
 sql/sql_priv.h                           |   3 +-
 storage/innobase/dict/dict0boot.cc       |  20 +-
 storage/xtradb/dict/dict0boot.cc         |  20 +-
 6 files changed, 1062 insertions(+), 165 deletions(-)
 create mode 100644 mysql-test/r/mysqlbinlog_raw_mode.result
 create mode 100644 mysql-test/t/mysqlbinlog_raw_mode.test
 Then I've executed the following while in server directory:

make clean
rm CMakeCache.txt
cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_SSL=system -DWITH_ZLIB=bundled -DMYSQL_MAINTAINER_MODE=0 -DENABLED_LOCAL_INFILE=1 -DWITH_JEMALLOC=system -DWITH_WSREP=ON -DWITH_INNODB_DISALLOW_WRITES=ON -DWITH_EMBEDDED_SERVER=ON -DCMAKE_INSTALL_PREFIX=/home/openxs/dbs/maria10.1

make

make install && make clean
Note that I've explicitly asked to build embedded server. I checked that the library is in the location I need:

[openxs@fc23 server]$ sudo find / -name libmysqld.a 2>/dev/null
/home/openxs/git/percona-xtrabackup/libmysqld/libmysqld.a
/home/openxs/dbs/maria10.1/lib/libmysqld.a
/home/openxs/dbs/5.7/lib/libmysqld.a
/home/openxs/dbs/p5.6/lib/libmysqld.a
/home/openxs/dbs/fb56/lib/libmysqld.a
/home/openxs/10.1.12/lib/libmysqld.a
Then I moved back to MaxScale/build directory and explicitly pointed out the location of headers, library and messages that I want to use with MaxScale:

[openxs@fc23 build]$ cmake .. -DMYSQL_EMBEDDED_INCLUDE_DIR=/home/openxs/dbs/maria10.1/include/mysql -DMYSQL_EMBEDDED_LIBRARIES=/home/openxs/dbs/maria10.1/lib/libmysqld.a -DERRMSG=/home/openxs/dbs/maria10.1/share/english/errmsg.sys -DCMAKE_INSTALL_PREFIX=/home/openxs/maxscale -DWITH_MAXSCALE_CNF=N
...
-- Build files have been written to: /home/openxs/git/MaxScale/build

[openxs@fc23 build]$ make
...
[ 95%] [BISON][ruleparser] Building parser with bison 3.0.4
ruleparser.y:34.1-13: warning: deprecated directive, use Б-?%name-prefixБ-? [-Wdeprecated]
 %name-prefix="dbfw_yy"
 ^^^^^^^^^^^^^
[ 96%] Building C object server/modules/filter/dbfwfilter/CMakeFiles/dbfwfilter.dir/ruleparser.c.o
[ 96%] Building C object server/modules/filter/dbfwfilter/CMakeFiles/dbfwfilter.dir/token.c.o
[ 97%] Linking C shared library libdbfwfilter.so
[ 97%] Built target dbfwfilter
Scanning dependencies of target maxadmin
[ 98%] Building C object client/CMakeFiles/maxadmin.dir/maxadmin.c.o
[ 98%] Linking C executable maxadmin
[100%] Built target maxadmin

It seems build completed without problems this time. We can try to test it (some tests do fail):







[openxs@fc23 build]$ make testcore
...
 1/22 Test  #1: Internal-TestQueryClassifier .....***Exception: Other  0.35 sec
      Start  2: Internal-CanonicalQuery
 2/22 Test  #2: Internal-CanonicalQuery ..........***Failed    0.25 sec
      Start  3: Internal-CanonicalQuerySelect
 3/22 Test  #3: Internal-CanonicalQuerySelect ....***Failed    0.04 sec
      Start  4: Internal-CanonicalQueryAlter
 4/22 Test  #4: Internal-CanonicalQueryAlter .....***Failed    0.04 sec
      Start  5: Internal-CanonicalQueryComment
 5/22 Test  #5: Internal-CanonicalQueryComment ...***Failed    0.04 sec
      Start  6: Internal-TestAdminUsers
 6/22 Test  #6: Internal-TestAdminUsers ..........   Passed    0.44 sec
      Start  7: Internal-TestBuffer
 7/22 Test  #7: Internal-TestBuffer ..............   Passed    0.01 sec
      Start  8: Internal-TestDCB
 8/22 Test  #8: Internal-TestDCB .................   Passed    0.01 sec
      Start  9: Internal-TestFilter
 9/22 Test  #9: Internal-TestFilter ..............   Passed    0.03 sec
...
(As a side note, make install in my case had NOT installed anything to /home/openxs/maxscale, something to deal with later, as on CentOS 6.7 it worked...)


In any case, I now have binaries to work with, of version 1.4.2:

[openxs@fc23 build]$ ls bin/
maxadmin  maxbinlogcheck  maxkeys  maxpasswd  maxscale
[openxs@fc23 build]$ bin/maxscale --version


MariaDB Corporation MaxScale 1.4.2      Wed Apr 27 13:24:01 2016
------------------------------------------------------
MaxScale 1.4.2

[openxs@fc23 build]$ bin/maxadmin --version
bin/maxadmin Version 1.4.2
To be continued one day... Stay tuned!




Sunday, April 24, 2016

Building MariaDB 10.1.x and Galera from Source for Multiple Node Cluster Testing Setup

My Facebook followers probably noted that I quit from Percona some time ago and work for MariaDB since March 1, 2016. I changed the company, but neither the job role (I am still a Support Engineer), nor the approach to do my job. I still prefer to test everything I suggest to customers and I usually use software I build from source myself for these tests.

While I try to avoid all kinds of clusters as much as possible for 15 years or so already (it does not matter if it's Oracle RAC, MySQL Cluster or Percona XtraDB Cluster, all of them), it's really hard to avoid Galera clusters while working for MariaDB. One of the reasons for this is that Galera, starting from MariaDB 10.1, can be easily "enabled"/used with any MariaDB 10.1.x instance, any time (at least when we speak about official binaries or those properly built - they are all "Galera ready"). Most of MariaDB customers do use Galera or can try to use it any time, so I have to be ready to test something Galera-specific any moment.

For simple cases I decided to use a setup with several (2 to begin with) cluster nodes on one box. This approach is described in the manual for Percona XtraDB Cluster and was also used by my former colleague Fernando Laudares for his blog post and many real life related tests.

So, I decided to proceed with the mix of ideas from the sources above and MariaDB's KB article on building Galera from source. As I decided to do this on my wife's Fedora 23 workstation, I checked this KB article for some details also. It lists prerequisites (boost-devel check-devel glibc-devel openssl-devel scons) and some of these packages (like scons in one of my cases) could be missing even on a system previosly used for builds for all kinds of MySQL related software. You can find something missing and fix the problem at later stage, but reading and following the manual or KB articles may help to save some time otherwise spent on trial and error.

I've started with making directories in my home directory (/home/openxs) for this Galera related testing setup, like these:
[openxs@fc23 ~]$ mkdir galera
[openxs@fc23 ~]$ cd galera
[openxs@fc23 galera]$ mkdir node1[openxs@fc23 galera]$ mkdir node2
[openxs@fc23 galera]$ mkdir node3
[openxs@fc23 galera]$ ls
node1  node2  node3
I plan to use 3 nodes one day, but for this blog post I'll set up only 2, to have the smallest possible and simplest cluster as a proof of concept.

Then I proceeded with cloning Galera from Codership's GitHub (this is supposed to be the latest and greatest). I changed current directory to my usual git repository and executed git clone https://github.com/codership/galera.git. When this command completed I've got a subdirectory named galera.

In that directory, assuming that all prerequisites are installed, to build current Galera library version it's enough to execute simple script while in galera directory, ./scripts/build.sh. I ended up with the following:
[openxs@fc23 galera]$ ls -l libgalera_smm.so
-rwxrwxr-x. 1 openxs openxs 40204824 Mar 31 12:21 libgalera_smm.so
[openxs@fc23 galera]$ file libgalera_smm.so
libgalera_smm.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=11457fa9fd69dabe617708c0dd288b218255a886, not stripped

[openxs@fc23 galera]$ pwd
/home/openxs/git/galera
[openxs@fc23 galera]$ cp libgalera_smm.so ~/galera/
and copied the library to the target directory for my testing setup (that should NOT conflict with whatever software I may have installed later from packages).

Now, time to build MariaDB properly to let it use Galera if needed. I already had recent (at the moment) 10.1.13 in the server subdirectory of my git repository. I've executed the following commands then:

[openxs@fc23 server]$ cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_SSL=system -DWITH_ZLIB=bundled -DMYSQL_MAINTAINER_MODE=0 -DENABLED_LOCAL_INFILE=1 -DWITH_JEMALLOC=system -DWITH_WSREP=ON -DWITH_INNODB_DISALLOW_WRITES=ON -DCMAKE_INSTALL_PREFIX=/home/openxs/dbs/maria10.1
-- Running cmake version 3.4.1
-- MariaDB 10.1.13
...

[openxs@fc23 server]$ time make -j 4...
real    9m28.164s
user    32m43.960s
sys     2m45.637s
This was my usual command line to build MariaDB 10.x with only 2 extra options added: -DWITH_WSREP=ON -DWITH_INNODB_DISALLOW_WRITES=ON.After make completed, I've executed make install && make clean and was ready to use my shiny new Galera-ready MariaDB 10.1.13.

To take into account the directories I am going to use for my cluster nodes and make sure they can start and communicate as separate mysqld instances, I have to create configuration files for them. I've changed working directory to /home/openxs/dbs/mariadb10.1 and started with this configuration file for the first node:

[openxs@fc23 maria10.1]$ cat /home/openxs/galera/mynode1.cnf                    
[mysqld]
datadir=/home/openxs/galera/node1
port=3306
socket=/tmp/mysql-node1.sock
pid-file=/tmp/mysql-node1.pid
log-error=/tmp/mysql-node1.err
binlog_format=ROW
innodb_autoinc_lock_mode=2

wsrep_on=ON # this is important for 10.1!
wsrep_provider=/home/openxs/galera/libgalera_smm.so
wsrep_cluster_name = singlebox
wsrep_node_name = node1
# wsrep_cluster_address=gcomm://
wsrep_cluster_address=gcomm://127.0.0.1:4567,127.0.0.1:5020?pc.wait_prim=no
It's one of the shortest possible. I had to specify unique datadir, error log location, pid file, port and socket for the instance, set binlog format and point out Galera library location, set cluster name and node name. With proper planning I was able to specify wsrep_cluster_address referring to all other nodes properly, but for initial setup of the first node I can have it "empty" as commented out in the above, so that we start as a new cluster node. There is one essential setting for MariaDB 10.1.x that is not needed for "cluster-specific" instances like Percona XtraDB Cluster or older 10.0.x Galera packages from MariaDB (where it's ON by default). This is wsrep_on=ON. Without it MariaDB works as normal, non-cluster instance and ignores anything cluster-related. You can save a lot of time in case of upgrade to 10.1.x if you put it in your configuration file explicitly right now, no matter what the version is used.

Then I copied and modified configuration file for the second node:
[openxs@fc23 maria10.1]$ cp /home/openxs/galera/mynode1.cnf /home/openxs/galera/mynode2.cnf
[openxs@fc23 maria10.1]$ vi /home/openxs/galera/mynode2.cnf                     

[openxs@fc23 maria10.1]$ cat /home/openxs/galera/mynode2.cnf                   
[mysqld]
datadir=/home/openxs/galera/node2
port=3307
socket=/tmp/mysql-node2.sock
pid-file=/tmp/mysql-node2.pid
log-error=/tmp/mysql-node2.err
binlog_format=ROW
innodb_autoinc_lock_mode=2

wsrep_on=ON # this is important for 10.1!wsrep_provider=/home/openxs/galera/libgalera_smm.so
wsrep_cluster_name = singlebox
wsrep_node_name = node2
wsrep_cluster_address=gcomm://127.0.0.1:4567,127.0.0.1:5020

?pc.wait_prim=nowsrep_provider_options = "base_port=5020;"
Note that while Galera node uses 4 ports, I specified only 2 unique ones explicitly, port for MySQL clients and base port for all Galera-related communication like IST and SST, with base_port setting. Note also how I referred to all cluster nodes with wsrep_cluster_address - this same value can be used for the configuration file of the first node actually. We can just start it as the first node of a new cluster (see below).

Now we have configuration files for 2 nodes ready (we can always add node3 later in the same way). But before starting new cluster we have to install system databases. For node1 it was performed in the following way:
[openxs@fc23 maria10.1]$ scripts/mysql_install_db --defaults-file=/home/openxs/galera/mynode1.cnf
Installing MariaDB/MySQL system tables in '/home/openxs/galera/node1' ...
2016-03-31 12:51:34 139766046820480 [Note] ./bin/mysqld (mysqld 10.1.13-MariaDB) starting as process 28297 ...
...

[openxs@fc23 maria10.1]$ ls -l /home/openxs/galera/node1
-rw-rw----. 1 openxs openxs    16384 Mar 31 12:51 aria_log.00000001
-rw-rw----. 1 openxs openxs       52 Mar 31 12:51 aria_log_control
-rw-rw----. 1 openxs openxs 12582912 Mar 31 12:51 ibdata1
-rw-rw----. 1 openxs openxs 50331648 Mar 31 12:51 ib_logfile0
-rw-rw----. 1 openxs openxs 50331648 Mar 31 12:51 ib_logfile1
drwx------. 2 openxs openxs     4096 Mar 31 12:51 mysql
drwx------. 2 openxs openxs     4096 Mar 31 12:51 performance_schema
drwx------. 2 openxs openxs     4096 Mar 31 12:51 test
Then I started node1 as a new cluster:
[openxs@fc23 maria10.1]$ bin/mysqld_safe --defaults-file=/home/openxs/galera/mynode1.cnf --wsrep-new-cluster &
and created a table, t1, with some data in it. After that I repeated installation of system tables etc for node2, just referencing proper configuration file, and started node2 that was supposed to join the cluster:
openxs@fc23 maria10.1]$ bin/mysqld_safe --defaults-file=/home/openxs/galera/mynode2.cnf &
Let's check if we do have both instances running and communicating in Galera cluster:
[openxs@fc23 maria10.1]$ tail /tmp/mysql-node2.err                             
2016-03-31 13:40:29 139627414767744 [Note] WSREP: Signalling provider to continue.
2016-03-31 13:40:29 139627414767744 [Note] WSREP: SST received: c91d17b6-f72b-11e5-95de-96e95167f593:0
2016-03-31 13:40:29 139627117668096 [Note] WSREP: 1.0 (node2): State transfer from 0.0 (node1) complete.
2016-03-31 13:40:29 139627117668096 [Note] WSREP: Shifting JOINER -> JOINED (TO: 0)
2016-03-31 13:40:29 139627117668096 [Note] WSREP: Member 1.0 (node2) synced with group.
2016-03-31 13:40:29 139627117668096 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
2016-03-31 13:40:29 139627414452992 [Note] WSREP: Synchronized with group, ready for connections
2016-03-31 13:40:29 139627414452992 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-03-31 13:40:29 139627414767744 [Note] /home/openxs/dbs/maria10.1/bin/mysqld: ready for connections.
Version: '10.1.13-MariaDB'  socket: '/tmp/mysql-node2.sock'  port: 3307  Source distribution

[openxs@fc23 maria10.1]$ tail /tmp/mysql-node1.err
2016-03-31 13:40:27 140071390934784 [Note] WSREP: Provider resumed.
2016-03-31 13:40:27 140072133322496 [Note] WSREP: 0.0 (node1): State transfer to 1.0 (node2) complete.
2016-03-31 13:40:27 140072133322496 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 0)
2016-03-31 13:40:27 140072133322496 [Note] WSREP: Member 0.0 (node1) synced with group.
2016-03-31 13:40:27 140072133322496 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
2016-03-31 13:40:27 140072429247232 [Note] WSREP: Synchronized with group, ready for connections
2016-03-31 13:40:27 140072429247232 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-03-31 13:40:27 140072141715200 [Note] WSREP: (c91c99ec, 'tcp://0.0.0.0:4567') turning message relay requesting off
2016-03-31 13:40:29 140072133322496 [Note] WSREP: 1.0 (node2): State transfer from 0.0 (node1) complete.
2016-03-31 13:40:29 140072133322496 [Note] WSREP: Member 1.0 (node2) synced with group.
Familiar messages (unfortunately...) that prove we had a second node joined and performed state transfer from the first one. Now it's time to connect and test how cluster works. This is what I had after node1 started and table with some data created there, but before node2 started:
[openxs@fc23 maria10.1]$ bin/mysql -uroot --socket=/tmp/mysql-node1.sock
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 5
Server version: 10.1.13-MariaDB Source distribution

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show variables like 'wsrep_cluster%';
+-----------------------+-------------------------------------------------------+
| Variable_name         | Value                                                 |
+-----------------------+-------------------------------------------------------+
| wsrep_cluster_address | gcomm://127.0.0.1:4567,127.0.0.1:5020?pc.wait_prim=no |
| wsrep_cluster_name    | singlebox                                             |
+-----------------------+-------------------------------------------------------+
2 rows in set (0.00 sec)

MariaDB [(none)]> show status like 'wsrep_cluster%';
+--------------------------+--------------------------------------+
| Variable_name            | Value                                |
+--------------------------+--------------------------------------+
| wsrep_cluster_conf_id    | 1                                    |
| wsrep_cluster_size       | 1                                    |
| wsrep_cluster_state_uuid | c91d17b6-f72b-11e5-95de-96e95167f593 |
| wsrep_cluster_status     | Primary                              |
+--------------------------+--------------------------------------+
4 rows in set (0.00 sec)

MariaDB [(none)]> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [test]> select * from t1;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
+----+------+
2 rows in set (0.00 sec)
Then, when node2 joined the cluster, I checked that the data we've added on node1 are there:

[openxs@fc23 maria10.1]$ bin/mysql -uroot --socket=/tmp/mysql-node2.sock
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 10.1.13-MariaDB Source distribution

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [test]> select * from t1;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
+----+------+
2 rows in set (0.00 sec)

MariaDB [test]> show status like 'wsrep_cluster%'; +--------------------------+--------------------------------------+
| Variable_name            | Value                                |
+--------------------------+--------------------------------------+
| wsrep_cluster_conf_id    | 2                                    |
| wsrep_cluster_size       | 2                                    |
| wsrep_cluster_state_uuid | c91d17b6-f72b-11e5-95de-96e95167f593 |
| wsrep_cluster_status     | Primary                              |
+--------------------------+--------------------------------------+
4 rows in set (0.01 sec)
So, the first basic test with the Galera cluster of 2 nodes (both running on the same box) built from current source of Galera and MariaDB 10.1.x on Fedora 23 is completed successfully. I plan to play with it more in the future, use current xtrabackup built from source for SST and so on, and create blog posts about these steps and any interesting tests in this setup. Stay tuned.

From the dates above you can conclude that it took me 3 weeks to publish this post. That's because I was busy with the company meeting in Berlin and some usual Support work, and was not sure is it really a good idea for me to write any post with "Galera" or "MariaDB" words used in it even once...



Monday, February 29, 2016

Exploring Metadata Locks with gdb - Studying Simple Case on Percona Server 5.7

I had forgotten the topic of studying metadata locks with gdb for quite a some time. Now that I've upgraded to Percona Server 5.7 and enjoy the remaining of my last free day as an independent MySQL Support Engineer doing nothing but some blogging, I think it's time to get back to gdb interactive sessions and try to answer the question I've got from customer back in Percona but had no chance to answer.

In a bit rewritten way, it sounded as follows:
"Do ALTER TABLE ... STATS_AUTO_RECALC=1" and the reverse ALTER
TABLE...STATS_AUTO_RECALC=default block select, update, delete, and insert on the table?
"
There are two ways to answer this question. First, we can just try to set up a simple test with 3 sessions and simple table, like this:
mysql> create table ti(id int primary key, c1 int, key(c1)) engine=InnoDB stats_auto_recalc=default;Query OK, 0 rows affected (0.16 sec)

mysql> insert into ti values (1,1), (2,2);
Query OK, 2 rows affected (0.09 sec)
Records: 2  Duplicates: 0  Warnings: 0
Now, in one session we can SELECT from the table in transaction that is not completed:
mysql> start transaction;
Query OK, 0 rows affected (0.01 sec)

mysql> select * from ti;
+----+------+
| id | c1   |
+----+------+
|  1 |    1 |
|  2 |    2 |
+----+------+
2 rows in set (0.00 sec)
and then in another session run the ALTER TABLE in question:
mysql> alter table ti stats_auto_recalc=1;
to find out in the first session that it actually hangs waiting on some metadata lock:
mysql> show processlist;
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+-----------+---------------+
| Id | User | Host      | db   | Command | Time | State                           | Info                               | Rows_sent | Rows_examined |
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+-----------+---------------+
|  3 | root | localhost | test | Query   |   15 | Waiting for table metadata lock | alter table ti stats_auto_recalc=1 |         0 |             0 ||  4 | root | localhost | test | Query   |    0 | starting                        | show processlist                   |         0 |             0 |
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+-----------+---------------+
2 rows in set (0.00 sec)
So, one of metadata locks set by ALTER TABLE even in this very simple case (when we try to modify the way of statistics recalculation for InnoDB table) even in 5.7 is too strict, so that it conflicts with one of metadata locks set by simple SELECT for the duration of the entire transaction. What's more interesting, now in third session we can try to run whatever statements we want to check (those that were "compatible" with SELECT at metadata locks level) and see them also blocked, like this:
mysql> show processlist;
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+-----------+---------------+
| Id | User | Host      | db   | Command | Time | State                           | Info                               | Rows_sent | Rows_examined |
+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+-----------+---------------+
|  3 | root | localhost | test | Query   |   52 | Waiting for table metadata lock | alter table ti stats_auto_recalc=1 |         0 |             0 |
|  4 | root | localhost | test | Query   |    0 | starting                        | show processlist                   |         0 |             0 |
|  5 | root | localhost | test | Query   |    4 | Waiting for table metadata lock | select * from ti                   |         0 |             0 |+----+------+-----------+------+---------+------+---------------------------------+------------------------------------+-----------+---------------+
3 rows in set (0.00 sec)
This way we can already conclude that ALTER TABLE of this kind does block even SELECT from the table! Good that it runs fast (there is no need to make a table copy), so will not block for a long time, but if ALTER itself is blocked by some transaction (like in my case) - we are still in trouble!

That simple test would give the answer to customer that he could be satisfied with. But I know that ALTER TABLE sets many metadata locks at different stages and while for this trivial one it may not matter much, I still wanted to find some time to attach gdb and see what locks are requested and when/in what order.

I had not done this with 5.7, so tried the same approach as in the initial post in this series (please, check it carefully for the details from the source code, like types of the locks etc), setting breakpoint on MDL_request::init, but it seems this method does NOT exist in MySQL 5.7. So, I ended up with the following breakpoints set:
(gdb) b MDL_context::acquire_lockBreakpoint 1 at 0xbf9083: file /usr/src/debug/percona-server-5.7.10-3/percona-server-5.7.10-3/sql/mdl.cc, line 3506.
(gdb) b lock_rec_lock
Breakpoint 2 at 0xf24ffd: file /usr/src/debug/percona-server-5.7.10-3/percona-server-5.7.10-3/storage/innobase/lock/lock0lock.cc, line 2343.
(gdb) c
Continuing.
With these breakpoints in place, I've executed alter table ti stats_auto_recalc=1; and ended up with the following in gdb:
Breakpoint 1, MDL_context::acquire_lock (this=0x7fc1e27ec0c8,
    mdl_request=0x7fc1b2df95b0, lock_wait_timeout=31536000)
    at /usr/src/debug/percona-server-5.7.10-3/percona-server-5.7.10-3/sql/mdl.cc:3506
3506      set_timespec(&abs_timeout, lock_wait_timeout);
(gdb) p mdl_request
$1 = (MDL_request *) 0x7fc1b2df95b0
(gdb) p *mdl_request
$2 = {type = MDL_INTENTION_EXCLUSIVE, duration = MDL_STATEMENT,
  next_in_list = 0x7fc1e27f7f70, prev_in_list = 0x7fc1b2df9778, ticket = 0x0,
  key = {m_length = 3, m_db_name_length = 0,
    m_ptr = "\000\000\000\000L\274q\324\301...
We see something not that much useful immediately, some metadata lock request at the weakest possible level for the duration of the statement. Just to see how we can see all of requests in the same "batch", let's do the following in gdb:
(gdb) p *(mdl_request->next_in_list)
$5 = {type = MDL_INTENTION_EXCLUSIVE, duration = MDL_TRANSACTION,
  next_in_list = 0x7fc1e27f7d70, prev_in_list = 0x7fc1b2df95b8, ticket = 0x0,
  key = {m_length = 7, m_db_name_length = 4,
    m_ptr = "\002test", '\000' <repeats 175 times>"\377, \377\377\037\000\000\000\000\001", '\000' <repeats 15 times>, "\001", '\000' <repeats 31 times>, "\a", '\000' <repeats 15 times>, "\004\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\001\001\000\001", '\000' <repeats 44 times>, "\001", '\000' <repeats 69 times>, static m_namespace_to_wait_state_name = {{m_key = 111,
...
So, the next one is also the weakest, but is set for the duration of transaction at the database (named "test") level. We can see the next after it as well:
(gdb) p *(mdl_request->next_in_list->next_in_list)
$6 = {type = MDL_SHARED_UPGRADABLE, duration = MDL_TRANSACTION,
  next_in_list = 0x0, prev_in_list = 0x7fc1e27f7f78, ticket = 0x0, key = {
    m_length = 9, m_db_name_length = 4,
    m_ptr = "\003test\000ti", '\000' <repeats 378 times>,
    static m_namespace_to_wait_state_name = {{m_key = 111,
...

(gdb) c
Continuing.
Here we can see a stronger metadata lock request for the duration of transaction on the test.ti table. The next_in_list pointer is NULL, so we are done with this "batch" of requests. If we continue, we'll see the same locks requested one by one while hitting breakpoint:
Breakpoint 1, MDL_context::acquire_lock (this=0x7fc1e27ec0c8,
    mdl_request=0x7fc1e27f7f70, lock_wait_timeout=31536000)
    at /usr/src/debug/percona-server-5.7.10-3/percona-server-5.7.10-3/sql/mdl.cc:3506
3506      set_timespec(&abs_timeout, lock_wait_timeout);

(gdb) p *mdl_request
$7 = {type = MDL_INTENTION_EXCLUSIVE, duration = MDL_TRANSACTION,
  next_in_list = 0x7fc1e27f7d70, prev_in_list = 0x7fc1b2df95b8, ticket = 0x0,
  key = {m_length = 7, m_db_name_length = 4,
    m_ptr = "\002test", '\000' <repeats 175 times>"\377, \377\377\037\000\000\000\000\001", '\000' <repeats 15 times>, "\001", '\000' <repeats 31 times>, "\a",
...

(gdb) p *(mdl_request->next_in_list)
$8 = {type = MDL_SHARED_UPGRADABLE, duration = MDL_TRANSACTION,
  next_in_list = 0x0, prev_in_list = 0x7fc1e27f7f78, ticket = 0x0, key = {
    m_length = 9, m_db_name_length = 4,
    m_ptr = "\003test\000ti", '\000' <repeats 378 times>,
    static m_namespace_to_wait_state_name = {{m_key = 111,
...

(gdb) c
Continuing.

Breakpoint 1, MDL_context::acquire_lock (this=0x7fc1e27ec0c8,
    mdl_request=0x7fc1e27f7d70, lock_wait_timeout=31536000)
    at /usr/src/debug/percona-server-5.7.10-3/percona-server-5.7.10-3/sql/mdl.cc:3506
3506      set_timespec(&abs_timeout, lock_wait_timeout);

(gdb) p *mdl_request
$9 = {type = MDL_SHARED_UPGRADABLE, duration = MDL_TRANSACTION,
  next_in_list = 0x0, prev_in_list = 0x7fc1e27f7f78, ticket = 0x0, key = {
    m_length = 9, m_db_name_length = 4,
    m_ptr = "\003test\000ti", '\000' <repeats 378 times>,
    static m_namespace_to_wait_state_name = {{m_key = 111,
...

(gdb) c
Continuing.
So, we have two ways to study the locks and new one in this post is to check the list of the lock requests in a single mdl_request list of pointers.

Eventually we also get the following breakpoint hit:
(gdb) c
Continuing.

Breakpoint 1, MDL_context::acquire_lock (this=0x7fc1e27ec0c8,
    mdl_request=0x7fc1b2df98a0, lock_wait_timeout=31536000)
    at /usr/src/debug/percona-server-5.7.10-3/percona-server-5.7.10-3/sql/mdl.cc:3506
3506      set_timespec(&abs_timeout, lock_wait_timeout);
(gdb) p *mdl_request
$15 = {type = MDL_EXCLUSIVE, duration = MDL_TRANSACTION,
  next_in_list = 0xd044f7, prev_in_list = 0x7fc1b2dfa914, ticket = 0x0, key = {
    m_length = 9, m_db_name_length = 4,
    m_ptr = "\003test\000ti\000_3\000\001\000\000\000\003\000\000\000<\240V\001\
...
Now, this is a real problem for any concurrent activity with the table, it's the strongest MDL_EXCLUSIVE lock requested on test.ti table that is hold (when obtained) till the end of ALTER statement...

To summarize this post:
  1. We see that we still can use gdb to study metadata locks in MySQL (or Percona Server) 5.7.x, but the breakpoint should be set on MDL_context::acquire_lock (at least until I'll find time to study changes in the source code for any alternatives).
  2. When breakpoint is hit you can study the content of structure pointed by the mdl_request parameter to find out what kind of lock is requested.
  3. At table level ALTER TABLE ... STATS_AUTO_RECALC=1 in MySQL 5.7 starts with weak enough MDL_SHARED_UPGRADABLE lock, but then eventually we end up with the request for the strongest MDL_EXCLUSIVE lock that is either blocked by any concurrent access to the table until transaction there is completed, and/or blocks everything (including SELECT from the table) until ALTER finally completes.
Next time in this series I plan to check what happens with metadata locks for partitioned tables and how much maintenance of individual partitions may affect other partitions. Other idea would be to check what metadata locks are set when statistics for the InnoDB table is recalculated automatically when 10% of data changes. Stay tuned!