<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>FewBar.com - Make it good &#187; MySQL</title>
	<atom:link href="http://fewbar.com/category/tech/mysql/feed/" rel="self" type="application/rss+xml" />
	<link>http://fewbar.com</link>
	<description>Technology, life, and mischief, not in that order</description>
	<lastBuildDate>Fri, 23 Dec 2011 01:41:49 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.5</generator>
		<item>
		<title>But will it scale? &#8211; Taking Limesurvey horizontal with juju&#8230;</title>
		<link>http://fewbar.com/2011/12/but-will-it-scale-juju/</link>
		<comments>http://fewbar.com/2011/12/but-will-it-scale-juju/#comments</comments>
		<pubDate>Fri, 23 Dec 2011 01:41:49 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[Cloud]]></category>
		<category><![CDATA[Juju]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[charms]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[ec2]]></category>
		<category><![CDATA[juju]]></category>
		<category><![CDATA[lamp]]></category>
		<category><![CDATA[limesurvey]]></category>
		<category><![CDATA[testing]]></category>
		<category><![CDATA[ubuntu]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=486</guid>
		<description><![CDATA[One of the really cool things about using the cloud, and especially juju, is that it instantly enables things that often times take a lot of thought to even try out in traditional environments. While I was developing some little PHP apps &#8220;back in the day&#8221;, I knew eventually they&#8217;d need to go to more [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-medium wp-image-487" style="border-style: initial; border-color: initial;" title="Will it Blend?" src="http://fewbar.com.s3.amazonaws.com/wp-content/uploads/2011/12/will-it-blend-300x168.jpg" alt="" width="300" height="168" /></p>
<p>One of the really cool things about using the cloud, and especially <a href="https://juju.ubuntu.com/">juju</a>, is that it instantly enables things that often times take a lot of thought to even try out in traditional environments. While I was developing some little PHP apps &#8220;back in the day&#8221;, I knew eventually they&#8217;d need to go to more than one server, but testing them for that meant, well, finding and configuring multiple servers. Even with VMs, I had to go allocate one and configure it. Oops, I&#8217;m out of time, throw it on one server, pray, move to next task.</p>
<p>This left a very serious question in my mind.. &#8220;When the time comes, will my app actually scale?&#8221;<br />
<span id="more-486"></span></p>
<p>Have I forgotten some huge piece to make sure it is stateless, or will it scale horizontally the way I intended it to? Things have changed though, and now we have the ability to start virtual machines via an API on several providers, and actually *test* whether our app will scale.</p>
<p>This brings us to our story. Recently, <a href="https://bugs.launchpad.net/charm/+bug/899849">Nick Barcet created a juju</a> <a href="https://juju.ubuntu.com/Charms">charm</a> for <a href="http://www.limesurvey.org/">Limesurvey</a>. This is a really cool little app that lets users create rich, multi faceted surveys and invite the internet to vote on things, answer questions, etc. etc. This is your standard &#8220;LAMP&#8221; application, and it seems written in a way that will allow it to scale out.</p>
<p>However, when Nick submitted the charm for the official juju charms collection, I wanted to see if it actually would scale the way I knew LAMP apps should. So, I fired up juju on ec2, threw in some haproxy, and related it to my limesurvey service, and then started adding units. This is incredibly simple with juju:</p>
<pre>juju deploy --repository charms local:mysql</pre>
<pre>juju deploy --repository charms local:limesurvey</pre>
<pre>juju deploy --repository charms local:haproxy</pre>
<pre>juju add-relation mysql limesurvey</pre>
<pre>juju add-relation limesurvey haproxy</pre>
<pre>juju add-unit limesurvey</pre>
<pre>juju expose haproxy</pre>
<p>Lo and behold, it didn&#8217;t scale. <a href="http://fewbar.com.s3.amazonaws.com/wp-content/uploads/2011/12/633325462873135493.jpg"><img class="alignright size-medium wp-image-492" title="633325462873135493" src="http://fewbar.com.s3.amazonaws.com/wp-content/uploads/2011/12/633325462873135493-300x225.jpg" alt="" width="300" height="225" /></a>There were a few issues with the default recommendations of limesurvey that Nick had encoded into the charm. These were simple things, like assuming that the local hostname would be the hostname people use to access the site.</p>
<p>Once that was solved, there were some other scaling problems immediately revealed. First on the ticket was that Limesurvey, by default, uses MyISAM for its storage engine in MySQL. This is a huge mistake, and I can&#8217;t imagine why *anybody* would use MyISAM in a modern application. MyISAM uses a &#8220;whole table&#8221; locking scheme for both reads and writes, so whenever anything writes to any part of the table, all reads and writes must wait for that to finish. InnoDB, available since MySQL 4.0, and the default storage engine for MySQL 5.5 and later, doesn&#8217;t suffer from this problem as it implements an MVCC model and row-level locks to allow concurrent reads and writes.</p>
<p>The MyISAM locks caused request timeouts when I pointed siege at the load balancer, because too many requests were stacking up waiting for updates to complete before even reading from the table. This is especially critical on something like the session storage that limesurvey does in the database, as it effectively meant that only one user can do anything at a time with the database.</p>
<p>Scalability testing in 10 minutes or less, with a server investment of about $1US. Who knew it could be this easy? Granted, I stopped at three app server nodes, and we didn&#8217;t even get to scaling out the database (something limesurvey doesn&#8217;t really have native support for). But these are things that are already solved, and that have been encoded in charms already. Now we just have to suggest small app changes to allow users to take advantage of all those well know best practices sitting in charms.</p>
<p><img class="aligncenter size-medium wp-image-495" style="border-style: initial; border-color: initial;" title="winning" src="http://fewbar.com.s3.amazonaws.com/wp-content/uploads/2011/12/winning-262x300.png" alt="" width="262" height="300" /></p>
<p>(check <a href="https://bugs.launchpad.net/charm/+bug/899849">the bug comments</a> for the results, I&#8217;d be interested if somebody wants to repeat the test).</p>
<p>So, in a situation where one needs to deploy now, and scale later, I think juju will prove quite useful. It should be on anybody&#8217;s radar who wants to get off the ground quickly.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2011/12/but-will-it-scale-juju/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The 2011 O&#8217;Reilly Open Mysql Drizzle Maria Monty Percona Xtra Galera Xeround Tungsten Cloud Database Conference and Expo</title>
		<link>http://fewbar.com/2011/04/the-2011-oreilly-open-mysql-drizzle-maria-monty-percona-xtra-galera-xeround-tungsten-cloud-database-conference-and-expo/</link>
		<comments>http://fewbar.com/2011/04/the-2011-oreilly-open-mysql-drizzle-maria-monty-percona-xtra-galera-xeround-tungsten-cloud-database-conference-and-expo/#comments</comments>
		<pubDate>Wed, 27 Apr 2011 17:49:00 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[Drizzle]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[drizzle]]></category>
		<category><![CDATA[mysqlconf]]></category>
		<category><![CDATA[percona]]></category>
		<category><![CDATA[ubuntu]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=382</guid>
		<description><![CDATA[Or, for short, the &#8220;2011 O&#8217;Reilly MySQL Users Conference &#38; Expo&#8221;. Yes thats the short name of the conference that, thus far, has brought me nothing but good info, good times, and insight into one of the most interesting open source communities around. MySQL has been at the core of a real revolution in the [...]]]></description>
			<content:encoded><![CDATA[<p>Or, for short, the &#8220;2011 O&#8217;Reilly MySQL Users Conference &amp; Expo&#8221;. Yes thats the short name of the conference that, thus far, has brought me nothing but good info, good times, and insight into one of the most interesting open source communities around.</p>
<p>MySQL has been at the core of a real revolution in the way data driven applications have exploded on the internet. Its so easy to just install it, fire up php&#8217;s mysql driver, and boom, you&#8217;re saving and retrieving data. The *use* of MySQL has always been incredibly simple.</p>
<p>The politics has, at times, been confusing.<span id="more-382"></span> Dual licensing was sort of an odd concept when MySQL AB was doing it &#8220;back in the day&#8221;. Nobody really understood how it worked or how they could sell something that was also &#8220;free&#8221;. But it worked out great for them. InnoDB got bought by Oracle and a lot of people thought &#8220;oh noes MySQL will have no transactional storage, Oracle will kill it.&#8221; Well we see where thats about 180 degrees from what actually happened (R.I.P. Falcon).</p>
<p>So this year, with the oddness of Oracle not being the top sponsor at an event that had driven a lot of the innovation and collaboration in the MySQL world (ironically, choosing instead to spend their time and effort on a conference called &#8220;Collaborate&#8221;), I thought &#8220;wonderful, more politics&#8221;.</p>
<p>But as <a href="http://blog.krow.net/post/4753904254/mysql-state-of-the-ecosystem-2011">Brian Aker says in his &#8220;State of the ecosystem&#8221; post</a>, it was quite the opposite. The absence of the commercial entity responsible for MySQL took a lot of the purely business focused discussion down to almost a whisper, while big ideas and big thinking seemed to be extremely prominent.</p>
<p><a href="http://en.oreilly.com/mysql2011/public/schedule/topic/563">Drizzle</a> had quite a few sessions, including my own about <a href="http://en.oreilly.com/mysql2011/public/schedule/detail/17640">what we&#8217;ve done with Drizzle in Ubuntu</a>. This is particularly interesting to me because Drizzle is mostly driven by a community effort, though most of the heavy lifting work up until now has been sponsored by Sun then Rackspace. Its purely an idea of how a MySQL-like database should be written, and while it may be seeing limited production use now, the discussions were on how it can be used, what it does now, not where its going or who is going to pay for its development. Its such a good idea, I&#8217;m pretty convinced users will drive it in much the same way Apache was driven by users wanting to do interesting things with HTTP.</p>
<p>I saw a lot of interesting ideas around replication put forth as well. <a href="http://en.oreilly.com/mysql2011/public/schedule/detail/17386">Galera</a>, Tungsten, and <a href="http://en.oreilly.com/mysql2011/public/schedule/detail/17747">Xeround</a> all seem to be trying to build on MySQL&#8217;s success with replication and NDB (a.k.a. MySQL Cluster). I really like that there are multiple takes on how to make a multi-master highly available / scalable system work. Getting all the people using and developing these things into one conference center is always pretty interesting to me.</p>
<p>The keynotes were especially interesting, as they were delivered by people who are sitting at the interesection of the old MySQL world, and the new MySQL &#8220;ecosystem&#8221;. I missed Monty Widenius&#8217;s keynote but it strikes me that he is still leading the charge for a simple, scalable, powerful database system, proving that the core of MySQL is mostly unchanged. Martin Mickos delivered a really interesting take on how MySQL was part of the last revolution in computing (LAMP) and how it may very well be a big part of the next revolution (IaaS, aka &#8220;the cloud&#8221;). Brian Aker reinforced that MySQL as a concept, and specifically, Drizzle, are just part of your Infrastructure (the I in IaaS).</p>
<p>Then on Thursday, <a href="http://en.oreilly.com/mysql2011/public/schedule/detail/17808">Baron Schwartz blew the whole place up</a>. Go, watch the video if you weren&#8217;t there, or haven&#8217;t seen it. Baron has always been  insightful in his evaluation of the MySQL ecosystem. Maatkit came around when the community needed it, and on joining Percona I think he brought his clear thinking to Petr&#8217;s bold decision making at just the right time to help fuel their rise as one of the most respected consulting firms in the &#8220;WebScale&#8221; world. So when Baron got up and said that the database is still going to scale up, that MySQL isn&#8217;t going to lose to NoSQL or SomeSQL, but rather, that the infrastructure would adapt to the data requirements, it caught my attention, and got me nodding. And when he plainly called Oracle out for not supporting the conference, there was a hush over the croud followed by a big sigh. Its likely that those in attendance were the ones who understand that, and those who weren&#8217;t there were probably the ones who need to hear it. I&#8217;d guess by now they&#8217;ve seen the video or at least heard the call. Either way, thanks Baron for your insight and powerful thoughts.</p>
<p>This was my second MySQL Conference, and I hope it won&#8217;t be my last. The mix of users, developers, and business professionals has always struck me as quite unique, as MySQL sits at the intersection of a number of very powerful avenues. Lets hope that O&#8217;Reilly decides to do it again, *and* lets hope that Oracle gets on board as well.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2011/04/the-2011-oreilly-open-mysql-drizzle-maria-monty-percona-xtra-galera-xeround-tungsten-cloud-database-conference-and-expo/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ubuntu and Drizzle &#8212; Run Drizzle on your Narwhal: OReilly MySQL Conference &amp; Expo 2011 &#8211; OReilly Conferences, April 11 &#8211; 14, 2011, Santa Clara, CA</title>
		<link>http://fewbar.com/2011/04/ubuntu-and-drizzle-run-drizzle-on-your-narwhal-oreilly-mysql-conference-expo-2011-oreilly-conferences-april-11-14-2011-santa-clara-ca/</link>
		<comments>http://fewbar.com/2011/04/ubuntu-and-drizzle-run-drizzle-on-your-narwhal-oreilly-mysql-conference-expo-2011-oreilly-conferences-april-11-14-2011-santa-clara-ca/#comments</comments>
		<pubDate>Fri, 15 Apr 2011 21:36:24 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[Drizzle]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[drizzle]]></category>
		<category><![CDATA[ubuntu]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=387</guid>
		<description><![CDATA[Ubuntu and Drizzle &#8212; Run Drizzle on your Narwhal: OReilly MySQL Conference &#38; Expo 2011 &#8211; OReilly Conferences, April 11 &#8211; 14, 2011, Santa Clara, CA. I gave a talk this week in Santa Clara at the MySQL Users Conference. I think it went pretty well and I got a lot of feedback from Ubuntu [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://en.oreilly.com/mysql2011/public/schedule/detail/17640">Ubuntu and Drizzle &#8212; Run Drizzle on your Narwhal: OReilly MySQL Conference &amp; Expo 2011 &#8211; OReilly Conferences, April 11 &#8211; 14, 2011, Santa Clara, CA</a>.</p>
<p>I gave a talk this week in Santa Clara at the MySQL Users Conference. I think it went pretty well and I got a lot of feedback from Ubuntu users about the positives of having Drizzle available in Universe.The slides are available at the link above.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2011/04/ubuntu-and-drizzle-run-drizzle-on-your-narwhal-oreilly-mysql-conference-expo-2011-oreilly-conferences-april-11-14-2011-santa-clara-ca/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Handlersocket &#8212; NoSQL for MySQL, now on your favorite Linux..</title>
		<link>http://fewbar.com/2011/02/handlersocket-nosql-for-mysql-now-on-your-favorite-linux/</link>
		<comments>http://fewbar.com/2011/02/handlersocket-nosql-for-mysql-now-on-your-favorite-linux/#comments</comments>
		<pubDate>Wed, 09 Feb 2011 07:42:19 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[handlersocket]]></category>
		<category><![CDATA[natty]]></category>
		<category><![CDATA[nosql]]></category>
		<category><![CDATA[ubuntu]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=347</guid>
		<description><![CDATA[Handlersocket is what all the cool kids are using these days.. I think. Basically you get a couple of new ports on your mysql server that allow SQL-free reading and writing for doing many thousands of tiny transactions per second without the overhead of parsing SQL. Thanks to my venerable Ubuntu sponsor, Chuck Short, handlersocket [...]]]></description>
			<content:encoded><![CDATA[<p><a href="https://github.com/ahiguti/HandlerSocket-Plugin-for-MySQL">Handlersocket</a> is what all the cool kids are using these days.. I think. Basically you get a couple of new ports on your mysql server that allow SQL-free reading and writing for doing many thousands of tiny transactions per second without the overhead of parsing SQL.</p>
<p>Thanks to my venerable Ubuntu sponsor, Chuck Short, <a href="https://launchpad.net/ubuntu/+source/handlersocket">handlersocket is now available in Ubuntu Natty</a> in the universe repository. apt-get install handlersocket-mysql-5.1 handlersocket-doc, then follow the instructions in /usr/share/doc/handlersocket-doc/docs-en to enable it, and you have yourself a bonified NoSQL solution for your MySQL server. </p>
<p>There are also client libraries for perl (libnet-handlersocket-perl) and C/C++ (libhsclient-dev .. static only as the API is in flux). Other languages are still not packaged, but the protocol is simple, and links to early implementations are listed in the README file, which should be at /usr/share/doc/handlersocket-mysql-5.1/README.</p>
<p>It should be on Debian unstable as well soon&#8230;<br />
<em>Update April 3 2011, Handlersocket is now in Debian Unstable as well</em></p>
<p>Happy hacking!</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2011/02/handlersocket-nosql-for-mysql-now-on-your-favorite-linux/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Gearman K.O.&#8217;s mysql to solr replication</title>
		<link>http://fewbar.com/2010/03/gearman-replicate-mysql-to-solr/</link>
		<comments>http://fewbar.com/2010/03/gearman-replicate-mysql-to-solr/#comments</comments>
		<pubDate>Wed, 24 Mar 2010 05:47:36 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[gearman]]></category>
		<category><![CDATA[opensource]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=154</guid>
		<description><![CDATA[Ding ding ding.. in this corner, wearing black shorts and a giant schema, we have over 11 million records in MySQL with a complex set of rules governing which must be searchable and which must not be. And in that corner, we have the contender, a kid from the back streets, outweighed and out reached [...]]]></description>
			<content:encoded><![CDATA[<p>Ding ding ding.. in this corner, wearing black shorts and a giant schema, we have over 11 million records in MySQL with a complex set of rules governing which must be searchable and which must not be. And in that corner, we have the contender, a kid from the back streets, outweighed and out reached by all his opponents, but still victorious in the queue shootout, with just open source, and 12 patch releases.. written in C, its <b><a href="http://gearman.org">gearman</a></b>!</p>
<p><a href="http://fewbar.com.s3.amazonaws.com/wp-content/uploads/2010/03/ko-mike-tyson.png"><img src="http://fewbar.com.s3.amazonaws.com/wp-content/uploads/2010/03/ko-mike-tyson.png" alt="" title="ko-mike-tyson" width="500" height="437" class="alignnone size-full wp-image-155" /></a><br />
<span id="more-154"></span></p>
<p>I&#8217;m pretty excited today, as I&#8217;m preparing to go live with the first real, high load application of Gearman that I&#8217;ve written. What is it you say? Well it is a simple trigger based replicator from mysql to <a href="http://lucene.apache.org/solr/">SOLR</a>.</p>
<p>I should say (because I know some of my colleagues read this blog) that I don&#8217;t actually believe in this design. Replication using triggers seems fraught with danger. It totally makes sense if you have a giant application and can&#8217;t track down everywhere that a table is changed. However, if your app is simple and properly abstracted, hopefully you know the 1 or 2 places that write to the table.</p>
<p>I should also say that I really can&#8217;t reveal all of the details. The general idea is pretty simple. Basically we have a trigger that dumps a primary key into gearman via the <a href="https://launchpad.net/gearman-mysql-udf">gearman MySQL UDFs</a>. The idea is just to tell a gearman worker &#8220;look at this record in that table&#8221;.</p>
<p>Once the worker picks it up, it applies some logic to the record.. &#8220;should this be searchable or not&#8221;. If the answer is yes it should be searchable, the worker pushes the record into SOLR. If not, the worker will make sure it is not in solr.</p>
<p>This at least is pretty simple. The end result is a system where we can rebuild the search index in parallel using multiple CPU&#8217;s (thank you to solr/lucene for being able to update indexes concurrently and efficiently btw). This is done by pushing all of the records in the table into the queue at once.</p>
<p>Anyway, gearmand is performing like a champ, libgearman and the gearman pecl module are doing great. I&#8217;m just really happy to see gearman rolled out in production, as I really do think it has that nice mix of simplicity and performance. I love the commandline client which makes it easy to write scripts to inject things into queues, or query workers.  This allows me to access a worker like this:</p>
<p><code>$ gearman -h gearmanbox -f all_workers -s<br />
Known Workers: 11</p>
<p>boxname_RealTimeUpdate_Queue_TriggerWorker_1 jobs=627366,restarts=0,memory_MB=4.27,lastcheckin=Tue, 23 Mar 2010 22:37:59 -0700<br />
boxname_RealTimeUpdate_Queue_Subject_13311 jobs=304134,restarts=0,memory_MB=7.03,lastcheckin=Tue, 23 Mar 2010 22:37:58 -0700<br />
boxname_RealTimeUpdate_Queue_Subject_13306 jobs=606126,restarts=0,memory_MB=7.03,lastcheckin=Tue, 23 Mar 2010 22:37:59 -0700<br />
boxname_RealTimeUpdate_Queue_Subject_13314 jobs=576714,restarts=0,memory_MB=7.03,lastcheckin=Tue, 23 Mar 2010 22:37:59 -0700<br />
boxname_RealTimeUpdate_Queue_Subject_13342 jobs=294846,restarts=0,memory_MB=7.03,lastcheckin=Tue, 23 Mar 2010 22:37:59 -0700<br />
boxname_RealTimeUpdate_Queue_Subject_13347 jobs=376998,restarts=0,memory_MB=7.03,lastcheckin=Tue, 23 Mar 2010 22:37:59 -0700<br />
boxname_RealTimeUpdate_Queue_Subject_13359 jobs=470508,restarts=0,memory_MB=7.03,lastcheckin=Tue, 23 Mar 2010 22:37:58 -0700<br />
boxname_RealTimeUpdate_Queue_Subject_13364 jobs=403182,restarts=0,memory_MB=7.03,lastcheckin=Tue, 23 Mar 2010 22:37:58 -0700<br />
boxname_RealTimeUpdate_Property_SolrPublish_ jobs=219630,restarts=0,memory_MB=6.19,lastcheckin=Tue, 23 Mar 2010 22:37:59 -0700<br />
boxname_RealTimeUpdate_Queue_TriggerWorker_2 jobs=393642,restarts=0,memory_MB=4.27,lastcheckin=Tue, 23 Mar 2010 22:37:59 -0700<br />
boxname_RealTimeUpdate_Property_SolrBatchPub jobs=6,restarts=0,memory_MB=6.23,lastcheckin=Tue, 23 Mar 2010 22:37:28 -0700</code></p>
<p>Brilliant.. no need for html or HTTP.. just a nice simple commandline interface.</p>
<p>I think gearman still has a ways to go. I&#8217;d really like to see some more administration added to it. Deleting empty queues and quickly flushing all queues without restarting gearmand would be nice to haves. We&#8217;ll see what happens going forward, but for not, thanks so much to the gearman team (especially Eric Day who showed me gearman, and Brian Aker for pushing hard to release v0.12).</p>
<p>w00t!</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2010/03/gearman-replicate-mysql-to-solr/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Parallel mysql replication?</title>
		<link>http://fewbar.com/2009/06/parallel-mysql-replication/</link>
		<comments>http://fewbar.com/2009/06/parallel-mysql-replication/#comments</comments>
		<pubDate>Tue, 02 Jun 2009 19:08:48 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[drizzle]]></category>
		<category><![CDATA[parallelism]]></category>
		<category><![CDATA[replication]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=80</guid>
		<description><![CDATA[Its always been a dream of mine. I&#8217;ve posted about parallel replication on Drizzle&#8217;s mailing list before. I think when faced with the problem of a big, highly concurrent master, and scaling out reads simply with lower cost slaves, this is going to be the only way to go. So today I was really glad [...]]]></description>
			<content:encoded><![CDATA[<p>Its always been a dream of mine. I&#8217;ve <a href="https://lists.launchpad.net/drizzle-discuss/msg03988.html">posted about parallel replication</a> on Drizzle&#8217;s mailing list before. I think when faced with the problem of a big, highly concurrent master, and scaling out reads simply with lower cost slaves, this is going to be the only way to go.</p>
<p>So today I was really glad to see that somebody is trying out the idea. Seppo Jaakola from <a href="http://www.codership.com/">&#8220;Codership&#8221;</a>, who I&#8217;ve never heard of before today, <a href="https://lists.launchpad.net/drizzle-discuss/msg04214.html">posted a link</a> to an article on his blog about his <a href="http://www.codership.com/content/parallel-applying">experimentation with parallel replication slaves</a>. The findings are pretty interesting.<br />
<span id="more-80"></span><br />
I hope that he&#8217;ll be able to repeat his tests with a real world setup. The software they&#8217;ve written seems to have the right idea. The biggest issue I have with the tests is that  the tests were run on tiny hardware. Hyperthreading? Single disks? Thats not really the point of having parallel replication slaves.</p>
<p>The idea is that you have maybe a gigantic real time write server for OLTP. This beast may have lots of medium-power CPU cores, and an obscene amount of RAM, and a lot of battery backed write cache for writes.</p>
<p>Now you know that there are tons of reads that shouldn&#8217;t ever be done against this server. You drop a few replication slaves in, and you realize that you need a box with as much disk storage as your central server, and probably just as much write cache. Pretty soon scaling out those reads is just not very cost effective.</p>
<p>However, if you could have lots of CPU cores, and lots of cheap disks, you could dispatch these writes to be done in parallel, and you wouldn&#8217;t need expensive disk systems or lots of RAM for each slave.</p>
<p>So, the idea is not to make slaves faster in a 1:1 size comparison. Its to make it easier for a cheap slave to keep up with a very busy, very expensive master.</p>
<p>I do see where another huge limiting factor is making sure things synchronize in commit order. I think thats an area where a lot of time needs to be spent on optimization. The order should already be known so that the commiter thread is just waiting for the next one in line, and if the next 100 are already done it can just rip through them quickly, not signal them that they can go. Something like this seems right:</p>
<p><code><br />
id=first_commit_id();<br />
while(wait_for_commit(id)) {<br />
  commit(id);<br />
  id++;<br />
}<br />
</code></p>
<p>I applaud the efforts of Codeship, and I hope they&#8217;ll continue the project and maybe ship something that will rock all our worlds.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2009/06/parallel-mysql-replication/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>I should write an innodb backup tool</title>
		<link>http://fewbar.com/2008/11/i-should-write-an-innodb-backup-tool/</link>
		<comments>http://fewbar.com/2008/11/i-should-write-an-innodb-backup-tool/#comments</comments>
		<pubDate>Tue, 11 Nov 2008 18:04:15 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[backups]]></category>
		<category><![CDATA[innodb]]></category>
		<category><![CDATA[reliability]]></category>
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=23</guid>
		<description><![CDATA[One of my favorite bloggers, Peter over at Percona/MySQL Performance Blog, has thrown down the gauntlet. Basically saying that it would be fairly trivial to write an incremental InnoDB backup tool. If you see me, and I haven&#8217;t run up to you and told you that I am writing/have written an amazing InnoDB incremental backup [...]]]></description>
			<content:encoded><![CDATA[<p>One of my favorite bloggers, Peter over at Percona/MySQL Performance Blog, <a href="http://www.mysqlperformanceblog.com/2008/11/10/thoughs-on-innodb-incremental-backups/">has thrown down the gauntlet</a>. Basically saying that it would be fairly trivial to write an incremental InnoDB backup tool.</p>
<p>If you see me, and I haven&#8217;t run up to you and told you that I am writing/have written an amazing InnoDB incremental backup tool, I give you permission to make fun of me. This sounds like a fun, interesting project that will challenge me and sort of scratches an itch I have, which is, faster MySQL backups.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2008/11/i-should-write-an-innodb-backup-tool/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Deciding whether to send reads to slave or master</title>
		<link>http://fewbar.com/2008/10/maximizing-usage-of-mysql-replication-slaves/</link>
		<comments>http://fewbar.com/2008/10/maximizing-usage-of-mysql-replication-slaves/#comments</comments>
		<pubDate>Sat, 04 Oct 2008 17:43:42 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[application]]></category>
		<category><![CDATA[replication]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=18</guid>
		<description><![CDATA[There are quite a few articles out there that talk about how to give your application some context and send reads to one server, and writes to another. There are even some mentions of marking your connection &#8220;dirty&#8221; and then sending all reads to the write server. As a first try at scaling things, I [...]]]></description>
			<content:encoded><![CDATA[<p>There are quite a few articles out there that talk about how to give your application some context and send reads to one server, and writes to another. There are even some mentions of marking your connection &#8220;dirty&#8221; and then sending all reads to the write server.</p>
<p>As a first try at scaling things, I recently made a change to our web application&#8217;s data access layer where reads went to a group of readonly slaves. However, if a write was made to a database, a value was put into the user&#8217;s session, saying that the database was dirty, and causing all subsequent reads to go to the master server.<br />
<span id="more-18"></span><br />
This was good as users would use the readonly slaves as long as they hadn&#8217;t changed anything in the database. The real problem though, was that as soon as the user logged in, their account was updated to say that they had logged in, marking that database dirty.</p>
<p>Rather than try to cleverly change this one problem, we changed the &#8220;dirty&#8221; value from a boolean to a timestamp. Whenever the user writes to the database, it records the current time in their session. Then a global timeout is applied to that. This gives the replication slaves time to catch up and get the record that was just changed, then the user will have a consistent view fo their data.</p>
<p>This is great, but I think a further step is to have something publish the actual maximum lag of the slaves into a memcache key, and simply double that value as the timeout. This would allow maximum usage of the readonly slaves and keep the master server busy doing mostly writes.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2008/10/maximizing-usage-of-mysql-replication-slaves/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Can more queries equal a healthier MySQL server?</title>
		<link>http://fewbar.com/2008/08/innodb-concurrency-problems-on-multi-core-boxes-possibly-a-thing-of-the-past/</link>
		<comments>http://fewbar.com/2008/08/innodb-concurrency-problems-on-multi-core-boxes-possibly-a-thing-of-the-past/#comments</comments>
		<pubDate>Sat, 30 Aug 2008 06:21:34 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[concurrency]]></category>
		<category><![CDATA[innodb]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=14</guid>
		<description><![CDATA[This week was an ugly one for my monster database servers. It should have been triumphant, but oddly enough, I think it shows how prone to mistuning InnoDB on MySQL 5.0 is with multiple cores. This server is a multi-core, high concurrency server. The application has been designed a little bit naively in that it [...]]]></description>
			<content:encoded><![CDATA[<p>This week was an ugly one for my monster database servers. It should have been triumphant, but oddly enough, I think it shows how prone to mistuning InnoDB on MySQL 5.0 is with multiple cores.</p>
<p>This server is a multi-core, high concurrency server. The application has been designed a little bit naively in that it just throws almost all queries at the main db server. Several bits have been designed to scale by not doing that, but unfortunately, huge amounts of functionality were built around those apps to prevent them from scaling.</p>
<p>As a result, we&#8217;ve had to scale up the central database server and its redundant systems significantly. We started with the Proliant DL380 G4 with two Xeon 3.4Ghz CPU&#8217;s and 12GB of RAM, and plenty of disks in an external RAID. As more traffic was added, we moved up to the DL580 servers with 4 Xeon 3.4Ghz and 64GB of RAM. This worked well, but still more traffic, and more data, was coming and the app wasn&#8217;t ready to change significantly. We finally landed on the latest DL580 server, with 1GB of total battery backed write cache, 14 SAS disks, 128GB of RAM, and two quad core Xeon CPU&#8217;s.<br />
<span id="more-14"></span><br />
Some things got better. Writes were now incredibly fast. The server was churning out 1000 queries per second easily. Sometimes during peak times, query response time would suffer, but ultimately, the box was keeping up and performing well. <a href="http://fewbar.com/2008/07/mysql-query-cache-scales-like-a-286-with-turbo-off/">Especially after we turned of query caching</a>. After this week though, I wonder how much of the problem was query caching&#8230; more later.</p>
<p>Anyway, whenever the server would need to have maintenance, some high traffic applications would suffer needlessly for their need of rarely changing data (memcached was out of the question for the complexity and &#8220;realtime&#8221; nature of this data). So we setup a selective replication fanout onto multiple boxes and pointed these apps at that cluster for these queries.</p>
<p>Well the next day, without all of these tiny queries pounding on it, the database server had horrible problems. 400 threads stacked up inside InnoDB &#8220;Waiting for InnoDB queue&#8221;. System resources were fine, but it was clear, InnoDB was having trouble. Queries that normally take 0.75 seconds were taking 300+ seconds, or just never completing. I knew there was real trouble, when killing the thread would result in it just changing state to &#8220;Killed&#8221;, but never dying. Based on what I&#8217;d read in High Performance MySQL, and <a href="http://www.mysqlperformanceblog.com/2006/06/05/innodb-thread-concurrency/">articles like this one</a>, I tried twiddling with innodb_thread_concurrency, innodb_concurrency_tickets, and innodb_thread_sleep_delay. None of them seemed to help, though innodb_thread_concurrency set to a value of about half the CPU cores seemed to delay the problems.</p>
<p>I noticed that we were running MySQL v5.0.51a still. We had planned an upgrade to 5.0.67, which was just recently released, but hadn&#8217;t gotten there yet. I went ahead and upgraded one of the boxes to it, and failed over to it. Instantly things were more healthy, and the health seemed to stay for hours, without any more InnoDB freakouts.</p>
<p>After some research, it would seem that between 5.0.51a and 5.0.67, a lot of really big fixes were made to InnoDB to help it scale up on multi-core machines. The box has been healthy for a couple of days, though there&#8217;s still a lot of work to do removing query load from the server.</p>
<p>But why would a _reduction_ in queries cause concurrency problems? I have a theory, but no real ideas on how to test it.</p>
<p>Before, we were doing 1000 queries per second. Things were healthy. We removed about 400 queries per second from that. These 400 queries were basically instantaneous.. often times returning no results at all and reading from tables and indexes completely stored in the innodb_buffer_pool. But, with query cache turned off, they were still being processed fully by InnoDB. When we removed these tiny queries from the queue imposed by innodb_thread_concurrency, I think we removed the equivalent of spin waits from the queue. These tiny, easy queries were just hard enough to process, to prevent a lot of bigger queries from hitting the queue at the same time. Thats why reducing innodb_thread_concurrency to 4 helped a bit.. with only 4 threads vying for mutexes and CPU resources constantly, InnoDB was able to (sort of) keep up.</p>
<p>My final bit of evidence for this is that we actually, I think, had this problem before with the <a href="http://fewbar.com/2008/07/mysql-query-cache-scales-like-a-286-with-turbo-off/">aforementioned article</a>. Turning off the query cache moved these tiny queries out of the query cache, and into the InnoDB queue, providing the needed pseudo-spin-waits to prevent it from locking in on itself.</p>
<p>I have to wonder if raising innodb_sync_spin_loops to something ridiculously high, like 50000, would have the same effect. Unfortunately, its very hard to test this without dedicating a lot of time to it.</p>
<p>So, in this case, it would seem that more work can, in fact, make the server healthier.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2008/08/innodb-concurrency-problems-on-multi-core-boxes-possibly-a-thing-of-the-past/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Query Cache defeats Serverzilla</title>
		<link>http://fewbar.com/2008/07/mysql-query-cache-scales-like-a-286-with-turbo-off/</link>
		<comments>http://fewbar.com/2008/07/mysql-query-cache-scales-like-a-286-with-turbo-off/#comments</comments>
		<pubDate>Tue, 15 Jul 2008 20:47:55 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[caching]]></category>
		<category><![CDATA[linux]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=11</guid>
		<description><![CDATA[So a few days ago, my big mean MySQL server started having problems that were very hard to explain. It was slowing down, taking a minute to run queries that usually take a few seconds, and Linux load averages were in the teens, despite having quiet disks (less than 0.1% cpu IO wait time) and [...]]]></description>
			<content:encoded><![CDATA[<p>So a few days ago, my big mean MySQL server started having problems that were very hard to explain. It was slowing down, taking a minute to run queries that usually take a few seconds, and Linux load averages were in the teens, despite having quiet disks (less than 0.1% cpu IO wait time) and plenty of RAM (128G for about 200G of data total&#8230;).</p>
<p>The developers were stumped. The other systems guys were stumped. So was I. But it still seemed ok. We found all sorts of things to point fingers at, but nothing made sense.<br />
<span id="more-11"></span><br />
Then this Monday, everything came to a screeching halt. 3 second queries were taking 15 minutes. 30 second queries were never completing. The CPU&#8217;s were only a little busy. What gives?! This box has 8 CPU cores and 128G of RAM.. nothing can take it down, right?!</p>
<p>We threw our hands in the air and failed over to the active standby (the other side of our master&lt;-&gt;master replication pair). Suddenly all was well. But something smelled wrong. We blamed some kind of bug in MySQL.</p>
<p>I spent all day trying to make Memcached more efficient, and trying to explain why suddenly this beast was felled by such tiny arrows as instantaneous queries that should have been cached anyway.</p>
<p>Oh wait, did somebody say cached? As in the MySQL query cache? I mentioned this in the #mysql channel on <a href="http://freenode.net">Freenode</a>, and Mr. Eric Bergen (ebergen) from <a href="http://www.provenscaling.com/">Proven Scaling</a> immediately said something like &#8220;well duh, turn off the cache, moron&#8221;. I was dumbfounded. Shouldn&#8217;t it be helping us with all those tiny queries?</p>
<p>Well apparently not. <a href="http://lists.mysql.com/internals/35777">This recent thread on the MySQL internals list</a> talks about mutex contention in the query cache while it is *searched*, not just while it is updated. This is disasterous for an environment where thousands and thousands of tiny queries are being run constantly. Even with query_cache_type set to 2, or &#8220;cache on demand&#8221; mode, every query in the system must run through this mutex.</p>
<p>So, this morning when the standby box again cried for mercy, hitting max_connections and spinning all queries around in circles, I ran &#8216;SET GLOBAL query_cache_type=2&#8242;. Instantly the server became more healthy. I half expected to trade one problem for another.. with the server being consumed by tiny queries. But instead, these tiny queries did as expected, and took very little time to complete. And large queries against tables that change every second or 2 didn&#8217;t have to contend for the query cache, they just ran through like nothing.</p>
<p>So, it would appear that for any sort of multi-core installations of MySQL, the query cache is not only a waste, but a hazard!</p>
<p>Thanks again to Mr. Bergen. I would not have thought about that until he said it.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2008/07/mysql-query-cache-scales-like-a-286-with-turbo-off/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.217 seconds -->

