<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>FewBar.com - Make it good &#187; replication</title>
	<atom:link href="http://fewbar.com/tag/replication/feed/" rel="self" type="application/rss+xml" />
	<link>http://fewbar.com</link>
	<description>Technology, life, and mischief, not in that order</description>
	<lastBuildDate>Fri, 23 Dec 2011 01:41:49 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.5</generator>
		<item>
		<title>Parallel mysql replication?</title>
		<link>http://fewbar.com/2009/06/parallel-mysql-replication/</link>
		<comments>http://fewbar.com/2009/06/parallel-mysql-replication/#comments</comments>
		<pubDate>Tue, 02 Jun 2009 19:08:48 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[drizzle]]></category>
		<category><![CDATA[parallelism]]></category>
		<category><![CDATA[replication]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=80</guid>
		<description><![CDATA[Its always been a dream of mine. I&#8217;ve posted about parallel replication on Drizzle&#8217;s mailing list before. I think when faced with the problem of a big, highly concurrent master, and scaling out reads simply with lower cost slaves, this is going to be the only way to go. So today I was really glad [...]]]></description>
			<content:encoded><![CDATA[<p>Its always been a dream of mine. I&#8217;ve <a href="https://lists.launchpad.net/drizzle-discuss/msg03988.html">posted about parallel replication</a> on Drizzle&#8217;s mailing list before. I think when faced with the problem of a big, highly concurrent master, and scaling out reads simply with lower cost slaves, this is going to be the only way to go.</p>
<p>So today I was really glad to see that somebody is trying out the idea. Seppo Jaakola from <a href="http://www.codership.com/">&#8220;Codership&#8221;</a>, who I&#8217;ve never heard of before today, <a href="https://lists.launchpad.net/drizzle-discuss/msg04214.html">posted a link</a> to an article on his blog about his <a href="http://www.codership.com/content/parallel-applying">experimentation with parallel replication slaves</a>. The findings are pretty interesting.<br />
<span id="more-80"></span><br />
I hope that he&#8217;ll be able to repeat his tests with a real world setup. The software they&#8217;ve written seems to have the right idea. The biggest issue I have with the tests is that  the tests were run on tiny hardware. Hyperthreading? Single disks? Thats not really the point of having parallel replication slaves.</p>
<p>The idea is that you have maybe a gigantic real time write server for OLTP. This beast may have lots of medium-power CPU cores, and an obscene amount of RAM, and a lot of battery backed write cache for writes.</p>
<p>Now you know that there are tons of reads that shouldn&#8217;t ever be done against this server. You drop a few replication slaves in, and you realize that you need a box with as much disk storage as your central server, and probably just as much write cache. Pretty soon scaling out those reads is just not very cost effective.</p>
<p>However, if you could have lots of CPU cores, and lots of cheap disks, you could dispatch these writes to be done in parallel, and you wouldn&#8217;t need expensive disk systems or lots of RAM for each slave.</p>
<p>So, the idea is not to make slaves faster in a 1:1 size comparison. Its to make it easier for a cheap slave to keep up with a very busy, very expensive master.</p>
<p>I do see where another huge limiting factor is making sure things synchronize in commit order. I think thats an area where a lot of time needs to be spent on optimization. The order should already be known so that the commiter thread is just waiting for the next one in line, and if the next 100 are already done it can just rip through them quickly, not signal them that they can go. Something like this seems right:</p>
<p><code><br />
id=first_commit_id();<br />
while(wait_for_commit(id)) {<br />
  commit(id);<br />
  id++;<br />
}<br />
</code></p>
<p>I applaud the efforts of Codeship, and I hope they&#8217;ll continue the project and maybe ship something that will rock all our worlds.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2009/06/parallel-mysql-replication/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Deciding whether to send reads to slave or master</title>
		<link>http://fewbar.com/2008/10/maximizing-usage-of-mysql-replication-slaves/</link>
		<comments>http://fewbar.com/2008/10/maximizing-usage-of-mysql-replication-slaves/#comments</comments>
		<pubDate>Sat, 04 Oct 2008 17:43:42 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[application]]></category>
		<category><![CDATA[replication]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=18</guid>
		<description><![CDATA[There are quite a few articles out there that talk about how to give your application some context and send reads to one server, and writes to another. There are even some mentions of marking your connection &#8220;dirty&#8221; and then sending all reads to the write server. As a first try at scaling things, I [...]]]></description>
			<content:encoded><![CDATA[<p>There are quite a few articles out there that talk about how to give your application some context and send reads to one server, and writes to another. There are even some mentions of marking your connection &#8220;dirty&#8221; and then sending all reads to the write server.</p>
<p>As a first try at scaling things, I recently made a change to our web application&#8217;s data access layer where reads went to a group of readonly slaves. However, if a write was made to a database, a value was put into the user&#8217;s session, saying that the database was dirty, and causing all subsequent reads to go to the master server.<br />
<span id="more-18"></span><br />
This was good as users would use the readonly slaves as long as they hadn&#8217;t changed anything in the database. The real problem though, was that as soon as the user logged in, their account was updated to say that they had logged in, marking that database dirty.</p>
<p>Rather than try to cleverly change this one problem, we changed the &#8220;dirty&#8221; value from a boolean to a timestamp. Whenever the user writes to the database, it records the current time in their session. Then a global timeout is applied to that. This gives the replication slaves time to catch up and get the record that was just changed, then the user will have a consistent view fo their data.</p>
<p>This is great, but I think a further step is to have something publish the actual maximum lag of the slaves into a memcache key, and simply double that value as the timeout. This would allow maximum usage of the readonly slaves and keep the master server busy doing mostly writes.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2008/10/maximizing-usage-of-mysql-replication-slaves/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.138 seconds -->

