<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>FewBar.com - Make it good &#187; PHP</title>
	<atom:link href="http://fewbar.com/tag/php/feed/" rel="self" type="application/rss+xml" />
	<link>http://fewbar.com</link>
	<description>Technology, life, and mischief, not in that order</description>
	<lastBuildDate>Fri, 23 Dec 2011 01:41:49 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.5</generator>
		<item>
		<title>Time for some ghetto monitoring</title>
		<link>http://fewbar.com/2011/05/time-for-some-ghetto-monitoring/</link>
		<comments>http://fewbar.com/2011/05/time-for-some-ghetto-monitoring/#comments</comments>
		<pubDate>Mon, 02 May 2011 16:54:51 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[Cloud]]></category>
		<category><![CDATA[drizzle]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[ubuntu]]></category>
		<category><![CDATA[upstart]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=391</guid>
		<description><![CDATA[If you came here between April 28 and about an hour ago, you got a &#8220;couldn&#8217;t connect to database&#8221; error. Oops! Seems my limited memory EC2 instance got a little overwhelmed by php processes and decided the db server, drizzled, should die to make more room for PHP. Ooops! Time to drop pm.max_children. I don&#8217;t [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://fewbar.com.s3.amazonaws.com/wp-content/uploads/2011/05/4584789991_7045d10c2c.jpg"><img class="alignleft size-medium wp-image-401" title="4584789991_7045d10c2c" src="http://fewbar.com.s3.amazonaws.com/wp-content/uploads/2011/05/4584789991_7045d10c2c-300x225.jpg" alt="" width="300" height="225" /></a></p>
<p>If you came here between April 28 and about an hour ago, you got a &#8220;couldn&#8217;t connect to database&#8221; error. Oops! Seems my limited memory EC2 instance got a little overwhelmed by php processes and decided the db server, drizzled, should die to make more room for PHP. Ooops! Time to drop pm.max_children.</p>
<p>I don&#8217;t have any monitoring setup for the site, so I just now figured it out. Until I get proper monitoring, I&#8217;ve installed this fancy bit of duct-tape upstart magic:<br />
<code><br />
start on stopping<br />
task<br />
script<br />
  env | mail -s "$JOB is stopping!" me@myemail.com<br />
end script<br />
</code></p>
<p>What does this do? Well is emails me whenever upstart gives up respawning something, or I manually stop a service.</p>
<p>Its not monitoring. I need monitoring. But this is a nice little hack to prevent a regression while I figure that out.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2011/05/time-for-some-ghetto-monitoring/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Gearman K.O.&#8217;s mysql to solr replication</title>
		<link>http://fewbar.com/2010/03/gearman-replicate-mysql-to-solr/</link>
		<comments>http://fewbar.com/2010/03/gearman-replicate-mysql-to-solr/#comments</comments>
		<pubDate>Wed, 24 Mar 2010 05:47:36 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[gearman]]></category>
		<category><![CDATA[opensource]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=154</guid>
		<description><![CDATA[Ding ding ding.. in this corner, wearing black shorts and a giant schema, we have over 11 million records in MySQL with a complex set of rules governing which must be searchable and which must not be. And in that corner, we have the contender, a kid from the back streets, outweighed and out reached [...]]]></description>
			<content:encoded><![CDATA[<p>Ding ding ding.. in this corner, wearing black shorts and a giant schema, we have over 11 million records in MySQL with a complex set of rules governing which must be searchable and which must not be. And in that corner, we have the contender, a kid from the back streets, outweighed and out reached by all his opponents, but still victorious in the queue shootout, with just open source, and 12 patch releases.. written in C, its <b><a href="http://gearman.org">gearman</a></b>!</p>
<p><a href="http://fewbar.com.s3.amazonaws.com/wp-content/uploads/2010/03/ko-mike-tyson.png"><img src="http://fewbar.com.s3.amazonaws.com/wp-content/uploads/2010/03/ko-mike-tyson.png" alt="" title="ko-mike-tyson" width="500" height="437" class="alignnone size-full wp-image-155" /></a><br />
<span id="more-154"></span></p>
<p>I&#8217;m pretty excited today, as I&#8217;m preparing to go live with the first real, high load application of Gearman that I&#8217;ve written. What is it you say? Well it is a simple trigger based replicator from mysql to <a href="http://lucene.apache.org/solr/">SOLR</a>.</p>
<p>I should say (because I know some of my colleagues read this blog) that I don&#8217;t actually believe in this design. Replication using triggers seems fraught with danger. It totally makes sense if you have a giant application and can&#8217;t track down everywhere that a table is changed. However, if your app is simple and properly abstracted, hopefully you know the 1 or 2 places that write to the table.</p>
<p>I should also say that I really can&#8217;t reveal all of the details. The general idea is pretty simple. Basically we have a trigger that dumps a primary key into gearman via the <a href="https://launchpad.net/gearman-mysql-udf">gearman MySQL UDFs</a>. The idea is just to tell a gearman worker &#8220;look at this record in that table&#8221;.</p>
<p>Once the worker picks it up, it applies some logic to the record.. &#8220;should this be searchable or not&#8221;. If the answer is yes it should be searchable, the worker pushes the record into SOLR. If not, the worker will make sure it is not in solr.</p>
<p>This at least is pretty simple. The end result is a system where we can rebuild the search index in parallel using multiple CPU&#8217;s (thank you to solr/lucene for being able to update indexes concurrently and efficiently btw). This is done by pushing all of the records in the table into the queue at once.</p>
<p>Anyway, gearmand is performing like a champ, libgearman and the gearman pecl module are doing great. I&#8217;m just really happy to see gearman rolled out in production, as I really do think it has that nice mix of simplicity and performance. I love the commandline client which makes it easy to write scripts to inject things into queues, or query workers.  This allows me to access a worker like this:</p>
<p><code>$ gearman -h gearmanbox -f all_workers -s<br />
Known Workers: 11</p>
<p>boxname_RealTimeUpdate_Queue_TriggerWorker_1 jobs=627366,restarts=0,memory_MB=4.27,lastcheckin=Tue, 23 Mar 2010 22:37:59 -0700<br />
boxname_RealTimeUpdate_Queue_Subject_13311 jobs=304134,restarts=0,memory_MB=7.03,lastcheckin=Tue, 23 Mar 2010 22:37:58 -0700<br />
boxname_RealTimeUpdate_Queue_Subject_13306 jobs=606126,restarts=0,memory_MB=7.03,lastcheckin=Tue, 23 Mar 2010 22:37:59 -0700<br />
boxname_RealTimeUpdate_Queue_Subject_13314 jobs=576714,restarts=0,memory_MB=7.03,lastcheckin=Tue, 23 Mar 2010 22:37:59 -0700<br />
boxname_RealTimeUpdate_Queue_Subject_13342 jobs=294846,restarts=0,memory_MB=7.03,lastcheckin=Tue, 23 Mar 2010 22:37:59 -0700<br />
boxname_RealTimeUpdate_Queue_Subject_13347 jobs=376998,restarts=0,memory_MB=7.03,lastcheckin=Tue, 23 Mar 2010 22:37:59 -0700<br />
boxname_RealTimeUpdate_Queue_Subject_13359 jobs=470508,restarts=0,memory_MB=7.03,lastcheckin=Tue, 23 Mar 2010 22:37:58 -0700<br />
boxname_RealTimeUpdate_Queue_Subject_13364 jobs=403182,restarts=0,memory_MB=7.03,lastcheckin=Tue, 23 Mar 2010 22:37:58 -0700<br />
boxname_RealTimeUpdate_Property_SolrPublish_ jobs=219630,restarts=0,memory_MB=6.19,lastcheckin=Tue, 23 Mar 2010 22:37:59 -0700<br />
boxname_RealTimeUpdate_Queue_TriggerWorker_2 jobs=393642,restarts=0,memory_MB=4.27,lastcheckin=Tue, 23 Mar 2010 22:37:59 -0700<br />
boxname_RealTimeUpdate_Property_SolrBatchPub jobs=6,restarts=0,memory_MB=6.23,lastcheckin=Tue, 23 Mar 2010 22:37:28 -0700</code></p>
<p>Brilliant.. no need for html or HTTP.. just a nice simple commandline interface.</p>
<p>I think gearman still has a ways to go. I&#8217;d really like to see some more administration added to it. Deleting empty queues and quickly flushing all queues without restarting gearmand would be nice to haves. We&#8217;ll see what happens going forward, but for not, thanks so much to the gearman team (especially Eric Day who showed me gearman, and Brian Aker for pushing hard to release v0.12).</p>
<p>w00t!</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2010/03/gearman-replicate-mysql-to-solr/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>How do you do, that voodoo, that Queues Do?</title>
		<link>http://fewbar.com/2010/01/queue-voodoo-that-queues-do/</link>
		<comments>http://fewbar.com/2010/01/queue-voodoo-that-queues-do/#comments</comments>
		<pubDate>Fri, 22 Jan 2010 08:32:46 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[Scalability]]></category>
		<category><![CDATA[amqp]]></category>
		<category><![CDATA[C]]></category>
		<category><![CDATA[gearman]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[qpid]]></category>
		<category><![CDATA[queuing]]></category>
		<category><![CDATA[stomp]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=136</guid>
		<description><![CDATA[Queues seem to be all over the place right now. Maybe its like when I wanted a VW GTi VR6 a few years back. I kept seeing them pass me on the freeway and thought &#8220;crap, everybody is getting this hot new thing and I&#8217;m missing out!&#8221;. I think everybody at one point looked at [...]]]></description>
			<content:encoded><![CDATA[<p>Queues seem to be all over the place right now. Maybe its like when I wanted a VW GTi VR6 a few years back. I kept seeing them pass me on the freeway and thought &#8220;crap, everybody is getting this hot new thing and I&#8217;m missing out!&#8221;.</p>
<p>I think everybody at one point looked at MySQL and tought.. &#8220;that would work fine as a queue system&#8221;. For low volume stuff, it *is* fine. But then somebody grabs your little transactional, relational, reliable queue system and plugs 5 million messages per hour through it, and somewhere, a man name Heikki cries.</p>
<p>So then you start to look around.. and for those of us who have meager budgets and tend to use open source, there aren&#8217;t a lot of choices.<span id="more-136"></span> <a href="http://wiki.secondlife.com/wiki/Message_Queue_Evaluation_Notes">The guys at Second Life did some research for all of us&#8230;</a>. Once you get through that though, you realize that the needs of second life, a MMORPG, are quite a bit different from your average web app.</p>
<p>So, without further ado, my &#8220;queue&#8221; system round up.</p>
<ul>
<li><a href="http://activemq.apache.org/">ActiveMQ</a> &#8211; This shining star of the queueing world seems to come up quickly in conversation. At Adicio, we actually gave it a good try. The main problem was, we&#8217;re a PHP shop. The PHP accessibility comes not through the normal Java Messing Service connector, but <a href="http://stomp.codehaus.org/Protocol">&#8220;STOMP&#8221;</a>.
<p />
Honestly, I&#8217;m not a big fan of these giant Apache sponsored java projects. <a href="http://lucene.apache.org/solr/">SOLR</a> has changed my mind a bit, as it seems to work well and doesn&#8217;t really crash. Then again, I&#8217;m not carrying a pager anymore, so maybe it does suck and I&#8217;m just not seeing it.</p>
<p />
Anyway, at first, ActiveMQ was winning me over. It was pretty quick.. had a pretty simple setup curve (just start up the latest version, and you have a working persistent queue system), and despite having mountains of documentation that reads like the text spammers shove into their emails randomly to pass bayesian filters, it made sense.</p>
<p />
However, its fall was pretty quick, as the first problem we hit was its Producer Throttling. This probably works fine when you&#8217;re using the JMS connector. However, with Stomp, when ActiveMQ decides your queue is too full, and it needs you to stop, it just stops acking your packets. Your stomp client blocks (or spins, in non-block mode) and you wait. This is made worse by the fairly naive php stomp driver, which doesn&#8217;t really check to see why its write failed, or even try to see if it <b>can</b>.</p>
<p />
Things got better when that was disabled, but the stomp driver was still haphazard. After figuring out that the Master/Slave protocol requires one to shut down the slave whenever failing back to a downed master, I had had enough. Sionara ActiveMQ.
</li>
<li><a href="http://www.rabbitmq.com/">RabbitMQ</a> &#8211; This one seems to be a favorite of many. My experience is limited, and I really haven&#8217;t tried it that much. Its written in erlang, which I guess automatically makes something &#8220;telco reliable&#8221;. Cool.</li>
<li><a href="http://qpid.apache.org/">QPID</a> &#8211; Wow, this one is supposedly INCREDIBLE. <a href="http://www.redhat.com/mrg/messaging/features/#aio">&#8220;500,000 messages per second per LUN.&#8221; </a>. WOW. It also has RedHat&#8217;s backing, which is a big win for me.<br />
In fact, as I write this, I&#8217;m doing my best to build and install the latest qpid on CentOS 5.4. </p>
<pre>
 gcc -DHAVE_CONFIG_H -I. -I. -I./src/config -I./include/ -I/usr/src/redhat/BUILD/xerces-c-src_2_8_0/src -I./src/lexer/ -D_GNU_SOURCE -D_REENTRANT -O2 -g -m64 -mtune=generic -MT mapm_add.lo -MD -MP -MF .deps/mapm_add.Tpo -c src/mapm/mapm_add.c  -fPIC -DPIC -o .libs/mapm_add.o
...
</pre>
<p>In case you&#8217;re familiar, I&#8217;m there. Oops, thats not qpid. Thats xerces-c. Which I have to build.. and I also have to build xqilla after that. Luckily, 40 other packages required to build qpid were available in the standard CentOS yum repository.</p>
<p />
Another unfortunate reality is that there is no qpid connectivity available for PHP. Unless the <a href="http://code.google.com/p/php-amqp/">php-amqp module</a> works. Its really not clear yet.<br />
Anyway, this looks like a promising messaging technology. However, this much software leaves a lot of room for things to break.. so, while I will probably complete the build, as I want to find out how it stacks up to the others in terms of simplicity and performance, I think this one is dead.</p>
<p />
</li>
<li><a href="http://gearman.org">Gearman</a> &#8211; Ok I&#8217;m going to say it up front. I like this one. Its really not a &#8220;queue&#8221; system per sé. The name is an anagram of &#8216;manager&#8217; (say that 5 times fast!). Its one of those great things that came out of the Danga group, the same people who created MogileFS and Memcached.
<p />
Call me stupid, but I like to be able to read things. QPID is in C++, and is so big, I don&#8217;t even know where to start. Java gives me the shivers, and I don&#8217;t even know what erlang looks like. But damn, who doesn&#8217;t like poring over well written C? Thats pretty much what the new C port of gearmand is.</p>
<p />
I&#8217;m especially fond of the ease with which one can write a persistence layer. I recently submitted code to make the tokyocabinet queue store better. Its a simple B+Tree store that everybody&#8217;s going crazy about these days. Its also written in really nice C.</p>
<p />
The built in ability for gearman clients/workers (producers/consumers) to have a 2 way conversation is especially appealing. Its not like they can just freely pass messages back and forth. But clients can choose to wait for the job they submitted to complete. They can also check on the status of the job fairly easily. Workers can send back two integers (numerator and denominator), which is particularly useful for sending back a count of things done over the count of things to do.</p>
<p />
Combine all this cool stuff with the dead simple &#8216;gearman&#8217; command line client, and you have a happy Clint. I wrote a little PHP worker that just sits around collecting data sent to it by the other workers running. When it receives a &#8220;show_all_workers&#8221; message (function in gearman-ese), it just spits back a text report of what it knows. This can be triggered by just saying:</p>
<pre>
$ gearman -s -f show_all_workers

Known Workers: 5

dev3.adicio.com_Adicio_App_Reverse_Worker_29336 jobs=26508,restarts=0,memory_MB=1.47,lastcheckin=Thu, 21 Jan 2010 15:33:32 -0800
dev3.adicio.com_Adicio_App_Reverse_Worker_29333 jobs=19194,restarts=0,memory_MB=1.47,lastcheckin=Thu, 21 Jan 2010 15:33:32 -0800
dev3.adicio.com_Adicio_App_Reverse_Worker_29356 jobs=29208,restarts=0,memory_MB=1.47,lastcheckin=Thu, 21 Jan 2010 15:33:32 -0800
dev3.adicio.com_Adicio_App_Reverse_Worker_29370 jobs=27638,restarts=0,memory_MB=1.47,lastcheckin=Thu, 21 Jan 2010 15:33:32 -0800
dev3.adicio.com_Adicio_App_Reverse_Worker_29332 jobs=10636,restarts=0,memory_MB=1.47,lastcheckin=Thu, 21 Jan 2010 15:33:32 -0800

$
</pre>
<p>This is pretty damn cool. Now double the fun with <a href="https://launchpad.net/gearman-mysql-udf">MySQL UDF&#8217;s</a>, and you have a workable solution for queueing via MySQL trigger.<br />
<a href="http://fewbar.com.s3.amazonaws.com/wp-content/uploads/2010/01/paris-hilton-thats-hot.jpg"><img src="http://fewbar.com/wp-content/uploads/2010/01/paris-hilton-thats-hot.jpg" alt="" title="paris hilton thats hot" width="256" height="256" class="alignnone size-full wp-image-138" /></a></p>
<p />
So, I can&#8217;t help but give this one the nod for simplicity of design. There are no massive books written to explain what gearman does. Just a nice easy C library, and perhaps one of the most important things, a really useful PHP extension.
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2010/01/queue-voodoo-that-queues-do/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Bromine and Selenium &#8211; second and third most useful elements behind Oxygen</title>
		<link>http://fewbar.com/2009/11/bromine-and-selenium-tests-for-the-rests-of-u/</link>
		<comments>http://fewbar.com/2009/11/bromine-and-selenium-tests-for-the-rests-of-u/#comments</comments>
		<pubDate>Tue, 03 Nov 2009 01:48:10 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[Engineers]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[opensource]]></category>
		<category><![CDATA[selenium]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[testing]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=125</guid>
		<description><![CDATA[If you&#8217;re an engineer, you hate testing. Seriously, who likes doing what those mere mortal &#8220;users&#8221; do? We&#8217;re POWER users and we don&#8217;t need to use all those silly features on all those sites. Just look at Craigslist, clearly an engineer&#8217;s dream tool. For web apps, testing actually isn&#8217;t *that* hard. The client program (the [...]]]></description>
			<content:encoded><![CDATA[<p>If you&#8217;re an engineer, you hate testing. Seriously, who likes doing what those mere mortal &#8220;users&#8221; do? We&#8217;re POWER users and we don&#8217;t need to use all those silly features on all those sites. Just look at Craigslist, clearly an engineer&#8217;s dream tool.</p>
<p>For web apps, testing actually isn&#8217;t *that* hard. The client program (the browser) is readily available on every platform known to man, and they generally don&#8217;t do much more than store and retrieve data in clever ways. So, its not like we have to fire up a Large Hadron Collider to observe the effects of our web app.<span id="more-125"></span><a href="http://fewbar.com/wp-content/uploads/2009/11/periodictable.jpg"><img class="size-full wp-image-127 alignleft" title="periodictable" src="http://fewbar.com/wp-content/uploads/2009/11/periodictable.jpg" alt="periodictable" width="234" height="142" /></a></p>
<p>Therein lies the problem though, as clicking around on web forms and entering the same email address, password, address, phone number, etc. etc., 100 times, is BORING.</p>
<p>Enter <a href="http://seleniumhq.org/">Selenium</a>. This amazing little tool has been on the scene for a little while now, but its just now getting some momentum. Click through to the website and watch &#8220;the magic&#8221; as they put it, but basically here&#8217;s how it goes:</p>
<ul>
<li> open their firefox plugin and click &#8216;record</li>
<li>do something</li>
<li>click &#8216;record&#8217; again.</li>
</ul>
<p>Then just save this little test case to a file, and the next time you change anything that might relate to the series of clicks and data entries you just made, run this test again. There are all kinds of assertions you can make while you&#8217;re doing something. Like &#8216;Make sure the title is X&#8217; or &#8216;make sure a link to Y exists&#8217;.</p>
<p>But wait, I could have done that with something like Test::More,  PHPUnit, or lime. Where&#8217;s the real benefit?</p>
<p>Well because Selenium remotely controls your browser, all those gotchya&#8217;s regarding javascript CSS incompatibilities can come into play here. Because Selenium can control Internet Explorer, Firefox, *and* Safari. In fact it can also control Opera, and according to their website, any browser that properly supports javascript fully.</p>
<p>This is really a nice evolutionary step for web shops, as tools like this generally are OS specific and cost a lot of money. Once again open source software appears where a need becomes somewhat ubiquitous.</p>
<p>You can even take it a step further. The next thing that generally happens in a web dev shop when they get bigger than 20 or 30 people is they hire people who actually <strong>like</strong> testing. Well not really, but they dislike it *less* than software engineers. These are QA engineers. And they <strong>DO</strong> like things to be orderly and efficient.</p>
<p><a href="http://seleniumhq.org/projects/bromine/">Bromine</a> is the answer for that. Its still pretty rough around the edges, but it gets the job done.</p>
<p>Again check out their website and watch the screencast, but basically it goes like this:</p>
<ul>
<li>Write selenium tests as specified above</li>
<li>Upload tests to Bromine server</li>
<li>Attach tests to requirements</li>
<li>Run selenium remote control on all required OS/browser version combinations (can you say virtualbox?)</li>
<li>Run tests</li>
</ul>
<p>Another nice thing about using bromine is now you are running your tests in a server side language, not just the Selenium IDE, which is limited to the IDE&#8217;s generated &#8220;Selenese&#8221; XML commands for tests. The IDE exports your basic test into PHP or Java, and then on the bromine server you can do interesting things, like check an IMAP box for an email, run a backend process, or send an SMS.</p>
<p>At first it may not seem like much, but eventually you end up with a multitude of useful tests for your web app that can be run all the time against development branches before release, and catch many problems. Quality means happier users, which hopefully means loyal users that keep coming back.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2009/11/bromine-and-selenium-tests-for-the-rests-of-u/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>TokyoOops</title>
		<link>http://fewbar.com/2009/10/tokyo-tyrant-ignores-memcache-protocol-flags/</link>
		<comments>http://fewbar.com/2009/10/tokyo-tyrant-ignores-memcache-protocol-flags/#comments</comments>
		<pubDate>Tue, 27 Oct 2009 04:28:52 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[Memcache]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[berkeleydb]]></category>
		<category><![CDATA[caching]]></category>
		<category><![CDATA[memcachedb]]></category>
		<category><![CDATA[process]]></category>
		<category><![CDATA[RTFM]]></category>
		<category><![CDATA[testing]]></category>
		<category><![CDATA[tokyotyrant]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=117</guid>
		<description><![CDATA[We had a fun time this week with TokyoTyrant. Recently it has become apparent that MemcacheDB has been all but abandoned. As fantastic as the early work was by Steve Chu, the project is in disrepair. That, coupled with the less than obvious failover for its replication combined to make us seek alternatives. Brian Aker [...]]]></description>
			<content:encoded><![CDATA[<p>We had a fun time this week with <a href="http://1978th.net/tokyotyrant/">TokyoTyrant</a>. Recently it has become apparent that <a href="http://www.memcachedb.org/">MemcacheDB</a> has been all but abandoned. As fantastic as the early work was by Steve Chu, the project is in disrepair. That, coupled with the <a href="http://fewbar.com/2009/03/memcachedb-fault-tolerance-procedures/">less than obvious failover for its replication</a> combined to make us seek alternatives.</p>
<p><a href="http://fewbar.com/wp-content/uploads/2009/10/virtual_stupidity.jpg"><img class="alignnone size-full wp-image-121" title="virtual_stupidity" src="http://fewbar.com.s3.amazonaws.com/wp-content/uploads/2009/10/virtual_stupidity.jpg" alt="virtual_stupidity" width="280" height="280" /></a></p>
<p><span id="more-117"></span><br />
<a href="http://krow.net">Brian Aker</a> had mentioned to me at one time that TokyoTyrant was way better than memcachedb and we should run it instead. I took notice and it turns out he&#8217;s right! It does basically the same thing, applying the memcache protocol to an on disk key/value store. However, the code is incredibly clean, well maintained, and runs extremely fast. There&#8217;s also a lot more flexibility, with the ability to choose between in-memory or on disk storage, hash tables, B+Tree&#8217;s, etc.</p>
<p>The availability of log based asynchronous master/master replication (somewhat similar to MySQL&#8217;s replication in concept) was probably one of the biggest wins, allowing much simpler failover (just move the IP, or DNS, or whatever) when compared to MemcacheDB&#8217;s adherence to BerkeleyDB&#8217;s replication setup, which is a single-master system implementing an election algorithm.</p>
<p>Somewhere during migration, we missed one tiny detail though. Sometimes, the devil is in the details. This is really the only evidence in <a href="http://1978th.net/tokyotyrant/spex.html#protocol">the documentation that tokyo tyrant has support for the memcache protocol</a>. It is very clear:</p>
<blockquote><p>Memcached Compatible Protocol</p>
<p>As for the memcached (ASCII) compatible protocol, the server implements the following commands; &#8220;set&#8221;, &#8220;add&#8221;, &#8220;replace&#8221;, &#8220;get&#8221;, &#8220;delete&#8221;, &#8220;incr&#8221;, &#8220;decr&#8221;, &#8220;stats&#8221;, &#8220;flush_all&#8221;, &#8220;version&#8221;, and &#8220;quit&#8221;. &#8220;noreply&#8221; options of update commands are also supported. However, &#8220;flags&#8221;, &#8220;exptime&#8221;, and &#8220;cas unique&#8221; parameters are ignored.</p></blockquote>
<p>Now, as I said, there&#8217;s nothing ambiguous about this. That would have helped, if anyone on my team had ever read it. We installed TokyoTyrant, pointed our basic test code at it, and it worked. This is really a process problem, not so much a technical one. The process must be to assume it won&#8217;t work, and test all the different use cases to make sure it works.</p>
<p>Now, why is that bit of the manual important? Well we use PHP. Specifically, we use the PECL &#8220;Memcache&#8221; module to access memcache protocol storage. Now, the Memcache module is mostly oriented toward caching in the memory based original memcached. It works great for memcachedb too, which simply ignores the exptime parameter. However, memcacheDB *does not* ignore &#8220;flags&#8221;.</p>
<p>And therein lies the problem. Users of the <a href="http://pecl.php.net/package/memcache">PECL Memcache module</a> may not know this, but the flags are *important*. There are two bits in that flags field that the Memcache module may set. Bit 0 is used to indicate whether or not the content has been serialized, and, therefore, on read, must be unserialized. Bit 1 is used to indicate whether or not the content has been gzipped.</p>
<p>So, while all of the strings that were stored in MemcacheDB and subsequently copied to TokyoTyrant worked great, the serialized objects, arrays, and gzipped values, were completely inoperative, as they were coming back to the code as strings and binary compressed data. The gzipped data was easy (turn off automatic gzip compression). The serialized data took some quick tap dancing to remedy, with code something like this:</p>
<p><code lang="php"><br />
class Memcache_BrokenFlags extends Memcache<br />
{<br />
public function get($key, &amp;$flags)<br />
{<br />
$v = parent::get($key, $flags);<br />
$uv = @unserialize($v);<br />
return $uv === false ? $v : $uv;<br />
}<br />
}<br />
</code></p>
<p>Luckily our code all uses one Factory method to spawn all &#8220;MemcacheDB&#8221; connections, so it was easy to substitute this in.</p>
<p>Eventually we can just change the code by segregating into things that always serialize, and things that don&#8217;t, and just do the serialization ourselves. This should eventually allow us to use the new <a href="http://pecl.php.net/package/tokyo_tyrant">tokyo_tyrant module in PECL</a>, which only reliably stores scalars (I noticed recent versions have added a call to the internal PHP function convert_to_string().. this is, I think, a mistake, but one that still leaves it up the programmer to explicitly serialize when serialization is desired).</p>
<p>This was a pretty big gotchya, and one that illustrates that even though sometimes us cowboy coders and sysadmins get annoyed when those pesky business people ask us for plans, schedules, expected impact, etc., and we keep assuring them we know whats up, its still important to actually know whats up, and make sure to RTFMC .. C as in, CAREFULLY.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2009/10/tokyo-tyrant-ignores-memcache-protocol-flags/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>MemcacheDB fault tolerance procedures</title>
		<link>http://fewbar.com/2009/03/memcachedb-fault-tolerance-procedures/</link>
		<comments>http://fewbar.com/2009/03/memcachedb-fault-tolerance-procedures/#comments</comments>
		<pubDate>Wed, 25 Mar 2009 18:07:24 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[fault tolerance]]></category>
		<category><![CDATA[heartbeat]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[memcachedb]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[reliability]]></category>
		<category><![CDATA[Scalability]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=46</guid>
		<description><![CDATA[It semeed so simple, just setup two memcachedb instances and point them at eachother. Instant fault tolerance, Right? If only it were so simple! Its not entirely clear from the documentation how to setup memcachedb for fault tolerance. Here&#8217;s the procedures I&#8217;ve found useful. Set up replication right. With all due respect to Steve Chu, [...]]]></description>
			<content:encoded><![CDATA[<p>It semeed so simple, just setup two memcachedb instances and point them at eachother. Instant fault tolerance, Right? If only it were so simple!</p>
<p>Its not entirely clear from the documentation how to setup memcachedb for fault tolerance. Here&#8217;s the procedures I&#8217;ve found useful.<br />
<span id="more-46"></span></p>
<ul>
<li><strong>Set up replication right</strong>. With all due respect to Steve Chu, The docs aren&#8217;t really clear on how to setup replication. Its much simpler than it looks. Just run MemcacheDB as you would if it were standalone, but then add a combination of these 3 options:
<ul>
<li>You must have a -R line if you want to participate in replication. This is your hostname and port that listens for connections from other machines for replication. It is the same value that should be listed in every other machine&#8217;s -O.</li>
<li>a -O for *every* other machine that may want to replicate to/from this machine. I am sure there are situations where you won&#8217;t need these, but it makes re-syncing and elections more predictable. You won&#8217;t be able to re-sync &#8220;live&#8221; after a failure without -O options.</li>
<li>-M/-S are not required. If you start n machines without -M or -S, but with appropriate -R and -O lines, they will arbitrarily elect a master. If you run them with -M and -S, then the -M box will just be pushy and always elect itself the master, and the -S boxes will, likewise, always try to defer to slave status.</li>
<li>Lets say we wanted to listen for memcache protocol on port 45000 on host &#8216;node1&#8242; and replicate to &#8216;node2&#8242;</li>
<li>Standalone: <code>memcachedb -p 45000 -H /home/memdb/data -u memdb -N</code></li>
<li>Replication w/ elected master: <code>memcachedb -p 45000 -H /home/memdb/data -u memdb -N -R node1:46000 -O node2:46000</code></li>
<li>Replication Master: <code>memcachedb -p 45000 -H /home/memdb/data -u memdb -N -R node1:46000 -O node2:46000 -M</code></li>
</ul>
</li>
<li><strong>Only the current master can accept writes</strong>. You can see which machine is the master with the &#8216;stats rep&#8217; command. In v1.2.1 its shown as an environment id. Below st_env_id and st_master are the same, so this is the master:<br />
<code><br />
stats rep<br />
STAT st_bulk_fills 0<br />
STAT st_bulk_overflows 0<br />
STAT st_bulk_records 11<br />
STAT st_bulk_transfers 3<br />
STAT st_client_rerequests 0<br />
STAT st_client_svc_miss 0<br />
STAT st_client_svc_req 0<br />
STAT st_dupmasters 0<br />
STAT st_egen 3<br />
STAT st_election_cur_winner 2147483647<br />
STAT st_election_gen 0<br />
STAT st_election_lsn 1/28<br />
STAT st_election_nsites 0<br />
STAT st_election_nvotes 1<br />
STAT st_election_priority 100<br />
STAT st_election_sec 5<br />
STAT st_election_status 0<br />
STAT st_election_tiebreaker 3676766282<br />
STAT st_election_usec 69747<br />
STAT st_election_votes 0<br />
STAT st_elections 1<br />
STAT st_elections_won 1<br />
STAT st_env_id 2147483647<br />
STAT st_env_priority 100<br />
STAT st_gen 2<br />
STAT st_log_duplicated 0<br />
STAT st_log_queued 0<br />
STAT st_log_queued_max 0<br />
STAT st_log_queued_total 0<br />
STAT st_log_records 0<br />
STAT st_log_requested 0<br />
STAT st_master 2147483647<br />
STAT st_master_changes 0<br />
STAT st_max_lease_sec 0<br />
STAT st_max_lease_usec 0<br />
STAT st_max_perm_lsn 0/0<br />
STAT st_msgs_badgen 0<br />
STAT st_msgs_processed 5<br />
STAT st_msgs_recover 0<br />
STAT st_msgs_send_failures 2<br />
STAT st_msgs_sent 10<br />
STAT st_newsites 0<br />
STAT st_next_lsn 1/8916<br />
STAT st_next_pg 0<br />
STAT st_nsites 2<br />
STAT st_nthrottles 0<br />
STAT st_outdated 0<br />
STAT st_pg_duplicated 0<br />
STAT st_pg_records 0<br />
STAT st_pg_requested 0<br />
STAT st_startsync_delayed 0<br />
STAT st_startup_complete 0<br />
STAT st_status 2<br />
STAT st_txns_applied 0<br />
STAT st_waiting_lsn 0/0<br />
STAT st_waiting_pg 0<br />
END<br />
</code><br />
However, its much simpler, I think, to just try and store a value on an instance. If you get &#8220;STORED&#8221; back, then this is the master. If you get NOT_STORED back, this is a slave. If it blocks (timeouts are hard in simple scripts, I know.. perldoc -f alarm), you are in a &#8220;DOWN&#8221; state. The danger here is one of split brain where both nodes thing they&#8217;re the master.. but.. if they&#8217;re not talking, you have bigger problems!</li>
<li><strong>Out of sync slaves can&#8217;t READ either!</strong> This one bit us just the other day. Something ocurred where our slave wasn&#8217;t able to retrieve the latest log entries from the master. Because of this, it was reporting errors in replication. During this time, *all* commands blocked. We were relying on basic round-robin DNS for failover, thinking that memcachedb was simple enough, it was either &#8220;up&#8221; or &#8220;down&#8221;. Unfortunately, it was stalled on one box, so everything that hit that box blocked and timed out until we firewalled the port so connections wouldn&#8217;t succeed. We eventually had to stop the instance, copy a db_hotbackup from the master, then start it again. This still had to catch up from the point at which the db_hotbackup copies logs were checkpointed, which was (because we&#8217;re on v1.0.3) many hours before. While it was catching up, all commands (even stats commands.. which is disappointing..) blocked.
</li>
<li><strong>Use a load balancer, not round robin</strong>. With that said, a load balancer is a far better solution then round robin. In this case, because the box was &#8220;up&#8221;, but failing to respond, we were at the mercy of the pecl memcache module&#8217;s definition of what was &#8220;up&#8221; or &#8220;down&#8221; for reads. A load balancer separates this logic out into monitors so the code can just connect to a virtual IP, or use some list of servers it is given.</li>
<li><strong>Even better.. just use a floating IP</strong>. MemcacheDB seems to scale to ridiculous levels with reads. Like, 400:1 read:write performance. Do you really need lots of slaves? Just having an IP that follows the master will give you fault tolerance. Its easy to determine if a box is the master. You can even do a &#8216;rep_set_priority 500&#8242; to make sure a box stays the master as long as it has the IP. If you&#8217;re running on Linux, Good old <a href="http://www.linux-ha.org/">Heartbeat</a> is perfect for this. If you need to scale past the write capabilities of one box, then partitioning by using a stable hash algorithm on the keys is a far better solution than master/slave replication, and is already built in to pretty much every memcache client.</li>
<li><strong>Be careful with db_archive/db_checkpoint</strong>. This is mostly regarding v1.0.3, as I don&#8217;t know the impact of these commands on v1.1 or 1.2. However, it would seem that even with a replication policy of &#8220;ACK_ONE&#8221;, its still possible to purge logs that the slave needs. This may or may not be true (something else could have gone wrong) but it seems that running db_checkpoint/db_archive too aggressively seems to have broken our replication. There&#8217;s no reason to purge logs too often, so be wary when doing so.</li>
</ul>
<p>Hopefully this will help other users who are starting to setup MemcacheDB and need fault tolerance.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2009/03/memcachedb-fault-tolerance-procedures/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Facebook&#8217;s scribe makes a meal out of me, and comes back for more</title>
		<link>http://fewbar.com/2009/02/facebooks-scribe-makes-a-meal-out-of-me-and-comes-back-for-more/</link>
		<comments>http://fewbar.com/2009/02/facebooks-scribe-makes-a-meal-out-of-me-and-comes-back-for-more/#comments</comments>
		<pubDate>Mon, 09 Feb 2009 18:51:03 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[build]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[scribe]]></category>
		<category><![CDATA[thrift]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=40</guid>
		<description><![CDATA[So, I was working on getting Facebook&#8217;s seemingly amazing Scribe logging architecture setup to check it out. One of the requirements it has is &#8216;fb303&#8242;, which is included with Thrift in the contrib directory. I ran into this: [root@wolverine fb303]# ./configure --with-thriftpath=/usr/local --with-boost=/usr/local checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is [...]]]></description>
			<content:encoded><![CDATA[<p>So, I was working on getting Facebook&#8217;s seemingly amazing <a href="http://developers.facebook.com/scribe/">Scribe</a> logging architecture setup to check it out. One of the requirements it has is &#8216;fb303&#8242;, which is included with <a href="http://incubator.apache.org/thrift/">Thrift</a> in the contrib directory. I ran into this:<br />
<span id="more-40"></span><br />
<code><br />
[root@wolverine fb303]# ./configure --with-thriftpath=/usr/local --with-boost=/usr/local<br />
checking for a BSD-compatible install... /usr/bin/install -c<br />
checking whether build environment is sane... yes<br />
checking for gawk... gawk<br />
checking whether make sets $(MAKE)... yes<br />
checking for style of include used by make... GNU</p>
<p>checking for gcc... gcc<br />
checking for C compiler default output file name... a.out<br />
checking whether the C compiler works... yes<br />
checking whether we are cross compiling... no<br />
checking for suffix of executables...<br />
checking for suffix of object files... o<br />
checking whether we are using the GNU C compiler... yes<br />
checking whether gcc accepts -g... yes<br />
checking for gcc option to accept ANSI C... none needed<br />
checking dependency style of gcc... gcc3<br />
checking for g++... g++<br />
checking whether we are using the GNU C++ compiler... yes<br />
checking whether g++ accepts -g... yes<br />
checking dependency style of g++... gcc3<br />
checking for ranlib... ranlib<br />
checking for bash... /bin/sh<br />
checking for perl... /usr/bin/perl<br />
checking for python... /usr/bin/python<br />
checking for ar... /usr/bin/ar<br />
checking for ant... no<br />
checking Checking EXTERNAL_PATH set to... /usr/local/src/scribetest/thrift/contrib/fb303<br />
checking whether to enable optimized build... yes<br />
checking whether to enable static mode... yes<br />
checking Checking thrift_home set to... /usr/local<br />
checking for boostlib >= 1.33.1... yes<br />
configure: creating ./config.status<br />
config.status: creating Makefile<br />
config.status: creating cpp/Makefile<br />
config.status: creating py/Makefile<br />
config.status: executing depfiles commands<br />
EXTERNAL_PATH /usr/local/src/scribetest/thrift/contrib/fb303<br />
[root@wolverine fb303]# make<br />
make  all-recursive<br />
make[1]: Entering directory `/usr/local/src/scribetest/thrift/contrib/fb303'<br />
Making all in .<br />
make[2]: Entering directory `/usr/local/src/scribetest/thrift/contrib/fb303'<br />
make[2]: Nothing to be done for `all-am'.<br />
make[2]: Leaving directory `/usr/local/src/scribetest/thrift/contrib/fb303'<br />
Making all in cpp<br />
make[2]: Entering directory `/usr/local/src/scribetest/thrift/contrib/fb303/cpp'<br />
make  all-am<br />
make[3]: Entering directory `/usr/local/src/scribetest/thrift/contrib/fb303/cpp'<br />
if g++ -DPACKAGE_NAME=\"libfb303\" -DPACKAGE_TARNAME=\"libfb303\" -DPACKAGE_VERSION=\"20080209\" -DPACKAGE_STRING=\"libfb303\ 20080209\" -DPACKAGE_BUGREPORT=\"\" -DHAVE_BOOST=  -I. -I.  -I.. -Igen-cpp -I/usr/local/include/thrift -I/usr/local/include/boost-1_37     -Wall -O3 -MT FacebookService.o -MD -MP -MF ".deps/FacebookService.Tpo" -c -o FacebookService.o `test -f 'gen-cpp/FacebookService.cpp' || echo './'`gen-cpp/FacebookService.cpp; \<br />
then mv -f ".deps/FacebookService.Tpo" ".deps/FacebookService.Po"; else rm -f ".deps/FacebookService.Tpo"; exit 1; fi<br />
In file included from gen-cpp/FacebookService.cpp:6:<br />
gen-cpp/FacebookService.h:28: error: `facebook::thrift' has not been declared<br />
gen-cpp/FacebookService.h:28: error: `Service' has not been declared<br />
gen-cpp/FacebookService.h:28: error: ISO C++ forbids declaration of `_return' with no type<br />
gen-cpp/FacebookService.h:72: error: `facebook::thrift' has not been declared<br />
gen-cpp/FacebookService.h:72: error: `Service' has not been declared<br />
gen-cpp/FacebookService.h:72: error: ISO C++ forbids declaration of `parameter' with no type<br />
gen-cpp/FacebookService.h:1077: error: `facebook::thrift' has not been declared<br />
gen-cpp/FacebookService.h:1077: error: ISO C++ forbids declaration of `Service' with no type<br />
gen-cpp/FacebookService.h:1077: error: expected `;' before "success"<br />
gen-cpp/FacebookService.h: In member function `bool facebook::fb303::FacebookService_getLimitedReflection_result::operator==(const facebook::fb303::FacebookService_getLimitedReflection_result&#038;) const':<br />
gen-cpp/FacebookService.h:1086: error: `success' was not declared in this scope<br />
gen-cpp/FacebookService.h:1086: error: 'const class facebook::fb303::FacebookService_getLimitedReflection_result' has no member named 'success'<br />
gen-cpp/FacebookService.h:1086: warning: unused variable 'success'<br />
gen-cpp/FacebookService.h: At global scope:<br />
gen-cpp/FacebookService.h:1107: error: `facebook::thrift' has not been declared<br />
gen-cpp/FacebookService.h:1107: error: ISO C++ forbids declaration of `Service' with no type<br />
gen-cpp/FacebookService.h:1107: error: expected `;' before '*' token<br />
gen-cpp/FacebookService.h:1241: error: `facebook::thrift' has not been declared<br />
gen-cpp/FacebookService.h:1241: error: `Service' has not been declared<br />
gen-cpp/FacebookService.h:1241: error: ISO C++ forbids declaration of `_return' with no type<br />
gen-cpp/FacebookService.h:1243: error: `facebook::thrift' has not been declared<br />
gen-cpp/FacebookService.h:1243: error: `Service' has not been declared<br />
gen-cpp/FacebookService.h:1243: error: ISO C++ forbids declaration of `_return' with no type<br />
gen-cpp/FacebookService.h:1434: error: `facebook::thrift' has not been declared<br />
gen-cpp/FacebookService.h:1434: error: `Service' has not been declared<br />
gen-cpp/FacebookService.h:1434: error: ISO C++ forbids declaration of `_return' with no type<br />
gen-cpp/FacebookService.cpp: In member function `uint32_t facebook::fb303::FacebookService_getLimitedReflection_result::read(apache::thrift::protocol::TProtocol*)':<br />
gen-cpp/FacebookService.cpp:1795: error: 'class facebook::fb303::FacebookService_getLimitedReflection_result' has no member named 'success'<br />
gen-cpp/FacebookService.cpp: In member function `uint32_t facebook::fb303::FacebookService_getLimitedReflection_result::write(apache::thrift::protocol::TProtocol*) const':<br />
gen-cpp/FacebookService.cpp:1821: error: 'const class facebook::fb303::FacebookService_getLimitedReflection_result' has no member named 'success'<br />
gen-cpp/FacebookService.cpp: In member function `uint32_t facebook::fb303::FacebookService_getLimitedReflection_presult::read(apache::thrift::protocol::TProtocol*)':<br />
gen-cpp/FacebookService.cpp:1851: error: 'class facebook::fb303::FacebookService_getLimitedReflection_presult' has no member named 'success'<br />
gen-cpp/FacebookService.cpp: At global scope:<br />
gen-cpp/FacebookService.cpp:2614: error: `facebook::thrift' has not been declared<br />
gen-cpp/FacebookService.cpp:2614: error: variable or field `getLimitedReflection' declared void<br />
gen-cpp/FacebookService.cpp:2614: error: `int facebook::fb303::FacebookServiceClient::getLimitedReflection' is not a static member of `class facebook::fb303::FacebookServiceClient'<br />
gen-cpp/FacebookService.cpp:2614: error: `Service' was not declared in this scope<br />
gen-cpp/FacebookService.cpp:2614: error: `_return' was not declared in this scope<br />
gen-cpp/FacebookService.cpp:2615: error: expected `,' or `;' before '{' token<br />
gen-cpp/FacebookService.cpp:2633: error: `facebook::thrift' has not been declared<br />
gen-cpp/FacebookService.cpp:2633: error: variable or field `recv_getLimitedReflection' declared void<br />
gen-cpp/FacebookService.cpp:2633: error: `int facebook::fb303::FacebookServiceClient::recv_getLimitedReflection' is not a static member of `class facebook::fb303::FacebookServiceClient'<br />
gen-cpp/FacebookService.cpp:2633: error: `Service' was not declared in this scope<br />
gen-cpp/FacebookService.cpp:2633: error: `_return' was not declared in this scope<br />
gen-cpp/FacebookService.cpp:2634: error: expected `,' or `;' before '{' token<br />
gen-cpp/FacebookService.cpp: In member function `void facebook::fb303::FacebookServiceProcessor::process_getLimitedReflection(int32_t, apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*)':<br />
gen-cpp/FacebookService.cpp:3070: error: 'class facebook::fb303::FacebookService_getLimitedReflection_result' has no member named 'success'<br />
make[3]: *** [FacebookService.o] Error 1<br />
make[3]: Leaving directory `/usr/local/src/scribetest/thrift/contrib/fb303/cpp'<br />
make[2]: *** [all] Error 2<br />
make[2]: Leaving directory `/usr/local/src/scribetest/thrift/contrib/fb303/cpp'<br />
make[1]: *** [all-recursive] Error 1<br />
make[1]: Leaving directory `/usr/local/src/scribetest/thrift/contrib/fb303'<br />
make: *** [all] Error 2<br />
</code></p>
<p>I googled around but found nothing. I tried logging into #thrift on Freenode, and &#8220;SteveC_&#8221; informed me that he had the same problem, and pointed me at this <a href="https://issues.apache.org/jira/browse/THRIFT-292">thrift</a> bug report that explains it a little better. It would seem that Thrift is no longer in the  &#8216;facebook&#8217; namespace, but rather &#8216;apache&#8217;. This makes perfect sense. However, fb303 still expects it to be facebook. So, on Steve&#8217;s advice of replacing the word &#8216;facebook&#8217; with &#8216;apache&#8217; in the fb303 tree, I did this in fb303:</p>
<p>find . -type f | xargs perl -p -i -e &#8216;s/facebook/apache/g&#8217;</p>
<p>And the make continued a little further until I hit this:</p>
<p><code><br />
make[2]: Leaving directory `/usr/local/src/scribetest/thrift/contrib/fb303/cpp'<br />
Making all in py<br />
make[2]: Entering directory `/usr/local/src/scribetest/thrift/contrib/fb303/py'<br />
/usr/bin/python setup.py build<br />
running build<br />
running build_py<br />
error: package directory 'fb303' does not exist<br />
make[2]: *** [all-local] Error 1<br />
make[2]: Leaving directory `/usr/local/src/scribetest/thrift/contrib/fb303/py'<br />
make[1]: *** [all-recursive] Error 1<br />
make[1]: Leaving directory `/usr/local/src/scribetest/thrift/contrib/fb303'<br />
make: *** [all] Error 2<br />
</code></p>
<p>However, this error seemed to be irrelevant, as &#8216;make install&#8217; put fb303 where it was supposed to go.</p>
<p>Building scribe, the same &#8216;find&#8217; command from above was required. I ran it after running into the first &#8216;no such thing as facebook&#8217; error, and the make continued, and scribe seemed to work just fine.</p>
<p>Hopefully this will find its way into google&#8217;s indexes and help somebody.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2009/02/facebooks-scribe-makes-a-meal-out-of-me-and-comes-back-for-more/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Memcached and Mogile Form MemcacheMegaZord!</title>
		<link>http://fewbar.com/2008/12/memcached-and-mogile-form-memcachemegazord/</link>
		<comments>http://fewbar.com/2008/12/memcached-and-mogile-form-memcachemegazord/#comments</comments>
		<pubDate>Sun, 14 Dec 2008 17:21:50 +0000</pubDate>
		<dc:creator>clint</dc:creator>
				<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[memcached]]></category>
		<category><![CDATA[memcachedb]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[sessions]]></category>

		<guid isPermaLink="false">http://fewbar.com/?p=27</guid>
		<description><![CDATA[So I was starting to play with Memcached for session storage, and I found a fairly big problem with justing memcached in its normal caching mode as a session store. It really just boils down to caching and storing of deterministic data being very different things that only look similar on the surface. So normally, [...]]]></description>
			<content:encoded><![CDATA[<p>So I was starting to play with Memcached for session storage, and I found a fairly big problem with justing memcached in its normal caching mode as a session store. It really just boils down to caching and storing of deterministic data being very different things that only look similar on the surface.<br />
<span id="more-27"></span><br />
So normally, memcached is used in a very clever way by adding a list of servers, and then using a hashing algorithm to pick a server to actually contact based on the key of a get/set request. This allows a ton of scaling out, with minimal moving parts. There&#8217;s no periodic monitor or broadcast protocol to add and remove cluster members to and from pools, so you can just run memcached on a bunch of servers, and use a consistent list across all of your machines to achieve a huge degree of scale out. When a server dies, the code just sees that, and moves on to the next one in the hash algorithm, and all is well.</p>
<p>For caching, this &#8220;failover&#8221; methodology works fine. If I go to set a value in memcached, and the server fails over to the second one, thats ok. The next get to the primary will fail, and get set properly, and the old entry on the secondary will *eventually* get pushed out of the cache.</p>
<p>However, for storing data reliably, this becomes a problem. Lets say there is a scenario where a network cable is bad on one of the memcached servers. 1 in 100 requests fails. With caching, failover will go a little nuts, but its entirely possible nobody will even notice, as results will be cached, data won&#8217;t get stale.. no big deal.</p>
<p>With storage though, this could happen..</p>
<p>- session is created on memache1</p>
<p>- session tries to read from memcache1, and fails.. so new session is created on memache2</p>
<p>- session is then read from memache1</p>
<p>- session is updated on memcache1 with new information</p>
<p>- session fails to read from memcache1, and old session data is read from memacache2, then the set succeeds on memcache1, and the old data is lost.</p>
<p>The point isn&#8217;t really this scenario&#8217;s details, but that this hashing algorithm is vulnerable, even designed to lose data that was written to it. That is the caching paradigm.</p>
<p>As I discussed this with some colleagues, my mind immediately jumped to <a href="http://www.memcachedb.org">MemcacheDB</a>. Maybe that would work for session storage. It has replication, so we could use the traditional active/passive paradigm for it. However, this limits our scale to whatever a single instance of MemcacheDB can handle. Honestly thats probably fine for most sites, as MemcacheDB can probably handle tens of thousands of small writes per second.</p>
<p>However, there are multiple problems. The biggest problem with MemcacheDB is there&#8217;s no easy way (yet, they&#8217;re working on it) to pull keys out of it to do garbage collection. Likewise, session data really doesn&#8217;t need to live for a long time. We just need to be reasonably certain that the data we&#8217;re getting is reasonably new.</p>
<p>If we store the data in *all* of the servers, and if we store a highly accurate (meaning if it takes you milliseconds to complete a request, this timestamp needs to be down to microseconds) timestamp of when the data was given to us (meaning we use the same timestamp for each server) along side it, we can then just read it from all of the servers, and pick the newest one. Ew, that means we are still limited to the scale of one instance of memcached.</p>
<p>Then I had a flash back to the way <a href="http://danga.com/mogilefs/">MogileFS</a> works. It stores data on a number of replica servers. Of course, it also keeps track of where it stored them. But I figured, for sessions, thats a lot of overhead. There&#8217;s an easier way. We can use the <a href="http://www.spiteful.com/2008/03/17/programmers-toolbox-part-3-consistent-hashing/">consistent hashing algorithm</a> that the PHP Memcache module uses to pick servers, and just read and write the data from nReplicas servers. If a server fails, we&#8217;ll move on to the next one, and there&#8217;s a reasonable degree of certainty that it will remain the same. If we write stale data to a server and then fail back to it later, we&#8217;re protected by the timestamp rules. The higher nReplicas, the higher the reliability that a server failure won&#8217;t cause issues. I even found <a href="http://paul.annesley.cc/articles/2008/04/30/flexihash-consistent-hashing-php">a PHP implementation of consistent hashing falled FlexiHash</a>.</p>
<p>There&#8217;s one last issue that bugs me about using memcached for sessioning, and the timestamp helps us solve. We recently found that there was a problem where a request would take, say, 45 seconds to complete. At 20 seconds, the user would hit the back button out of frustration. This would put other stuff in the session, then the 45 second request would complete, and write the version of the session it thinks is right to the session store, losing the user&#8217;s new activity.</p>
<p>There are two ways to solve this. One is to introduce locking. This actually isn&#8217;t hard to do with Memcached, it is <a href="http://www.socialtext.net/memcached/index.cgi?faq#emulating_locking_with_the_add_command">described in the memcached faq</a>. However, this introduces something to block or fail on in the read. I think its simpler than that. You simply read the record before you write it, and if it has changed since you read it the first time, you don&#8217;t write it. You just throw the session write away. Obviously the user has moved on, so there&#8217;s no reason to make your update.  If you used locking, the user would still be waiting on the old thread to finish.</p>
<p>Of course, this all hinges on you caring that your session data is accurate, and that you care that users don&#8217;t lose their sessions when one server goes down. If neither of those apply to you, then you can just use sessions like cache.</p>
]]></content:encoded>
			<wfw:commentRss>http://fewbar.com/2008/12/memcached-and-mogile-form-memcachemegazord/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.247 seconds -->

