I have been spending a lot of time lately going over every sysadmin detail I have glossed over in the past (not really something I entirely enjoy). For an upcoming project I'm setting up a fairly (i hope) scalable topology with 2 nginx servers, 1 elastic load balancer, 1 php-fpm server, 1 or 2 RDS nodes, 1 or 2 elasticache memcached nodes, with a basic wordpress site that I'm going to import a lot of posts and data into. I'll be using W3 total cache as well. All this is inside a VPC. I'm going to try to load test it and get it to the point where 3000+ reqs/sec are no biggie. I'm also going to make an effort to really make it like a real-world test, not just a "hello world" plain text on screen that has no problem handling 10k r/s.
Are there any sysadmin heros here?
Here are my misc questions, if you feel so inclined:
In your experience, what is the best way to sync static files between the nginx machines? I considered GlusterFS but I have heard a lot of horror stories about this. Plus it seems like high IO load. The obvious answer would be capistrano or vanilla git but for this project it's less than ideal for a few specific reasons. I also like rsync but that is unidirectional, so as of right now I'm going to give Unison a shot. It seems to have witheld the test of time.
How far vertically do you scale your instances out before scaling horizontally? Is anything much bigger than large or XL a waste and perhaps you should just add another server? What is a good way to tell you should use another server rather than just increasing your hardware on the ones you have.
Right now when you use the wordpress uploader I'm pretty sure that would put any files on the php-fpm machine, at which point it needs to sync back to the nginx servers. For 2 minutes or so there could be broken links until it syncs. Would it be possible to make some kind of process 'watch' a folder, and only run unison if it changes? Or maybe even wait 10 seconds after the last bit of activity. Would that take a large hit on reliability? Are there any other DFS that are very reliable?
Do you have any recommendations for benchmarking? Perhaps even an EC2 AMI or something that I can load onto 5 machines and choose a target? I have used apache benchmark in the past but don't remember much about it. Ive also used up my credits to http://blitz.io and friends long ago, heh.