Archive for the ‘Open Source Software’ Category

AWS Instance – EBS Volume Delete on Termination

Monday, June 6th, 2011

When creating an Amazon Machine Image (AMI) where there are additional volumes added, these volumes are typically not set to delete on termination. This means that after you spin through 5-10 instances based on this AMI there will remain a large number of volumes left over that were not deleted when the instances were terminated. This may result in higher usage costs and a more difficult time managing images as there will be large number of volumes to sort and filter through.

The following procedure describes how to resolve this issue through setting the deleteOnTermination flag on the AWS instnace prior to creating an AMI so that all volumes will be cleaned up properly.

Note that the operations here should be performed on a running instance either before the AMI is first cut or an instance that is spun up based on the AMI in question. This instance will then be used to create an AMI that will be used for future operations.

Pre-requisites:

- Setup command line environment to work with EC2/AWS without the need for password prompts and any authentication. This requires the installation of the ec2-api-tools and related keys and environment variables.

Process/Procedure:

1. List volumes associated with instance:

ec2-describe-instance-attribute -v -b instanceid

BLOCKDEVICE     /dev/sda1       vol-XXXXXXXX    2011-06-06T14:59:20.000Z
BLOCKDEVICE     xvdg    vol-XXXXXXXX    2011-06-06T16:58:13.000Z
BLOCKDEVICE     xvdm    vol-XXXXXXXX    2011-06-06T16:59:07.000Z
BLOCKDEVICE     xvdl    vol-XXXXXXXX    2011-06-06T16:59:24.000Z
BLOCKDEVICE     xvdf    vol-XXXXXXXX    2011-06-06T16:59:39.000Z
REQUEST ID      XXXXXXXX-a12a-4df5-b944-XXXXXXXX

# For each device above, modify the device attributes to set delete on termination:

ec2-modify-instance-attribute -b xvdg=vol-XXXXXXXX:true instanceid
ec2-modify-instance-attribute -b xvdm=vol-XXXXXXXX:true instanceid
ec2-modify-instance-attribute -b xvdl=vol-XXXXXXXX:true instanceid
ec2-modify-instance-attribute -b xvdf=vol-XXXXXXXX:true instanceid

I received an error each time I ran this command that turned out successful anyway, which is why the verification step below.

Unexpected error:
java.lang.ClassCastException: com.amazon.aes.webservices.client.InstanceBlockDeviceMappingDescription cannot be cast to com.amazon.aes.webservices.client.InstanceBlockDeviceMappingResponseDescription
        at com.amazon.aes.webservices.client.cmd.Outputter.outputInstanceAttribute(Outputter.java:664)
        at com.amazon.aes.webservices.client.cmd.ModifyInstanceAttribute.invokeOnline(ModifyInstanceAttribute.java:149)
        at com.amazon.aes.webservices.client.cmd.BaseCmd.invoke(BaseCmd.java:795)
        at com.amazon.aes.webservices.client.cmd.ModifyInstanceAttribute.main(ModifyInstanceAttribute.java:269)

# Verify that all volumes are setup correctly:

ec2-describe-instance-attribute -v -b instanceid  | egrep deleteOnTermination
            true
            true
            true
            true
            true

Why would somebody be worried about this issue:

When working with continuous integration, the intent is to automate everything. Part of that automation process with the work I have been doing lately is to create AMIs for each machine/server/instance function that will be programattically deployed and configured to remove any human intervention from the process.

Ideally this entire process would be based on Amazon official AMIs but that is further down the road. At this point I am creating and maintaining custom AMIs for each machine function.

Xen support in the Linux kernel?

Friday, June 3rd, 2011

Much of the virtualization around Linux over the last couple of years has gone to KVM which has made it appear as if Xen is dead or dieing.

This is not true! The largest virtual infrastructure in the world uses Xen as a hypervisor – Amazon Web Services.

Now Xen is in the mainline code of the Linux kernel.

This is great and will bring more visibility to Xen.

Fedora 15 Review

Thursday, June 2nd, 2011

I upgraded to Fedora 15 the day it came out and have a few thoughts on the matter.

Preupgrade
The official Fedora documentation recommends using the “preupgrade” package to upgrade from Fedora 14. Don’t do it. It blew up and I had to spend 2 hours manually resolving the issues that it caused.

Gnome Shell
Gnome shell has a lot of interesting features but I have so far found it unfamiliar and difficult to get familiar with as I am now missing IMs and multiple workspaces, as well as Compiz-fusion. I don’t like the lack of customization that it seems to present to me thus far.

I do enjoy the Windows key – quick type to start things, although if you already have a shell open and try to open a new shell (or any app), it will not allow you to do so without typing extra characters. Don’t assume that the user is an idiot, let me open two windows without extra work.

Also, fallback mode is crippling. These Gnome Shell folks need to provide a better option for lesser hardware.

More Ranting
Overall, I am more disappointed with each release of Fedora. The changes being made appear to be moving away from the important features that made Linux distributions fun to work with in the beginning, lots of user choice, command line requirements and text based installs, etc… The future appears to be less user choice and less responsibility for the user which is terrible.

Terrible features:
1. PackageKit
2. ConsoleKit
3. PolicyKit
4. The gap between rpm and yum – yum needs to embrace rpm and work together. Why aren’t there options to rollback and preserve packages? tsflags=repackage, repackage_all_erasures=true? The removal of these features really killed the Enterprise ready nature of the packaging system.

Change Encrypted Volume Password – Fedora / Linux

Tuesday, May 24th, 2011

I have been in the habit of encrypting the primary volume on every Fedora install that I’ve done in the past couple of years but have never changed the password until now (outside of rebuilds). I figured it was time to learn how to do this so that I could maintain a consistent password across systems.

The basic procedure is this:

1. Determine which volume is encrypted.

On my system, I installed with the default partition layout, which has a single hard disk with two partitions. The first partition is /dev/sda1 and is the /boot volume which is not encrypted. The second partition is /dev/sda2 and is the partition that is encrypted with dm-crypt.

> fdisk -l /dev/sda

Disk /dev/sda: 80.0 GB, 80000000000 bytes
255 heads, 63 sectors/track, 9726 cylinders, total 156250000 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xb99fb99f

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048     1026047      512000   83  Linux
/dev/sda2         1026048   156248063    77611008   83  Linux

2. Use cryptsetup to add a passphrase.

cryptsetup luksAddKey /dev/sda2

You will be prompted for a current pass-phrase (any) and then twice for the new pass-phrase. There is a limit of 8 slots that hold pass-phrases but you can delete unused or old pass-phrases.

Jenkins Slave on Windows – git clone failure

Monday, May 23rd, 2011

I recently setup a Jenkins job to run on a Windows slave that was triggered from a successful build on a Linux master and it failed with the following error:

Started by user anonymous
Building remotely on 10.13.0.11
Checkout:Deploy App Code / c:\jenkins-slave\workspace\Deploy App Code - hudson.remoting.Channel@1e5cd7a:10.13.0.11
Using strategy: Default
Checkout:Deploy App Code / c:\jenkins-slave\workspace\Deploy App Code - hudson.remoting.LocalChannel@132c800
Cloning the remote Git repository
Cloning repository origin
ERROR: Error cloning remote repo 'origin' : Error performing command: git --version
Cannot run program "git" (in directory "c:\jenkins-slave\workspace\Deploy App Code"): CreateProcess error=2, The system cannot find the file specified
ERROR: Cause: Cannot run program "git" (in directory "c:\jenkins-slave\workspace\Deploy App Code"): CreateProcess error=2, The system cannot find the file specified
Trying next repository
ERROR: Could not clone repository
FATAL: Could not clone
hudson.plugins.git.GitException: Could not clone
	at hudson.plugins.git.GitSCM$2.invoke(GitSCM.java:977)
	at hudson.plugins.git.GitSCM$2.invoke(GitSCM.java:908)
	at hudson.FilePath$FileCallableWrapper.call(FilePath.java:1979)
	at hudson.remoting.UserRequest.perform(UserRequest.java:118)
	at hudson.remoting.UserRequest.perform(UserRequest.java:48)
	at hudson.remoting.Request$2.run(Request.java:270)
	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
	at java.util.concurrent.FutureTask.run(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at hudson.remoting.Engine$1$1.run(Engine.java:60)
	at java.lang.Thread.run(Unknown Source)

The problem is that msysgit is installed with a path that points to the git.cmd executable batch file rather than git.exe in bin of the msysgit package.

To resolve this, I updated the PATH environment variable to point to the bin directory where git.exe was at rather than the git.cmd script.

I’ve seen a lot of folks change the Jenkins config to point to git.cmd instead of git but this will not work in an environment where you have both Linux and Windows hosts.

git log – good summary

Tuesday, April 19th, 2011

I have recently started using git and have struggled to find a good short summary of recent changes. This command does the trick:

git log --name-status HEAD^..HEAD
commit 1x3ga6g08f1c4324141f966d9766i86c6a790921
Author: Josh Miller
Date:   Tue Apr 19 07:19:15 2011 -0700

    removing this file to test post-receive delete operation, take 1

D       var/www/application/css/test07.css

Using the --name-status and the HEAD^..HEAD (parent to HEAD up to HEAD) give all changes made with the most recent receive.

PHP Fatal error: Uncaught exception ‘RequestCore_Exception’ with message ‘The stream size for the streaming upload cannot be determined.’

Thursday, April 14th, 2011

While attempting to upload some large files (>2GB) to S3 yesterday, I ran across this error (on both sdk 1.2.3 and 1.3.2):

php s3-upload.php
Uploading file:  myfile.bak
PHP Fatal error:  Uncaught exception 'RequestCore_Exception' with message 'The stream size for the streaming upload cannot be determined.' in /home/josh/aws/sdk-1.2.3/lib/requestcore/requestcore.class.php:771
Stack trace:
#0 /home/josh/aws/sdk-1.2.3/services/s3.class.php(722): RequestCore->prep_request()
#1 /home/josh/aws/sdk-1.2.3/services/s3.class.php(1342): AmazonS3->authenticate('db-backup-resto...', Array)
#2 /home/josh/aws/bin/s3-upload.php(73): AmazonS3->create_object('db-backup-resto...', 'myfile.bak', Array)
#3 {main}
  thrown in /home/josh/aws/sdk-1.2.3/lib/requestcore/requestcore.class.php on line 771

I started to troubleshoot the error but as I had already successfully uploaded another smaller file, I suspected it was a problem with the size of the file I was uploading so I searched the AWS forums first. I found that there is a problem with uploading files greater than 2GB from a 32 bit machine due to the fact that fstat() and filesize() return the file size as 32-bit signed integer.

The fix is to use a 64 bit machine and perform the upload.

Configuring Apache to Proxy Requests to Tomcat

Wednesday, April 13th, 2011

A very common task when administering apache and/or tomcat is to setup apache to proxy requests to tomcat.  The primary driver to using this configuration is to get apache to handle all of the front-end requests and some caching with tomcat serving up dynamic content (which is proxied through apache). This also allows apache to handle much of the security as it gets much more exposure to the internet at large than tomcat and has a great track record in this regard.  You can also benefit from standing up multiple tomcat instances behind two or more apache instances to allow you to scale more effectively and where it is needed.

Before we get started with configuration, first install apache and tomcat. This is typically done using the package manager of your distribution. Using yum, it would be as follows:

yum install httpd tomcat6

Next, set both daemons to persist (start on boot) using chkconfig or your distributions method of choice and start each daemon:

chkconfig tomcat6 on
chkconfig httpd on
/etc/init.d/tomcat6 start
/etc/init.d/httpd start

Next, configure apache to load the appropriate modules needed to proxy requests to tomcat by modifying /etc/httpd/conf/httpd.conf (or appropriate configuration file for your distribution):

LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_ajp_module modules/mod_proxy_ajp.so
LoadModule proxy_balancer_module modules/mod_proxy_balancer.so

Next, configure the virtual host that you’ll be using to proxy requests to tomcat (be sure to replace the port and IP with entries suitable to your environment):

<Proxy balancer://localhost>
  BalancerMember ajp://127.0.0.1:8080 min=10 max=100 loadfactor=1
</Proxy>
ProxyPass / ajp://localhost/

Once that configuration is complete, restart or reload apache to take effect:

apachectl graceful

Note that this configuration relies up on tomcat and apache being on the same server and you can easily configure apache to proxy requests to tomcat on another server or VIP by replacing the localhost/127.0.0.1 occurrences above with the VIP, IP, or hostname of the tomcat instance(s).

HTTP Caching in yum…

Friday, April 1st, 2011

yum is great to work with when it works and a pain in the ass when it does not. I recently had a problem where I would get the dreaded metadata does not match checksum warning while trying to update a CentOS 5.3 system I was working on.

filelists.xml.gz: [Errno -1] Metadata file does not match checksum

The problem here is that the repomd.xml which lists the file name and the SHA1 checksum that it should calculate on the file or the file itself is not being updated properly due to some level of HTTP caching between the yum client and the repository server.

Usually you can resolve this error through a metadata clean:

yum clean metadata

…or at the very least, clean all:

yum clean all

In this particular case, nothing seemed to work. I even removed the cache manually:

rm -rf /var/cache/yum/$OFFENDINGREPOSITORY

I also tried to copy the cache from another host which did not have the same problem:

$ another host> rsync -av /var/cache/yum/$OFFENDINGREPOSITORY /var/cache/yum/$OFFENDINGREPOSITORY

Something else was causing the problem — some caching server between me and the repository.

To resolve this issue, I relied on a tip from another site, to disable http caching at the yum.conf level:

#/etc/yum.conf
...
http_caching=none

This immediately resolved the problem and yum worked again.

I do not recommend continuing with this flag set in the config as caching is highly useful and makes things work faster. At minimum, it would be best to cache packages with the following option:

http_caching=packages

After going through this experience, I found another option that might have worked better, ‘yum clean expire-cache’. Next time this happens, I’d like to try this option out to see if it solves the problem.

Using PHP to check SSHD availability

Thursday, March 17th, 2011

I’m working on a project where I need to spin up some EC2 instances and deploy code to them over SSH which requires that I know when SSH is available. To do this, I’ve written a little PHP snippet that will connect to the server via TCP port 22 and loop until the service is available:

        $address   = "10.3.1.2" ;
        $port         = 22 ;
        $sleeptime = 1 ;

        $resource = socket_create( AF_INET, SOCK_STREAM, SOL_TCP ) ;
        while ( ! @socket_connect( $resource, $address, $port ) ) {
                print "not ready\n" ;
                sleep( $sleeptime )  ;
        }

        print "ready\n" ;

Note the @ sign in front of the socket_connect function call to suppress PHP Warnings when it fails. This is because I expect it to fail 99% of the time.