Print only uncommented lines from a text file

May 9th, 2012

A common task that I perform is to print out the lines of a text file (or script) that are not commented and no blank lines.

A few good examples of where this would be useful would be the apache httpd.conf file (which has verbose comments!) and a hosts file where many entries are in use that might not be active.

egrep -v '^$|^\#' /etc/hosts

The above command uses egrep (extended regular expressions) with the -v flag to preclude matches items. The regular expression searches for any lines without content ‘^$’ and anything that starts with a # (pound sign or hash symbol) ‘^\#’.

Configure nginx to log the virtual host

March 28th, 2012

Nginx does not have the concept of virtual host like Apache does — nginx refers to the apache virtual host as the server.

To log the server_name with each request, which makes troubleshooting when multiple sites are hosted on the same instance much easier, use the following format:

log_format  main  '$remote_addr $server_name $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$request_body"' ;

Note the addition of $server_name and the removal of the ‘-’ in the first line.

The next step is to tell nginx to use this for the access_log (or any log) with something like the following:

access_log  logs/access.log  main;

Once that is complete, send a HUP signal to nginx and it will begin logging the virtual host/server name to the access log(s).

MySQL Load Data Infile – Epoch time to Timestamp

March 19th, 2012

I was recently working on a data load where I wanted to convert one column from an epoch time format into a timestamp column on import. I couldn’t figure out why the timestamp was being set to the current time on every test, no matter what I set the variables/values to.

I initially tried with this script:

for file in $(find ${FILESRC} -maxdepth 1 -type f -name "*csv")
do

mysql -u user --password=password << EOF
LOAD DATA INFILE "${file}"
  INTO TABLE mydb.mytable
  FIELDS TERMINATED BY ','
  LINES TERMINATED BY '\n'
  (field1,field2,@epochtime)
  set timestampfield = FROM_UNIXTIME( @epochtime )
;
EOF

#mv ${file} ${FILEDST}/

done

Note that this is loading into a table with at least 3 columns, field1, field2, and timestampfield.

The problem was that when I created the table, the timestamp field default value was set to CURRENT_TIMESTAMP and was over-riding the SET operation.

To fix the issue, I changed the data type for the timestamp field to datetime which has a default of NULL and the import then worked.

Proxy HTTP Requests through Nginx to Jetty6 with X-Forwarded-For

March 14th, 2012

One important part of any proxy configuration is logging the correct originating IP address on the final application log to ensure proper analytics and problem determination. Note that at times, it’s very useful to log the proxy or load balancer IP at the application server to determine where an issue may be occurring but for the most part, the original IP address is desired in the application log.

This example is using;

  • Amazon Linux (as of 2012-03)
  • nginx-0.8.54-1.4.amzn1.x86_64
  • jetty6-6.1.14-1.jpp5.noarch from jpackage.org

Perform the following steps:

  1. Install and configure nginx to proxy all requests to localhost port 8080.
  2.        location / {
                    proxy_pass   http://127.0.0.1:8080;
            }
    
  3. Install and configure jetty6, using all default options.
  4. Configure nginx to set the proxy header values for X-Forwarded-For
  5.        location / {
                    proxy_set_header X-Real-IP       $remote_addr;
                    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
                    proxy_set_header Host            $host;
                    proxy_pass   http://127.0.0.1:8080;
            }
    
  6. Configure Jetty to log the X-Forwarded-For IP in /etc/jetty6/jetty.xml under the RequestLog section

  7. ...
    <Set name="LogTimeZone">GMT</Set>
    <Set name="PreferProxiedForAddress">true</Set>
    </New>
    ...

  8. Once that is complete, restart both nginx and jetty to test.
  9. sudo /etc/init.d/nginx restart
    sudo /etc/init.d/jetty6 restart
    

Rewrite HTTP requests to HTTPS using Nginx

March 8th, 2012

A common task is to rewrite HTTP requests to HTTPS to secure communication. Using nginx, this is easily done with:

        # use only https
        if ($scheme = http) {
                rewrite ^ https://$host$uri permanent;
        }

This should be placed in a server block.

Compiling USB to Serial Kernel Modules on the D2Plug

March 6th, 2012

I recently ordered a GlobalScale D2Plug (makers of the SheevaPlug) and needed to be able to hook up a serial device via USB. The only problem was that the installed kernel did not have the proper modules available, namely pl2303 and ftdi_sio.

Although this unit was shipped running Ubuntu 10.04, performing software updates via aptitude did not work properly, the new kernel was not installed successfully and the kernel modules were not available. Rather than attempting to fix the package management problem, I have worked around the issue by building the modules manually.

Pre-Requisites
* root access
* Functional network/internet access
* development tools intalled
* curl installed (personal preferrence)
* no manual updating performed beyond specific tools required to perform tasks, (ie, freshly reflashed plug)
* using kernel 2.6.32.9-dove-5.4.2 (not the new 3/2012 kernel)
* DO NOT worry about the toolchain/cross compiler — that is for compiling D2Plug binaries on your desktop or other machine.

To get these modules compiled and installed, I had to download the kernel source and extract to /home/ubuntu/build/lsp (per the user guide) then compile and install the modules.

Step 1: Download the kernel source

$ mkdir -p /home/ubuntu/build/lsp/
$ cd !$
$ curl -C - -L -O 'http://www.plugcomputer.org/405/us/d2plug/kernel/d2plug_lsp_src_v0p4.tar.bz2'

Step 2: Extract the kernel source

$ tar xjvf d2plug_lsp_src_v0p4.tar.bz2
$ cd d2plug-linux-2.6.32.y

Step 3: Seed the .config to conform to values used to compile the stock D2Plug kernel

$ make dove_d2plug_defconfig

Step 4: Modify the .config to enable usbserial support and FTDI/PL2303, or apply the patch located here.

Step 5: Compile the kernel modules

$ make modules

Step 6: Install the kernel modules

$ sudo make modules_install

Step 7: Rebuild module dependency map file

$ sudo depmod

Step 8: Load kernel modules

$ sudo modprobe usbserial
$ sudo modprobe ftdi_sio
$ sudo modprobe pl2303

Once that process is complete, USB to Serial devices should work using either the FTDI or PL2303 drivers.

You can verify that the proper modules are loaded and in use by issuing the lsmod command (or cat /proc/modules):

ubuntu@D2Plug:~$ lsmod
Module                  Size  Used by
ftdi_sio               29308  0
pl2303                 13771  0
usbserial              24840  2 ftdi_sio,pl2303
hdmitx                119185  0
binfmt_misc             5897  1
bmm_drv                 5379  0
sg                     15417  0
galcore                64024  0
ubuntu@D2Plug:~$

Note that the usbserial module is loaded and is used by 2 modules, the pl2303 and ftdi_sio modules.

Have a question or comment about this post or topic? Send me an email; linux (AT) itsecureadmin (DOT) com.

Splitting Backups into 5GB Chunks

March 6th, 2012

A common problem with cloud storage of files is that many restrict the filesize to make things more manageable. A good way to solve this problem is to use split to reduce the filesize and create multiple smaller files from a large archive.

There are two primary use cases – note that these commands split into 5GB chunks and number them starting with 00;

1) split on the fly, as the backup is being taken (this example takes a directory and creates an archive based on that directory):

 tar -cpz ${directory} | split -d -b 5000m - ${directory}.tgz.

2) split after the fact, reducing an existing archive

split -d -b 5000m source-file.gz destination-file-split.gz.

Once the archive has been split, it can be safely uploaded to your cloud storage of choice.

When the archive is retrieved and ready to be restored, concatenate the files back together and uncompress.

For a tar.gz archive:

cat *source-file*.tgz.* | tar xzv -

For a gz archive:

cat *source-files*.gz.* | gzip -d > destination

Proxy Requests to Splunk with Apache 2.2

February 15th, 2012

When installing Splunk on a server with existing applications and Apache already setup and running, it’s easy to add support for Splunk via mod_proxy.

Although I believe it’s best to use virtual hosts to split out applications and setup proper DNS, in this example, I will be using the default virtual host (or none at all).

Step 1: Install Splunk

I won’t go into boring details of installing Splunk, but this post assumes the defaults on an RPM based distribution.

Step 2: Configure Splunk

In order to proxy, without a separate virtual host for Splunk, it’s best to set the root_endpoint in the Splunk web.conf to something Splunk-specific. Here is what I suggest:

root_endpoint = /splunk

Next, tell Splunk that it will be proxied, by setting the tools.proxy.on directive to True:

tools.proxy.on = True

Don’t forget to restart Splunk!

Step 3: Configure Apache to Proxy Requests

Ensure that mod_proxy is loaded in your apache config:

LoadModule proxy_module modules/mod_proxy.so

Next, add this bit of config to proxy the requests:

ProxyPass /splunk http://localhost:8000/splunk
ProxyPassReverse /splunk http://localhost:8000/splunk

Don’t forget to restart Apache!

Once that is finished, you will be able to access splunk via your server URL/splunk.

AWS Adds Object Expiration to S3!

January 13th, 2012

This is great news!

One of the headaches of managing any file/object store is pruning old data, although that is something we’ve all dealt with for years with standard filesystems and storage devices, this makes working in the cloud easier. It’s applied by policy to a bucket (without versioning enabled).

Check it out in the S3 Developer Guide.

Enabling the binary log on a MySQL Replication Master

January 12th, 2012

A common task when working with MySQL is to enable binary logging which will allow you to add read only slaves (often a good idea even if you aren’t adding the replication slaves now).

According to the official MySQL documentation, there are only 3 steps required to enable binary logging:

  1. assign a unique server-id to the server
  2. assign a value to log-bin in the my.cnf file
  3. restart the MySQL daemon

Taking care of these first two steps is as simple as adding the following lines to the my.cnf under the [mysqld] section:

server_id = 10
log_bin = mysql-bin

It’s also a good idea to setup a limit on the binary log file size and number of days worth of logs to retain to prevent disk space issues. Common values might be:

expire_logs_days = 2
max_binlog_size = 100M

Note that all changes to the my.cnf require restarting the MySQL daemon to take effect, although some changes can be made on-line, like setting the expire_logs_days value. Any changes must also be made to the my.cnf to persist upon restarts of MySQL.

It is my policy that no changes be made to the my.cnf file unless a restart is possible at the same time, otherwise you may end up with invalid changes (typos, etc..) in your my.cnf and MySQL may not come up the next time you need it.