Proxy SQL Services Reporting Server with HAProxy

A common issue with SQL Server Reporting Services is to proxy the server so it is not exposed on the internet. This is difficult to do with nginx, apache, and others due to NTLM authentication, although nginx offers a paid version that supports NTLM authentication. One easy fix is to use HAProxy and use TCP mode.

A simple configuration like the following works well. Note that this configuration requires an SSL certificate (+key) and terminates SSL at the haproxy service.

global
    log         127.0.0.1 local2

    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon

    stats socket /var/lib/haproxy/stats

    # utilize system-wide crypto-policies
    ssl-default-bind-ciphers PROFILE=SYSTEM
    ssl-default-server-ciphers PROFILE=SYSTEM

defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 3000

frontend main
    bind *:80
    bind *:443 ssl crt /etc/haproxy/$path_to_cert_and_key_in_one_file
    option tcplog
    mode tcp
    default_backend             ssrs

backend ssrs
    mode tcp
    server  $ssrs_hostname $ssrs_ip_address:80 check

SSH in a for loop is a solution…

I just read an article by Jay Valentine on LinkedIn where he talks about Puppet and how they were not profitable, and also noted that Chef is not, and has never been, profitable. That got me to thinking, why are IT professionals investing in these technologies (time, knowledge, effort…).

As an IT pro, it’s tempting to become a “fan boy” — someone who learns something difficult to use, and then because so much has been invested (time, effort, knowledge), it benefits the IT pro to evangelize the tool or software to make it more relevant (and thus make the IT pro’s skills more valuable and relevant).

This happens to me all the time, Linux, cfengine, puppet, ruby, etc… With little regard for objective analysis of what would work best. I had switched to puppet, from cfengine, when I heard Redhat had adopted Puppet. That was long ago, and they have since switched to Ansible — time to focus more on containers and, when necessary, Ansible. (Although I will continue to support my clients in whatever technology they desire, like any good consultant.)

While this is not a complete waste and is, most of the time, a very good thing, since it will enable quick execution on projects with known skills and tools, it is not ideal in the long run. The reason for this is that all of these projects and tools become very complicated over time. Take puppet or chef — they do require a significant amount of knowledge to effectively deploy. Even worse, they change rapidly. A system deployed one year could require a major re-write (of the manifest/recipe) the following year, if it were upgraded. Many deployments of these configuration management tools go for years without major updates because the effort in upgrading large numbers of services, servers, and configurations is incredible.

This is a huge amount of technical debt. I’d now venture to say that the more time you must spend deploying a configuration management solution, the more technical debt you will incur, unless you do have a very focused plan to upgrade frequently, and maintain a dedicated “puppet/chef/xxxx” IT pro.

I recall reading and/or hearing the famous Luke Kanies (of Puppetlabs) quote where he says, “ssh in a for loop is not a solution”… This has always bothered me, and I couldn’t quantify the reason very well, but it’s similar to the basic text processing argument in old school linux circles — text output is universal. Any app, tool, utility, can process text. Once you move to binary or other output, you lose the ability to universally process the output. It may be more efficient to process it in other manners, but it’s no longer universal.

“SSH in a for loop” is universal.

Standalone puppet with hiera 5 error…

With puppet moving more and more away from supporting a standalone model, it’s somewhat difficult to get puppet standalone working. I recently got bit by a hiera update that caused my puppet standalone deployments to stop interacting with hiera the way that I had deployed it.

Affected versions:

  • puppet 4.10.10
  • hiera 3.4.3

The error that I was receiving was similar to the following — note that this example cites an error with the ec2tagfacts module, which I have modified to work with puppet 4.*:

Error: Evaluation Error: Error while evaluating a Function Call, Lookup of key 'ec2tagfacts::aws_access_key_id' failed: DataBinding 'hiera': v5 hiera.yaml is only to be used inside an environment or a module and cannot be given to the global hiera at $path_to/puppet/manifests/site.pp:12:3 on node $this_node

The new way of managing hiera (via puppet server) is to contain hiera within each environment and module. This does not work with [the way I use] puppet standalone because of the way you have to reference the hiera configuration. I need to try putting puppet in the default locations and try that at some point.

I was able to resolve the issue by downgrading hiera to version 3.1.1. I am testing with other versions. Updates to follow.

Puppet deprecation in stdlib module…

As part of the long upgrade to become fully compatible with puppet 4 and drop puppet 3 support — version 4.13+ of the stdlib module introduced some breaking changes for other modules that I use. I recently upgrade some individual modules using the ‘puppet module upgrade’ method.

Upon upgrading, I received the following message:

Error: Evaluation Error: Error while evaluating a Function Call, undefined method `function_deprecation' ...

The solution, for now, until the modules that I use are upgraded to work with the newer version of stdlib, is to downgrade to version 4.12 of puppetlabs-stdlib.

Before downgrading, check to see if there are other modules which require a version greater than 4.12, ie:

> for file in $(find modules/ -type f -name metadata.json); do echo -n ${file}; echo -n ":  "; egrep -i stdlib ${file} || echo ""; done
modules//apt/metadata.json:      {"name":"puppetlabs/stdlib","version_requirement":">= 4.5.0 < 5.0.0"}
modules//archive/metadata.json:        "name": "puppetlabs/stdlib",
modules//concat/metadata.json:      {"name":"puppetlabs/stdlib","version_requirement":">= 4.2.0 < 5.0.0"}
modules//ec2tagfacts/metadata.json:      {"name":"puppetlabs/stdlib","version_requirement":">= 3.2.0 < 5.0.0"},
modules//epel/metadata.json:      {"name":"puppetlabs/stdlib","version_requirement":">= 3.0.0"}
modules//gnupg/metadata.json:  
modules//inifile/metadata.json:  
modules//java/metadata.json:      {"name":"puppetlabs/stdlib","version_requirement":">= 2.4.0 < 5.0.0"}
modules//jenkins/metadata.json:      {"name":"puppetlabs/stdlib","version_requirement":">= 4.6.0 < 5.0.0"},
modules//lvm/metadata.json:      {"name":"puppetlabs/stdlib","version_requirement":">=4.1.0 < 5.0.0"}
modules//mysql/metadata.json:      {"name":"puppetlabs/stdlib","version_requirement":">= 3.2.0 < 5.0.0"},
modules//python/metadata.json:      {"name":"puppetlabs/stdlib","version_requirement":">= 4.6.0 < 6.0.0"},
modules//rvm/metadata.json:      {"name":"puppetlabs/stdlib","version_requirement":">=4.2.0"},
modules//staging/metadata.json:  
modules//vcsrepo/metadata.json:  
modules//zypprepo/metadata.json:  

The following modules require a version of puppetlabs-stdlib greater than 4.12:

  • apt – 4.5+
  • concat – 4.2.0+
  • jenkins- 4.6.0+
  • python4.6.0+
  • rvm – 4.2.0+

Sting with the apt module, the apt module is version 4.1.0. According to the puppetlabs-apt change log page, they recommend staying on version 2.3.0 unless you’re ready for the latest puppet 4 changes (they actually say any version 2 release but we’ll let that go after figuring it out…), so we need to downgrade that module first:

> puppet module upgrade puppetlabs-apt --version 2.3.0 --modulepath=modules/
Notice: Preparing to upgrade 'puppetlabs-apt' ...
Notice: Found 'puppetlabs-apt' (v2.4.0) in .../puppet/modules ...
Notice: Downloading from https://forgeapi.puppetlabs.com ...
Error: Could not upgrade module 'puppetlabs-apt' (v2.4.0 -> v2.3.0)
  Downgrading is not allowed.

Ok, so you can’t use the upgrade command to downgrade modules, so I need to be a little more heavy handed. I removed the apt module manually:

> rm -rf modules/apt/

Next, install version 2.3.0 of puppetlabs-apt:

> puppet module install puppetlabs-apt --version 2.3.0 --modulepath=modules/
Notice: Preparing to install into .../puppet/modules ...
Notice: Downloading from https://forgeapi.puppetlabs.com ...
Notice: Installing -- do not interrupt ...
.../puppet/modules
└─┬ puppetlabs-apt (v2.3.0)
  └── puppetlabs-stdlib (v4.19.0)

Now that I’m deep into this problem, I see great value in using something like r10k or Code Manager (which uses r10k) to manage these modules and dependencies.

Another potentially useful tip – I removed the puppetlabs-stdlib module and ran a module list command, and it then told me which modules were dependent upon the missing module, and which versions:

> puppet module list --modulepath=modules/
Warning: Missing dependency 'puppetlabs-stdlib':
  'puppetlabs-apt' (v2.3.0) requires 'puppetlabs-stdlib' (>= 4.5.0 < 5.0.0)
  'puppet-archive' (v2.0.0) requires 'puppetlabs-stdlib' (>= 4.13.0 < 5.0.0)
  'puppetlabs-concat' (v2.2.0) requires 'puppetlabs-stdlib' (>= 4.2.0 < 5.0.0)
  'bryana-ec2tagfacts' (v0.2.0) requires 'puppetlabs-stdlib' (>= 3.2.0 < 5.0.0)
  'stahnma-epel' (v1.2.2) requires 'puppetlabs-stdlib' (>= 3.0.0)
  'puppetlabs-java' (v1.5.0) requires 'puppetlabs-stdlib' (>= 2.4.0 < 5.0.0)
  'rtyler-jenkins' (v1.7.0) requires 'puppetlabs-stdlib' (>= 4.6.0 < 5.0.0)
  'puppetlabs-lvm' (v0.7.0) requires 'puppetlabs-stdlib' (>=4.1.0 < 5.0.0)
  'puppetlabs-mysql' (v3.10.0) requires 'puppetlabs-stdlib' (>= 3.2.0 < 5.0.0)
  'stankevich-python' (v1.14.2) requires 'puppetlabs-stdlib' (>= 4.6.0 < 6.0.0)
  'maestrodev-rvm' (v1.13.1) requires 'puppetlabs-stdlib' (>=4.2.0)

Moving on to using r10k..

gem install r10k
...

I then created a Puppetfile using my modules and am using r10k to manage them.

Note: converting to using r10k took around 20 minutes — if you’re not using r10k (or Code Manager), it’s time to start.

Adding git branch and aws profile to your bash prompt…

As a consultant who works in AWS for numerous clients, one of the most important things to keep track of is which AWS CLI profile am I currently on. To help clarify this, I’ve recently added the AWS profile to my bash prompt to remove all doubt. In addition, I’ve added a prompt for the git branch that I’m currently on. I don’t know if I’ll keep both of these around as it’s added some latency to my prompt returning promptly, but so far it’s useful.

To do this, add the following to your ~/.bashrc:


    git_branch() {
      git branch > /dev/null 2>&1
      if [[ $? -gt 0 ]]
      then
        echo "(none)"
      else
        git_branch=$(git branch | awk '/\*/ {print "("$2")"}')
        echo "${git_branch}"
      fi  
    }
    
    aws_profile() {
      aws_profile=$(aws configure list | egrep profile | awk '{print "("$2")"}')
      if [[ "${aws_profile}" == "(<not)" ]]
      then
        echo "(none)"
      else
        echo "${aws_profile}"
      fi  
    }
    
    export PS1='\n-$?- \u@\h \w >\ngit branch:   $(git_branch)\naws profile:  $(aws_profile)\n\n> '

Note that this will over-write any PS1 that you might have already setup, so use caution. This results in a multi-line prompt that looks like the following:

-0- username@hostname ~/consulting >
git branch:   (master)
aws profile:  (test-dev-account)

> 

To get the new prompt, either logout and login, or source the .bashrc file, ie:

. ~/.bashrc

This prompt is very useful as it provides the return code of the last command (-0-), the user@hostname, the current location on disk, and then two lines with the git branch of the current repository, and then the aws profile that is currently active. The prompt then follows up with two newlines and an angle bracket for the cursor.

Latest Amazon EC2 AMI Supports Puppet 3.7.4

Good news! After quite some time without a supported puppet and ruby combination from the EC2 yum repositories, the latest AMI has support for puppet 3.7.4.

This will make deploying puppet environments easier and not require use of the gem and the development packages requirement to compile it.

Tuning EC2 Network Stack

I recently had an issue with web requests taking 1.2-1.5 seconds from a service hosted in AWS. I had a small SSD-backed EC2 instance with a small SSD-backed RDS instance running a wordpress site and this type of performance was not acceptable. After a bit of troubleshooting I discovered that the network was suffering from major congestion.

To determine whether or not performance is suffering due to network congestion, I’d reccommend first ruling out swapping, CPU usage, and disk IO, as those can all contribute to network congestion and related symptoms. Once those items have been ruled out, issue the following command to review current network state:

# netstat -s | egrep 'backlog|queue|retrans|fail'
    75 input ICMP message failed.
    0 ICMP messages failed
    145 failed connection attempts
    1986800 segments retransmited
    1773 packets pruned from receive queue because of socket buffer overrun
    77017125 packets directly queued to recvmsg prequeue.
    15266317022 packets directly received from backlog
    114432650212 packets directly received from prequeue
    155427 packets dropped from prequeue
    104055915 packets header predicted and directly queued to user
    9660 fast retransmits
    885 forward retransmits
    547 retransmits in slow start
    126 sack retransmits failed
    130424 packets collapsed in receive queue due to low socket buffer

The netstat -s command will return summary statistics [since last reboot] for each protocol. The above command searches for specific items related to the backlog, queue, retransmissions, and failures, which will give us a good summary of how healthy the stack is under the current load.

With the above output, we can see that there are a number of retransmitted segments, packets pruned from the receive queue due to socket buffer overrun, packets collapsed in receive queue due to low socket buffer, and others.

To fix this issue, I took a look at the current default settings for 3 parameters, the network device backlog and tcp read and write buffer sizes:

# sysctl -a | egrep 'max_backlog|tcp_rmem|tcp_wmem'
net.core.netdev_max_backlog = 1000
net.ipv4.tcp_rmem = 4096        87380   6291456
net.ipv4.tcp_wmem = 4096        20480   4194304

The default backlog size is pretty small at 1000 units. The read and write memory are also pretty small for a production server that handles large amounts of web traffic. The first value is the minimum memory allocated to each network socket, the second is the default amount of memory alloated to each network socket, and the third is the maximum amount. The reason for these three numbers is to allow the OS to manage the buffers with regard to various load conditions.

To adjust these values, I typically make an entry or modification to /etc/sysctl.conf and then issue the ‘sysctl -p’ command to read them into memory.

I made a few adjustments to these values to get to the values seen below. After each modification, I would test service performance to ensure the system was operating smoothly and without bottlenecks on CPU, memory, disk, or network. Keep in mind that larger values result in larger memory consumption, especially with more network connections, which requires a careful analysis of memory and swap usage after tuning and performance testing.

net.core.netdev_max_backlog = 10000
net.ipv4.tcp_rmem = 20480 174760 25165824
net.ipv4.tcp_wmem = 20480 174760 25165824

These values produced a much quicker response from the web service and reduced page load times by over half a second for each page, as well as cleaning up network statistic failures, queue, and backlog errors.

I think that the default EC2 values are far too conservative and this might be due to the marketing around the larger instances being more IO performant. I’d highly recommend tuning the TCP stack to get more performance from smaller instances.

puppet search function deprecation

With the release of puppet 3.7, the search function is now deprecated, and will be removed in 4.0. This is a feature that I had used by recommendation of a puppet cookbook when creating virtual resources and managing users that I have now removed.

Using the search function basically added the namespace of an existing class to another class to allow the second class access to the existing classes resources; virtual resources in this scenario. E.g, using the search function:

# /etc/puppet/modules/user/manifests/virtual.pp
class user::virtual {
  @user { "cacti":
    ensure => present,
    uid    => 999,
}

# /etc/puppet/modules/user/manifests/system.pp
class user::system {
  search user::virtual

  realize( User["cacti"])
}

# /etc/puppet/manifests/site.pp

class base {
  include user::system
}

In the above example I created a virtual user that I could then include anywhere and any number of times, and then realize that user where appropriate.

To fix the problem and stop using the search function, I simply included the user::virtual class everywhere that I included the user::system class, e.g.:

# /etc/puppet/modules/user/manifests/virtual.pp
class user::virtual {
  @user { "cacti":
    ensure => present,
    uid    => 999,
}

# /etc/puppet/modules/user/manifests/system.pp
class user::system {
  realize( User["cacti"])
}

# /etc/puppet/manifests/site.pp

class base {
  include user::virtual
  include user::system
}

This resolved my issue. Let me know in the comments if you have other uses of search and how this change might impact you.

– josh

Redmine Issues with HTML Formatting

I recently had an issue with a client where we had deployed Redmine with an add-on plugin (CKEditor) that displayed all updates to issues as HTML. This resulted in all new issues and content being created with HTML tags but existing/previous content was not and it looked like a big glob on the page. To resolve this, I created a simple script that would connect to the database and update the journals table notes column to add basic paragraph and line breaks to format the output in a clean manner.

I’ll note that I did try the rake task that comes with ckeditor to convert all notes to texttile formatting but it also formatted all of the new content that was already correct so I had to rollback (via database restore).

The script is a simple redmine loop that loops over the results of a SQL query like the following that looks for notes without formatting and longer than 10 characters:

@raw_journal_results = @raw_journal_client.query("
  SELECT
    id
    ,journalized_id
    ,notes
  FROM #{@db_schema}.journals
  WHERE
      notes NOT LIKE '<%' 
    AND 
      LENGTH(notes) > 10
  ").each do |row|
...

Then update the notes to replace all newlines with an HTML break, and add a paragraph markup to the start and end of the notes.

I then set this to run every few minutes as we also use the email fetch feature to pull in email updates and they are not formatted either. This script, when run successfully once, only takes <1 minute to run so it's easy on the system and keeps things formatted properly for new issues.

Puppet node inheritance deprecation

Puppet 4.0 will deprecate node inheritance which is currently a common way to organize resources. I have been using node inheritance to group common configurations into a basic role and then inherit that with a node declaration like the following:

# site.pp
...
node webserver {
  # add all generic web server configuration here
}

node web01 inherits webserver {
  # any custom things that apply to only web01
}
...

This has worked pretty well for me but with a recent update to Puppet 3.7 on a few nodes, I’m getting the following deprecation warning:

Warning: Deprecation notice: Node inheritance is not supported in Puppet >= 4.0.0.
See http://links.puppetlabs.com/puppet-node-inheritance-deprecation

This quickly becomes annoying as puppet runs frequently on the nodes I manage as I like to be able to quickly deploy changes without a kick or other manual intervention, as well as be able to recover quickly if an error were to occur.

To resolve this issue for now, I have modifed all generic nodes and made them into classes and included those classes in the specific node declarations, similar to the following, which is the modified/updated version of what I included above:

# site.pp
...
class webserver {
  # add all generic web server configuration here
}

node web01 {

  # include the generic web server class
  include webserver

  # any custom things that apply to only web01

}
...

Now that works without issuing a deprecation warning which has cleared my inbox quite a bit. The right answer moving forward will be to adhere to the official puppet recommendation of creating modules from these classes and potentially making them parameterized to allow very specific configuration calls from the nodes.