Apache Airflow 1.10.2– Active Directory Authentication (via LDAP[s])

This basic guide assumes a functional airflow deployment, albeit without authentication, or perhaps, with LDAP authentication under the legacy UI scheme. This guide also assumes apache airflow 1.10.2, installed via pip using MySQL and Redis. The guide also assumes Amazon Linux on an EC2 instance.

Pre-requisites:

    An Active Directory service account to use as the bind account.

First, modify airflow.cfg to remove the existing LDAP configuration, if it exists. This can be done by simply removing the values to the right of the equal sign under [ldap] in the airflow.cfg configuration file. Alternately, the [ldap] section can be removed.

Next, modify airflow.cfg to remove ‘authentication = True’, under the [webserver] section. Also, remove the authentication backend line, if it exists.

And finally, create a webserver_config.py file in the AIRFLOW_HOME directory (this is where airflow.cfg is also located). The contents should reflect the following:

import os
from airflow import configuration as conf
from flask_appbuilder.security.manager import AUTH_LDAP
basedir = os.path.abspath(os.path.dirname(__file__))

SQLALCHEMY_DATABASE_URI = conf.get('core', 'SQL_ALCHEMY_CONN')

CSRF_ENABLED = True

AUTH_TYPE = AUTH_LDAP

AUTH_ROLE_ADMIN = 'Admin'
AUTH_USER_REGISTRATION = True

AUTH_USER_REGISTRATION_ROLE = "Admin"
# AUTH_USER_REGISTRATION_ROLE = "Viewer"

AUTH_LDAP_SERVER = 'ldaps://$ldap:636/
AUTH_LDAP_SEARCH = "DC=domain,DC=organization,DC=com"
AUTH_LDAP_BIND_USER = 'CN=bind-user,OU=serviceAccounts,DC=domain,DC=organization,DC=com'
AUTH_LDAP_BIND_PASSWORD = '**************'
AUTH_LDAP_UID_FIELD = 'sAMAccountName'
AUTH_LDAP_USE_TLS = False
AUTH_LDAP_ALLOW_SELF_SIGNED = False
AUTH_LDAP_TLS_CACERTFILE = '/etc/pki/ca-trust/source/anchors/$root_CA.crt'

Note that this requires a valid CA certificate in the location specified to verify the SSL certificate given by Active Directory so the $ldap variable must be a resolvable name which has a valid SSL certificate signed by $root_CA.crt. Also note that any user who logs in with this configuration in place will be an Admin (more to come on this).

Once this configuration is in place, it will likely be desirable to remove all existing users, using the following set of commands from the mysql CLI, logged into the airflow DB instance:

SET FOREIGN_KEY_CHECKS=0;
truncate table ab_user;
truncate table ab_user_role;
SET FOREIGN_KEY_CHECKS=1;

Next, restart the webserver process:

initctl stop airflow-webserver;sleep 300;initctl start airflow-webserver

Once the webserver comes up, login as the user intended to be the Admin. This will allow this user to manage other users later on.

After logging in as the Admin, modify the webserver_config.py to reflect the following change(s):

# AUTH_USER_REGISTRATION_ROLE = "Admin"
AUTH_USER_REGISTRATION_ROLE = "Viewer"

Now restart the webserver process once more:

initctl stop airflow-webserver;sleep 300;initctl start airflow-webserver

Once that is done, all new users will register as ‘Viewers’. This will give them limited permissions. The Admin user(s) can then assign proper permissions, based on company policies. Note that this does not allow random people to register — only users in AD can register.

I also like to modify the ‘Public’ role to add ‘can_index’ so that anonymous users can see the UI, although they do not see DAGs or other items.

Note that Apache airflow introduced RBAC with version 1.10 and dropped support for the legacy UI after version 1.10.2.

References:
Airflow
Updating Airflow
Flask AppBuilder LDAP Authentication
Flask AppBuilder Configuration

Rebooting: quick tip

Note to self: whenever rebooting a server, login via SSH and restart the OpenSSH daemon first to validate that it will come back up.

I just updated an AWS instance and rebooted it without doing this. Some new update in OpenSSH required that the AuthorizedKeysCommandUser be defined if AuthorizedKeysCommand is defined and the OpenSSH daemon will not start.

Luckily I can tell puppet to fix this and will be able to login in 30 minutes but that’s 30 minutes I’d prefer not to wait.

– josh

SSH Public Key Authentication via OpenLDAP on RHEL/CentOS 6.x

With the release of RHEL/CentOS 6.x there are some changes to the way clients authenticate using public keys over SSH with keys stored in OpenLDAP. I was able to get this working with the following modifications.

Pre-requisites:
* RHEL / CentOS 6.x
* openssh-ldap

Setup the sshd_config by setting up the AuthorizedKeysCommand. This will execute the ssh-ldap-wrapper and output the users public key:

AuthorizedKeysCommand /usr/libexec/openssh/ssh-ldap-wrapper

Next, ensure a proper ldap.conf in /etc/ssh — be sure to setup the appropriate level of TLS security to suite your environment:

ldap_version 3
bind_policy soft

binddn cn=readonly,ou=people,dc=example,dc=com
bindpw secret

ssl no
ssl start_tls
tls_reqcert never
tls_cacertdir /etc/openldap/cacerts

host 10.x.x.x
port 389
base dc=example,dc=com

If the LDAP server is setup with the proper schema and contains public keys, this configuration should work.

For more information on how to setup the schema and insert public keys, review the documents here but be sure to note that things have changed with client configuration.