This basic guide assumes a functional airflow deployment, albeit without authentication, or perhaps, with LDAP authentication under the legacy UI scheme. This guide also assumes apache airflow 1.10.2, installed via pip using MySQL and Redis. The guide also assumes Amazon Linux on an EC2 instance.
Pre-requisites:
- An Active Directory service account to use as the bind account.
First, modify airflow.cfg to remove the existing LDAP configuration, if it exists. This can be done by simply removing the values to the right of the equal sign under [ldap] in the airflow.cfg configuration file. Alternately, the [ldap] section can be removed.
Next, modify airflow.cfg to remove ‘authentication = True’, under the [webserver] section. Also, remove the authentication backend line, if it exists.
And finally, create a webserver_config.py file in the AIRFLOW_HOME directory (this is where airflow.cfg is also located). The contents should reflect the following:
import os
from airflow import configuration as conf
from flask_appbuilder.security.manager import AUTH_LDAP
basedir = os.path.abspath(os.path.dirname(__file__))
SQLALCHEMY_DATABASE_URI = conf.get('core', 'SQL_ALCHEMY_CONN')
CSRF_ENABLED = True
AUTH_TYPE = AUTH_LDAP
AUTH_ROLE_ADMIN = 'Admin'
AUTH_USER_REGISTRATION = True
AUTH_USER_REGISTRATION_ROLE = "Admin"
# AUTH_USER_REGISTRATION_ROLE = "Viewer"
AUTH_LDAP_SERVER = 'ldaps://$ldap:636/
AUTH_LDAP_SEARCH = "DC=domain,DC=organization,DC=com"
AUTH_LDAP_BIND_USER = 'CN=bind-user,OU=serviceAccounts,DC=domain,DC=organization,DC=com'
AUTH_LDAP_BIND_PASSWORD = '**************'
AUTH_LDAP_UID_FIELD = 'sAMAccountName'
AUTH_LDAP_USE_TLS = False
AUTH_LDAP_ALLOW_SELF_SIGNED = False
AUTH_LDAP_TLS_CACERTFILE = '/etc/pki/ca-trust/source/anchors/$root_CA.crt'
Note that this requires a valid CA certificate in the location specified to verify the SSL certificate given by Active Directory so the $ldap variable must be a resolvable name which has a valid SSL certificate signed by $root_CA.crt. Also note that any user who logs in with this configuration in place will be an Admin (more to come on this).
Once this configuration is in place, it will likely be desirable to remove all existing users, using the following set of commands from the mysql CLI, logged into the airflow DB instance:
SET FOREIGN_KEY_CHECKS=0;
truncate table ab_user;
truncate table ab_user_role;
SET FOREIGN_KEY_CHECKS=1;
Next, restart the webserver process:
initctl stop airflow-webserver;sleep 300;initctl start airflow-webserver
Once the webserver comes up, login as the user intended to be the Admin. This will allow this user to manage other users later on.
After logging in as the Admin, modify the webserver_config.py to reflect the following change(s):
# AUTH_USER_REGISTRATION_ROLE = "Admin"
AUTH_USER_REGISTRATION_ROLE = "Viewer"
Now restart the webserver process once more:
initctl stop airflow-webserver;sleep 300;initctl start airflow-webserver
Once that is done, all new users will register as ‘Viewers’. This will give them limited permissions. The Admin user(s) can then assign proper permissions, based on company policies. Note that this does not allow random people to register — only users in AD can register.
I also like to modify the ‘Public’ role to add ‘can_index’ so that anonymous users can see the UI, although they do not see DAGs or other items.
Note that Apache airflow introduced RBAC with version 1.10 and dropped support for the legacy UI after version 1.10.2.
References:
– Airflow
– Updating Airflow
– Flask AppBuilder LDAP Authentication
– Flask AppBuilder Configuration
Leave a Reply