The parameter LSF_AUTH in lsf.conf is set to eauth, which enables external authentication
A default eauth executable is installed in the directory that is specified by the parameter LSF_SERVERDIR in lsf.conf
The default executable provides an example of how the eauth protocol works. You should write your own eauth executable to meet the security requirements of your cluster.
Configuration file |
Parameter and syntax |
Default behavior |
---|---|---|
lsf.conf |
LSF_AUTH=eauth |
|
LSF_AUTH_DAEMONS=1 |
|
There are three independent features you can configure with Kerberos:
TGT forwarding
User eauth using krb5
Inter-daemon authentication using krb5
TGT forwarding is the most commonly used. All of these features need to dynamically load krb5 libs, which can be set by the LSB_KRB_LIB_PATH parameter. This parameter is optional. It tells LSF where krb5 is installed. If not set, it defaults to /usr/local/lib.
To enable TGT forwarding:
Register the user principal in the KDC server (if not already done). Set LSB_KRB_TGT_FWD=Y|y in lsf.conf. This is mandatory. This parameter serves as an overall switch which turns TGT forwarding on or off.
Set LSB_KRB_CHECK_INTERVAL in lsf.conf. This is optional. This parameter controls the time interval for TGT checking. If not set, the default value of 15 minutes is used.
Set LSB_KRB_RENEW_MARGIN in lsf.conf. This is optional. This parameter controls how much elapses before TGT is renewed. If not set, the default value of 1 hour is used.
Set LSB_KRB_TGT_DIR in lsf.conf. This is optional. It specifies where to store TGT on the execution host. If not set, it defaults to /tmp on the execution host.
Restart LSF.
Run kinit -r [sometime] -f to obtain a user TGT for forwarding.
Submit jobs as normal.
Replace the eauth binary in $LSF_SERVERDIR with eauth.krb5 which resides in the same directory.
Set LSF_AUTH=eauth in lsf.conf (this is the default setting).
To enable inter-daemon authentication using krb5:
Replace the eauth binary in $LSF_SERVERDIR with eauth.krb5 which resides in the same directory.
Set LSF_AUTH=eauth in lsf.conf (this is the default setting).
Set LSF_AUTH_DAEMONS=1 in lsf.conf.
The first step is to configure the Kerberos server. Follow the procedure below to set up a Kerberos principal and key table entry items used by LSF mbatchd to communicate with user commands and other daemons:
Run kadmin: addprinc lsf/cluster1
Enter a password for the principal lsf/cluster1@COMPANY.COM:<enter password here>
Re-enter the password for the principal lsf/cluster1@COMPANY.COM:<re-type password>
The principal lsf/cluster1@COMPANY.COM is created.
Run the ktadd subcommand of kadmin on all master hosts to add a key for mbatchd to the local host keytab file:
kadmin: ktadd -k /etc/krb5.keytab lsf/cluster_name
Run kadmin: addprinc lsf/hostA.company.com
Enter a password for the principal lsf/hostA.company.com@COMPANY.COM:<enter password here>
Re-enter the password for the principal lsf/hostA.company.com@COMPANY.COM:<re-type password>
Run kadmin and use ktadd to add this key to the local keytab on each host. You must run kadmin as root. In this example, you create a local key table entry for HostA:
kadmin: ktadd -k /etc/krb5.keytab lsf/hostA.company.com
To configure LSF to work in an AFS or NFSv4 environment (for example, to give LSF and the user's job access to an AFS filesystem):
Set LSB_KRB_TGT_FWD=Y in lsf.conf.
Set LSB_AFS_JOB_SUPPORT=Y in lsf.conf.
Optional: Set LSB_AFS_BIN_DIR= path to aklog command. If not set, the system searches in /bin, /usr/bin, /usr/local/bin, /usr/afs/bin.
Rename $LSF_SERVERDIR/erenew.krb5 to $LSF_SERVERDIR/erenw or write an executable named erenew in $LSF_SERVERDIR with minimally the following content:
#!/bin/sh
/path/to/aklog/command/aklog
Submit the job. For example, a user may submit a parallel job to run on two hosts:
bsub -m "host1 host2" -n 2 -R "span[ptile=1]" blaunch <user job commands...>
The end user should be able to use the system normally as long as they have a Kerberos credential before they submit a job.
Generally, the erenew interface functions as follows: If LSB_KRB_TGT_FWD=Y in lsf.conf and there is an executable named erenew in $LSF_SERVERDIR, then LSF will run this executable:
Once per host per job on dispatch
Once per host per job immediately after the Kerberos TGT is renewed
If the system is configured for AFS, the user's tasks will run in the same Process Authentication Group (PAG) in which this executable is run on each host. Users should ensure their renew script does not create new PAG, because every task process will automatically be put into an individual PAG. PAG is the group with which AFS associates security tokens.
When the parameter LSB_AFS_JOB_SUPPORT in lsf.conf is set to Y|y:
LSF assumes the user’s job is running in an AFS environment, and calls aklog -setpag to create a new PAG for the user’s job if it is a sequential job, or to create a separate PAG for each task res if the job is a blaunch job.
LSF runs the erenew script after the TGT is renewed. This script is primarily used to run aklog.
LSF assumes that JOB_SPOOL_DIR resides in the AFS volume. It kerberizes the child sbatchd to get the AFS token so the child sbatchd can access JOB_SPOOL_DIR.
A typical use case for an end user is to set LSB_AFS_JOB_SUPPORT=Y in lsf.conf and only call aklog in the erenew script. The user should not initiate a new PAG in the erenew script (such as calling aklog -setpag) in this case. If this parameter is changed, you must restart root res to make the change take effect.
If LSB_AFS_JOB_SUPPORT=Y, then LSF will need aklog in AFS to create a new PAG. You can then use the LSB_AFS_BIN_DIR parameter in lsf.conf to tell LSF the file path and directory where aklog resides.
If LSB_AFS_BIN_DIR is not defined, LSF will search in the following order: /bin, /usr/bin, /usr/local/bin, /usr/afs/bin. The search stops as soon as an executable aklog is found.
To turn off this TGT renewal process where the TGT file is distributed to each execution host, and instead have the TGT reside on a shared file system where each process can read it, define a directory for LSB_KRB_TGT_DIR in lsf.conf.