Symptom |
Probable cause |
Solution |
---|---|---|
User receives an email notification that LSF has placed a job in the USUSP state. |
The job cannot run because the Windows password for the job is not registered with LSF. |
The user should
|
LSF displays
one of the following error messages:
|
The LSF daemon does not recognize host as part of the cluster. These messages can occur if you add host to the configuration files without reconfiguring all LSF daemons. |
Run the following commands after
adding a host to the cluster:
If the problem still occurs, the host might
have multiple addresses. Match all of the host addresses to the host
name by either:
|
|
RES assumes that a user has the same UNIX user name and user ID on all LSF hosts. These messages occur if this assumption is violated. |
If the user is allowed to use LSF for interactive remote execution, make sure the user’s account has the same user ID and user name on all LSF hosts. |
|
The root user tried to execute or submit a job but LSF_ROOT_REX is not defined in lsf.conf. |
To allow the root user to run jobs on a remote host, define LSF_ROOT_REX in lsf.conf. |
|
The user with user ID uid is not allowed to make RES control requests. By default, only the LSF administrator can make RES control requests. |
To allow the root user to make RES control requests, define LSF_ROOT_REX in lsf.conf. |
|
mbatchd received
a request from sbatchd on host host_name,
but that host is not known to mbatchd. Either
|
To reconfigure mbatchd, run the command badmin reconfig To shut down sbatchd on host_name, run the commandbadmin hshutdown host_name |