For LSF IBM Parallel Environment (IBM PE) integration. Specifies the network resource requirements to enable network-aware scheduling for IBM PE jobs.
resource
If any network resource requirement is specified in the job, queue, or application profile, the jobs are treated as IBM PE jobs. IBM PE jobs can only run on hosts where IBM PE pnsd daemon is running.
The network resource requirement string network_res_req has the same syntax as the NETWORK_REQ parameter defined in lsb.applications or lsb.queues.
network_res_req has the following syntax:
[type=sn_all | sn_single] [:protocol=protocol_name[(protocol_number)][,protocol_name[(protocol_number)]] [:mode=US | IP] [:usage=shared | dedicated] [:instance=positive_integer]
LSF_PE_NETWORK_NUM must be defined to a non-zero value in lsf.conf for the LSF to recognize the -network option. If LSF_PE_NETWORK_NUM is not defined or is set to 0, the job submission is rejected with a warning message.
The -network option overrides the value of NETWORK_REQ defined in lsb.applications or lsb.queues.
If mode is IP and type is specified as sn_all or sn_single, the job will only run on IB adapters (IPoIB). If mode is IP and type is not specified, the job will only run on Ethernet adapters (IPoEth). For IPoEth jobs, LSF ensures the job is running on hosts where pnsd is installed and running. For IPoIB jobs, LSF ensures the job the job is running on hosts where pnsd is installed and running, and that InfiniBand networks are up. Because IP jobs do not consume network windows, LSF does not check if all network windows are used up or the network is already occupied by a dedicated IBM PE job.
Equivalent to the IBM PE MP_EUIDEVICE environment variable and -euidevice IBM PE flag See the Parallel Environment Runtime Edition for AIX: Operation and Use guide (SC23-6781-05) for more information. Only sn_all or sn_single are supported by LSF. The other types supported by IBM PE are not supported for LSF jobs.
The default value is mpi.
LSF also supports an optional protocol_number (for example, mpi(2), which specifies the number of contexts (endpoints) per parallel API instance. The number must be a power of 2, but no greater than 128 (1, 2, 4, 8, 16, 32, 64, 128). LSF will pass the communication protocols to IBM PE without any change. LSF will reserve network windows for each protocol.
When you specify multiple parallel API protocols, you cannot make calls to both LAPI and PAMI (lapi, pami) or LAPI and OpenSHMEM (lapi, shmem) in the same application. Protocols can be specified in any order.
See the MP_MSG_API and MP_ENDPOINTS environment variables and the -msg_api and -endpoints IBM PE flags in the Parallel Environment Runtime Edition for AIX: Operation and Use guide (SC23-6781-05) for more information about the communication protocols that are supported by IBM PE.
Each instance on the US mode requested by a task running on switch adapters requires and adapter window. For example, if a task requests both the MPI and LAPI protocols such that both protocol instances require US mode, two adapter windows will be used.
The default value is 1.
If the specified value is greater than MAX_PROTOCOL_INSTANCES in lsb.params or lsb.queues, LSF rejects the job.
See Administering IBM Platform LSF for more information about network-aware scheduling and running and managing workload through IBM Parallel Environment.
bsub –n2 –R "span[ptile=1]" –network "protocol=mpi,lapi: type=sn_all: instances=2: usage=shared" poe /home/user1/mpi_prog
For this job running on hostA and hostB, each task will reserve 8 windows (2*2*2), for 2 protocols, 2 instances and 2 networks. If enough network windows are available, other network jobs with usage=shared can run on hostA and hostB because networks used by this job are shared.