Troubleshooting FAQ
General
The MATLAB module is not visible and cannot be loaded
Check your module environment by listing currently loaded modules. As part of LRZ software stack MATLAB is provided via Spack. It is mandatory that a Spack module is loaded. If not, check for available Spack modules and load the newest module.
In order to get access to the list of supported MATLAB releases, you have to load the Spack module, for example spack/release/21.1.1.
> module list # list loaded modules > module avail spack # show all available Spack modules > module avail spack/release # show available Spack modules (releases only) > module switch spack/release/20.1 # load Spack (example) > module avail matlab # check for MATLAB
Fall back to old MATLAB installation
You may also load old installations of MATLAB via an old Spack module. This might be helpful for particular use cases. But, this is a fallback solution! Old software stacks are not supported anymore! Some MATLAB features might not work because setup of MATLAB installation does not match updated system configurations, among others.
Starting MATLAB may fail with various errors. Common issues are licensing errors, such as:
License checkout failed. License Manager Error -83 This version is newer than the version of the license file and/or network license manager on the server machine. Make sure the license manager has been updated to the latest release. Troubleshoot this issue by visiting: https://www.mathworks.com/support/lme/R2022a/83 Diagnostic Information: Feature: MATLAB
- Please check MATLAB - Getting Started for relevant information on the releases.
- If no information on the problem available: Open an incident ticket at the LRZ Servicedesk and follow the path via category "Software problem".
Depending on your organization, the number of licenses for Matlab and its toolboxes is limited. It may happen, that at a certain time all licenses for a toolbox are in use. Matlab will exit with error. There's no way to avoid that behaviour. You may use the tool "check-matlab-license" to check the availabilty of licenses.
How to use the tool?
After starting the tool, it continuously checks for availability of the given license. The tool returns if a free license is available. Otherwise, the tool prevents the processing of further commands of the job. Please note: The Slurm job is not paused! In order to avoid wasting valuable compute time, you may also add a reasonable timeout. Thus, the tool can abort the test prematurely.
#################################################################################### # Check availability of Matlab (Toolbox) licenses # # Input arguments: # (1) name of Matlab toolbox # (2) timeout in seconds: after that timframe the check will be aborted # # Return value from this script (exit codes): # 0: Check ok -> license available! The tool immediately aborts. The job may continue with Matlab. # 1: License for requested toolbox generally not available! # 2: License for requested toolbox generally available. But no free license available! # 3: Not enough input arguments! ####################################################################################
#!/bin/bash #SBATCH -o ./out/%x.%j.%N.out #SBATCH -e ./out/%x.%j.%N.err #SBATCH -D ./ #SBATCH -J matlab_threading_batch_job #SBATCH --get-user-env #SBATCH --export=NONE #SBATCH --clusters=cm2_tiny #SBATCH --partition=cm2_tiny #SBATCH --nodes=1 #SBATCH --tasks-per-node=1 #SBATCH --cpus-per-task=14 #SBATCH --time=00:30:00 module load slurm_setup module load <MATLAB MODULE> # EDIT HERE (see supported releases) # Load license checker module load matlab-tools/check-matlab-license # Run license checker # The tool will check, whether a free license for the Parallel Computing Toolbox is available. # The tool will block the script execution for max. 900 seconds. check-matlab-license Distrib_Computing_Toolbox 900 # Run MATLAB matlab -nodisplay -r "my_script;"
MATLAB PCT jobs usually have trouble to involve all CPU cores, e. g. 28 cores on CoolMUC-2. A typical reason is the lack of available memory. Assuming 56 GB of memory on CoolMUC-2 and 28 tasks defined in the job script, each worker will have a maximum of 2 GB of memory (minus memory overhead used for the worker).
As a workaround you may try to reduce the number of workers by doing following adjustments (20 tasks is an example, please try meaningful values for your usecase):
# This is not a full working example. Only relevant modifications are shown. ... #SBATCH --nodes=1 #SBATCH --tasks-per-node=20 #SBATCH --cpus-per-task=1 ... matlab -nodisplay -singleCompThread -r "myMatlabFunction" ...
% This is not a full working example. Only relevant modifications are shown. % Your code ... % get number of workers from Slurm environment nw = str2num(getenv('SLURM_NTASKS_PER_NODE')) % disable multithreading % (Probably, next line is deprecated. Please also use MATLAB's commandline argument "-singleCompThread".) maxNumCompThreads(1); % parallel pool initialization % shut down existing parallel pool if ~isempty(gcp('nocreate')) poolobj = gcp('nocreate'); delete(poolobj); end % create a local cluster object pc = parcluster('local'); % set number of workers pc.NumWorkers = nw; % set the JobStorageLocation to SCRATCH (default: HOME -> not recommended) pc.JobStorageLocation = strcat(getenv('SCRATCH')); % start the parallel pool poolobj = parpool(pc, nw); ... % Your code
It might be necessary to reduce the number of workers significantly, for example down to 50 percent of all available cores, e. g. 14 workers on CoolMUC-2. However, due to a significant amount of unused resources, this results in an inefficient computation. A hybrid solution might improve the job performance. That is, you might try to use both workers and threads in your job, e. g. 14 workers and 2 threads per worker. Many MATLAB-intrinsic functions and libraries support multithreading. Your job might benefit from that. A few modifications of the basic workaround are necessary:
# This is not a full working example. Only relevant modifications are shown. ... #SBATCH --nodes=1 #SBATCH --tasks-per-node=14 #SBATCH --cpus-per-task=2 ... matlab -nodisplay -r "myMatlabFunction" ...
% This is not a full working example. Only relevant modifications are shown. % Your code ... % get number of workers and threads from Slurm environment nw = str2num(getenv('SLURM_NTASKS_PER_NODE')); nthreads = str2num(getenv('SLURM_CPUS_PER_TASK')); % set multithreading % (Probably, next line is deprecated.) maxNumCompThreads(nthreads); % parallel pool initialization % shut down existing parallel pool if ~isempty(gcp('nocreate')) poolobj = gcp('nocreate'); delete(poolobj); end % create a local cluster object pc = parcluster('local'); % set number of workers pc.NumWorkers = nw; % set number of threads per worker pc.NumThreads = nthreads; % set the JobStorageLocation to SCRATCH (default: HOME -> not recommended) pc.JobStorageLocation = strcat(getenv('SCRATCH')); % start the parallel pool poolobj = parpool(pc, nw); ... % Your code
Problem applicability
Your job uses Intel-MPI modules and runs both MATLAB and OpenMP program(s).
Problem description
The performance of OpenMP programs may be affected by the environment setting KMP_AFFINITY. In order to obtain reasonable performance on the Linux Cluster, the default value is set by the Intel-MPI module:
setenv KMP_AFFINITY granularity=thread,compact,1,0
However, due to unknown issues between MATLAB and Intel MPI, this setting may decrease the performance of MATLAB PCT. The MATLAB module overwrites KMP_AFFINITY:
setenv KMP_AFFINITY granularity=thread,none
Workaround
In complex workflows, we recommend to reset the environment variable in your Slurm job script as needed:
# This is not a full working example. Only relevant modifications are shown. ... export KMP_AFFINITY=granularity=thread,compact,1,0 # now run my OpenMP program ... export KMP_AFFINITY=granularity=thread,none # now run MATLAB ...
As SOSTOOLS is not part of the MATLAB release and users do not have permissions to modify software installations provided by LRZ, you may install it into the user space, e.g. your HOME directory. This manual is based on following sources:
- SOSTOOLS @ Mathworks File Exchange
- SOSTOOLS manual
- SOSTOOLS source @ GitHub
- Source of SeDuMi solver @ GitHub
Recommended instructions for installation on Linux Cluster:
Step | Instruction | Comment |
---|---|---|
1 | nothing to do | MATLAB requirements
|
2 | Terminal (bash) cm2login1:~> cd $HOME cm2login1:~> mkdir software | Login to Linux Cluster and prepare installation of third-party software For third-party software, we will create a directory called "software" in our HOME directory. |
3 | Terminal (bash) cm2login1:~> cd software cm2login1:~/software> git clone https://github.com/sqlp/sedumi.git cm2login1:~/software> cd sedumi # Load and run MATLAB: exemplary choice, you may choose another MATLAB release cm2login1:~/software/sedumi> module load matlab/R2022a-generic cm2login1:~/software/sedumi> matlab -singleCompThread -nodisplay -r "install_sedumi; quit" | Download and install SDP solver into software directory As described in SOSTOOLS documentation, you need to install a SDP solver. The MATLAB installation does not provide such an solver. The SOSTOOLS documentation provides a list of SDP packages. Exemplary, we use SeDuMi. However, you also try the other packages. We recommend to test different packages. The choice might have significant impact on the job performance. You need to load a MATLAB module and run MATLAB in order to install SeDuMi. |
4 | Terminal (bash) cm2login1:~/software> git clone https://github.com/oxfordcontrol/SOSTOOLS.git | Download SOSTOOLS into software directory |
5 | MATLAB script addpath(genpath(fullfile(getenv('HOME'),'software/sedumi/'))); addpath(genpath(fullfile(getenv('HOME'),'software/SOSTOOLS/'))); | Add SDP solver and SOSTOOLS to MATLAB's search path This should be done before your application scripts run any function/script from the third-party software directory. addpath(genpath(...)) will add all subdirectories. Alternative: You may also add the commands to a dedicated MATLAB script (e.g. "startup.m"), which is part of MATLAB's search path. That applies to the current working directory. Example: Put "startup.m" into your HOME directory, change directory to your HOME and run MATLAB there. MATLAB will automatically load "startup.m". |
MATLAB Parallel Server
The following bash command performs a MPS license check and can be used in two cases:
- You may check, whether you are allowed to use the MPS .
- The output might be helpful in troubleshooting of licensing issues. Please attach it to incident tickets submitted to the support.
If MATLAB prints the list of installed toolboxes then everything is fine. Otherwise, you will receive an error message.
> matlab -dmlworker -nodisplay -r "ver,quit"
For troubleshooting purposes you may also run a license check using the license command (as shown below) within a MATLAB session. If it returns "1" then the MPS license exists.
>> license('checkout','distrib_computing_toolbox')
You tried to submit a MPS job. Do you get an error which looks like the following example?
Error using parallel.Cluster/batch (line 158) Too many workers requested. The job requires 200, but the cluster with profile 'coolmuc local R2021a' supports a maximum of 448.
The job size is too big. Please reduce the number of workers.
The number of MPS licenses is limited. Slurm will automatically queue MPS jobs unless licenses are available (please consider: our Slurm regulations are applicable and there is no guaranty that jobs start immediatly). You may check the availability of MPS licenses via following bash command on one of the login nodes:
> scontrol show licenses --clusters=cm2 | grep -A1 "MATLAB_Distrib_Comp_Engine"
Do you get an error which looks like the following example?
============================= JOB SUBMISSION INFO ============================= Job will be submitted to Slurm via following command: sbatch ... =============================================================================== /dss/dsshome1/lrz/sys/spack/release/21.1.1/opt/x86_64/matlab/R2021a-gcc-cbij4ux/bin/glnxa64/matlab_helper: symbol lookup error: /lrz/sys/tools/intel-mpi-wrappers/lib/sles15wa.so: undefined symbol: dlsym Error using parallel.Cluster/batch (line 158) Job submission failed because the plugin function 'communicatingSubmitFcn.m' errored. Error in job_config (line 63) job = ch.batch(fhandle, 4, {}, 'Pool', num_worker); Caused by: Error using communicatingSubmitFcn (line 113) Submit failed with the following message:
The library libdl.so cannot be found. This error occurs in combination with particular Intel modules. Workaround: Please quit your Matlab session and restart with a modified command:
> LD_PRELOAD=/usr/lib64/libdl.so matlab <matlab-arguments>
Using the default Intel-MPI module (Version 2019), unintended crashes of parallel MATLAB jobs might occur. Please switch to Intel MPI 2018. Example:
> module remove intel-mpi > module load intel-mpi/2018-intel > module matlab-mcr/R2021a-generic
Matlab Compiler Runtime
Does your (SuperMUC-NG) job show the following error?
libXt.so.6: cannot open shared object file: No such file or directory
As a workaround you have to modify the LD_LIBRARY_PATH environment variable in your job script in order to link to the missing library.
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/lrz/sys/spack/release/21.1.1/opt/skylake_avx512/libxt/1.1.5-gcc-twemnx6/lib
Do you get the following error?
Error: Could not find version X of the MATLAB Runtime. Attempting to load libmwmclmcrrt.so.X. Please install the correct version of the MATLAB Runtime. Contact your vendor if you do not have an installer for the MATLAB Runtime.
The MATLAB release version used for compilation has to match the MCR version at runtime! The update level may be different. Example:
- MATLAB version on development system: R2020b (Update 5)
- MCR version on target system: R2020b (Update 3)
That is, by switching to a new MCR version on the target system, you have to recompile your application on the development system using the appropriate MATLAB version.
Please also refer to the example given at MATLAB Compiler Runtime (MCR) and Job Farming.