Changes between Version 11 and Version 12 of cypress/Matlab


Ignore:
Timestamp:
08/24/16 11:45:23 (8 years ago)
Author:
fuji
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • cypress/Matlab

    v11 v12  
    8989
    9090See [https://wiki.hpc.tulane.edu/trac/wiki/cypress/Matlab#CompiledMatlab]
    91 
    92 === Running MATLAB in Parallel with Multithreads ===
    93 MATLAB supports multithreaded computation for a number of functions and expressions that are combinations of element-wise functions.
    94 These functions automatically execute on multiple threads if data size is large enough.
    95 Note that on Cypress, in default, MATLAB runs with a single threads, and you have to explicitly specify the number of threads in your code.
    96 For example,
    97 {{{#!matlab
    98 % Matlab Test Code "FuncTest.m"
    99 %
    100 LASTN = maxNumCompThreads(str2num(getenv('SLURM_JOB_CPUS_PER_NODE')));
    101 nth = maxNumCompThreads;
    102 fprintf('Number of Threads = %d.\n',nth);
    103 
    104 N=2^(14);
    105 A = randn(N);
    106 st = cputime;
    107 tic;
    108 B = sin(A);
    109 realT = toc;
    110 cpuT = cputime -st;
    111 fprintf('Real Time = %f(sec)\n',realT);
    112 fprintf('CPU Time = %f(sec)\n',cpuT);
    113 fprintf('Ratio = %f\n',cpuT / realT);
    114 }}}
    115 
    116 In above code, the line,
    117 {{{#!matlab
    118 LASTN = maxNumCompThreads(str2num(getenv('SLURM_JOB_CPUS_PER_NODE')));
    119 }}}
    120 defines the number of threads.
    121 The environmental variable, '''SLURM_JOB_CPUS_PER_NODE''' has the value set in SLURM script, for example,
    122 {{{#!bash
    123 #!/bin/bash
    124 #SBATCH --qos=normal            # Quality of Service
    125 #SBATCH --job-name=matlabMT     # Job Name
    126 #SBATCH --time=1:00:00          # WallTime
    127 #SBATCH --nodes=1               # Number of Nodes
    128 #SBATCH --ntasks-per-node=1     # Number of tasks (MPI processes)
    129 #SBATCH --cpus-per-task=10      # Number of threads per task (OMP threads)
    130 
    131 module load matlab
    132 matlab -nodesktop -nodisplay -nosplash -r "FuncTest; exit;"
    133 }}}
    134 The number of cores per process (task) is set by '''--cpus-per-task=10'''.
    135 This value goes to '''SLURM_JOB_CPUS_PER_NODE''' and you can use it to determine the number of threads used in the code.
    136 
    137 '''Note : Since the number of license is limited, it is recommended to compile Matlab code and make an executable.'''
    138 
    139 See [https://wiki.hpc.tulane.edu/trac/wiki/cypress/Matlab#CompiledMatlab]
    140 
    141 ==== Explicit parallelism ====
    142 The ''parallel computing toolbox'' is available on Cypress.
    143 You can use up to 12 workers for shared parallel operations on a single node in the current MATLAB version.
    144 Our license does not include MATLAB Distributed Computing Server. Therefore, multi-node parallel operations are not supported.
    145 
    146 Workers are like independent processes. If you want to use 4 workers, you have to request at least 4 tasks within a node.
    147 
    148 [[Image(MatlabWorkers.jpeg)]]
    149 
    150 {{{#!bash
    151 #!/bin/bash
    152 #SBATCH --qos=normal            # Quality of Service
    153 #SBATCH --job-name=matlabPool     # Job Name
    154 #SBATCH --time=1:00:00          # WallTime
    155 #SBATCH --nodes=1               # Number of Nodes
    156 #SBATCH --ntasks-per-node=1     # Number of tasks (MPI processes)
    157 #SBATCH --cpus-per-task=4       # Number of threads per task (OMP threads)
    158 
    159 module load matlab
    160 matlab  -nodesktop -nodisplay -nosplash -r "CreateWorker; ParforTest; exit;"
    161 }}}
    162 
    163 ''!CreateWorker.m'' is a Matlab code to create workers.
    164 {{{#!matlab
    165 % Parallel Tool Box Test "CreateWorker.m"
    166 %
    167 if isempty(getenv('SLURM_JOB_CPUS_PER_NODE'))
    168    nWorker = 1;
    169 else
    170    nWorker = min(12,str2num(getenv('SLURM_JOB_CPUS_PER_NODE')));
    171 end
    172 % Create Workers
    173 parpool(nWorker);
    174 %
    175 }}}
    176 
    177 ''Parfor.m'' is a sample 'parfor' test code,
    178 {{{#!matlab
    179 % parfor "ParforTest.m"
    180 %
    181 iter = 10000;
    182 sz = 50;
    183 a = zeros(1,iter);
    184 %
    185 fprintf('Computing...\n');
    186 tic;
    187 parfor i = 1:iter
    188     a(i) = max(svd(randn(sz)));
    189 end
    190 toc;
    191 %
    192 poolobj = gcp('nocreate'); % Returns the current pool if one exists. If no pool, do not create new one.
    193 if isempty(poolobj)
    194     poolobj = gcp;
    195 end
    196 fprintf('Number of Workers = %d.\n',poolobj.NumWorkers);
    197 %
    198 }}}
    199  
    200 '''Note : Since the number of license is limited, it is recommended to compile Matlab code and make an executable.'''
    201 
    202 See [https://wiki.hpc.tulane.edu/trac/wiki/cypress/Matlab#CompiledMatlab]
    203 
    204 === Running MATLAB with Automatic Offload ===
    205 Internally MATLAB uses Intel MKL Basic Linear Algebra Subroutines (BLAS) and Linear Algebra package (LAPACK) routines to perform the underlying computations when running on Intel processors.
    206 
    207 Intel MKL includes Automatic Offload (AO) feature that enables computationally intensive Intel MKL functions to offload partial workload to attached '''Intel Xeon Phi''' coprocessors automatically and transparently.
    208 
    209 As a result, MATLAB performance can benefit from Intel Xeon Phi coprocessors via the Intel MKL AO feature when problem sizes are large enough to amortize the cost of transferring data to the coprocessors.
    210 
    211 In SLURM script, make sure that option '''--gres=mic:1''' is set and ''intel-psxe'' module as well as the MATLAB module has been loaded.
    212 
    213 {{{#!bash
    214 #!/bin/bash
    215 #SBATCH --qos=normal            # Quality of Service
    216 #SBATCH --job-name=matlabAO     # Job Name
    217 #SBATCH --time=1:00:00          # WallTime
    218 #SBATCH --nodes=1               # Number of Nodes
    219 #SBATCH --ntasks-per-node=1     # Number of tasks (MPI processes)
    220 #SBATCH --cpus-per-task=1       # Number of threads per task (OMP threads)
    221 #SBATCH --gres=mic:1            # Number of Co-Processors
    222 
    223 module load matlab
    224 module load intel-psxe
    225 
    226 export MKL_MIC_ENABLE=1
    227 matlab  -nodesktop -nodisplay -nosplash -r "MatTest; exit;"
    228 }}}
    229 
    230 Note that
    231 {{{#!bash
    232 export MKL_MIC_ENABLE=1
    233 }}}
    234 enables Intel MKL Automatic Offload (AO).
    235 
    236 The sample cose is below:
    237 {{{#!matlab
    238 %
    239 % Matrix test "MatTest.m"
    240 %
    241 A = rand(10000, 10000);
    242 B = rand(10000, 10000);
    243 tic;
    244 C = A * B;
    245 realT = toc;
    246 fprintf('Real Time = %f(sec)\n',realT);
    247 }}}
    248 
    249 '''Note : Since the number of license is limited, it is recommended to compile Matlab code and make an executable.'''
    250 
    251 See [https://wiki.hpc.tulane.edu/trac/wiki/cypress/Matlab#CompiledMatlab]
    252 
    253 
    254 ----
    255 
    25691== Compiled Matlab ==
    25792=== Compiling Matlab Scripts using mcc ===
     
    295130./my_script
    296131}}}
     132
     133
     134
     135=== Running MATLAB in Parallel with Multithreads ===
     136MATLAB supports multithreaded computation for a number of functions and expressions that are combinations of element-wise functions.
     137These functions automatically execute on multiple threads if data size is large enough.
     138Note that on Cypress, in default, MATLAB runs with a single threads, and you have to explicitly specify the number of threads in your code.
     139For example,
     140{{{#!matlab
     141% Matlab Test Code "FuncTest.m"
     142%
     143LASTN = maxNumCompThreads(str2num(getenv('SLURM_JOB_CPUS_PER_NODE')));
     144nth = maxNumCompThreads;
     145fprintf('Number of Threads = %d.\n',nth);
     146
     147N=2^(14);
     148A = randn(N);
     149st = cputime;
     150tic;
     151B = sin(A);
     152realT = toc;
     153cpuT = cputime -st;
     154fprintf('Real Time = %f(sec)\n',realT);
     155fprintf('CPU Time = %f(sec)\n',cpuT);
     156fprintf('Ratio = %f\n',cpuT / realT);
     157}}}
     158
     159In above code, the line,
     160{{{#!matlab
     161LASTN = maxNumCompThreads(str2num(getenv('SLURM_JOB_CPUS_PER_NODE')));
     162}}}
     163defines the number of threads.
     164The environmental variable, '''SLURM_JOB_CPUS_PER_NODE''' has the value set in SLURM script, for example,
     165{{{#!bash
     166#!/bin/bash
     167#SBATCH --qos=normal            # Quality of Service
     168#SBATCH --job-name=matlabMT     # Job Name
     169#SBATCH --time=1:00:00          # WallTime
     170#SBATCH --nodes=1               # Number of Nodes
     171#SBATCH --ntasks-per-node=1     # Number of tasks (MPI processes)
     172#SBATCH --cpus-per-task=10      # Number of threads per task (OMP threads)
     173
     174module load matlab
     175matlab -nodesktop -nodisplay -nosplash -r "FuncTest; exit;"
     176}}}
     177The number of cores per process (task) is set by '''--cpus-per-task=10'''.
     178This value goes to '''SLURM_JOB_CPUS_PER_NODE''' and you can use it to determine the number of threads used in the code.
     179
     180'''Note : Since the number of license is limited, it is recommended to compile Matlab code and make an executable.'''
     181
     182See [https://wiki.hpc.tulane.edu/trac/wiki/cypress/Matlab#CompiledMatlab]
     183
     184==== Explicit parallelism ====
     185The ''parallel computing toolbox'' is available on Cypress.
     186You can use up to 12 workers for shared parallel operations on a single node in the current MATLAB version.
     187Our license does not include MATLAB Distributed Computing Server. Therefore, multi-node parallel operations are not supported.
     188
     189Workers are like independent processes. If you want to use 4 workers, you have to request at least 4 tasks within a node.
     190
     191[[Image(MatlabWorkers.jpeg)]]
     192
     193{{{#!bash
     194#!/bin/bash
     195#SBATCH --qos=normal            # Quality of Service
     196#SBATCH --job-name=matlabPool     # Job Name
     197#SBATCH --time=1:00:00          # WallTime
     198#SBATCH --nodes=1               # Number of Nodes
     199#SBATCH --ntasks-per-node=1     # Number of tasks (MPI processes)
     200#SBATCH --cpus-per-task=4       # Number of threads per task (OMP threads)
     201
     202module load matlab
     203matlab  -nodesktop -nodisplay -nosplash -r "CreateWorker; ParforTest; exit;"
     204}}}
     205
     206''!CreateWorker.m'' is a Matlab code to create workers.
     207{{{#!matlab
     208% Parallel Tool Box Test "CreateWorker.m"
     209%
     210if isempty(getenv('SLURM_JOB_CPUS_PER_NODE'))
     211   nWorker = 1;
     212else
     213   nWorker = min(12,str2num(getenv('SLURM_JOB_CPUS_PER_NODE')));
     214end
     215% Create Workers
     216parpool(nWorker);
     217%
     218}}}
     219
     220''Parfor.m'' is a sample 'parfor' test code,
     221{{{#!matlab
     222% parfor "ParforTest.m"
     223%
     224iter = 10000;
     225sz = 50;
     226a = zeros(1,iter);
     227%
     228fprintf('Computing...\n');
     229tic;
     230parfor i = 1:iter
     231    a(i) = max(svd(randn(sz)));
     232end
     233toc;
     234%
     235poolobj = gcp('nocreate'); % Returns the current pool if one exists. If no pool, do not create new one.
     236if isempty(poolobj)
     237    poolobj = gcp;
     238end
     239fprintf('Number of Workers = %d.\n',poolobj.NumWorkers);
     240%
     241}}}
     242 
     243'''Note : Since the number of license is limited, it is recommended to compile Matlab code and make an executable.'''
     244
     245See [https://wiki.hpc.tulane.edu/trac/wiki/cypress/Matlab#CompiledMatlab]
     246
     247=== Running MATLAB with Automatic Offload ===
     248Internally MATLAB uses Intel MKL Basic Linear Algebra Subroutines (BLAS) and Linear Algebra package (LAPACK) routines to perform the underlying computations when running on Intel processors.
     249
     250Intel MKL includes Automatic Offload (AO) feature that enables computationally intensive Intel MKL functions to offload partial workload to attached '''Intel Xeon Phi''' coprocessors automatically and transparently.
     251
     252As a result, MATLAB performance can benefit from Intel Xeon Phi coprocessors via the Intel MKL AO feature when problem sizes are large enough to amortize the cost of transferring data to the coprocessors.
     253
     254In SLURM script, make sure that option '''--gres=mic:1''' is set and ''intel-psxe'' module as well as the MATLAB module has been loaded.
     255
     256{{{#!bash
     257#!/bin/bash
     258#SBATCH --qos=normal            # Quality of Service
     259#SBATCH --job-name=matlabAO     # Job Name
     260#SBATCH --time=1:00:00          # WallTime
     261#SBATCH --nodes=1               # Number of Nodes
     262#SBATCH --ntasks-per-node=1     # Number of tasks (MPI processes)
     263#SBATCH --cpus-per-task=1       # Number of threads per task (OMP threads)
     264#SBATCH --gres=mic:1            # Number of Co-Processors
     265
     266module load matlab
     267module load intel-psxe
     268
     269export MKL_MIC_ENABLE=1
     270matlab  -nodesktop -nodisplay -nosplash -r "MatTest; exit;"
     271}}}
     272
     273Note that
     274{{{#!bash
     275export MKL_MIC_ENABLE=1
     276}}}
     277enables Intel MKL Automatic Offload (AO).
     278
     279The sample cose is below:
     280{{{#!matlab
     281%
     282% Matrix test "MatTest.m"
     283%
     284A = rand(10000, 10000);
     285B = rand(10000, 10000);
     286tic;
     287C = A * B;
     288realT = toc;
     289fprintf('Real Time = %f(sec)\n',realT);
     290}}}
     291
     292'''Note : Since the number of license is limited, it is recommended to compile Matlab code and make an executable.'''
     293
     294See [https://wiki.hpc.tulane.edu/trac/wiki/cypress/Matlab#CompiledMatlab]
     295