115 | | |
| 115 | Workers are like independent processes. If you want to use 4 workers, you have to request at least 4 tasks within a node. |
| 116 | |
| 117 | [[Image(MatlabWorkers.jpg)]] |
| 118 | |
| 119 | {{{#!bash |
| 120 | #!/bin/bash |
| 121 | #SBATCH --qos=normal # Quality of Service |
| 122 | #SBATCH --job-name=matlabPool # Job Name |
| 123 | #SBATCH --time=1:00:00 # WallTime |
| 124 | #SBATCH --nodes=1 # Number of Nodes |
| 125 | #SBATCH --ntasks-per-node=1 # Number of tasks (MPI processes) |
| 126 | #SBATCH --cpus-per-task=4 # Number of threads per task (OMP threads) |
| 127 | |
| 128 | module load matlab |
| 129 | matlab -nodesktop -nodisplay -nosplash -r "CreateWorker; ParforTest; exit;" |
| 130 | }}} |
| 131 | |
| 132 | ''!CreateWorker.m'' is a Matlab code to create workers. |
| 133 | {{{#!matlab |
| 134 | % Parallel Tool Box Test "CreateWorker.m" |
| 135 | % |
| 136 | if isempty(getenv('SLURM_JOB_CPUS_PER_NODE')) |
| 137 | nWorker = 1; |
| 138 | else |
| 139 | nWorker = min(12,str2num(getenv('SLURM_JOB_CPUS_PER_NODE'))); |
| 140 | end |
| 141 | % Create Workers |
| 142 | parpool(nWorker); |
| 143 | % |
| 144 | }}} |
| 145 | |
| 146 | ''Parfor.m'' is a sample 'parfor' test code, |
| 147 | {{{#!matlab |
| 148 | % parfor "ParforTest.m" |
| 149 | % |
| 150 | iter = 10000; |
| 151 | sz = 50; |
| 152 | a = zeros(1,iter); |
| 153 | % |
| 154 | fprintf('Computing...\n'); |
| 155 | tic; |
| 156 | parfor i = 1:iter |
| 157 | a(i) = max(svd(randn(sz))); |
| 158 | end |
| 159 | toc; |
| 160 | % |
| 161 | poolobj = gcp('nocreate'); % Returns the current pool if one exists. If no pool, do not create new one. |
| 162 | if isempty(poolobj) |
| 163 | poolobj = gcp; |
| 164 | end |
| 165 | fprintf('Number of Workers = %d.\n',poolobj.NumWorkers); |
| 166 | % |
| 167 | }}} |
117 | | === Running MATLAB with Automatic Offloading === |
| 169 | === Running MATLAB with Automatic Offload === |
| 170 | Internally MATLAB uses Intel MKL Basic Linear Algebra Subroutines (BLAS) and Linear Algebra package (LAPACK) routines to perform the underlying computations when running on Intel processors. |
| 171 | |
| 172 | Intel MKL includes Automatic Offload (AO) feature that enables computationally intensive Intel MKL functions to offload partial workload to attached '''Intel Xeon Phi''' coprocessors automatically and transparently. |
| 173 | |
| 174 | As a result, MATLAB performance can benefit from Intel Xeon Phi coprocessors via the Intel MKL AO feature when problem sizes are large enough to amortize the cost of transferring data to the coprocessors. |
| 175 | |
| 176 | In SLURM script, make sure that option '''--gres=mic:1''' is set and ''intel-psxe'' module as well as the MATLAB module has been loaded. |
| 177 | |
| 178 | {{{#!bash |
| 179 | #!/bin/bash |
| 180 | #SBATCH --qos=normal # Quality of Service |
| 181 | #SBATCH --job-name=matlabAO # Job Name |
| 182 | #SBATCH --time=1:00:00 # WallTime |
| 183 | #SBATCH --nodes=1 # Number of Nodes |
| 184 | #SBATCH --ntasks-per-node=1 # Number of tasks (MPI processes) |
| 185 | #SBATCH --cpus-per-task=1 # Number of threads per task (OMP threads) |
| 186 | #SBATCH --gres=mic:1 # Number of Co-Processors |
| 187 | |
| 188 | module load matlab |
| 189 | module load intel-psxe |
| 190 | |
| 191 | export MKL_MIC_ENABLE=1 |
| 192 | matlab -nodesktop -nodisplay -nosplash -r "MatTest; exit;" |
| 193 | }}} |
| 194 | |
| 195 | Note that |
| 196 | {{{#!bash |
| 197 | export MKL_MIC_ENABLE=1 |
| 198 | }}} |
| 199 | enables Intel MKL Automatic Offload (AO). |
| 200 | |
| 201 | The sample cose is below: |
| 202 | {{{#!matlab |
| 203 | % |
| 204 | % Matrix test "MatTest.m" |
| 205 | % |
| 206 | A = rand(10000, 10000); |
| 207 | B = rand(10000, 10000); |
| 208 | tic; |
| 209 | C = A * B; |
| 210 | realT = toc; |
| 211 | fprintf('Real Time = %f(sec)\n',realT); |
| 212 | }}} |