matlab CPU并行计算,加速限制了改组。 GPU(Tesla K40m),MATLAB中的CPU并行计算

论坛 期权论坛 编程之家     
选择匿名的用户   2021-6-2 20:36   2404   0

我的代码: h1>

close all

clear all

clc

% open parpool skip error if it was opened

try parpool(24); end

% Sample input. It is faked, just for demo.

% Number of "lamps" and number of "blinks" are similar to real.

NLamps = 10^2;

NBlinks = 2*10^2;

Events = cumsum([randg(9,NLamps,NBlinks)],2); % each row - different "lamp"

DurationOfExperiment=Events(:,end).*1.01;

%% MAIN

% Define parameters

nLags=2; % I need to keep autocorrelation with lags 1-2

alpha=[0.01,0.1]; % range of allowed relative deviation from observed

% parameters should be > 0 to avoid generating original

% sequence

nPermutations=10^2; % In original code 10^5

% Processing of experimental data

DurationOfExperiment=num2cell(DurationOfExperiment);

Events=num2cell(Events,2);

Intervals=cellfun(@(x) diff(x),Events,'UniformOutput',false);

observedParams=cellfun(@(x) fGetParameters(x,nLags),Intervals,'UniformOutput',false);

observedParams=cell2mat(observedParams);

% Constrained shuffling. EXPENSIVE PART!!!

while true

parfor iPermutation=1:nPermutations

% Shuffle intervals

shuffledIntervals=cellfun(@(x,y) fPermute(x,y),Intervals,DurationOfExperiment,'UniformOutput',false);

% get parameters of shuffled intervals

shuffledParameters=cellfun(@(x) fGetParameters(x,nLags),shuffledIntervals,'UniformOutput',false);

shuffledParameters=cell2mat(shuffledParameters);

% get relative deviation

delta=abs((shuffledParameters-observedParams)./observedParams);

% find shuffled Lamps, which are inside alpha range

MaximumDeviation=max(delta,[] ,2);

MinimumDeviation=min(delta,[] ,2);

LampID=find(and(MaximumDeviationalpha(1)));

% if shuffling of ANY lamp was succesful, save these Intervals

if ~isempty(LampID)

shuffledIntervals=shuffledIntervals(LampID);

shuffledParameters=shuffledParameters(LampID,:);

parsave( LampID,shuffledIntervals,shuffledParameters);

'DONE'

end

end

end

%% FUNCTIONS

function [ params ] = fGetParameters( intervals,nLags )

% Calculate [mean,std,autocorrelations with lags from 1 to nLags

R=nan(1,nLags);

for lag=1:nLags

R(lag) = corr(intervals(1:end-lag)',intervals((1+lag):end)','type','Spearman');

end

params = [mean(intervals),std(intervals),R];

end

%--------------------------------------------------------------------------

function [ Intervals ] = fPermute( Intervals,Duration )

% Create long shuffled time-series

Time=cumsum([0,datasample(Intervals,numel(Intervals)*3)]);

% Keep the same duration

Time(Time>Duration)=[];

% Calculate Intervals

Intervals=diff(Time);

end

%--------------------------------------------------------------------------

function parsave( LampID,Intervals,params)

save([num2str(randi(10^9)),'.mat'],'LampID','Intervals','params')

end

服务器规格: h1>

>>gpuDevice()

CUDADevice with properties:

Name: 'Tesla K40m'

Index: 1

ComputeCapability: '3.5'

SupportsDouble: 1

DriverVersion: 8

ToolkitVersion: 8

MaxThreadsPerBlock: 1024

MaxShmemPerBlock: 49152

MaxThreadBlockSize: [1024 1024 64]

MaxGridSize: [2.1475e+09 65535 65535]

SIMDWidth: 32

TotalMemory: 1.1979e+10

AvailableMemory: 1.1846e+10

MultiprocessorCount: 15

ClockRateKHz: 745000

ComputeMode: 'Default'

GPUOverlapsTransfers: 1

KernelExecutionTimeout: 0

CanMapHostMemory: 1

DeviceSupported: 1

DeviceSelected: 1

>> feature('numcores')

MATLAB detected: 12 physical cores.

MATLAB detected: 24 logical cores.

MATLAB was assigned: 24 logical cores by the OS.

MATLAB is using: 12 logical cores.

MATLAB is not using all logical cores because hyper-threading is enabled.

>> system('for /f "tokens=2 delims==" %A in (''wmic cpu get name /value'') do @(echo %A)')

Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz

Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz

>> memory

Maximum possible array: 496890 MB (5.210e+11 bytes) *

Memory available for all arrays: 496890 MB (5.210e+11 bytes) *

Memory used by MATLAB: 18534 MB (1.943e+10 bytes)

Physical Memory (RAM): 262109 MB (2.748e+11 bytes)

* Limited by System Memory (physical + swap file) available.

问题: H1>

是否有可能加快我的计算?我考虑CPU + GPU计算,但我不明白该怎么做(我没有使用gpuArray的经验)。而且,我不确定这是一个好主意。有时候一些算法优化会带来更大的利润,然后是并行计算

附:

保存步骤不是瓶颈 - 在最佳情况下,在10-30分钟内会发生一次。

分享到 :
0 人收藏
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

积分:3875789
帖子:775174
精华:0
期权论坛 期权论坛
发布
内容

下载期权论坛手机APP