Next: HPFacts
Up: Data Distribution
 Previous: Other Mappings
 
 
HPF is always a trade-off between parallelism and communication,
-  more processors, more communications,
 -  try to load balance, assume owner-computes rule,
 -  try to ensure data locality,
 -  use array syntax or array intrinsics,
 -  avoid storage and sequence association (and assumed-size arrays).
 
It is fairly easy to write a correct HPF program but it is an awful lot
harder to write an efficient one. The technique is more or less:
-  write a correct serial version of the intended HPF program, test
and debug,
 -  add distribution directives so as to generate the minimum number
of communications in pursuit of the solution.
 
Data locality it the key - well chosen alignment and distribution
strategies will enhance performance significantly.
Programmers should always add as many directives as possible: always add
INDEPENDENT directives where correct, always align things explicitly
and so on.
The most important factor is to obtain a good parallel algorithm, many
serial algorithms are wholly inappropriate for HPF programs.
 
Badly written programs will result in poor execution speeds:
-  if there are complicated subscript expressions then the compiler
may not be able to efficiently work out which elements are needed from
other processors,
 -  vector subscripting is a very powerful technique which is not
handled efficiently by the current compilers. Use with caution.
 -  sequential DO loops may well be executed sequentially, only a very
sophisticated compiler will be able to detect parallelism in such
structures. This may mean
that processor 1 performs its iterations of a loop, when it has finised
it informs processor 2 of this fact which means that processor 2 can
perform its iterations, when finished it will inform processor 3 and so
on. If there are hundreds of processors then this kind of loop will
absolutely cripple performance.
 -  remapping objects can be performed across procedure boundaries;
unnecessary remappings are very time consuming. Again,
be careful.
 -  if the system is unevenly balanced then some processors will be
constantly waiting for other processors to finish. This will also ruin
performance.
 
 
 
 
  
 Next: HPFacts
Up: Data Distribution
 Previous: Other Mappings
Adam Marshall ©University of Liverpool, 1996
Fri Dec  6 15:03:35 GMT 1996Not for commercial use.