- 
   Orphan1 (source):
   main program: "time loop", not parallel
   loops in smooth_x/smooth_y can be parallelised
   large overhead: fork/join in every call
 - 
    Orphan2 (source):
   create threads once with 
PARALLEL block around time loop
	
	- 
        all threads execute this loops
	
 - 
        index variable t private by default
	
 
 
	
   DO in subroutines for distribution of loop iterations 
	
	- 
        not lexically inside a PARALLEL 
		directive ("orphaned")
	
 - 
	    already created threads are used 
	
 - 
        synchronisation of threads (implicit barrier) at END DO
	
 
 
 - 
    Orphan3 (source):
   END DO NOWAIT : no barrier at the end of 
     DO
   can be used to reduce thread waiting times
   Warning: erroneous in our example (even if used only for one loop)!
   explicit synchronisation: BARRIER
 - 
    Orphan4 (source):
   commands executed by only one thread:
	
 
	
      all threads wait at the end
   simultaneous prints by several threads ok, but output may be scrambled