reshape efficiency

Discussion:

reshape efficiency

(too old to reply)

pizorn

2010-05-19 07:53:35 UTC

Does anybody know whether 'reshape' is always a copy function or is
this compiler-specific.
let's consider the following example:

program PROG
real, dimension(:,:,:,:),allocatable :: A
integer ::D
D=...
allocate( A(D,D,D,D))
A(:,:,:,:) = ...
call PROCA(reshape( A, (/D**2, D**2/)))
contains
subroutine PROCA(A)
real, dimension(:,:) :: A ! dimension(D**2, D**2)
! ...
end subroutine PROCA
end program

so the question is: will reshape allocate an array and copy the
contents of A(:,:,:,:) to Acopy(:,:) and then call PROCA or simply
call PROCA with the same A, just pretending it has a different shape?

I read that this is compiler specific (sometimes the compiler doesn't
really allocate a new array but indeed uses the old one). What is the
general rule?

Richard Maine

2010-05-19 08:45:50 UTC

Permalink

Post by pizorn
Does anybody know whether 'reshape' is always a copy function or is
this compiler-specific.

...

Post by pizorn
I read that this is compiler specific (sometimes the compiler doesn't
really allocate a new array but indeed uses the old one). What is the
general rule?

The general rule is that the standard doesn't talk about such things as
whether a copy is made. That is considered an implementation strategy.
As long as the compiler makes things work as the standard specifies,
then it is free to copy or not. Thus yes, it is compiler specific.

That being said, the simplest implementation strategy is to make a copy.
That will always work. The standard's description is consistent with
such implementation. Note that RESHAPE is not special in this regard. It
is just a function; the standard doesn't have some special description
of it as being different from other functions. On the other hand,
because RESHAPE is an intrinsic function, the compiler can plausibly
know enough about it to optimize away the copy is some cases. It is an
optimization, though.

Also, note that there are cases just using the old array will give
incorrect results because of aliasing that results. An optimizer for
using the old array has to be smart enough to recognize those cases.

If you are concerned about performance, the safest thing to do is to
assume that a copy will be made. That might not be so in all situations
for all compilers, but it is "safe" in the sense of helping you avoid
potential performance hits.

--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain

pizorn

2010-05-19 09:59:25 UTC

Permalink

Thank you for your quick reply, this is exactly what I needed to know.

Post by Richard Maine

Post by pizorn
Does anybody know whether 'reshape' is always a copy function or is
this compiler-specific.

...

Post by pizorn
I read that this is compiler specific (sometimes the compiler doesn't
really allocate a new array but indeed uses the old one). What is the
general rule?

The general rule is that the standard doesn't talk about such things as
whether a copy is made. That is considered an implementation strategy.
As long as the compiler makes things work as the standard specifies,
then it is free to copy or not. Thus yes, it is compiler specific.
That being said, the simplest implementation strategy is to make a copy.
That will always work. The standard's description is consistent with
such implementation. Note that RESHAPE is not special in this regard. It
is just a function; the standard doesn't have some special description
of it as being different from other functions. On the other hand,
because RESHAPE is an intrinsic function, the compiler can plausibly
know enough about it to optimize away the copy is some cases. It is an
optimization, though.
Also, note that there are cases just using the old array will give
incorrect results because of aliasing that results. An optimizer for
using the old array has to be smart enough to recognize those cases.
If you are concerned about performance, the safest thing to do is to
assume that a copy will be made. That might not be so in all situations
for all compilers, but it is "safe" in the sense of helping you avoid
potential performance hits.
--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain

glen herrmannsfeldt

2010-05-19 10:38:25 UTC

Permalink

Post by pizorn
Does anybody know whether 'reshape' is always a copy function or is
this compiler-specific.
program PROG
real, dimension(:,:,:,:),allocatable :: A
integer ::D
D=...
allocate( A(D,D,D,D))
A(:,:,:,:) = ...
call PROCA(reshape( A, (/D**2, D**2/)))
contains
subroutine PROCA(A)
real, dimension(:,:) :: A ! dimension(D**2, D**2)
! ...
end subroutine PROCA
end program
so the question is: will reshape allocate an array and copy the
contents of A(:,:,:,:) to Acopy(:,:) and then call PROCA or simply
call PROCA with the same A, just pretending it has a different shape?

Unless the compiler can determine that PROCA doesn't change the
array, I would think it would have to make a copy. As an internal
subroutine, it might be able to determine that, or if the argument
was INTENT(IN).

Post by pizorn
I read that this is compiler specific (sometimes the compiler doesn't
really allocate a new array but indeed uses the old one). What is the
general rule?

If used as an argument to an intrinsic function that is known not
to modify its arguments, users are going to expect the performance
of no copy. As an argument to SUM, for example.

-- glen

Ron Shepard

2010-05-19 15:10:36 UTC

Permalink

In article

The RESHAPE() and the TRANSPOSE() intrinsics are similar in this
respect. In some cases, such as the above, the compiler might
determine that the storage sequence of the individual elements for
the two declarations are the same and avoid the copy operations
(going in and/or returning from the subprogram). However, this
depends on such things as optimization level, so you cannot depend
on the behavior. As others have pointed out, the behavior might
also depend on the INTENT of the dummy argument.

For example, if A(:,:,:,:) were a dummy argument rather than a local
array, then the compiler could not know that the first D**2 elements
all have the same spacing, so a copy would almost certainly have to
be made by the reshape. For programmers who were adept as using
storage sequence association in their f77 codes, this was (and is)
an important restriction in using array syntax and deferred shape
declarations in the newer versions of the language.

Post by pizorn
I read that this is compiler specific (sometimes the compiler doesn't
really allocate a new array but indeed uses the old one). What is the
general rule?

You can tell what is happening in any specific case by using the
nonstandard, but common, LOC() function. Or c_loc() with compilers
that support the c interop stuff in f2003. Check the address of one
of the array elements of the actual and the dummy arguments that are
associated through the call. If they are the same, then your
compiler managed to avoid a copy (that time). If the addresses are
different, then a copy was made.

One of the new features of f2008 (which I don't use yet, so I may
get some details wrong) is that you can assign pointers of higher
rank to 1-D arrays. The 1-D array will always have a fixed spacing
between the elements, allowing storage association magic to work
again. This would allow you, for example, to allocate the array as
A(D**4), and assign A2(:,:) or A4(:,:,:,:) pointers to that array.
Those pointers could then be used in the PROCA() call in the usual
way, avoiding the array copies in many cases. The downside, of
course, is that using the pointer arrays and the target attributes
suppresses many of the optimizations that are allowed otherwise, so
the programmer must be aware of the tradeoffs if efficiency is
important. In the worst case, your fortran program could end up
being as slow as a C program, and we wouldn't want that, right? :-)

$.02 -Ron Shepard

Richard Maine

2010-05-19 16:02:07 UTC

Permalink

Post by Ron Shepard
You can tell what is happening in any specific case by using the
nonstandard, but common, LOC() function.

But be aware of a caveat of such use. I recall one case here where
someone reported what seemed to me to be surprising behavior in regards
to when their compiler(s) made a copy of an argument (not related to
reshape in that case, but rather to assumed shape dummy arguments). They
were using a LOC function to determine whether or not a copy had been
made.

It turned out that they drew a completely incorrect conclusion, pretty
much exactly the opposite of what was actually happening. They were
mislead because the call to LOC involved a COPY in some cases. They
hadn't realized that was where the copy was happening. I forget all the
details involved; it might be that they were using some non-intrinsic
version of LOC that the compiler didn't know to treat specially.

But I do recall them having been seriously mislead. I recall that
because their reported results were extremely surprising to me (and it
was a "regular" here who is pretty reliable).

--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain

Ron Shepard

2010-05-19 16:56:17 UTC

Permalink

Post by Richard Maine
It turned out that they drew a completely incorrect conclusion, pretty
much exactly the opposite of what was actually happening. They were
mislead because the call to LOC involved a COPY in some cases.

Yes, I think I remember that too. I think the workaround to get the
correct result was to pass an individual array element to the LOC()
function, not the entire array.

$.02 -Ron Shepard

James Van Buskirk

2010-05-19 17:00:44 UTC

Permalink

Post by Ron Shepard
For example, if A(:,:,:,:) were a dummy argument rather than a local
array, then the compiler could not know that the first D**2 elements
all have the same spacing, so a copy would almost certainly have to
be made by the reshape. For programmers who were adept as using
storage sequence association in their f77 codes, this was (and is)
an important restriction in using array syntax and deferred shape
declarations in the newer versions of the language.

There is also the issue about what happens when the ORDER optional
argument is present.

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end

Jim Xia

2010-05-20 03:45:17 UTC

Permalink

Post by Ron Shepard
One of the new features of f2008 (which I don't use yet, so I may
get some details wrong) is that you can assign pointers of higher
rank to 1-D arrays. The 1-D array will always have a fixed spacing
between the elements, allowing storage association magic to work
again. This would allow you, for example, to allocate the array as
A(D**4), and assign A2(:,:) or A4(:,:,:,:) pointers to that array.
Those pointers could then be used in the PROCA() call in the usual
way, avoiding the array copies in many cases.

I assume you mean code like this?

program PROG
real, dimension(:,:,:,:), allocatable :: A
real, dimension(:), pointer :: B
real, dimension(:,:), pointer :: C
integer ::D
D=...
allocate( b(D**4))
b(:) = ...
c (1:d**2, 1:d**2) => b
call PROCA(c)
contains
subroutine PROCA(A)
real, dimension(:,:) :: A ! dimension(D**2, D**2)
! ...
end subroutine PROCA
end program

This is an F2003 feature, not F08 feature. If D is large, and reshape
may become expensive, then this pointer choice could save memory usage
and speed as well. XLF supports this feature since V11.1.

The downside, of

Post by Ron Shepard
course, is that using the pointer arrays and the target attributes
suppresses many of the optimizations that are allowed otherwise, so
the programmer must be aware of the tradeoffs if efficiency is
important. In the worst case, your fortran program could end up
being as slow as a C program, and we wouldn't want that, right? :-)

Not general the case. For compilers, the Fortran pointers are C-like
structures that maintain the internal book-keeping of the memory. The
use of pointers itself shouldn't prevent optimizations since Fortran
has rigid rules against "invalid aliasing" and good compilers should
have sufficient information on the usages of all the pointers. What's
bad about the pointers are the manual memory management by users. And
memory leaks are a major source of concern in large scale
applications. Allocatables are a far more user-friendly language
feature. However to the compiler, optimizing allocatables are just as
hard as optimizing pointers most of times. Under the hook they're
very much similar, except compilers have to actively manage memory for
allocatables. The only advantage in allocatables is that it's known
to be contiguous. F08's CONTIGUOUS attribute may give pointers a
performace boost.

Cheers,

Jim

Richard Maine

2010-05-20 05:49:42 UTC

Permalink

Post by Jim Xia
The
use of pointers itself shouldn't prevent optimizations since Fortran
has rigid rules against "invalid aliasing" and good compilers should
have sufficient information on the usages of all the pointers.

Except that most of those Fortran rules about "invalid aliasing" are for
non-pointers. If both sides of an alias are pointers, then that usually
makes it "valid aliasing."

Admitedly, if you take a pointer and pass it as an actual argument to a
non-pointer dummy, then the non-pointer rules do come into play.

--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain