Skip to content

Conversation

@tyzhang1993
Copy link
Contributor

@tyzhang1993 tyzhang1993 commented Sep 10, 2018

This PR optimizes the algorithm to choose the best contraction permutation.

Summary

  • Corrected error in computation of cpu_cost.
  • Modify the algorithm to consider permutation cost.
  • Avoid avoidable permutation of intermediate tensors in contraction.
  • Avoid permuting PA, PB, PC in contraction if they are already the outer most indices.
  • Avoid double permutation of the same tensor in the batched algorithm.
  • Taking care of permutation symmetry.

Detail and Examples

Corrected error in computation of cpu_cost.

Example: C2["ijrs"] += B["gar"] * B["gbs"] * T2["ijab"]
Dimensions: i(4), j(8), r(60), s(8), g(280), a(60), b(60)
The two ways of contraction permutations are:

  1. C2[ijrs]+={$[arbs]=B[gar]*B[gbs]}*T2[ijab]
    Original code will give cpu_cost = g(ar+bs) + ab(ij+rs)= 2,985,600
    Corrected code will give cpu_cost = garbs + abijrs= 539,136,000 FLOPs

  2. C2[ijrs]+={$[ijags]=T2[ijab]*B[gbs]}*B[gar]
    Original code will give cpu_cost = b(ija+gs) + ga(ijs+r)= 5,558,400
    Corrected code will give cpu_cost = bijags + gaijsr= 516,096,000 FLOPs

In this situation, the original code will choose contraction permutation 1, but the corrected code will choose 2.
Note here that in the original code, cpu_cost actually give the memory cost of all the contracted tensors. (eg. in 1, the total number of elements in $[arbs], B[gar], B[gbs], T2[ijab] is 2,985,600)

Avoid avoidable permutation of intermediate tensors in contraction.

Example: D["efgh"] = A["abdc"] * B["cdef"] * C["bagh"]
Assuming the optimizer decide to contract A and B first, then the result with C.
According to the current implementation, the intermediated tensor will be decided to be $[abef], and the contraction will be performed in two steps:
(1) $[abef] = A["abdc"] * B["cdef"]
(2) D["efgh"] = $[abef] * C["bagh"]
In case the size of A["abdc"] < B["cdef"], $[abef] < C["bagh"]:
In (1), A["abdc"] will be permuted to A["abcd"], then in (2), $[abef] will be permuted to $[baef].
Actually, the latter can be avoided if the intermediate tensor is decided to be $[baef]:
(1) $[baef] = A["abdc"] * B["cdef"]
(2) D["efgh"] = $[baef] * C["bagh"]
The computational cost of the permutation from A["abdc"] to A["abcd"] is the same as permutation to A["bacd"].
The computational cost of the intermediate permutation can be avoided by a wiser decision on the index order of the intermediate tensor.

Avoid permuting PA, PB, PC in contraction if they are already the outer most indices.

Example: C["abij"] = A["baik"] * B["abkj"]
Since a and b are Hadamard indices and are in different orders in C["abij"] and A["baik"], the current implementation will permute A["baik"] to A["abik"] first.
However, this permutation can be avoided by locating the correct ik batch in A when looping over Hadamard indices rather than explicitly permute them.

NOTE: Currently, the decision of permuting regarding P is after the decision of i,j,k, but this should be moved before the decision of i,j,k permutation.

Avoid double permutation of the same tensor in the batched algorithm.

Example: C["abcd"] = batched("a", A["eadf"] * B["efbc"])
In the current implementation, A["eadf"] will be first permute to A["aedf"]. Then in the contraction, each batch A["edf"] need to be permuted again to A["def"].
The latter can actually be avoided if in the first permutation A["eadf"] is permuted to A["adef"].

Status

  • Ready to go.

@jturney
Copy link
Owner

jturney commented Sep 11, 2018

Do you have any timings that support this? I'm just curious.

@tyzhang1993
Copy link
Contributor Author

tyzhang1993 commented Sep 11, 2018

@jturney I have a little. I will definitely add some benchmarks in the PR.
This PR is intended to be one investigating more ways to promote the performance. For example, some permutations may have the same CPU_cost and memory_cost by the current code, but they may experience different numbers of index permutations which is not taken into account in the current algorithm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants