-
Notifications
You must be signed in to change notification settings - Fork 7
Add padding #199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add padding #199
Conversation
|
I think this is it. Another thing I’ve removed for now is the check for the periodicity boolean tensor. Previously, we allowed only 2D and 3D tensors. To enable batching, I removed these checks, so now if a 1D or 0D tensor is passed to Note: batching is now allowed only for systems of the same type: either all systems are PBC or all are non-PBC. Probably @sirmarcel implemented something fancier in the JAX PME version, but I think it’s impossible to do it here without creating a separate calculator. Therefore, training for mixed PBC systems would probably require two dataloaders, which I think is fine. |
PicoCentauri
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
I think we should add one tutorial to show how the batching works in a very simple training?
tests/calculators/test_padding.py
Outdated
| pair_mask, | ||
| kvectors, | ||
| ) | ||
| batched_time = time.time() - start_batched |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is always problematic to fail. Maybe we do it over 5 runs and average?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also add tests for the potentials?
|
Resolved most of the comments; in the next commit, I will introduce an example script |
PicoCentauri
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing. I just have some comments regarding the example. Maybe also update the changelog?
examples/12-padding-example.py
Outdated
| This example demonstrates how to compute Ewald potentials for a batch of systems | ||
| with different numbers of atoms using padding. The idea is to pad atomic positions, | ||
| charges, and neighbor lists to the same length and use masks to ignore padded entries | ||
| during computation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should add that batching and the allowed padding will increase the training of a model. Should be known but doesn't hurt to say again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also show a speed comparison for a batched one and a looped one...
Might be nice :-)
|
I had to add 3 more systems to yeild 5 in total in the example, otherwise the loop implementation, is faster :) |
PicoCentauri
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. Can you update the changelog and we merge this beauty!
At some point, if we want to proceed with batching, we need to introduce padding to the inputs. This is the first commit that introduces initial padding only for the direct calculator and the Coulomb potential.
Contributor (creator of pull-request) checklist
Reviewer checklist
📚 Documentation preview 📚: https://torch-pme--199.org.readthedocs.build/en/199/