This is apparently impossible to do unless you go beyond just using <audio> tags. Something like this would probably be a good starting point (sadly the API won't fit nicely into our system): https://github.com/regosen/Gapless-5