Skip to content

Conversation

@sisp
Copy link
Member

@sisp sisp commented Oct 28, 2025

I have introduced a new update algorithm that is centered around the high-level git merge command, performing a 3-way merge of a fresh project based on the new template version into the current project with the common ancestor being a fresh project based on the old template version. This is – in a nutshell – what the previous update algorithm was doing using a sequence of low-level Git commands and other hacks. With this refactoring, I'm hoping to simplify the update algorithm implementation, reduce the risk of missed edge cases by relying on a battle-tested high-level command, and paving the way for new opportunities like improving merge conflict handling through custom merge drivers like https://mergiraf.org and supporting an update algorithm variant that does not rely on replaying a fresh project based on the old template version (see #1170).

I have marked this PR as a draft because this is a significant (and breaking, see below for more details) change that needs to be thoroughly reviewed and shouldn't be rushed.

According to my research, Copier's update algorithm has a long history, starting with #106 and evolving from a 2-way merge to a 3-way merge with merge conflicts stored first in .rej files and later also available as inline markers. Many edge cases have shown up along the way and their fixes contributed to the current complexity. At some point, I realized that the current algorithm is essentially a 3-way merge, implemented with a sequence of low-level commands and hacks. But git merge also performs a 3-way merge with just a single command, given that the Git graph contains three commits with the appropriate relationship. This Git graph does not exist naturally in our case, as a Copier template needs to be instantiated before updating a project, and these template instances are not tracked via Git. But it is possible to construct this relationship in a synthetic Git graph which consists of the three commits necessary to perform the 3-way merge. git merge can be applied on this synthetic graph, and with some tricks it looks like a familiar merge. Thanks to using plain git merge, delicate Git index manipulation to induce a mid-merge state for merge conflicts, especially to enable visual merge conflicts in IDEs like VS Code, becomes obsolete, aborting a Copier update becomes a simple git merge --abort command, and edge cases like add/add merge conflicts are handled without special treatment. In a way, copier copy is like git clone, and copier update is like git merge – but extended to handling parametrized starters with some extra sugar.

This PR introduces a few breaking changes:

  • git merge can only produce inline conflict markers but no .rej files, so I've removed the conflict setting. I suppose that nobody uses .rej files anymore, as inline conflict markers have been the default setting for a while and are familiar to Git users.
  • git merge does not support a configurable context size, so I've removed the context_lines setting.
  • git merge does not support full customization of the conflict maker labels. To compensate for this limitation at least a little bit, a temporary ref copier/after-updating points to the commit of the fresh project based on the new template version, which is used as the conflict marker label for "incoming" changes. The conflict marker label for the current state is HEAD.
  • .gitattributes settings affect Copier's internal git merge call. This may have (unexpected) side effects which are difficult to anticipate. But it may be exactly what we want, glancing at a custom merge driver like https://mergiraf.org again.

If we decide to (eventually) merge this PR, I suggest to slip in deprecation warnings first, release them, and wait for a little while before creating a new major release to give Copier users a smoother transition experience.

WDYT, @copier-org/maintainers?

Additionally looping in some contributors who have been active around the update algorithm: @tpluscode @tguillemot @barrywhart @thurse93

@codecov
Copy link

codecov bot commented Oct 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.20%. Comparing base (de88305) to head (0bd952f).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2376      +/-   ##
==========================================
- Coverage   97.24%   97.20%   -0.05%     
==========================================
  Files          55       55              
  Lines        6247     6156      -91     
==========================================
- Hits         6075     5984      -91     
  Misses        172      172              
Flag Coverage Δ
unittests 97.20% <100.00%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sisp sisp force-pushed the refactor/new-update-algorithm branch from c173f5c to 27e3597 Compare October 28, 2025 14:20
Copy link
Member

@yajo yajo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty amazed with the new algorithm. I have not tested it yet, but with reduced low-level plumbing and half of the lines, all tests pass! At least, except the ones you're inherently breaking.

I suggest to add BREAKING CHANGE: trailers to the git commit message listing specific things that break, so they appear in the changelog once released.

Thank you! ❤️

@sisp sisp force-pushed the refactor/new-update-algorithm branch 4 times, most recently from 0b9745c to 36f5b80 Compare October 28, 2025 20:11
@sisp
Copy link
Member Author

sisp commented Oct 28, 2025

Thanks for your prompt review and the most positive feedback, @yajo! 🙇 🙏

I've added BREAKING CHANGE trailers to the commit message as you suggested. I've also found a few internal helpers that are now obsolete and removed them. And I've fixed the failing tests on Windows; it turns out that trailing whitespaces in filenames are stripped on Windows, so the two previously failing parametrizations don't make much sense on Windows and are now skipped there.

BREAKING CHANGE: The `-o,--conflict` flag is removed, as `git merge` only supports inline conflicts.
BREAKING CHANGE: The `-c,--context-lines` flag is removed, as `git merge` does not support configurable context sizes.
BREAKING CHANGE: Inline conflict marker labels are different, as `git merge` does not support their customization.
BREAKING CHANGE: `gitattributes` settings affect Copier's internal `git merge` call.
@sisp sisp force-pushed the refactor/new-update-algorithm branch from 36f5b80 to 0bd952f Compare October 28, 2025 20:15
@barrywhart
Copy link
Contributor

Sounds great to me! I contributed the PR about 3 years ago which initially added support for inline conflict markers, for the same reasons you mention -- it's the default with git. In fact, I had never seen an .rej file before using copier.

No concerns at all about removing the context setting either -- I've never used it.

Simpler code sounds cool -- git can do so many surprising things if you know how to approach it.

@pawamoy
Copy link
Contributor

pawamoy commented Oct 29, 2025

Yep, so cool that we can get rid of the index manipulation hacks 👏 🚀 Didn't fully check the code yet but it looks much cleaner 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants