bid-merge.txt
115 lines
| 5.3 KiB
| text/plain
|
TextLexer
r45619 | Bid merge is a feature introduced in Mercurial 3.0, a merge algorithm for | |||
dealing with complicated merges. | ||||
Bid merge is controled by the `merge.preferancestor` configuration option. The | ||||
default is set to `merge.preferancetors=*` and enable bid merge. Mercurial will | ||||
perform a bid merge in the cases where a merge otherwise would emit a note: | ||||
using X as ancestor of X and X message. | ||||
Problem it is solving | ||||
===================== | ||||
Mercurial's core merge algorithm is the traditional "three-way merge". This | ||||
algorithm combines all the changes in two changesets relative to a common | ||||
ancestor. But with complex DAGs, it is often possible to have more than one | ||||
"best" common ancestor, with no easy way to distinguish between them. | ||||
For example, C and D has 2 common ancestors in the following graph:: | ||||
C D | ||||
|\ /| | ||||
| x | | ||||
|/ \| | ||||
A B | ||||
\ / | ||||
R | ||||
Mercurial used to arbitrarily chooses the first of these, which can result in | ||||
various issues: | ||||
* unexpected hard 3-way merges that would have been completely trivial if | ||||
another ancestor had been used | ||||
* conflicts that have already been resolved may reappear | ||||
* changes that have been reversed can silently oscillate | ||||
One common problem is a merge which with the "right" ancestor would be trivial | ||||
to resolve because only one side changed. Using another ancestor where the same | ||||
lines are different, it will give an annoying 3-way merge. | ||||
Other systems like Git have attacked some of these problems with a so-called | ||||
"recursive" merge strategy, that internally merges all the possible ancestors | ||||
to produce a single "virtual" ancestor to merge against. This is awkward as the | ||||
internal merge itself may involve conflicts (and possibly even multiple levels | ||||
of recursion), which either requires choosing a conflict disposition (e.g. | ||||
always choose the local version) or exposing the user to extremely confusing | ||||
merge prompts for old revisions. Generating the virtual merge also potentially | ||||
involves invoking filters and extensions. | ||||
Concept | ||||
======= | ||||
(Bid merge is pretty much the same as Consensus merge.) | ||||
Bid merge is a strategy that attempts to sensibly combine the results of the | ||||
multiple possible three-way merges directly without producing a virtual | ||||
ancestor. The basic idea is that for each ancestor, we perform a top-level | ||||
manifest merge and generate a list of proposed actions, which we consider | ||||
"bids". We then make an "auction" among all the bids for each file and pick the | ||||
most favourable. Some files might be trivial to merge with one ancestor, other | ||||
files with another ancestor. | ||||
The most obvious advantage of considering multiple ancestors is the case where | ||||
some of the bids for a file is a "real" (interactive) merge but where one or | ||||
more bids just take on of the parent revisions. A bid for just taking an | ||||
existing revision is very simple and low risk and is an obvious winner. | ||||
The auction algorithm for merging the bids is so far very simple: | ||||
* If there is consensus from all the ancestors, there is no doubt what to do. A | ||||
clever result will be indistinguishable from just picking a random bid. The | ||||
consensus case is thus not only trivial, it is also already handled | ||||
perfectly. | ||||
* If "keep local" or "get from other" actions is an option (and there is only | ||||
one such option), just do it. | ||||
* If the auction doesn't have a single clear winner, pick one of the bids | ||||
"randomly" - just as it would have done if only one ancestor was considered. | ||||
This meta merge algorithm has room for future improvements, especially for | ||||
doing better than picking a random bid. | ||||
Some observations | ||||
================= | ||||
Experience with bid merge shows that many merges that actually have a very | ||||
simple solution (because only one side changed) only can be solved efficiently | ||||
when we start looking at file content in filemerge ... and it thus also | ||||
requires all ancestors passed to filemerge. That is because Mercurial includes | ||||
the history in filelog hashes. A file with changes that ends up not changing | ||||
the content (could be change + backout or graft + merge or criss cross merges) | ||||
still shows up as a changed file to manifestmerge. (The git data model has an | ||||
advantage here when it uses hashes of content without history.) One way to | ||||
handle that would be to refactor manifestmerge, mergestate/resolve and | ||||
filemerge so they become more of the same thing. | ||||
There is also cases where different conflicting chunks could benefit from using | ||||
multiple ancestors in filemerge - but that will require merge tools with fancy | ||||
support for using multiple ancestors in 3+-way merge. That is left as an | ||||
exercise for another day. That seems to be a case where "recursive merge" has | ||||
an advantage. | ||||
The current manifest merge actions are very low level imperative and not | ||||
symmetrical. They do not only describe how two manifests should be merged, they | ||||
also describe a strategy for changing a context from a state where it is one of | ||||
the parents to the state where it is the result of the merge with the other | ||||
parent. I can imagine that manifestmerge could be simplified (and made more | ||||
suitable for in memory merges) by separating the abstract merge actions from | ||||
the actual file system operation actions. A more clever wcontext could perhaps | ||||
also take care of some of the branchmerge special cases. | ||||
We assume that the definition of Mercurial manifest merge will make sure that | ||||
exactly the same files will be produced, no matter which ancestor is used. That | ||||
assumption might be wrong in very rare cases that really not is a problem. | ||||