Show More
@@ -5,6 +5,40 b'' | |||||
5 | # |
|
5 | # | |
6 | # This software may be used and distributed according to the terms of the |
|
6 | # This software may be used and distributed according to the terms of the | |
7 | # GNU General Public License version 2 or any later version. |
|
7 | # GNU General Public License version 2 or any later version. | |
|
8 | """ | |||
|
9 | Algorithm works in the following way. You have two repository: local and | |||
|
10 | remote. They both contains a DAG of changelists. | |||
|
11 | ||||
|
12 | The goal of the discovery protocol is to find one set of node *common*, | |||
|
13 | the set of nodes shared by local and remote. | |||
|
14 | ||||
|
15 | One of the issue with the original protocol was latency, it could | |||
|
16 | potentially require lots of roundtrips to discover that the local repo was a | |||
|
17 | subset of remote (which is a very common case, you usually have few changes | |||
|
18 | compared to upstream, while upstream probably had lots of development). | |||
|
19 | ||||
|
20 | The new protocol only requires one interface for the remote repo: `known()`, | |||
|
21 | which given a set of changelists tells you if they are present in the DAG. | |||
|
22 | ||||
|
23 | The algorithm then works as follow: | |||
|
24 | ||||
|
25 | - We will be using three sets, `common`, `missing`, `unknown`. Originally | |||
|
26 | all nodes are in `unknown`. | |||
|
27 | - Take a sample from `unknown`, call `remote.known(sample)` | |||
|
28 | - For each node that remote knows, move it and all its ancestors to `common` | |||
|
29 | - For each node that remote doesn't know, move it and all its descendants | |||
|
30 | to `missing` | |||
|
31 | - Iterate until `unknown` is empty | |||
|
32 | ||||
|
33 | There are a couple optimizations, first is instead of starting with a random | |||
|
34 | sample of missing, start by sending all heads, in the case where the local | |||
|
35 | repo is a subset, you computed the answer in one round trip. | |||
|
36 | ||||
|
37 | Then you can do something similar to the bisecting strategy used when | |||
|
38 | finding faulty changesets. Instead of random samples, you can try picking | |||
|
39 | nodes that will maximize the number of nodes that will be | |||
|
40 | classified with it (since all ancestors or descendants will be marked as well). | |||
|
41 | """ | |||
8 |
|
42 | |||
9 | from node import nullid |
|
43 | from node import nullid | |
10 | from i18n import _ |
|
44 | from i18n import _ |
General Comments 0
You need to be logged in to leave comments.
Login now