Show More
@@ -1,184 +1,184 b'' | |||||
1 | # This software may be used and distributed according to the terms of the |
|
1 | # This software may be used and distributed according to the terms of the | |
2 | # GNU General Public License version 2 or any later version. |
|
2 | # GNU General Public License version 2 or any later version. | |
3 |
|
3 | |||
4 | """advertise pre-generated bundles to seed clones |
|
4 | """advertise pre-generated bundles to seed clones | |
5 |
|
5 | |||
6 | "clonebundles" is a server-side extension used to advertise the existence |
|
6 | "clonebundles" is a server-side extension used to advertise the existence | |
7 | of pre-generated, externally hosted bundle files to clients that are |
|
7 | of pre-generated, externally hosted bundle files to clients that are | |
8 | cloning so that cloning can be faster, more reliable, and require less |
|
8 | cloning so that cloning can be faster, more reliable, and require less | |
9 | resources on the server. |
|
9 | resources on the server. | |
10 |
|
10 | |||
11 | Cloning can be a CPU and I/O intensive operation on servers. Traditionally, |
|
11 | Cloning can be a CPU and I/O intensive operation on servers. Traditionally, | |
12 | the server, in response to a client's request to clone, dynamically generates |
|
12 | the server, in response to a client's request to clone, dynamically generates | |
13 | a bundle containing the entire repository content and sends it to the client. |
|
13 | a bundle containing the entire repository content and sends it to the client. | |
14 | There is no caching on the server and the server will have to redundantly |
|
14 | There is no caching on the server and the server will have to redundantly | |
15 | generate the same outgoing bundle in response to each clone request. For |
|
15 | generate the same outgoing bundle in response to each clone request. For | |
16 | servers with large repositories or with high clone volume, the load from |
|
16 | servers with large repositories or with high clone volume, the load from | |
17 | clones can make scaling the server challenging and costly. |
|
17 | clones can make scaling the server challenging and costly. | |
18 |
|
18 | |||
19 | This extension provides server operators the ability to offload potentially |
|
19 | This extension provides server operators the ability to offload potentially | |
20 | expensive clone load to an external service. Here's how it works. |
|
20 | expensive clone load to an external service. Here's how it works. | |
21 |
|
21 | |||
22 | 1. A server operator establishes a mechanism for making bundle files available |
|
22 | 1. A server operator establishes a mechanism for making bundle files available | |
23 | on a hosting service where Mercurial clients can fetch them. |
|
23 | on a hosting service where Mercurial clients can fetch them. | |
24 | 2. A manifest file listing available bundle URLs and some optional metadata |
|
24 | 2. A manifest file listing available bundle URLs and some optional metadata | |
25 | is added to the Mercurial repository on the server. |
|
25 | is added to the Mercurial repository on the server. | |
26 | 3. A client initiates a clone against a clone bundles aware server. |
|
26 | 3. A client initiates a clone against a clone bundles aware server. | |
27 | 4. The client sees the server is advertising clone bundles and fetches the |
|
27 | 4. The client sees the server is advertising clone bundles and fetches the | |
28 | manifest listing available bundles. |
|
28 | manifest listing available bundles. | |
29 | 5. The client filters and sorts the available bundles based on what it |
|
29 | 5. The client filters and sorts the available bundles based on what it | |
30 | supports and prefers. |
|
30 | supports and prefers. | |
31 | 6. The client downloads and applies an available bundle from the |
|
31 | 6. The client downloads and applies an available bundle from the | |
32 | server-specified URL. |
|
32 | server-specified URL. | |
33 | 7. The client reconnects to the original server and performs the equivalent |
|
33 | 7. The client reconnects to the original server and performs the equivalent | |
34 | of :hg:`pull` to retrieve all repository data not in the bundle. (The |
|
34 | of :hg:`pull` to retrieve all repository data not in the bundle. (The | |
35 | repository could have been updated between when the bundle was created |
|
35 | repository could have been updated between when the bundle was created | |
36 | and when the client started the clone.) |
|
36 | and when the client started the clone.) | |
37 |
|
37 | |||
38 | Instead of the server generating full repository bundles for every clone |
|
38 | Instead of the server generating full repository bundles for every clone | |
39 | request, it generates full bundles once and they are subsequently reused to |
|
39 | request, it generates full bundles once and they are subsequently reused to | |
40 | bootstrap new clones. The server may still transfer data at clone time. |
|
40 | bootstrap new clones. The server may still transfer data at clone time. | |
41 | However, this is only data that has been added/changed since the bundle was |
|
41 | However, this is only data that has been added/changed since the bundle was | |
42 | created. For large, established repositories, this can reduce server load for |
|
42 | created. For large, established repositories, this can reduce server load for | |
43 | clones to less than 1% of original. |
|
43 | clones to less than 1% of original. | |
44 |
|
44 | |||
45 | To work, this extension requires the following of server operators: |
|
45 | To work, this extension requires the following of server operators: | |
46 |
|
46 | |||
47 | * Generating bundle files of repository content (typically periodically, |
|
47 | * Generating bundle files of repository content (typically periodically, | |
48 | such as once per day). |
|
48 | such as once per day). | |
49 | * A file server that clients have network access to and that Python knows |
|
49 | * A file server that clients have network access to and that Python knows | |
50 | how to talk to through its normal URL handling facility (typically an |
|
50 | how to talk to through its normal URL handling facility (typically an | |
51 | HTTP server). |
|
51 | HTTP server). | |
52 | * A process for keeping the bundles manifest in sync with available bundle |
|
52 | * A process for keeping the bundles manifest in sync with available bundle | |
53 | files. |
|
53 | files. | |
54 |
|
54 | |||
55 | Strictly speaking, using a static file hosting server isn't required: a server |
|
55 | Strictly speaking, using a static file hosting server isn't required: a server | |
56 | operator could use a dynamic service for retrieving bundle data. However, |
|
56 | operator could use a dynamic service for retrieving bundle data. However, | |
57 | static file hosting services are simple and scalable and should be sufficient |
|
57 | static file hosting services are simple and scalable and should be sufficient | |
58 | for most needs. |
|
58 | for most needs. | |
59 |
|
59 | |||
60 | Bundle files can be generated with the :hg:`bundle` command. Typically |
|
60 | Bundle files can be generated with the :hg:`bundle` command. Typically | |
61 | :hg:`bundle --all` is used to produce a bundle of the entire repository. |
|
61 | :hg:`bundle --all` is used to produce a bundle of the entire repository. | |
62 |
|
62 | |||
63 | :hg:`debugcreatestreamclonebundle` can be used to produce a special |
|
63 | :hg:`debugcreatestreamclonebundle` can be used to produce a special | |
64 | *streaming clone bundle*. These are bundle files that are extremely efficient |
|
64 | *streaming clone bundle*. These are bundle files that are extremely efficient | |
65 | to produce and consume (read: fast). However, they are larger than |
|
65 | to produce and consume (read: fast). However, they are larger than | |
66 | traditional bundle formats and require that clients support the exact set |
|
66 | traditional bundle formats and require that clients support the exact set | |
67 | of repository data store formats in use by the repository that created them. |
|
67 | of repository data store formats in use by the repository that created them. | |
68 | Typically, a newer server can serve data that is compatible with older clients. |
|
68 | Typically, a newer server can serve data that is compatible with older clients. | |
69 | However, *streaming clone bundles* don't have this guarantee. **Server |
|
69 | However, *streaming clone bundles* don't have this guarantee. **Server | |
70 | operators need to be aware that newer versions of Mercurial may produce |
|
70 | operators need to be aware that newer versions of Mercurial may produce | |
71 | streaming clone bundles incompatible with older Mercurial versions.** |
|
71 | streaming clone bundles incompatible with older Mercurial versions.** | |
72 |
|
72 | |||
73 | The list of requirements printed by :hg:`debugcreatestreamclonebundle` should |
|
|||
74 | be specified in the ``requirements`` parameter of the *bundle specification |
|
|||
75 | string* for the ``BUNDLESPEC`` manifest property described below. e.g. |
|
|||
76 | ``BUNDLESPEC=none-packed1;requirements%3Drevlogv1``. |
|
|||
77 |
|
||||
78 | A server operator is responsible for creating a ``.hg/clonebundles.manifest`` |
|
73 | A server operator is responsible for creating a ``.hg/clonebundles.manifest`` | |
79 | file containing the list of available bundle files suitable for seeding |
|
74 | file containing the list of available bundle files suitable for seeding | |
80 | clones. If this file does not exist, the repository will not advertise the |
|
75 | clones. If this file does not exist, the repository will not advertise the | |
81 | existence of clone bundles when clients connect. |
|
76 | existence of clone bundles when clients connect. | |
82 |
|
77 | |||
83 | The manifest file contains a newline (\n) delimited list of entries. |
|
78 | The manifest file contains a newline (\n) delimited list of entries. | |
84 |
|
79 | |||
85 | Each line in this file defines an available bundle. Lines have the format: |
|
80 | Each line in this file defines an available bundle. Lines have the format: | |
86 |
|
81 | |||
87 | <URL> [<key>=<value>[ <key>=<value>]] |
|
82 | <URL> [<key>=<value>[ <key>=<value>]] | |
88 |
|
83 | |||
89 | That is, a URL followed by an optional, space-delimited list of key=value |
|
84 | That is, a URL followed by an optional, space-delimited list of key=value | |
90 | pairs describing additional properties of this bundle. Both keys and values |
|
85 | pairs describing additional properties of this bundle. Both keys and values | |
91 | are URI encoded. |
|
86 | are URI encoded. | |
92 |
|
87 | |||
93 | Keys in UPPERCASE are reserved for use by Mercurial and are defined below. |
|
88 | Keys in UPPERCASE are reserved for use by Mercurial and are defined below. | |
94 | All non-uppercase keys can be used by site installations. An example use |
|
89 | All non-uppercase keys can be used by site installations. An example use | |
95 | for custom properties is to use the *datacenter* attribute to define which |
|
90 | for custom properties is to use the *datacenter* attribute to define which | |
96 | data center a file is hosted in. Clients could then prefer a server in the |
|
91 | data center a file is hosted in. Clients could then prefer a server in the | |
97 | data center closest to them. |
|
92 | data center closest to them. | |
98 |
|
93 | |||
99 | The following reserved keys are currently defined: |
|
94 | The following reserved keys are currently defined: | |
100 |
|
95 | |||
101 | BUNDLESPEC |
|
96 | BUNDLESPEC | |
102 | A "bundle specification" string that describes the type of the bundle. |
|
97 | A "bundle specification" string that describes the type of the bundle. | |
103 |
|
98 | |||
104 | These are string values that are accepted by the "--type" argument of |
|
99 | These are string values that are accepted by the "--type" argument of | |
105 | :hg:`bundle`. |
|
100 | :hg:`bundle`. | |
106 |
|
101 | |||
107 | The values are parsed in strict mode, which means they must be of the |
|
102 | The values are parsed in strict mode, which means they must be of the | |
108 | "<compression>-<type>" form. See |
|
103 | "<compression>-<type>" form. See | |
109 | mercurial.exchange.parsebundlespec() for more details. |
|
104 | mercurial.exchange.parsebundlespec() for more details. | |
110 |
|
105 | |||
|
106 | :hg:`debugbundle --spec` can be used to print the bundle specification | |||
|
107 | string for a bundle file. The output of this command can be used verbatim | |||
|
108 | for the value of ``BUNDLESPEC`` (it is already escaped). | |||
|
109 | ||||
111 | Clients will automatically filter out specifications that are unknown or |
|
110 | Clients will automatically filter out specifications that are unknown or | |
112 | unsupported so they won't attempt to download something that likely won't |
|
111 | unsupported so they won't attempt to download something that likely won't | |
113 | apply. |
|
112 | apply. | |
114 |
|
113 | |||
115 | The actual value doesn't impact client behavior beyond filtering: |
|
114 | The actual value doesn't impact client behavior beyond filtering: | |
116 | clients will still sniff the bundle type from the header of downloaded |
|
115 | clients will still sniff the bundle type from the header of downloaded | |
117 | files. |
|
116 | files. | |
118 |
|
117 | |||
119 | **Use of this key is highly recommended**, as it allows clients to |
|
118 | **Use of this key is highly recommended**, as it allows clients to | |
120 | easily skip unsupported bundles. |
|
119 | easily skip unsupported bundles. If this key is not defined, an old | |
|
120 | client may attempt to apply a bundle that it is incapable of reading. | |||
121 |
|
121 | |||
122 | REQUIRESNI |
|
122 | REQUIRESNI | |
123 | Whether Server Name Indication (SNI) is required to connect to the URL. |
|
123 | Whether Server Name Indication (SNI) is required to connect to the URL. | |
124 | SNI allows servers to use multiple certificates on the same IP. It is |
|
124 | SNI allows servers to use multiple certificates on the same IP. It is | |
125 | somewhat common in CDNs and other hosting providers. Older Python |
|
125 | somewhat common in CDNs and other hosting providers. Older Python | |
126 | versions do not support SNI. Defining this attribute enables clients |
|
126 | versions do not support SNI. Defining this attribute enables clients | |
127 | with older Python versions to filter this entry without experiencing |
|
127 | with older Python versions to filter this entry without experiencing | |
128 | an opaque SSL failure at connection time. |
|
128 | an opaque SSL failure at connection time. | |
129 |
|
129 | |||
130 | If this is defined, it is important to advertise a non-SNI fallback |
|
130 | If this is defined, it is important to advertise a non-SNI fallback | |
131 | URL or clients running old Python releases may not be able to clone |
|
131 | URL or clients running old Python releases may not be able to clone | |
132 | with the clonebundles facility. |
|
132 | with the clonebundles facility. | |
133 |
|
133 | |||
134 | Value should be "true". |
|
134 | Value should be "true". | |
135 |
|
135 | |||
136 | Manifests can contain multiple entries. Assuming metadata is defined, clients |
|
136 | Manifests can contain multiple entries. Assuming metadata is defined, clients | |
137 | will filter entries from the manifest that they don't support. The remaining |
|
137 | will filter entries from the manifest that they don't support. The remaining | |
138 | entries are optionally sorted by client preferences |
|
138 | entries are optionally sorted by client preferences | |
139 | (``experimental.clonebundleprefers`` config option). The client then attempts |
|
139 | (``experimental.clonebundleprefers`` config option). The client then attempts | |
140 | to fetch the bundle at the first URL in the remaining list. |
|
140 | to fetch the bundle at the first URL in the remaining list. | |
141 |
|
141 | |||
142 | **Errors when downloading a bundle will fail the entire clone operation: |
|
142 | **Errors when downloading a bundle will fail the entire clone operation: | |
143 | clients do not automatically fall back to a traditional clone.** The reason |
|
143 | clients do not automatically fall back to a traditional clone.** The reason | |
144 | for this is that if a server is using clone bundles, it is probably doing so |
|
144 | for this is that if a server is using clone bundles, it is probably doing so | |
145 | because the feature is necessary to help it scale. In other words, there |
|
145 | because the feature is necessary to help it scale. In other words, there | |
146 | is an assumption that clone load will be offloaded to another service and |
|
146 | is an assumption that clone load will be offloaded to another service and | |
147 | that the Mercurial server isn't responsible for serving this clone load. |
|
147 | that the Mercurial server isn't responsible for serving this clone load. | |
148 | If that other service experiences issues and clients start mass falling back to |
|
148 | If that other service experiences issues and clients start mass falling back to | |
149 | the original Mercurial server, the added clone load could overwhelm the server |
|
149 | the original Mercurial server, the added clone load could overwhelm the server | |
150 | due to unexpected load and effectively take it offline. Not having clients |
|
150 | due to unexpected load and effectively take it offline. Not having clients | |
151 | automatically fall back to cloning from the original server mitigates this |
|
151 | automatically fall back to cloning from the original server mitigates this | |
152 | scenario. |
|
152 | scenario. | |
153 |
|
153 | |||
154 | Because there is no automatic Mercurial server fallback on failure of the |
|
154 | Because there is no automatic Mercurial server fallback on failure of the | |
155 | bundle hosting service, it is important for server operators to view the bundle |
|
155 | bundle hosting service, it is important for server operators to view the bundle | |
156 | hosting service as an extension of the Mercurial server in terms of |
|
156 | hosting service as an extension of the Mercurial server in terms of | |
157 | availability and service level agreements: if the bundle hosting service goes |
|
157 | availability and service level agreements: if the bundle hosting service goes | |
158 | down, so does the ability for clients to clone. Note: clients will see a |
|
158 | down, so does the ability for clients to clone. Note: clients will see a | |
159 | message informing them how to bypass the clone bundles facility when a failure |
|
159 | message informing them how to bypass the clone bundles facility when a failure | |
160 | occurs. So server operators should prepare for some people to follow these |
|
160 | occurs. So server operators should prepare for some people to follow these | |
161 | instructions when a failure occurs, thus driving more load to the original |
|
161 | instructions when a failure occurs, thus driving more load to the original | |
162 | Mercurial server when the bundle hosting service fails. |
|
162 | Mercurial server when the bundle hosting service fails. | |
163 | """ |
|
163 | """ | |
164 |
|
164 | |||
165 | from mercurial import ( |
|
165 | from mercurial import ( | |
166 | extensions, |
|
166 | extensions, | |
167 | wireproto, |
|
167 | wireproto, | |
168 | ) |
|
168 | ) | |
169 |
|
169 | |||
170 | testedwith = 'internal' |
|
170 | testedwith = 'internal' | |
171 |
|
171 | |||
172 | def capabilities(orig, repo, proto): |
|
172 | def capabilities(orig, repo, proto): | |
173 | caps = orig(repo, proto) |
|
173 | caps = orig(repo, proto) | |
174 |
|
174 | |||
175 | # Only advertise if a manifest exists. This does add some I/O to requests. |
|
175 | # Only advertise if a manifest exists. This does add some I/O to requests. | |
176 | # But this should be cheaper than a wasted network round trip due to |
|
176 | # But this should be cheaper than a wasted network round trip due to | |
177 | # missing file. |
|
177 | # missing file. | |
178 | if repo.opener.exists('clonebundles.manifest'): |
|
178 | if repo.opener.exists('clonebundles.manifest'): | |
179 | caps.append('clonebundles') |
|
179 | caps.append('clonebundles') | |
180 |
|
180 | |||
181 | return caps |
|
181 | return caps | |
182 |
|
182 | |||
183 | def extsetup(ui): |
|
183 | def extsetup(ui): | |
184 | extensions.wrapfunction(wireproto, '_capabilities', capabilities) |
|
184 | extensions.wrapfunction(wireproto, '_capabilities', capabilities) |
General Comments 0
You need to be logged in to leave comments.
Login now