##// END OF EJS Templates
docs: update instructions for shared exception store in cluster setup
marcink -
r3067:9b7d4cbb default
parent child Browse files
Show More
@@ -1,383 +1,395 b''
1 1 .. _scale-horizontal-cluster:
2 2
3 3
4 4 Scale Horizontally / RhodeCode Cluster
5 5 --------------------------------------
6 6
7 7 |RCE| is built in a way it support horizontal scaling across multiple machines.
8 8 There are three main pre-requisites for that:
9 9
10 10 - Shared storage that each machine can access. Using NFS or other shared storage system.
11 11 - Shared DB connection across machines. Using `MySQL`/`PostgreSQL` that each node can access.
12 12 - |RCE| user sessions and caches need to use a shared storage (e.g `Redis`_/`Memcached`)
13 13
14 14
15 15 Horizontal scaling means adding more machines or workers into your pool of
16 16 resources. Horizontally scaling |RCE| gives a huge performance increase,
17 17 especially under large traffic scenarios with a high number of requests.
18 18 This is very beneficial when |RCE| is serving many users simultaneously,
19 19 or if continuous integration servers are automatically pulling and pushing code.
20 20 It also adds High-Availability to your running system.
21 21
22 22
23 23 Cluster Overview
24 24 ^^^^^^^^^^^^^^^^
25 25
26 26 Below we'll present a configuration example that will use two separate nodes to serve
27 27 |RCE| in a load-balanced environment. The 3rd node will act as a shared storage/cache
28 28 and handle load-balancing. In addition 3rd node will be used as shared database instance.
29 29
30 30 This setup can be used both in Docker based configuration or with individual
31 31 physical/virtual machines. Using the 3rd node for Storage/Redis/PostgreSQL/Nginx is
32 32 optional. All those components can be installed on one of the two nodes used for |RCE|.
33 33 We'll use following naming for our nodes:
34 34
35 35 - `rc-node-1` (NFS, DB, Cache node)
36 36 - `rc-node-2` (Worker node1)
37 37 - `rc-node-3` (Worker node2)
38 38
39 39 Our shares NFS storage in the example is located on `/home/rcdev/storage` and
40 40 it's RW accessible on **each** node.
41 41
42 42 In this example we used certain recommended components, however many
43 43 of those can be replaced by other, in case your organization already uses them, for example:
44 44
45 45 - `MySQL`/`PostgreSQL`: Aren't replaceable and are the two only supported databases.
46 46 - `Nginx`_ on `rc-node-1` can be replaced by: `Hardware Load Balancer (F5)`, `Apache`_, `HA-Proxy` etc.
47 47 - `Nginx`_ on rc-node-2/3 acts as a reverse proxy and can be replaced by other HTTP server
48 48 acting as reverse proxy such as `Apache`_.
49 49 - `Redis`_ on `rc-node-1` can be replaced by: `Memcached`
50 50
51 51
52 52 Here's an overview what components should be installed/setup on each server in our example:
53 53
54 54 - **rc-node-1**:
55 55
56 56 - main storage acting as NFS host.
57 57 - `nginx` acting as a load-balancer.
58 58 - `postgresql-server` used for database and sessions.
59 59 - `redis-server` used for storing shared caches.
60 60 - optionally `rabbitmq-server` for `Celery` if used.
61 61 - optionally if `Celery` is used Enterprise/Community instance + VCSServer.
62 62 - optionally mailserver that can be shared by other instances.
63 63 - optionally channelstream server to handle live communication for all instances.
64 64
65 65
66 66 - **rc-node-2/3**:
67 67
68 68 - `nginx` acting as a reverse proxy to handle requests to |RCE|.
69 69 - 1x RhodeCode Enterprise/Community instance.
70 70 - 1x VCSServer instance.
71 71 - optionally for testing connection: postgresql-client, redis-client (redis-tools).
72 72
73 73
74 74 Before we start here are few assumptions that should be fulfilled:
75 75
76 76 - make sure each node can access each other.
77 77 - make sure `Redis`_/`MySQL`/`PostgreSQL`/`RabbitMQ`_ are running on `rc-node-1`
78 78 - make sure both `rc-node-2`/`3` can access NFS storage with RW access
79 79 - make sure rc-node-2/3 can access `Redis`_/`PostgreSQL`, `MySQL` database on `rc-node-1`.
80 80 - make sure `Redis`_/Database/`RabbitMQ`_ are password protected and accessible only from rc-node-2/3.
81 81
82 82
83 83
84 84 Setup rc-node-2/3
85 85 ^^^^^^^^^^^^^^^^^
86 86
87 87 Initially before `rc-node-1` we'll configure both nodes 2 and 3 to operate as standalone
88 88 nodes with their own hostnames. Use a default installation settings, and use
89 89 the default local addresses (127.0.0.1) to configure VCSServer and Community/Enterprise instances.
90 90 All external connectivity will be handled by the reverse proxy (`Nginx`_ in our example).
91 91
92 92 This way we can ensure each individual host works,
93 93 accepts connections, or do some operations explicitly on chosen node.
94 94
95 95 In addition this would allow use to explicitly direct certain traffic to a node, e.g
96 96 CI server will only call directly `rc-node-3`. This should be done similar to normal
97 97 installation so check out `Nginx`_/`Apache`_ configuration example to configure each host.
98 98 Each one should already connect to shared database during installation.
99 99
100 100
101 101 1) Assuming our final url will be http://rc-node-1, Configure `instances_id`, `app.base_url`
102 102
103 103 a) On **rc-node-2** find the following settings and edit :file:`/home/{user}/.rccontrol/{instance-id}/rhodecode.ini`
104 104
105 105 .. code-block:: ini
106 106
107 107 ## required format is: *NAME-
108 108 instance_id = *rc-node-2-
109 109 app.base_url = http://rc-node-1
110 110
111 111
112 112 b) On **rc-node-3** find the following settings and edit :file:`/home/{user}/.rccontrol/{instance-id}/rhodecode.ini`
113 113
114 114 .. code-block:: ini
115 115
116 116 ## required format is: *NAME-
117 117 instance_id = *rc-node-3-
118 118 app.base_url = http://rc-node-1
119 119
120 120
121 121
122 122 2) Configure `User Session` to use a shared database. Example config that should be
123 changed on both node 2 and 3. Edit :file:`/home/{user}/.rccontrol/{instance-id}/rhodecode.ini`
123 changed on both **rc-node-2** and **rc-node-3** .
124 Edit :file:`/home/{user}/.rccontrol/{instance-id}/rhodecode.ini`
124 125
125 126 .. code-block:: ini
126 127
127 128 ####################################
128 129 ### BEAKER SESSION ####
129 130 ####################################
130 131
131 132 ## Disable the default `file` sessions
132 133 #beaker.session.type = file
133 134 #beaker.session.data_dir = %(here)s/data/sessions
134 135
135 136 ## use shared db based session, fast, and allows easy management over logged in users
136 137 beaker.session.type = ext:database
137 138 beaker.session.table_name = db_session
138 139 # use our rc-node-1 here
139 140 beaker.session.sa.url = postgresql://postgres:qweqwe@rc-node-1/rhodecode
140 141 beaker.session.sa.pool_recycle = 3600
141 142 beaker.session.sa.echo = false
142 143
143 144 In addition make sure both instances use the same `session.secret` so users have
144 145 persistent sessions across nodes. Please generate other one then in this example.
145 146
146 147 .. code-block:: ini
147 148
148 149 # use an unique generated long string
149 150 beaker.session.secret = 70e116cae2274656ba7265fd860aebbd
150 151
151 152 3) Configure stored cached/archive cache to our shared NFS `rc-node-1`
152 153
153 154 .. code-block:: ini
154 155
155 156 # note the `_` prefix that allows using a directory without
156 157 # remap and rescan checking for vcs inside it.
157 158 cache_dir = /home/rcdev/storage/_cache_dir/data
158 159 # note archive cache dir is disabled by default, however if you enable
159 160 # it also needs to be shared
160 161 #archive_cache_dir = /home/rcdev/storage/_tarball_cache_dir
161 162
162 163
163 4) Change cache backends to use `Redis`_ based caches. Below full example config
164 4) Use shared exception store. Example config that should be
165 changed on both **rc-node-2** and **rc-node-3**, and also for VCSServer.
166 Edit :file:`/home/{user}/.rccontrol/{instance-id}/rhodecode.ini` and
167 :file:`/home/{user}/.rccontrol/{vcsserver-instance-id}/vcsserver.ini`
168 and add/change following setting.
169
170 .. code-block:: ini
171
172 exception_tracker.store_path = /home/rcdev/storage/_exception_store_data
173
174
175 5) Change cache backends to use `Redis`_ based caches. Below full example config
164 176 that replaces default file-based cache to shared `Redis`_ with Distributed Lock.
165 177
166 178
167 179 .. code-block:: ini
168 180
169 181 #####################################
170 182 ### DOGPILE CACHE ####
171 183 #####################################
172 184
173 185 ## `cache_perms` cache settings for permission tree, auth TTL.
174 186 #rc_cache.cache_perms.backend = dogpile.cache.rc.file_namespace
175 187 #rc_cache.cache_perms.expiration_time = 300
176 188
177 189 ## alternative `cache_perms` redis backend with distributed lock
178 190 rc_cache.cache_perms.backend = dogpile.cache.rc.redis
179 191 rc_cache.cache_perms.expiration_time = 300
180 192 ## redis_expiration_time needs to be greater then expiration_time
181 193 rc_cache.cache_perms.arguments.redis_expiration_time = 7200
182 194 rc_cache.cache_perms.arguments.socket_timeout = 30
183 195 rc_cache.cache_perms.arguments.host = rc-node-1
184 196 rc_cache.cache_perms.arguments.password = qweqwe
185 197 rc_cache.cache_perms.arguments.port = 6379
186 198 rc_cache.cache_perms.arguments.db = 0
187 199 rc_cache.cache_perms.arguments.distributed_lock = true
188 200
189 201 ## `cache_repo` cache settings for FileTree, Readme, RSS FEEDS
190 202 #rc_cache.cache_repo.backend = dogpile.cache.rc.file_namespace
191 203 #rc_cache.cache_repo.expiration_time = 2592000
192 204
193 205 ## alternative `cache_repo` redis backend with distributed lock
194 206 rc_cache.cache_repo.backend = dogpile.cache.rc.redis
195 207 rc_cache.cache_repo.expiration_time = 2592000
196 208 ## redis_expiration_time needs to be greater then expiration_time
197 209 rc_cache.cache_repo.arguments.redis_expiration_time = 2678400
198 210 rc_cache.cache_repo.arguments.socket_timeout = 30
199 211 rc_cache.cache_repo.arguments.host = rc-node-1
200 212 rc_cache.cache_repo.arguments.password = qweqwe
201 213 rc_cache.cache_repo.arguments.port = 6379
202 214 rc_cache.cache_repo.arguments.db = 1
203 215 rc_cache.cache_repo.arguments.distributed_lock = true
204 216
205 217 ## cache settings for SQL queries, this needs to use memory type backend
206 218 rc_cache.sql_cache_short.backend = dogpile.cache.rc.memory_lru
207 219 rc_cache.sql_cache_short.expiration_time = 30
208 220
209 221 ## `cache_repo_longterm` cache for repo object instances, this needs to use memory
210 222 ## type backend as the objects kept are not pickle serializable
211 223 rc_cache.cache_repo_longterm.backend = dogpile.cache.rc.memory_lru
212 224 ## by default we use 96H, this is using invalidation on push anyway
213 225 rc_cache.cache_repo_longterm.expiration_time = 345600
214 226 ## max items in LRU cache, reduce this number to save memory, and expire last used
215 227 ## cached objects
216 228 rc_cache.cache_repo_longterm.max_size = 10000
217 229
218 230
219 4) Configure `Nginx`_ as reverse proxy on `rc-node-2/3`:
231 6) Configure `Nginx`_ as reverse proxy on `rc-node-2/3`:
220 232 Minimal `Nginx`_ config used:
221 233
222 234
223 235 .. code-block:: nginx
224 236
225 237 ## rate limiter for certain pages to prevent brute force attacks
226 238 limit_req_zone $binary_remote_addr zone=req_limit:10m rate=1r/s;
227 239
228 240 ## custom log format
229 241 log_format log_custom '$remote_addr - $remote_user [$time_local] '
230 242 '"$request" $status $body_bytes_sent '
231 243 '"$http_referer" "$http_user_agent" '
232 244 '$request_time $upstream_response_time $pipe';
233 245
234 246 server {
235 247 listen 80;
236 248 server_name rc-node-2;
237 249 #server_name rc-node-3;
238 250
239 251 access_log /var/log/nginx/rhodecode.access.log log_custom;
240 252 error_log /var/log/nginx/rhodecode.error.log;
241 253
242 254 # example of proxy.conf can be found in our docs.
243 255 include /etc/nginx/proxy.conf;
244 256
245 257 ## serve static files by Nginx, recommended for performance
246 258 location /_static/rhodecode {
247 259 gzip on;
248 260 gzip_min_length 500;
249 261 gzip_proxied any;
250 262 gzip_comp_level 4;
251 263 gzip_types text/css text/javascript text/xml text/plain text/x-component application/javascript application/json application/xml application/rss+xml font/truetype font/opentype application/vnd.ms-fontobject image/svg+xml;
252 264 gzip_vary on;
253 265 gzip_disable "msie6";
254 266 #alias /home/rcdev/.rccontrol/community-1/static;
255 267 alias /home/rcdev/.rccontrol/enterprise-1/static;
256 268 }
257 269
258 270
259 271 location /_admin/login {
260 272 limit_req zone=req_limit burst=10 nodelay;
261 273 try_files $uri @rhode;
262 274 }
263 275
264 276 location / {
265 277 try_files $uri @rhode;
266 278 }
267 279
268 280 location @rhode {
269 281 # Url to running RhodeCode instance.
270 282 # This is shown as `- URL: <host>` in output from rccontrol status.
271 283 proxy_pass http://127.0.0.1:10020;
272 284 }
273 285
274 286 ## custom 502 error page. Will be displayed while RhodeCode server
275 287 ## is turned off
276 288 error_page 502 /502.html;
277 289 location = /502.html {
278 290 #root /home/rcdev/.rccontrol/community-1/static;
279 291 root /home/rcdev/.rccontrol/enterprise-1/static;
280 292 }
281 293 }
282 294
283 295
284 5) Optional: Full text search, in case you use `Whoosh` full text search we also need a
296 7) Optional: Full text search, in case you use `Whoosh` full text search we also need a
285 297 shared storage for the index. In our example our NFS is mounted at `/home/rcdev/storage`
286 298 which represents out storage so we can use the following:
287 299
288 300 .. code-block:: ini
289 301
290 302 # note the `_` prefix that allows using a directory without
291 303 # remap and rescan checking for vcs inside it.
292 304 search.location = /home/rcdev/storage/_index_data/index
293 305
294 306
295 307 .. note::
296 308
297 309 If you use ElasticSearch it's by default shared, and simply running ES node is
298 310 by default cluster compatible.
299 311
300 312
301 6) Optional: If you intend to use mailing all instances need to use either a shared
302 mailing node, or each will use individual local mailagent. Simply put node-1/2/3 needs
303 to use same mailing configuration.
313 8) Optional: If you intend to use mailing all instances need to use either a shared
314 mailing node, or each will use individual local mail agent. Simply put node-1/2/3
315 needs to use same mailing configuration.
304 316
305 317
306 318
307 319 Setup rc-node-1
308 320 ^^^^^^^^^^^^^^^
309 321
310 322
311 323 Configure `Nginx`_ as Load Balancer to rc-node-2/3.
312 324 Minimal `Nginx`_ example below:
313 325
314 326 .. code-block:: nginx
315 327
316 328 ## define rc-cluster which contains a pool of our instances to connect to
317 329 upstream rc-cluster {
318 330 # rc-node-2/3 are stored in /etc/hosts with correct IP addresses
319 331 server rc-node-2:80;
320 332 server rc-node-3:80;
321 333 }
322 334
323 335 server {
324 336 listen 80;
325 337 server_name rc-node-1;
326 338
327 339 location / {
328 340 proxy_pass http://rc-cluster;
329 341 }
330 342 }
331 343
332 344
333 345 .. note::
334 346
335 347 You should configure your load balancing accordingly. We recommend writing
336 348 load balancing rules that will separate regular user traffic from
337 349 automated process traffic like continuous servers or build bots. Sticky sessions
338 350 are not required.
339 351
340 352
341 353 Show which instance handles a request
342 354 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
343 355
344 356 You can easily check if load-balancing is working as expected. Visit our main node
345 357 `rc-node-1` URL which at that point should already handle incoming requests and balance
346 358 it across node-2/3.
347 359
348 360 Add a special GET param `?showrcid=1` to show current instance handling your request.
349 361
350 362 For example: visiting url `http://rc-node-1/?showrcid=1` will show, in the bottom
351 363 of the screen` cluster instance info.
352 364 e.g: `RhodeCode instance id: rc-node-3-rc-node-3-3246`
353 365 which is generated from::
354 366
355 367 <NODE_HOSTNAME>-<INSTANCE_ID>-<WORKER_PID>
356 368
357 369
358 370 Using Celery with cluster
359 371 ^^^^^^^^^^^^^^^^^^^^^^^^^
360 372
361 373
362 374 If `Celery` is used we recommend setting also an instance of Enterprise/Community+VCSserver
363 375 on the node that is running `RabbitMQ`_. Those instances will be used to executed async
364 376 tasks on the `rc-node-1`. This is the most efficient setup. `Celery` usually
365 377 handles tasks such as sending emails, forking repositories, importing
366 378 repositories from external location etc. Using workers on instance that has
367 379 the direct access to disks used by NFS as well as email server gives noticeable
368 380 performance boost. Running local workers to the NFS storage results in faster
369 381 execution of forking large repositories or sending lots of emails.
370 382
371 383 Those instances need to be configured in the same way as for other nodes.
372 384 The instance in rc-node-1 can be added to the cluser, but we don't recommend doing it.
373 385 For best results let it be isolated to only executing `Celery` tasks in the cluster setup.
374 386
375 387
376 388 .. _Gunicorn: http://gunicorn.org/
377 389 .. _Whoosh: https://pypi.python.org/pypi/Whoosh/
378 390 .. _Elasticsearch: https://www.elastic.co/..
379 391 .. _RabbitMQ: http://www.rabbitmq.com/
380 392 .. _Nginx: http://nginx.io
381 393 .. _Apache: http://nginx.io
382 394 .. _Redis: http://redis.io
383 395
General Comments 0
You need to be logged in to leave comments. Login now