##// END OF EJS Templates
docs: fix spelling typo.
marcink -
r3284:1d6bf147 default
parent child Browse files
Show More
@@ -1,395 +1,395 b''
1 1 .. _scale-horizontal-cluster:
2 2
3 3
4 4 Scale Horizontally / RhodeCode Cluster
5 5 --------------------------------------
6 6
7 7 |RCE| is built in a way it support horizontal scaling across multiple machines.
8 8 There are three main pre-requisites for that:
9 9
10 10 - Shared storage that each machine can access. Using NFS or other shared storage system.
11 11 - Shared DB connection across machines. Using `MySQL`/`PostgreSQL` that each node can access.
12 12 - |RCE| user sessions and caches need to use a shared storage (e.g `Redis`_/`Memcached`)
13 13
14 14
15 15 Horizontal scaling means adding more machines or workers into your pool of
16 16 resources. Horizontally scaling |RCE| gives a huge performance increase,
17 17 especially under large traffic scenarios with a high number of requests.
18 18 This is very beneficial when |RCE| is serving many users simultaneously,
19 19 or if continuous integration servers are automatically pulling and pushing code.
20 20 It also adds High-Availability to your running system.
21 21
22 22
23 23 Cluster Overview
24 24 ^^^^^^^^^^^^^^^^
25 25
26 26 Below we'll present a configuration example that will use two separate nodes to serve
27 27 |RCE| in a load-balanced environment. The 3rd node will act as a shared storage/cache
28 28 and handle load-balancing. In addition 3rd node will be used as shared database instance.
29 29
30 30 This setup can be used both in Docker based configuration or with individual
31 31 physical/virtual machines. Using the 3rd node for Storage/Redis/PostgreSQL/Nginx is
32 32 optional. All those components can be installed on one of the two nodes used for |RCE|.
33 33 We'll use following naming for our nodes:
34 34
35 35 - `rc-node-1` (NFS, DB, Cache node)
36 36 - `rc-node-2` (Worker node1)
37 37 - `rc-node-3` (Worker node2)
38 38
39 39 Our shares NFS storage in the example is located on `/home/rcdev/storage` and
40 40 it's RW accessible on **each** node.
41 41
42 42 In this example we used certain recommended components, however many
43 43 of those can be replaced by other, in case your organization already uses them, for example:
44 44
45 45 - `MySQL`/`PostgreSQL`: Aren't replaceable and are the two only supported databases.
46 46 - `Nginx`_ on `rc-node-1` can be replaced by: `Hardware Load Balancer (F5)`, `Apache`_, `HA-Proxy` etc.
47 47 - `Nginx`_ on rc-node-2/3 acts as a reverse proxy and can be replaced by other HTTP server
48 48 acting as reverse proxy such as `Apache`_.
49 49 - `Redis`_ on `rc-node-1` can be replaced by: `Memcached`
50 50
51 51
52 52 Here's an overview what components should be installed/setup on each server in our example:
53 53
54 54 - **rc-node-1**:
55 55
56 56 - main storage acting as NFS host.
57 57 - `nginx` acting as a load-balancer.
58 58 - `postgresql-server` used for database and sessions.
59 59 - `redis-server` used for storing shared caches.
60 60 - optionally `rabbitmq-server` for `Celery` if used.
61 61 - optionally if `Celery` is used Enterprise/Community instance + VCSServer.
62 62 - optionally mailserver that can be shared by other instances.
63 63 - optionally channelstream server to handle live communication for all instances.
64 64
65 65
66 66 - **rc-node-2/3**:
67 67
68 68 - `nginx` acting as a reverse proxy to handle requests to |RCE|.
69 69 - 1x RhodeCode Enterprise/Community instance.
70 70 - 1x VCSServer instance.
71 71 - optionally for testing connection: postgresql-client, redis-client (redis-tools).
72 72
73 73
74 74 Before we start here are few assumptions that should be fulfilled:
75 75
76 76 - make sure each node can access each other.
77 77 - make sure `Redis`_/`MySQL`/`PostgreSQL`/`RabbitMQ`_ are running on `rc-node-1`
78 78 - make sure both `rc-node-2`/`3` can access NFS storage with RW access
79 79 - make sure rc-node-2/3 can access `Redis`_/`PostgreSQL`, `MySQL` database on `rc-node-1`.
80 80 - make sure `Redis`_/Database/`RabbitMQ`_ are password protected and accessible only from rc-node-2/3.
81 81
82 82
83 83
84 84 Setup rc-node-2/3
85 85 ^^^^^^^^^^^^^^^^^
86 86
87 87 Initially before `rc-node-1` we'll configure both nodes 2 and 3 to operate as standalone
88 88 nodes with their own hostnames. Use a default installation settings, and use
89 89 the default local addresses (127.0.0.1) to configure VCSServer and Community/Enterprise instances.
90 90 All external connectivity will be handled by the reverse proxy (`Nginx`_ in our example).
91 91
92 92 This way we can ensure each individual host works,
93 93 accepts connections, or do some operations explicitly on chosen node.
94 94
95 95 In addition this would allow use to explicitly direct certain traffic to a node, e.g
96 96 CI server will only call directly `rc-node-3`. This should be done similar to normal
97 97 installation so check out `Nginx`_/`Apache`_ configuration example to configure each host.
98 98 Each one should already connect to shared database during installation.
99 99
100 100
101 101 1) Assuming our final url will be http://rc-node-1, Configure `instances_id`, `app.base_url`
102 102
103 103 a) On **rc-node-2** find the following settings and edit :file:`/home/{user}/.rccontrol/{instance-id}/rhodecode.ini`
104 104
105 105 .. code-block:: ini
106 106
107 107 ## required format is: *NAME-
108 108 instance_id = *rc-node-2-
109 109 app.base_url = http://rc-node-1
110 110
111 111
112 112 b) On **rc-node-3** find the following settings and edit :file:`/home/{user}/.rccontrol/{instance-id}/rhodecode.ini`
113 113
114 114 .. code-block:: ini
115 115
116 116 ## required format is: *NAME-
117 117 instance_id = *rc-node-3-
118 118 app.base_url = http://rc-node-1
119 119
120 120
121 121
122 122 2) Configure `User Session` to use a shared database. Example config that should be
123 123 changed on both **rc-node-2** and **rc-node-3** .
124 124 Edit :file:`/home/{user}/.rccontrol/{instance-id}/rhodecode.ini`
125 125
126 126 .. code-block:: ini
127 127
128 128 ####################################
129 129 ### BEAKER SESSION ####
130 130 ####################################
131 131
132 132 ## Disable the default `file` sessions
133 133 #beaker.session.type = file
134 134 #beaker.session.data_dir = %(here)s/data/sessions
135 135
136 136 ## use shared db based session, fast, and allows easy management over logged in users
137 137 beaker.session.type = ext:database
138 138 beaker.session.table_name = db_session
139 139 # use our rc-node-1 here
140 140 beaker.session.sa.url = postgresql://postgres:qweqwe@rc-node-1/rhodecode
141 141 beaker.session.sa.pool_recycle = 3600
142 142 beaker.session.sa.echo = false
143 143
144 144 In addition make sure both instances use the same `session.secret` so users have
145 145 persistent sessions across nodes. Please generate other one then in this example.
146 146
147 147 .. code-block:: ini
148 148
149 # use an unique generated long string
149 # use a unique generated long string
150 150 beaker.session.secret = 70e116cae2274656ba7265fd860aebbd
151 151
152 152 3) Configure stored cached/archive cache to our shared NFS `rc-node-1`
153 153
154 154 .. code-block:: ini
155 155
156 156 # note the `_` prefix that allows using a directory without
157 157 # remap and rescan checking for vcs inside it.
158 158 cache_dir = /home/rcdev/storage/_cache_dir/data
159 159 # note archive cache dir is disabled by default, however if you enable
160 160 # it also needs to be shared
161 161 #archive_cache_dir = /home/rcdev/storage/_tarball_cache_dir
162 162
163 163
164 164 4) Use shared exception store. Example config that should be
165 165 changed on both **rc-node-2** and **rc-node-3**, and also for VCSServer.
166 166 Edit :file:`/home/{user}/.rccontrol/{instance-id}/rhodecode.ini` and
167 167 :file:`/home/{user}/.rccontrol/{vcsserver-instance-id}/vcsserver.ini`
168 168 and add/change following setting.
169 169
170 170 .. code-block:: ini
171 171
172 172 exception_tracker.store_path = /home/rcdev/storage/_exception_store_data
173 173
174 174
175 175 5) Change cache backends to use `Redis`_ based caches. Below full example config
176 176 that replaces default file-based cache to shared `Redis`_ with Distributed Lock.
177 177
178 178
179 179 .. code-block:: ini
180 180
181 181 #####################################
182 182 ### DOGPILE CACHE ####
183 183 #####################################
184 184
185 185 ## `cache_perms` cache settings for permission tree, auth TTL.
186 186 #rc_cache.cache_perms.backend = dogpile.cache.rc.file_namespace
187 187 #rc_cache.cache_perms.expiration_time = 300
188 188
189 189 ## alternative `cache_perms` redis backend with distributed lock
190 190 rc_cache.cache_perms.backend = dogpile.cache.rc.redis
191 191 rc_cache.cache_perms.expiration_time = 300
192 192 ## redis_expiration_time needs to be greater then expiration_time
193 193 rc_cache.cache_perms.arguments.redis_expiration_time = 7200
194 194 rc_cache.cache_perms.arguments.socket_timeout = 30
195 195 rc_cache.cache_perms.arguments.host = rc-node-1
196 196 rc_cache.cache_perms.arguments.password = qweqwe
197 197 rc_cache.cache_perms.arguments.port = 6379
198 198 rc_cache.cache_perms.arguments.db = 0
199 199 rc_cache.cache_perms.arguments.distributed_lock = true
200 200
201 201 ## `cache_repo` cache settings for FileTree, Readme, RSS FEEDS
202 202 #rc_cache.cache_repo.backend = dogpile.cache.rc.file_namespace
203 203 #rc_cache.cache_repo.expiration_time = 2592000
204 204
205 205 ## alternative `cache_repo` redis backend with distributed lock
206 206 rc_cache.cache_repo.backend = dogpile.cache.rc.redis
207 207 rc_cache.cache_repo.expiration_time = 2592000
208 208 ## redis_expiration_time needs to be greater then expiration_time
209 209 rc_cache.cache_repo.arguments.redis_expiration_time = 2678400
210 210 rc_cache.cache_repo.arguments.socket_timeout = 30
211 211 rc_cache.cache_repo.arguments.host = rc-node-1
212 212 rc_cache.cache_repo.arguments.password = qweqwe
213 213 rc_cache.cache_repo.arguments.port = 6379
214 214 rc_cache.cache_repo.arguments.db = 1
215 215 rc_cache.cache_repo.arguments.distributed_lock = true
216 216
217 217 ## cache settings for SQL queries, this needs to use memory type backend
218 218 rc_cache.sql_cache_short.backend = dogpile.cache.rc.memory_lru
219 219 rc_cache.sql_cache_short.expiration_time = 30
220 220
221 221 ## `cache_repo_longterm` cache for repo object instances, this needs to use memory
222 222 ## type backend as the objects kept are not pickle serializable
223 223 rc_cache.cache_repo_longterm.backend = dogpile.cache.rc.memory_lru
224 224 ## by default we use 96H, this is using invalidation on push anyway
225 225 rc_cache.cache_repo_longterm.expiration_time = 345600
226 226 ## max items in LRU cache, reduce this number to save memory, and expire last used
227 227 ## cached objects
228 228 rc_cache.cache_repo_longterm.max_size = 10000
229 229
230 230
231 231 6) Configure `Nginx`_ as reverse proxy on `rc-node-2/3`:
232 232 Minimal `Nginx`_ config used:
233 233
234 234
235 235 .. code-block:: nginx
236 236
237 237 ## rate limiter for certain pages to prevent brute force attacks
238 238 limit_req_zone $binary_remote_addr zone=req_limit:10m rate=1r/s;
239 239
240 240 ## custom log format
241 241 log_format log_custom '$remote_addr - $remote_user [$time_local] '
242 242 '"$request" $status $body_bytes_sent '
243 243 '"$http_referer" "$http_user_agent" '
244 244 '$request_time $upstream_response_time $pipe';
245 245
246 246 server {
247 247 listen 80;
248 248 server_name rc-node-2;
249 249 #server_name rc-node-3;
250 250
251 251 access_log /var/log/nginx/rhodecode.access.log log_custom;
252 252 error_log /var/log/nginx/rhodecode.error.log;
253 253
254 254 # example of proxy.conf can be found in our docs.
255 255 include /etc/nginx/proxy.conf;
256 256
257 257 ## serve static files by Nginx, recommended for performance
258 258 location /_static/rhodecode {
259 259 gzip on;
260 260 gzip_min_length 500;
261 261 gzip_proxied any;
262 262 gzip_comp_level 4;
263 263 gzip_types text/css text/javascript text/xml text/plain text/x-component application/javascript application/json application/xml application/rss+xml font/truetype font/opentype application/vnd.ms-fontobject image/svg+xml;
264 264 gzip_vary on;
265 265 gzip_disable "msie6";
266 266 #alias /home/rcdev/.rccontrol/community-1/static;
267 267 alias /home/rcdev/.rccontrol/enterprise-1/static;
268 268 }
269 269
270 270
271 271 location /_admin/login {
272 272 limit_req zone=req_limit burst=10 nodelay;
273 273 try_files $uri @rhode;
274 274 }
275 275
276 276 location / {
277 277 try_files $uri @rhode;
278 278 }
279 279
280 280 location @rhode {
281 281 # Url to running RhodeCode instance.
282 282 # This is shown as `- URL: <host>` in output from rccontrol status.
283 283 proxy_pass http://127.0.0.1:10020;
284 284 }
285 285
286 286 ## custom 502 error page. Will be displayed while RhodeCode server
287 287 ## is turned off
288 288 error_page 502 /502.html;
289 289 location = /502.html {
290 290 #root /home/rcdev/.rccontrol/community-1/static;
291 291 root /home/rcdev/.rccontrol/enterprise-1/static;
292 292 }
293 293 }
294 294
295 295
296 296 7) Optional: Full text search, in case you use `Whoosh` full text search we also need a
297 297 shared storage for the index. In our example our NFS is mounted at `/home/rcdev/storage`
298 298 which represents out storage so we can use the following:
299 299
300 300 .. code-block:: ini
301 301
302 302 # note the `_` prefix that allows using a directory without
303 303 # remap and rescan checking for vcs inside it.
304 304 search.location = /home/rcdev/storage/_index_data/index
305 305
306 306
307 307 .. note::
308 308
309 309 If you use ElasticSearch it's by default shared, and simply running ES node is
310 310 by default cluster compatible.
311 311
312 312
313 313 8) Optional: If you intend to use mailing all instances need to use either a shared
314 314 mailing node, or each will use individual local mail agent. Simply put node-1/2/3
315 315 needs to use same mailing configuration.
316 316
317 317
318 318
319 319 Setup rc-node-1
320 320 ^^^^^^^^^^^^^^^
321 321
322 322
323 323 Configure `Nginx`_ as Load Balancer to rc-node-2/3.
324 324 Minimal `Nginx`_ example below:
325 325
326 326 .. code-block:: nginx
327 327
328 328 ## define rc-cluster which contains a pool of our instances to connect to
329 329 upstream rc-cluster {
330 330 # rc-node-2/3 are stored in /etc/hosts with correct IP addresses
331 331 server rc-node-2:80;
332 332 server rc-node-3:80;
333 333 }
334 334
335 335 server {
336 336 listen 80;
337 337 server_name rc-node-1;
338 338
339 339 location / {
340 340 proxy_pass http://rc-cluster;
341 341 }
342 342 }
343 343
344 344
345 345 .. note::
346 346
347 347 You should configure your load balancing accordingly. We recommend writing
348 348 load balancing rules that will separate regular user traffic from
349 349 automated process traffic like continuous servers or build bots. Sticky sessions
350 350 are not required.
351 351
352 352
353 353 Show which instance handles a request
354 354 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
355 355
356 356 You can easily check if load-balancing is working as expected. Visit our main node
357 357 `rc-node-1` URL which at that point should already handle incoming requests and balance
358 358 it across node-2/3.
359 359
360 360 Add a special GET param `?showrcid=1` to show current instance handling your request.
361 361
362 362 For example: visiting url `http://rc-node-1/?showrcid=1` will show, in the bottom
363 363 of the screen` cluster instance info.
364 364 e.g: `RhodeCode instance id: rc-node-3-rc-node-3-3246`
365 365 which is generated from::
366 366
367 367 <NODE_HOSTNAME>-<INSTANCE_ID>-<WORKER_PID>
368 368
369 369
370 370 Using Celery with cluster
371 371 ^^^^^^^^^^^^^^^^^^^^^^^^^
372 372
373 373
374 374 If `Celery` is used we recommend setting also an instance of Enterprise/Community+VCSserver
375 375 on the node that is running `RabbitMQ`_. Those instances will be used to executed async
376 376 tasks on the `rc-node-1`. This is the most efficient setup. `Celery` usually
377 377 handles tasks such as sending emails, forking repositories, importing
378 378 repositories from external location etc. Using workers on instance that has
379 379 the direct access to disks used by NFS as well as email server gives noticeable
380 380 performance boost. Running local workers to the NFS storage results in faster
381 381 execution of forking large repositories or sending lots of emails.
382 382
383 383 Those instances need to be configured in the same way as for other nodes.
384 384 The instance in rc-node-1 can be added to the cluser, but we don't recommend doing it.
385 385 For best results let it be isolated to only executing `Celery` tasks in the cluster setup.
386 386
387 387
388 388 .. _Gunicorn: http://gunicorn.org/
389 389 .. _Whoosh: https://pypi.python.org/pypi/Whoosh/
390 390 .. _Elasticsearch: https://www.elastic.co/..
391 391 .. _RabbitMQ: http://www.rabbitmq.com/
392 392 .. _Nginx: http://nginx.io
393 393 .. _Apache: http://nginx.io
394 394 .. _Redis: http://redis.io
395 395
General Comments 0
You need to be logged in to leave comments. Login now