##// END OF EJS Templates
docs: add elasticsearch docs
dan -
r153:42d94537 default
parent child Browse files
Show More
@@ -1,235 +1,272 b''
1 .. _indexing-ref:
1 .. _indexing-ref:
2
2
3 Full-text Search
3 Full-text Search
4 ----------------
4 ----------------
5
5
6 By default |RCM| uses `Whoosh`_ to index |repos| and provide full-text search.
6 By default |RC| is configured to use `Whoosh`_ to index |repos| and
7 provide full-text search.
8
9 |RCE| also provides support for `Elasticsearch`_ as a backend for scalable
10 search. See :ref:`enable-elasticsearch` for details.
11
12 Indexing
13 ^^^^^^^^
14
7 To run the indexer you need to use an |authtoken| with admin rights to all
15 To run the indexer you need to use an |authtoken| with admin rights to all
8 |repos|.
16 |repos|.
9
17
10 To index new content added, you have the option to set the indexer up in a
18 To index new content added, you have the option to set the indexer up in a
11 number of ways, for example:
19 number of ways, for example:
12
20
13 * Call the indexer via a cron job. We recommend running this nightly,
21 * Call the indexer via a cron job. We recommend running this nightly,
14 unless you need everything indexed immediately.
22 unless you need everything indexed immediately.
15 * Set the indexer to infinitely loop and reindex as soon as it has run its
23 * Set the indexer to infinitely loop and reindex as soon as it has run its
16 cycle.
24 cycle.
17 * Hook the indexer up with your CI server to reindex after each push.
25 * Hook the indexer up with your CI server to reindex after each push.
18
26
19 The indexer works by indexing new commits added since the last run. If you
27 The indexer works by indexing new commits added since the last run. If you
20 wish to build a brand new index from scratch each time,
28 wish to build a brand new index from scratch each time,
21 use the ``force`` option in the configuration file.
29 use the ``force`` option in the configuration file.
22
30
23 .. important::
31 .. important::
24
32
25 You need to have |RCT| installed, see :ref:`install-tools`. Since |RCE|
33 You need to have |RCT| installed, see :ref:`install-tools`. Since |RCE|
26 3.5.0 they are installed by default.
34 3.5.0 they are installed by default.
27
35
28 To set up indexing, use the following steps:
36 To set up indexing, use the following steps:
29
37
30 1. :ref:`config-rhoderc`, if running tools remotely.
38 1. :ref:`config-rhoderc`, if running tools remotely.
31 2. :ref:`run-index`
39 2. :ref:`run-index`
32 3. :ref:`set-index`
40 3. :ref:`set-index`
33 4. :ref:`advanced-indexing`
41 4. :ref:`advanced-indexing`
34
42
35 .. _config-rhoderc:
43 .. _config-rhoderc:
36
44
37 Configure the ``.rhoderc`` File
45 Configure the ``.rhoderc`` File
38 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
46 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
39
47
40 |RCT| uses the :file:`/home/{user}/.rhoderc` file for connection details
48 |RCT| uses the :file:`/home/{user}/.rhoderc` file for connection details
41 to |RCM| instances. If this file is not automatically created,
49 to |RCM| instances. If this file is not automatically created,
42 you can configure it using the following example. You need to configure the
50 you can configure it using the following example. You need to configure the
43 details for each instance you want to index.
51 details for each instance you want to index.
44
52
45 .. code-block:: bash
53 .. code-block:: bash
46
54
47 # Check the instance details
55 # Check the instance details
48 # of the instance you want to index
56 # of the instance you want to index
49 $ rccontrol status
57 $ rccontrol status
50
58
51 - NAME: enterprise-1
59 - NAME: enterprise-1
52 - STATUS: RUNNING
60 - STATUS: RUNNING
53 - TYPE: Momentum
61 - TYPE: Momentum
54 - VERSION: 1.5.0
62 - VERSION: 1.5.0
55 - URL: http://127.0.0.1:10000
63 - URL: http://127.0.0.1:10000
56
64
57 To get your API Token, on the |RCM| interface go to
65 To get your API Token, on the |RCM| interface go to
58 :menuselection:`username --> My Account --> Auth tokens`
66 :menuselection:`username --> My Account --> Auth tokens`
59
67
60 .. code-block:: ini
68 .. code-block:: ini
61
69
62 # Configure .rhoderc with matching details
70 # Configure .rhoderc with matching details
63 # This allows the indexer to connect to the instance
71 # This allows the indexer to connect to the instance
64 [instance:enterprise-1]
72 [instance:enterprise-1]
65 api_host = http://127.0.0.1:10000
73 api_host = http://127.0.0.1:10000
66 api_key = <auth token goes here>
74 api_key = <auth token goes here>
67 repo_dir = /home/<username>/repos
75 repo_dir = /home/<username>/repos
68
76
69 .. _run-index:
77 .. _run-index:
70
78
71 Run the Indexer
79 Run the Indexer
72 ^^^^^^^^^^^^^^^
80 ^^^^^^^^^^^^^^^
73
81
74 Run the indexer using the following command, and specify the instance you
82 Run the indexer using the following command, and specify the instance you
75 want to index:
83 want to index:
76
84
77 .. code-block:: bash
85 .. code-block:: bash
78
86
79 # From inside a virtualevv
87 # From inside a virtualevv
80 (venv)$ rhodecode-index --instance-name=enterprise-1
88 (venv)$ rhodecode-index --instance-name=enterprise-1
81
89
82 # Using default installation
90 # Using default installation
83 $ /home/user/.rccontrol/enterprise-4/profile/bin/rhodecode-index \
91 $ /home/user/.rccontrol/enterprise-4/profile/bin/rhodecode-index \
84 --instance-name=enterprise-4
92 --instance-name=enterprise-4
85
93
86 # Using a custom mapping file
94 # Using a custom mapping file
87 $ /home/user/.rccontrol/enterprise-4/profile/bin/rhodecode-index \
95 $ /home/user/.rccontrol/enterprise-4/profile/bin/rhodecode-index \
88 --instance-name=enterprise-4 \
96 --instance-name=enterprise-4 \
89 --mapping=/home/user/.rccontrol/enterprise-4/mapping.ini
97 --mapping=/home/user/.rccontrol/enterprise-4/mapping.ini
90
98
91 .. note::
99 .. note::
92
100
93 |RCT| require |PY| 2.7 to run.
101 |RCT| require |PY| 2.7 to run.
94
102
95 .. _set-index:
103 .. _set-index:
96
104
97 Schedule the Indexer
105 Schedule the Indexer
98 ^^^^^^^^^^^^^^^^^^^^
106 ^^^^^^^^^^^^^^^^^^^^
99
107
100 To schedule the indexer, configure the crontab file to run the indexer inside
108 To schedule the indexer, configure the crontab file to run the indexer inside
101 your |RCT| virtualenv using the following steps.
109 your |RCT| virtualenv using the following steps.
102
110
103 1. Open the crontab file, using ``crontab -e``.
111 1. Open the crontab file, using ``crontab -e``.
104 2. Add the indexer to the crontab, and schedule it to run as regularly as you
112 2. Add the indexer to the crontab, and schedule it to run as regularly as you
105 wish.
113 wish.
106 3. Save the file.
114 3. Save the file.
107
115
108 .. code-block:: bash
116 .. code-block:: bash
109
117
110 $ crontab -e
118 $ crontab -e
111
119
112 # The virtualenv can be called using its full path, so for example you can
120 # The virtualenv can be called using its full path, so for example you can
113 # put this example into the crontab
121 # put this example into the crontab
114
122
115 # Run the indexer daily at 4am using the default mapping settings
123 # Run the indexer daily at 4am using the default mapping settings
116 * 4 * * * /home/ubuntu/.virtualenv/rhodecode-venv/bin/rhodecode-index \
124 * 4 * * * /home/ubuntu/.virtualenv/rhodecode-venv/bin/rhodecode-index \
117 --instance-name=enterprise-1
125 --instance-name=enterprise-1
118
126
119 # Run the indexer every Sunday at 3am using default mapping
127 # Run the indexer every Sunday at 3am using default mapping
120 * 3 * * 0 /home/ubuntu/.virtualenv/rhodecode-venv/bin/rhodecode-index \
128 * 3 * * 0 /home/ubuntu/.virtualenv/rhodecode-venv/bin/rhodecode-index \
121 --instance-name=enterprise-1
129 --instance-name=enterprise-1
122
130
123 # Run the indexer every 15 minutes
131 # Run the indexer every 15 minutes
124 # using a specially configured mapping file
132 # using a specially configured mapping file
125 */15 * * * * ~/.rccontrol/enterprise-4/profile/bin/rhodecode-index \
133 */15 * * * * ~/.rccontrol/enterprise-4/profile/bin/rhodecode-index \
126 --instance-name=enterprise-4 \
134 --instance-name=enterprise-4 \
127 --mapping=/home/user/.rccontrol/enterprise-4/mapping.ini
135 --mapping=/home/user/.rccontrol/enterprise-4/mapping.ini
128
136
129 .. _advanced-indexing:
137 .. _advanced-indexing:
130
138
131 Advanced Indexing
139 Advanced Indexing
132 ^^^^^^^^^^^^^^^^^
140 ^^^^^^^^^^^^^^^^^
133
141
134 |RCT| indexes based on the :file:`mapping.ini` file. To configure your index,
142 |RCT| indexes based on the :file:`mapping.ini` file. To configure your index,
135 you can specify different options in this file. The default location is:
143 you can specify different options in this file. The default location is:
136
144
137 * :file:`/home/{user}/.rccontrol/{instance-id}/mapping.ini`, using default
145 * :file:`/home/{user}/.rccontrol/{instance-id}/mapping.ini`, using default
138 |RCT|.
146 |RCT|.
139 * :file:`~/venv/lib/python2.7/site-packages/rhodecode_tools/templates/mapping.ini`,
147 * :file:`~/venv/lib/python2.7/site-packages/rhodecode_tools/templates/mapping.ini`,
140 when using ``virtualenv``.
148 when using ``virtualenv``.
141
149
142 .. note::
150 .. note::
143
151
144 If you need to create the :file:`mapping.ini` file, use the |RCT|
152 If you need to create the :file:`mapping.ini` file, use the |RCT|
145 ``rhodecode-index --create-mapping path/to/file`` API call. For details,
153 ``rhodecode-index --create-mapping path/to/file`` API call. For details,
146 see the :ref:`tools-cli` section.
154 see the :ref:`tools-cli` section.
147
155
148 The indexer runs in a random order to prevent a failing |repo| from stopping
156 The indexer runs in a random order to prevent a failing |repo| from stopping
149 a build. To configure different indexing scenarios, set the following options
157 a build. To configure different indexing scenarios, set the following options
150 inside the :file:`mapping.ini` and specify the altered file using the
158 inside the :file:`mapping.ini` and specify the altered file using the
151 ``--mapping`` option.
159 ``--mapping`` option.
152
160
153 * ``index_files`` : Index the specified file types.
161 * ``index_files`` : Index the specified file types.
154 * ``skip_files`` : Do not index the specified file types.
162 * ``skip_files`` : Do not index the specified file types.
155 * ``index_files_content`` : Index the content of the specified file types.
163 * ``index_files_content`` : Index the content of the specified file types.
156 * ``skip_files_content`` : Do not index the content of the specified files.
164 * ``skip_files_content`` : Do not index the content of the specified files.
157 * ``force`` : Create a fresh index on each run.
165 * ``force`` : Create a fresh index on each run.
158 * ``max_filesize`` : Files larger than the set size will not be indexed.
166 * ``max_filesize`` : Files larger than the set size will not be indexed.
159 * ``commit_parse_limit`` : Set the batch size when indexing commit messages.
167 * ``commit_parse_limit`` : Set the batch size when indexing commit messages.
160 Set to a lower number to lessen memory load.
168 Set to a lower number to lessen memory load.
161 * ``repo_limit`` : Set the maximum number or |repos| indexed per run.
169 * ``repo_limit`` : Set the maximum number or |repos| indexed per run.
162 * ``[INCLUDE]`` : Set |repos| you want indexed. This takes precedent over
170 * ``[INCLUDE]`` : Set |repos| you want indexed. This takes precedent over
163 ``[EXCLUDE]``.
171 ``[EXCLUDE]``.
164 * ``[EXCLUDE]`` : Set |repos| you do not want indexed. Exclude can be used to
172 * ``[EXCLUDE]`` : Set |repos| you do not want indexed. Exclude can be used to
165 not index branches, forks, or log |repos|.
173 not index branches, forks, or log |repos|.
166
174
167 At the end of the file you can specify conditions for specific |repos| that
175 At the end of the file you can specify conditions for specific |repos| that
168 will override the default values. To configure your indexer,
176 will override the default values. To configure your indexer,
169 use the following example :file:`mapping.ini` file.
177 use the following example :file:`mapping.ini` file.
170
178
171 .. code-block:: ini
179 .. code-block:: ini
172
180
173 [__DEFAULT__]
181 [__DEFAULT__]
174 # default patterns for indexing files and content of files.
182 # default patterns for indexing files and content of files.
175 # Binary files are skipped by default.
183 # Binary files are skipped by default.
176
184
177 # Index python and markdown files
185 # Index python and markdown files
178 index_files = *.py, *.md
186 index_files = *.py, *.md
179
187
180 # Do not index these file types
188 # Do not index these file types
181 skip_files = *.svg, *.log, *.dump, *.txt
189 skip_files = *.svg, *.log, *.dump, *.txt
182
190
183 # Index both file types and their content
191 # Index both file types and their content
184 index_files_content = *.cpp, *.ini, *.py
192 index_files_content = *.cpp, *.ini, *.py
185
193
186 # Index file names, but not file content
194 # Index file names, but not file content
187 skip_files_content = *.svg,
195 skip_files_content = *.svg,
188
196
189 # Force rebuilding an index from scratch. Each repository will be rebuild
197 # Force rebuilding an index from scratch. Each repository will be rebuild
190 # from scratch with a global flag. Use local flag to rebuild single repos
198 # from scratch with a global flag. Use local flag to rebuild single repos
191 force = false
199 force = false
192
200
193 # Do not index files larger than 385KB
201 # Do not index files larger than 385KB
194 max_filesize = 385KB
202 max_filesize = 385KB
195
203
196 # Limit commit indexing to 500 per batch
204 # Limit commit indexing to 500 per batch
197 commit_parse_limit = 500
205 commit_parse_limit = 500
198
206
199 # Limit each index run to 25 repos
207 # Limit each index run to 25 repos
200 repo_limit = 25
208 repo_limit = 25
201
209
202 # __INCLUDE__ is more important that __EXCLUDE__.
210 # __INCLUDE__ is more important that __EXCLUDE__.
203
211
204 [__INCLUDE__]
212 [__INCLUDE__]
205 # Include all repos with these names
213 # Include all repos with these names
206
214
207 docs/* = 1
215 docs/* = 1
208 lib/* = 1
216 lib/* = 1
209
217
210 [__EXCLUDE__]
218 [__EXCLUDE__]
211 # Do not include the following repo in index
219 # Do not include the following repo in index
212
220
213 dev-docs/* = 1
221 dev-docs/* = 1
214 legacy-repos/* = 1
222 legacy-repos/* = 1
215 *-dev/* = 1
223 *-dev/* = 1
216
224
217 # Each repo that needs special indexing is a separate section below.
225 # Each repo that needs special indexing is a separate section below.
218 # In each section set the options to override the global configuration
226 # In each section set the options to override the global configuration
219 # parameters above.
227 # parameters above.
220 # If special settings are not configured, the global configuration values
228 # If special settings are not configured, the global configuration values
221 # above are inherited. If no special repositories are
229 # above are inherited. If no special repositories are
222 # defined here RhodeCode will use the API to ask for all repositories
230 # defined here RhodeCode will use the API to ask for all repositories
223
231
224 # For this repo use different settings
232 # For this repo use different settings
225 [special-repo]
233 [special-repo]
226 commit_parse_limit = 20,
234 commit_parse_limit = 20,
227 skip_files = *.idea, *.xml,
235 skip_files = *.idea, *.xml,
228
236
229 # For another repo use different settings
237 # For another repo use different settings
230 [another-special-repo]
238 [another-special-repo]
231 index_files = *,
239 index_files = *,
232 max_filesize = 800MB
240 max_filesize = 800MB
233 commit_parse_limit = 20000
241 commit_parse_limit = 20000
234
242
243 .. _enable-elasticsearch:
244
245 Enabling Elasticsearch
246 ^^^^^^^^^^^^^^^^^^^^^^
247
248 1. Open the :file:`rhodecode.ini` file for the instance you wish to edit. The
249 default location is
250 :file:`home/{user}/.rccontrol/{instance-id}/rhodecode.ini`
251 2. Find the search configuration section:
252
253 .. code-block:: ini
254
255 ###################################
256 ## SEARCH INDEXING CONFIGURATION ##
257 ###################################
258
259 search.module = rhodecode.lib.index.whoosh
260 search.location = %(here)s/data/index
261
262 and change it to:
263
264 .. code-block:: ini
265
266 search.module = rc_elasticsearch
267 search.location = http://localhost:9200/
268
269 where ``search.location`` points to the elasticsearch server.
270
235 .. _Whoosh: https://pypi.python.org/pypi/Whoosh/
271 .. _Whoosh: https://pypi.python.org/pypi/Whoosh/
272 .. _Elasticsearch: https://www.elastic.co/ No newline at end of file
General Comments 0
You need to be logged in to leave comments. Login now