Show More
@@ -1,363 +1,359 b'' | |||||
1 | .. _indexing-ref: |
|
1 | .. _indexing-ref: | |
2 |
|
2 | |||
3 | Full-text Search |
|
3 | Full-text Search | |
4 | ---------------- |
|
4 | ---------------- | |
5 |
|
5 | |||
|
6 | RhodeCode provides a full text search capabilities to search inside file content, | |||
|
7 | commit message, and file paths. Indexing is not enabled by default and to use | |||
|
8 | full text search building an index is a pre-requisite. | |||
|
9 | ||||
6 | By default RhodeCode is configured to use `Whoosh`_ to index |repos| and |
|
10 | By default RhodeCode is configured to use `Whoosh`_ to index |repos| and | |
7 | provide full-text search. |
|
11 | provide full-text search. `Whoosh`_ works well for a small amount of data and | |
|
12 | shouldn't be used in case of large code-bases and lots of repositories. | |||
8 |
|
13 | |||
9 |
|RCE| also provides support for `Elastic |
|
14 | |RCE| also provides support for `ElasticSearch 6`_ as a backend more for advanced | |
10 | and scalable search. See :ref:`enable-elasticsearch` for details. |
|
15 | and scalable search. See :ref:`enable-elasticsearch` for details. | |
11 |
|
16 | |||
12 | Indexing |
|
17 | Indexing | |
13 | ^^^^^^^^ |
|
18 | ^^^^^^^^ | |
14 |
|
19 | |||
15 | To run the indexer you need to have an |authtoken| with admin rights to all |repos|. |
|
20 | To run the indexer you need to have an |authtoken| with admin rights to all |repos|. | |
16 |
|
21 | |||
17 |
To index |
|
22 | To index repositories stored in RhodeCode, you have the option to set the indexer up in a | |
18 | number of ways, for example: |
|
23 | number of ways, for example: | |
19 |
|
24 | |||
20 | * Call the indexer via a cron job. We recommend running this once at night. |
|
25 | * Call the indexer via a cron job. We recommend running this once at night. | |
21 | In case you need everything indexed immediately it's possible to index few |
|
26 | In case you need everything indexed immediately it's possible to index few | |
22 | times during the day. |
|
27 | times during the day. Indexer has a special locking mechanism that won't allow | |
|
28 | two instances of indexer running at once. It's safe to run it even every 1hr. | |||
23 | * Set the indexer to infinitely loop and reindex as soon as it has run its previous cycle. |
|
29 | * Set the indexer to infinitely loop and reindex as soon as it has run its previous cycle. | |
24 | * Hook the indexer up with your CI server to reindex after each push. |
|
30 | * Hook the indexer up with your CI server to reindex after each push. | |
25 |
|
31 | |||
26 |
The indexer works by indexing new commits added since the last run |
|
32 | The indexer works by indexing new commits added since the last run, and comparing | |
27 | wish to build a brand new index from scratch each time, |
|
33 | file changes to index only new or modified files. | |
28 | use the ``force`` option in the configuration file. |
|
34 | If you wish to build a brand new index from scratch each time, use the ``force`` | |
|
35 | option in the configuration file, or run it with --force flag. | |||
29 |
|
36 | |||
30 | .. important:: |
|
37 | .. important:: | |
31 |
|
38 | |||
32 | You need to have |RCT| installed, see :ref:`install-tools`. Since |RCE| |
|
39 | You need to have |RCT| installed, see :ref:`install-tools`. Since |RCE| | |
33 | 3.5.0 they are installed by default and available with community/enterprise installations. |
|
40 | 3.5.0 they are installed by default and available with community/enterprise installations. | |
34 |
|
41 | |||
35 | To set up indexing, use the following steps: |
|
42 | To set up indexing, use the following steps: | |
36 |
|
43 | |||
37 | 1. :ref:`config-rhoderc`, if running tools remotely. |
|
44 | 1. :ref:`config-rhoderc`, if running tools remotely. | |
38 | 2. :ref:`run-index` |
|
45 | 2. :ref:`run-index` | |
39 | 3. :ref:`set-index` |
|
46 | 3. :ref:`set-index` | |
40 | 4. :ref:`advanced-indexing` |
|
47 | 4. :ref:`advanced-indexing` | |
41 |
|
48 | |||
42 | .. _config-rhoderc: |
|
49 | .. _config-rhoderc: | |
43 |
|
50 | |||
44 | Configure the ``.rhoderc`` File |
|
51 | Configure the ``.rhoderc`` File | |
45 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
52 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
46 |
|
53 | |||
47 | .. note:: |
|
54 | .. note:: | |
48 |
|
55 | |||
49 | Optionally it's possible to use indexer without the ``.rhoderc``. Simply instead of |
|
56 | Optionally it's possible to use indexer without the ``.rhoderc``. Simply instead of | |
50 | executing with `--instance-name=enterprise-1` execute providing the host and token |
|
57 | executing with `--instance-name=enterprise-1` execute providing the host and token | |
51 |
directly: `--api-host=http://127.0.0.1:10000 --api-key=<auth |
|
58 | directly: `--api-host=http://127.0.0.1:10000 --api-key=<auth-token-goes-here>` | |
52 |
|
59 | |||
53 |
|
60 | |||
54 | |RCT| uses the :file:`/home/{user}/.rhoderc` file for connection details |
|
61 | |RCT| uses the :file:`/home/{user}/.rhoderc` file for connection details | |
55 | to |RCE| instances. If this file is not automatically created, |
|
62 | to |RCE| instances. If this file is not automatically created, | |
56 | you can configure it using the following example. You need to configure the |
|
63 | you can configure it using the following example. You need to configure the | |
57 | details for each instance you want to index. |
|
64 | details for each instance you want to index. | |
58 |
|
65 | |||
59 | .. code-block:: bash |
|
66 | .. code-block:: bash | |
60 |
|
67 | |||
61 | # Check the instance details |
|
68 | # Check the instance details | |
62 | # of the instance you want to index |
|
69 | # of the instance you want to index | |
63 | $ rccontrol status |
|
70 | $ rccontrol status | |
64 |
|
71 | |||
65 | - NAME: enterprise-1 |
|
72 | - NAME: enterprise-1 | |
66 | - STATUS: RUNNING |
|
73 | - STATUS: RUNNING | |
67 | - TYPE: Enterprise |
|
74 | - TYPE: Enterprise | |
68 | - VERSION: 4.1.0 |
|
75 | - VERSION: 4.1.0 | |
69 | - URL: http://127.0.0.1:10003 |
|
76 | - URL: http://127.0.0.1:10003 | |
70 |
|
77 | |||
71 | To get your API Token, on the |RCE| interface go to |
|
78 | To get your API Token, on the |RCE| interface go to | |
72 | :menuselection:`username --> My Account --> Auth tokens` |
|
79 | :menuselection:`username --> My Account --> Auth tokens` | |
73 |
|
80 | |||
74 | .. code-block:: ini |
|
81 | .. code-block:: ini | |
75 |
|
82 | |||
76 | # Configure .rhoderc with matching details |
|
83 | # Configure .rhoderc with matching details | |
77 | # This allows the indexer to connect to the instance |
|
84 | # This allows the indexer to connect to the instance | |
78 | [instance:enterprise-1] |
|
85 | [instance:enterprise-1] | |
79 | api_host = http://127.0.0.1:10000 |
|
86 | api_host = http://127.0.0.1:10000 | |
80 | api_key = <auth token goes here> |
|
87 | api_key = <auth token goes here> | |
81 |
|
88 | |||
82 |
|
89 | |||
83 | .. _run-index: |
|
90 | .. _run-index: | |
84 |
|
91 | |||
85 | Run the Indexer |
|
92 | Run the Indexer | |
86 | ^^^^^^^^^^^^^^^ |
|
93 | ^^^^^^^^^^^^^^^ | |
87 |
|
94 | |||
88 | Run the indexer using the following command, and specify the instance you want to index: |
|
95 | Run the indexer using the following command, and specify the instance you want to index: | |
89 |
|
96 | |||
90 | .. code-block:: bash |
|
97 | .. code-block:: bash | |
91 |
|
98 | |||
92 |
# Using default |
|
99 | # Using default simples indexing of all repositories | |
93 | $ /home/user/.rccontrol/enterprise-1/profile/bin/rhodecode-index \ |
|
100 | $ /home/user/.rccontrol/enterprise-1/profile/bin/rhodecode-index \ | |
94 | --instance-name=enterprise-1 |
|
101 | --instance-name=enterprise-1 | |
95 |
|
102 | |||
96 | # Using a custom mapping file |
|
103 | # Using a custom mapping file with indexing rules, and using elasticsearch 6 backend | |
97 | $ /home/user/.rccontrol/enterprise-1/profile/bin/rhodecode-index \ |
|
104 | $ /home/user/.rccontrol/enterprise-1/profile/bin/rhodecode-index \ | |
98 | --instance-name=enterprise-1 \ |
|
105 | --instance-name=enterprise-1 \ | |
99 | --mapping=/home/user/.rccontrol/enterprise-1/search_mapping.ini |
|
106 | --mapping=/home/user/.rccontrol/enterprise-1/search_mapping.ini \ | |
|
107 | --es-version=6 --engine-location=http://elasticsearch-host:9200 | |||
100 |
|
108 | |||
101 | # Using a custom mapping file and invocation without ``.rhoderc`` |
|
109 | # Using a custom mapping file and invocation without ``.rhoderc`` | |
102 | $ /home/user/.rccontrol/enterprise-1/profile/bin/rhodecode-index \ |
|
110 | $ /home/user/.rccontrol/enterprise-1/profile/bin/rhodecode-index \ | |
103 | --api-host=http://rhodecodecode.myserver.com --api-key=xxxxx \ |
|
111 | --api-host=http://rhodecodecode.myserver.com --api-key=xxxxx \ | |
104 | --mapping=/home/user/.rccontrol/enterprise-1/search_mapping.ini |
|
112 | --mapping=/home/user/.rccontrol/enterprise-1/search_mapping.ini | |
105 |
|
113 | |||
106 | # From inside a virtualev on your local machine or CI server. |
|
114 | # From inside a virtualev on your local machine or CI server. | |
107 | (venv)$ rhodecode-index --instance-name=enterprise-1 |
|
115 | (venv)$ rhodecode-index --instance-name=enterprise-1 | |
108 |
|
116 | |||
109 |
|
117 | |||
110 | .. note:: |
|
118 | .. note:: | |
111 |
|
119 | |||
112 | In case of often indexing the index may become fragmented. Most often a result of that |
|
120 | In case of often indexing the index may become fragmented. Most often a result of that | |
113 | is error about `too many open files`. To fix this indexer needs to be executed with |
|
121 | is error about `too many open files`. To fix this indexer needs to be executed with | |
114 | --optimize flag. E.g `rhodecode-index --instance-name=enterprise-1 --optimize` |
|
122 | --optimize flag. E.g `rhodecode-index --instance-name=enterprise-1 --optimize` | |
115 | This should be executed regularly, once a week is recommended. |
|
123 | This should be executed regularly, once a week is recommended. | |
116 |
|
124 | |||
117 |
|
125 | |||
118 | .. _set-index: |
|
126 | .. _set-index: | |
119 |
|
127 | |||
120 | Schedule the Indexer |
|
128 | Schedule the Indexer | |
121 | ^^^^^^^^^^^^^^^^^^^^ |
|
129 | ^^^^^^^^^^^^^^^^^^^^ | |
122 |
|
130 | |||
123 | To schedule the indexer, configure the crontab file to run the indexer inside |
|
131 | To schedule the indexer, configure the crontab file to run the indexer inside | |
124 | your |RCT| virtualenv using the following steps. |
|
132 | your |RCT| virtualenv using the following steps. | |
125 |
|
133 | |||
126 | 1. Open the crontab file, using ``crontab -e``. |
|
134 | 1. Open the crontab file, using ``crontab -e``. | |
127 | 2. Add the indexer to the crontab, and schedule it to run as regularly as you |
|
135 | 2. Add the indexer to the crontab, and schedule it to run as regularly as you | |
128 | wish. |
|
136 | wish. | |
129 | 3. Save the file. |
|
137 | 3. Save the file. | |
130 |
|
138 | |||
131 | .. code-block:: bash |
|
139 | .. code-block:: bash | |
132 |
|
140 | |||
133 | $ crontab -e |
|
141 | $ crontab -e | |
134 |
|
142 | |||
135 | # The virtualenv can be called using its full path, so for example you can |
|
143 | # The virtualenv can be called using its full path, so for example you can | |
136 | # put this example into the crontab |
|
144 | # put this example into the crontab | |
137 |
|
145 | |||
138 | # Run the indexer daily at 4am using the default mapping settings |
|
146 | # Run the indexer daily at 4am using the default mapping settings | |
139 | * 4 * * * /home/ubuntu/.virtualenv/rhodecode-venv/bin/rhodecode-index \ |
|
147 | * 4 * * * /home/ubuntu/.virtualenv/rhodecode-venv/bin/rhodecode-index \ | |
140 | --instance-name=enterprise-1 |
|
148 | --instance-name=enterprise-1 | |
141 |
|
149 | |||
142 | # Run the indexer every Sunday at 3am using default mapping |
|
150 | # Run the indexer every Sunday at 3am using default mapping | |
143 | * 3 * * 0 /home/ubuntu/.virtualenv/rhodecode-venv/bin/rhodecode-index \ |
|
151 | * 3 * * 0 /home/ubuntu/.virtualenv/rhodecode-venv/bin/rhodecode-index \ | |
144 | --instance-name=enterprise-1 |
|
152 | --instance-name=enterprise-1 | |
145 |
|
153 | |||
146 | # Run the indexer every 15 minutes |
|
154 | # Run the indexer every 15 minutes | |
147 | # using a specially configured mapping file |
|
155 | # using a specially configured mapping file | |
148 | */15 * * * * ~/.rccontrol/enterprise-4/profile/bin/rhodecode-index \ |
|
156 | */15 * * * * ~/.rccontrol/enterprise-4/profile/bin/rhodecode-index \ | |
149 | --instance-name=enterprise-4 \ |
|
157 | --instance-name=enterprise-4 \ | |
150 | --mapping=/home/user/.rccontrol/enterprise-4/search_mapping.ini |
|
158 | --mapping=/home/user/.rccontrol/enterprise-4/search_mapping.ini | |
151 |
|
159 | |||
152 | .. _advanced-indexing: |
|
160 | .. _advanced-indexing: | |
153 |
|
161 | |||
154 | Advanced Indexing |
|
162 | Advanced Indexing | |
155 | ^^^^^^^^^^^^^^^^^ |
|
163 | ^^^^^^^^^^^^^^^^^ | |
156 |
|
164 | |||
157 |
|
165 | |||
158 | Force Re-Indexing single repository |
|
166 | Force Re-Indexing single repository | |
159 | +++++++++++++++++++++++++++++++++++ |
|
167 | +++++++++++++++++++++++++++++++++++ | |
160 |
|
168 | |||
161 | Often it's required to re-index whole repository because of some repository changes, |
|
169 | Often it's required to re-index whole repository because of some repository changes, | |
162 | or to remove some indexed secrets, or files. There's a special `--repo-name=` flag |
|
170 | or to remove some indexed secrets, or files. There's a special `--repo-name=` flag | |
163 | for the indexer that limits execution to a single repository. For example to force-reindex |
|
171 | for the indexer that limits execution to a single repository. For example to force-reindex | |
164 | single repository such call can be made:: |
|
172 | single repository such call can be made:: | |
165 |
|
173 | |||
166 | rhodecode-index --instance-name=enterprise-1 --force --repo-name=rhodecode-vcsserver |
|
174 | rhodecode-index --instance-name=enterprise-1 --force --repo-name=rhodecode-vcsserver | |
167 |
|
175 | |||
168 |
|
176 | |||
169 | Removing repositories from index |
|
177 | Removing repositories from index | |
170 | ++++++++++++++++++++++++++++++++ |
|
178 | ++++++++++++++++++++++++++++++++ | |
171 |
|
179 | |||
172 | The indexer automatically removes renamed repositories and builds index for new names. |
|
180 | The indexer automatically removes renamed repositories and builds index for new names. | |
|
181 | In the same way if a listed repository in mapping.ini is not reported existing by the | |||
|
182 | server it's removed from the index. | |||
173 | In case that you wish to remove indexed repository manually such call would allow that:: |
|
183 | In case that you wish to remove indexed repository manually such call would allow that:: | |
174 |
|
184 | |||
175 | rhodecode-index --instance-name=enterprise-1 --remove-only --repo-name=rhodecode-vcsserver |
|
185 | rhodecode-index --instance-name=enterprise-1 --remove-only --repo-name=rhodecode-vcsserver | |
176 |
|
186 | |||
177 |
|
187 | |||
178 | Using search_mapping.ini file for advanced index rules |
|
188 | Using search_mapping.ini file for advanced index rules | |
179 | ++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
|
189 | ++++++++++++++++++++++++++++++++++++++++++++++++++++++ | |
180 |
|
190 | |||
181 | By default rhodecode-index runs for all repositories, all files with parsing limits |
|
191 | By default rhodecode-index runs for all repositories, all files with parsing limits | |
182 | defined by the CLI default arguments. You can change those limits by calling with |
|
192 | defined by the CLI default arguments. You can change those limits by calling with | |
183 |
different flags such as `--max-filesize |
|
193 | different flags such as `--max-filesize=2048kb` or `--repo-limit=10` | |
184 |
|
194 | |||
185 | For more advanced execution logic it's possible to use a configuration file that |
|
195 | For more advanced execution logic it's possible to use a configuration file that | |
186 | would define detailed rules which repositories and how should be indexed. |
|
196 | would define detailed rules which repositories and how should be indexed. | |
187 |
|
197 | |||
188 | |RCT| provides an example index configuration file called :file:`search_mapping.ini`. |
|
198 | |RCT| provides an example index configuration file called :file:`search_mapping.ini`. | |
189 | This file is created by default during installation and is located at: |
|
199 | This file is created by default during installation and is located at: | |
190 |
|
200 | |||
191 | * :file:`/home/{user}/.rccontrol/{instance-id}/search_mapping.ini`, using default |RCT|. |
|
201 | * :file:`/home/{user}/.rccontrol/{instance-id}/search_mapping.ini`, using default |RCT|. | |
192 | * :file:`~/venv/lib/python2.7/site-packages/rhodecode_tools/templates/mapping.ini`, |
|
202 | * :file:`~/venv/lib/python2.7/site-packages/rhodecode_tools/templates/mapping.ini`, | |
193 | when using ``virtualenv``. |
|
203 | when using ``virtualenv``. | |
194 |
|
204 | |||
195 | .. note:: |
|
205 | .. note:: | |
196 |
|
206 | |||
197 | If you need to create the :file:`search_mapping.ini` file manually, use the |RCT| |
|
207 | If you need to create the :file:`search_mapping.ini` file manually, use the |RCT| | |
198 | ``rhodecode-index --create-mapping path/to/search_mapping.ini`` API call. |
|
208 | ``rhodecode-index --create-mapping path/to/search_mapping.ini`` API call. | |
199 | For details, see the :ref:`tools-cli` section. |
|
209 | For details, see the :ref:`tools-cli` section. | |
200 |
|
210 | |||
201 | To Run the indexer with mapping file provide it using `--mapping` flag:: |
|
211 | To Run the indexer with mapping file provide it using `--mapping` flag:: | |
202 |
|
212 | |||
203 | rhodecode-index --instance-name=enterprise-1 --mapping=/my/path/search_mapping.ini |
|
213 | rhodecode-index --instance-name=enterprise-1 --mapping=/my/path/search_mapping.ini | |
204 |
|
214 | |||
205 |
|
215 | |||
206 | Here's a detailed example of using :file:`search_mapping.ini` file. |
|
216 | Here's a detailed example of using :file:`search_mapping.ini` file. | |
207 |
|
217 | |||
208 | .. code-block:: ini |
|
218 | .. code-block:: ini | |
209 |
|
219 | |||
210 | [__DEFAULT__] |
|
220 | [__DEFAULT__] | |
211 | ; Create index on commits data, and files data in this order. Available options |
|
221 | ; Create index on commits data, and files data in this order. Available options | |
212 | ; are `commits`, `files` |
|
222 | ; are `commits`, `files` | |
213 | index_types = commits,files |
|
223 | index_types = commits,files | |
214 |
|
224 | |||
215 | ; Commit fetch limit. In what amount of chunks commits should be fetched |
|
225 | ; Commit fetch limit. In what amount of chunks commits should be fetched | |
216 | ; via api and parsed. This allows server to transfer smaller chunks and be less loaded |
|
226 | ; via api and parsed. This allows server to transfer smaller chunks and be less loaded | |
217 | commit_fetch_limit = 1000 |
|
227 | commit_fetch_limit = 1000 | |
218 |
|
228 | |||
219 | ; Commit process limit. Limit the number of commits indexer should fetch, and |
|
229 | ; Commit process limit. Limit the number of commits indexer should fetch, and | |
220 | ; store inside the full text search index. eg. if repo has 2000 commits, and |
|
230 | ; store inside the full text search index. eg. if repo has 2000 commits, and | |
221 | ; limit is 1000, on the first run it will process commits 0-1000 and on the |
|
231 | ; limit is 1000, on the first run it will process commits 0-1000 and on the | |
222 | ; second 1000-2000 commits. Help reduce memory usage, default is 50000 |
|
232 | ; second 1000-2000 commits. Help reduce memory usage, default is 50000 | |
223 | ; (set -1 for unlimited) |
|
233 | ; (set -1 for unlimited) | |
224 |
commit_process_limit = |
|
234 | commit_process_limit = 20000 | |
225 |
|
235 | |||
226 | ; Limit of how many repositories each run can process, default is -1 (unlimited) |
|
236 | ; Limit of how many repositories each run can process, default is -1 (unlimited) | |
227 | ; in case of 1000s of repositories it's better to execute in chunks to not overload |
|
237 | ; in case of 1000s of repositories it's better to execute in chunks to not overload | |
228 | ; the server. |
|
238 | ; the server. | |
229 | repo_limit = -1 |
|
239 | repo_limit = -1 | |
230 |
|
240 | |||
231 | ; Default patterns for indexing files and content of files. Binary files |
|
241 | ; Default patterns for indexing files and content of files. Binary files | |
232 | ; are skipped by default. |
|
242 | ; are skipped by default. | |
233 |
|
243 | |||
234 | ; Add to index those comma separated files; globs syntax |
|
244 | ; Add to index those comma separated files; globs syntax | |
235 | ; e.g index_files = *.py, *.c, *.h, *.js |
|
245 | ; e.g index_files = *.py, *.c, *.h, *.js | |
236 | index_files = *, |
|
246 | index_files = *, | |
237 |
|
247 | |||
238 | ; Do not add to index those comma separated files, this excludes |
|
248 | ; Do not add to index those comma separated files, this excludes | |
239 | ; both search by name and content; globs syntax |
|
249 | ; both search by name and content; globs syntax | |
240 | ; e.g index_files = *.key, *.sql, *.xml |
|
250 | ; e.g index_files = *.key, *.sql, *.xml, *.pem, *.crt | |
241 | skip_files = , |
|
251 | skip_files = , | |
242 |
|
252 | |||
243 | ; Add to index content of those comma separated files; globs syntax |
|
253 | ; Add to index content of those comma separated files; globs syntax | |
244 | ; e.g index_files = *.h, *.obj |
|
254 | ; e.g index_files = *.h, *.obj | |
245 | index_files_content = *, |
|
255 | index_files_content = *, | |
246 |
|
256 | |||
247 | ; Do not add to index content of those comma separated files; globs syntax |
|
257 | ; Do not add to index content of those comma separated files; globs syntax | |
248 | ; e.g index_files = *.exe, *.bin, *.log, *.dump |
|
258 | ; Binary files are not indexed by default. | |
|
259 | ; e.g index_files = *.min.js, *.xml, *.dump, *.log, *.dump | |||
249 | skip_files_content = , |
|
260 | skip_files_content = , | |
250 |
|
261 | |||
251 | ; Force rebuilding an index from scratch. Each repository will be rebuild from |
|
262 | ; Force rebuilding an index from scratch. Each repository will be rebuild from | |
252 | ; scratch with a global flag. Use --repo-name=NAME --force to rebuild single repo |
|
263 | ; scratch with a global flag. Use --repo-name=NAME --force to rebuild single repo | |
253 | force = false |
|
264 | force = false | |
254 |
|
265 | |||
255 | ; maximum file size that indexer will use, files above that limit are not going |
|
266 | ; maximum file size that indexer will use, files above that limit are not going | |
256 | ; to have they content indexed. |
|
267 | ; to have they content indexed. | |
257 | ; Possible options are KB (kilobytes), MB (megabytes), eg 1MB or 1024KB |
|
268 | ; Possible options are KB (kilobytes), MB (megabytes), eg 1MB or 1024KB | |
258 |
max_filesize = |
|
269 | max_filesize = 10MB | |
259 |
|
270 | |||
260 |
|
271 | |||
261 | [__INDEX_RULES__] |
|
272 | [__INDEX_RULES__] | |
262 | ; Ordered match rules for repositories. A list of all repositories will be fetched |
|
273 | ; Ordered match rules for repositories. A list of all repositories will be fetched | |
263 | ; using API and this list will be filtered using those rules. |
|
274 | ; using API and this list will be filtered using those rules. | |
264 | ; Syntax for entry: `glob_pattern_OR_full_repo_name = 0 OR 1` where 0=exclude, 1=include |
|
275 | ; Syntax for entry: `glob_pattern_OR_full_repo_name = 0 OR 1` where 0=exclude, 1=include | |
265 | ; When this ordered list is traversed first match will return the include/exclude marker |
|
276 | ; When this ordered list is traversed first match will return the include/exclude marker | |
266 | ; For example: |
|
277 | ; For example: | |
267 | ; upstream/binary_repo = 0 |
|
278 | ; upstream/binary_repo = 0 | |
268 | ; upstream/subrepo/xml_files = 0 |
|
279 | ; upstream/subrepo/xml_files = 0 | |
269 | ; upstream/* = 1 |
|
280 | ; upstream/* = 1 | |
270 | ; special-repo = 1 |
|
281 | ; special-repo = 1 | |
271 | ; * = 0 |
|
282 | ; * = 0 | |
272 | ; This will index all repositories under upstream/*, but skip upstream/binary_repo |
|
283 | ; This will index all repositories under upstream/*, but skip upstream/binary_repo | |
273 | ; and upstream/sub_repo/xml_files, last * = 0 means skip all other matches |
|
284 | ; and upstream/sub_repo/xml_files, last * = 0 means skip all other matches | |
274 |
|
285 | |||
275 | ; Another example: |
|
|||
276 | ; *-fork = 0 |
|
|||
277 | ; * = 1 |
|
|||
278 | ; This will index all repositories, except those that have -fork as suffix. |
|
|||
279 |
|
||||
280 | rhodecode-vcsserver = 1 |
|
|||
281 | rhodecode-enterprise-ce = 1 |
|
|||
282 | upstream/mozilla/firefox-repo = 0 |
|
|||
283 | upstream/git-binaries = 0 |
|
|||
284 | upstream/* = 1 |
|
|||
285 | * = 0 |
|
|||
286 |
|
286 | |||
287 | ; == EXPLICIT REPOSITORY INDEXING == |
|
287 | ; == EXPLICIT REPOSITORY INDEXING == | |
288 | ; If defined this will skip using __INDEX_RULES__, and will not use API to fetch |
|
288 | ; If defined this will skip using __INDEX_RULES__, and will not use API to fetch | |
289 | ; list of repositories, it will explicitly take names defined with [NAME] format and |
|
289 | ; list of repositories, it will explicitly take names defined with [NAME] format and | |
290 | ; try to build the index, to build index just for repo_name_1 and special-repo use: |
|
290 | ; try to build the index, to build index just for repo_name_1 and special-repo use: | |
291 | ; [repo_name_1] |
|
291 | ; [repo_name_1] | |
292 | ; [special-repo] |
|
292 | ; [special-repo] | |
293 |
|
293 | |||
294 | ; == PER REPOSITORY CONFIGURATION == |
|
294 | ; == PER REPOSITORY CONFIGURATION == | |
295 | ; This allows overriding the global configuration per repository. |
|
295 | ; This allows overriding the global configuration per repository. | |
296 | ; example to set specific file limit, and skip certain files for repository special-repo |
|
296 | ; example to set specific file limit, and skip certain files for repository special-repo | |
|
297 | ; the CLI flags doesn't override the conf settings. | |||
297 | ; [conf:special-repo] |
|
298 | ; [conf:special-repo] | |
298 | ; max_filesize = 5mb |
|
299 | ; max_filesize = 5mb | |
299 | ; skip_files = *.xml, *.sql |
|
300 | ; skip_files = *.xml, *.sql | |
300 | ; index_types = files, |
|
|||
301 |
|
301 | |||
302 | [conf:rhodecode-vcsserver] |
|
|||
303 | index_types = files, |
|
|||
304 | max_filesize = 5mb |
|
|||
305 | skip_files = *.xml, *.sql |
|
|||
306 | index_files = *.py, *.c, *.h, *.js |
|
|||
307 |
|
302 | |||
308 |
|
303 | |||
309 | In case of 1000s of repositories it can be tricky to write the include/exclude rules at first. |
|
304 | In case of 1000s of repositories it can be tricky to write the include/exclude rules at first. | |
310 | There's a special flag to test the mapping file rules and list repositories that would |
|
305 | There's a special flag to test the mapping file rules and list repositories that would | |
311 |
be indexed. Run the indexer with `--show-matched-repos` to list only the |
|
306 | be indexed. Run the indexer with `--show-matched-repos` to list only the | |
|
307 | match repositories defined in .ini file rules:: | |||
312 |
|
308 | |||
313 | rhodecode-index --instance-name=enterprise-1 --show-matched-repos --mapping=/my/path/search_mapping.ini |
|
309 | rhodecode-index --instance-name=enterprise-1 --show-matched-repos --mapping=/my/path/search_mapping.ini | |
314 |
|
310 | |||
315 |
|
311 | |||
316 | .. _enable-elasticsearch: |
|
312 | .. _enable-elasticsearch: | |
317 |
|
313 | |||
318 |
Enabling Elastic |
|
314 | Enabling ElasticSearch | |
319 | ^^^^^^^^^^^^^^^^^^^^^^ |
|
315 | ^^^^^^^^^^^^^^^^^^^^^^ | |
320 |
|
316 | |||
321 |
Elastic |
|
317 | ElasticSearch is available in EE edition only. It provides much scalable and more advanced | |
322 |
search capabilities. While Whoosh is fine for upto 1-2GB of data beyond that amount |
|
318 | search capabilities. While Whoosh is fine for upto 1-2GB of data, beyond that amount it | |
323 |
|
|
319 | starts slowing down, and can cause other problems. | |
324 | much more advanced query language allowing advanced filtering by file paths, extensions |
|
320 | New ElasticSearch 6 also provides much more advanced query language. | |
325 | OR statements, ranges etc. Please check query language examples in the search field for |
|
321 | It allows advanced filtering by file paths, extensions, use OR statements, ranges etc. | |
326 | some advanced query language usage. |
|
322 | Please check query language examples in the search field for some advanced query language usage. | |
327 |
|
323 | |||
328 |
|
324 | |||
329 | 1. Open the :file:`rhodecode.ini` file for the instance you wish to edit. The |
|
325 | 1. Open the :file:`rhodecode.ini` file for the instance you wish to edit. The | |
330 | default location is |
|
326 | default location is | |
331 | :file:`home/{user}/.rccontrol/{instance-id}/rhodecode.ini` |
|
327 | :file:`home/{user}/.rccontrol/{instance-id}/rhodecode.ini` | |
332 | 2. Find the search configuration section: |
|
328 | 2. Find the search configuration section: | |
333 |
|
329 | |||
334 | .. code-block:: ini |
|
330 | .. code-block:: ini | |
335 |
|
331 | |||
336 | ################################### |
|
332 | ################################### | |
337 | ## SEARCH INDEXING CONFIGURATION ## |
|
333 | ## SEARCH INDEXING CONFIGURATION ## | |
338 | ################################### |
|
334 | ################################### | |
339 |
|
335 | |||
340 | search.module = rhodecode.lib.index.whoosh |
|
336 | search.module = rhodecode.lib.index.whoosh | |
341 | search.location = %(here)s/data/index |
|
337 | search.location = %(here)s/data/index | |
342 |
|
338 | |||
343 | and change it to: |
|
339 | and change it to: | |
344 |
|
340 | |||
345 | .. code-block:: ini |
|
341 | .. code-block:: ini | |
346 |
|
342 | |||
347 | search.module = rc_elasticsearch |
|
343 | search.module = rc_elasticsearch | |
348 | search.location = http://localhost:9200 |
|
344 | search.location = http://localhost:9200 | |
349 | ## specify Elastic Search version, 6 for latest or 2 for legacy |
|
345 | ## specify Elastic Search version, 6 for latest or 2 for legacy | |
350 | search.es_version = 6 |
|
346 | search.es_version = 6 | |
351 |
|
347 | |||
352 |
where ``search.location`` points to the |
|
348 | where ``search.location`` points to the ElasticSearch server | |
353 | by default running on port 9200. |
|
349 | by default running on port 9200. | |
354 |
|
350 | |||
355 | Index invocation also needs change. Please provide --es-version= and |
|
351 | Index invocation also needs change. Please provide --es-version= and | |
356 |
--engine-location= parameters to define |
|
352 | --engine-location= parameters to define ElasticSearch server location and it's version. | |
357 | For example:: |
|
353 | For example:: | |
358 |
|
354 | |||
359 | rhodecode-index --instace-name=enterprise-1 --es-version=6 --engine-location=http://localhost:9200 |
|
355 | rhodecode-index --instace-name=enterprise-1 --es-version=6 --engine-location=http://localhost:9200 | |
360 |
|
356 | |||
361 |
|
357 | |||
362 | .. _Whoosh: https://pypi.python.org/pypi/Whoosh/ |
|
358 | .. _Whoosh: https://pypi.python.org/pypi/Whoosh/ | |
363 |
.. _Elastic |
|
359 | .. _ElasticSearch 6: https://www.elastic.co/ |
General Comments 0
You need to be logged in to leave comments.
Login now