upstream/kallithea Commit - r1302:f0e90465

moved LANGUAGE_EXTENSION_MAP to lib, and made whoosh indexer use the same map

marcink -

r1302:f0e90465 beta

parent child

docs/changelog.rst

0 +3 -1

              .. _changelog:
              Changelog
              =========
 .2.0 (**2011-XX-XX**)
              ======================
              :status: in-progress
              :branch: beta
              news
              ----
              - implemented #89 Can setup google analytics code from settings menu
              - implemented #91 added nicer looking archive urls with more download options
                like tags, branches
              - implemented #44 into file browsing, and added follow branch option
              - implemented #84 downloads can be enabled/disabled for each repository
              - anonymous repository can be cloned without having to pass default:default
                into clone url
              - fixed #90 whoosh indexer can index chooses repositories passed in command
                line
              - extended journal with day aggregates and paging
              - implemented #107 source code lines highlight ranges
              - implemented #93 customizable changelog on combined revision ranges -
                equivalent of githubs compare view
              - implemented #108 extended and more powerful LDAP configuration
              - implemented #56 users groups
              - major code rewrites optimized codes for speed and memory usage
              - raw and diff downloads are now in git format
              - setup command checks for write access to given path
              - fixed many issues with international characters and unicode. It uses utf8
                decode with replace to provide less errors even with non utf8 encoded strings
              - #125 added API KEY access to feeds
              - #109 Repository can be created from external Mercurial link (aka. remote
                repository, and manually updated (via pull) from admin panel
              - beta git support - push/pull server + basic view for git repos
-             - added followers page
+             - added followers page and forks page
              fixes
              -----
              - fixed file browser bug, when switching into given form revision the url was
                not changing
              - fixed propagation to error controller on simplehg and simplegit middlewares
              - fixed error when trying to make a download on empty repository
              - fixed problem with '[' chars in commit messages in journal
              - fixed #99 Unicode errors, on file node paths with non utf-8 characters
              - journal fork fixes
              - removed issue with space inside renamed repository after deletion
              - fixed strange issue on formencode imports
              - fixed #126 Deleting repository on Windows, rename used incompatible chars.
              - #150 fixes for errors on repositories mapped in db but corrupted in
                filesystem
              - fixed problem with ascendant characters in realm #181
+             - fixed problem with sqlite file based database connection pool
+             - whoosh indexer and code stats share the same dynamic extensions map
 .1.8 (**2011-04-12**)
              ======================
              news
              ----
              - improved windows support
              fixes
              -----
              - fixed #140 freeze of python dateutil library, since new version is python2.x
                incompatible
              - setup-app will check for write permission in given path
              - cleaned up license info issue #149
              - fixes for issues #137,#116 and problems with unicode and accented characters.
              - fixes crashes on gravatar, when passed in email as unicode
              - fixed tooltip flickering problems
              - fixed came_from redirection on windows
              - fixed logging modules, and sql formatters
              - windows fixes for os.kill issue #133
              - fixes path splitting for windows issues #148
              - fixed issue #143 wrong import on migration to 1.1.X
              - fixed problems with displaying binary files, thanks to Thomas Waldmann
              - removed name from archive files since it's breaking ui for long repo names
              - fixed issue with archive headers sent to browser, thanks to Thomas Waldmann
              - fixed compatibility for 1024px displays, and larger dpi settings, thanks to
                Thomas Waldmann
              - fixed issue #166 summary pager was skipping 10 revisions on second page
 .1.7 (**2011-03-23**)
              ======================
              news
              ----
              fixes
              -----
              - fixed (again) #136 installation support for FreeBSD
 .1.6 (**2011-03-21**)
              ======================
              news
              ----
              fixes
              -----
              - fixed #136 installation support for FreeBSD
              - RhodeCode will check for python version during installation
 .1.5 (**2011-03-17**)
              ======================
              news
              ----
              - basic windows support, by exchanging pybcrypt into sha256 for windows only
                highly inspired by idea of mantis406
              fixes
              -----
              - fixed sorting by author in main page
              - fixed crashes with diffs on binary files
              - fixed #131 problem with boolean values for LDAP
              - fixed #122 mysql problems thanks to striker69
              - fixed problem with errors on calling raw/raw_files/annotate functions
                with unknown revisions
              - fixed returned rawfiles attachment names with international character
              - cleaned out docs, big thanks to Jason Harris
 .1.4 (**2011-02-19**)
              ======================
              news
              ----
              fixes
              -----
              - fixed formencode import problem on settings page, that caused server crash
                when that page was accessed as first after server start
              - journal fixes
              - fixed option to access repository just by entering http://server/<repo_name>
 .1.3 (**2011-02-16**)
              ======================
              news
              ----
              - implemented #102 allowing the '.' character in username
              - added option to access repository just by entering http://server/<repo_name>
              - celery task ignores result for better performance
              fixes
              -----
              - fixed ehlo command and non auth mail servers on smtp_lib. Thanks to
                apollo13 and Johan Walles
              - small fixes in journal
              - fixed problems with getting setting for celery from .ini files
              - registration, password reset and login boxes share the same title as main
                application now
              - fixed #113: to high permissions to fork repository
              - fixed problem with '[' chars in commit messages in journal
              - removed issue with space inside renamed repository after deletion
              - db transaction fixes when filesystem repository creation failed
              - fixed #106 relation issues on databases different than sqlite
              - fixed static files paths links to use of url() method
 .1.2 (**2011-01-12**)
              ======================
              news
              ----
              fixes
              -----
              - fixes #98 protection against float division of percentage stats
              - fixed graph bug
              - forced webhelpers version since it was making troubles during installation
 .1.1 (**2011-01-06**)
              ======================
              news
              ----
              - added force https option into ini files for easier https usage (no need to
                set server headers with this options)
              - small css updates
              fixes
              -----
              - fixed #96 redirect loop on files view on repositories without changesets
              - fixed #97 unicode string passed into server header in special cases (mod_wsgi)
                and server crashed with errors
              - fixed large tooltips problems on main page
              - fixed #92 whoosh indexer is more error proof
 .1.0 (**2010-12-18**)
              ======================
              news
              ----
              - rewrite of internals for vcs >=0.1.10
              - uses mercurial 1.7 with dotencode disabled for maintaining compatibility
                with older clients
              - anonymous access, authentication via ldap
              - performance upgrade for cached repos list - each repository has it's own
                cache that's invalidated when needed.
              - performance upgrades on repositories with large amount of commits (20K+)
              - main page quick filter for filtering repositories
              - user dashboards with ability to follow chosen repositories actions
              - sends email to admin on new user registration
              - added cache/statistics reset options into repository settings
              - more detailed action logger (based on hooks) with pushed changesets lists
                and options to disable those hooks from admin panel
              - introduced new enhanced changelog for merges that shows more accurate results
              - new improved and faster code stats (based on pygments lexers mapping tables,
                showing up to 10 trending sources for each repository. Additionally stats
                can be disabled in repository settings.
              - gui optimizations, fixed application width to 1024px
              - added cut off (for large files/changesets) limit into config files
              - whoosh, celeryd, upgrade moved to paster command
              - other than sqlite database backends can be used
              fixes
              -----
              - fixes #61 forked repo was showing only after cache expired
              - fixes #76 no confirmation on user deletes
              - fixes #66 Name field misspelled
              - fixes #72 block user removal when he owns repositories
              - fixes #69 added password confirmation fields
              - fixes #87 RhodeCode crashes occasionally on updating repository owner
              - fixes #82 broken annotations on files with more than 1 blank line at the end
              - a lot of fixes and tweaks for file browser
              - fixed detached session issues
              - fixed when user had no repos he would see all repos listed in my account
              - fixed ui() instance bug when global hgrc settings was loaded for server
                instance and all hgrc options were merged with our db ui() object
              - numerous small bugfixes
              (special thanks for TkSoh for detailed feedback)
 .0.2 (**2010-11-12**)
              ======================
              news
              ----
              - tested under python2.7
              - bumped sqlalchemy and celery versions
              fixes
              -----
              - fixed #59 missing graph.js
              - fixed repo_size crash when repository had broken symlinks
              - fixed python2.5 crashes.
 .0.1 (**2010-11-10**)
              ======================
              news
              ----
              - small css updated
              fixes
              -----
              - fixed #53 python2.5 incompatible enumerate calls
              - fixed #52 disable mercurial extension for web
              - fixed #51 deleting repositories don't delete it's dependent objects
 .0.0 (**2010-11-02**)
              ======================
              - security bugfix simplehg wasn't checking for permissions on commands
                other than pull or push.
              - fixed doubled messages after push or pull in admin journal
              - templating and css corrections, fixed repo switcher on chrome, updated titles
              - admin menu accessible from options menu on repository view
              - permissions cached queries
 .0.0rc4  (**2010-10-12**)
              ==========================
              - fixed python2.5 missing simplejson imports (thanks to Jens Bäckman)
              - removed cache_manager settings from sqlalchemy meta
              - added sqlalchemy cache settings to ini files
              - validated password length and added second try of failure on paster setup-app
              - fixed setup database destroy prompt even when there was no db
 .0.0rc3 (**2010-10-11**)
              =========================
              - fixed i18n during installation.
 .0.0rc2 (**2010-10-11**)
              =========================
              - Disabled dirsize in file browser, it's causing nasty bug when dir renames
                occure. After vcs is fixed it'll be put back again.
              - templating/css rewrites, optimized css.
  No newline at end of file

rhodecode/lib/__init__.py

0 +42 0

              # -*- coding: utf-8 -*-
              """
                  rhodecode.lib.__init__
                  ~~~~~~~~~~~~~~~~~~~~~~~
                  Some simple helper functions
                  :created_on: Jan 5, 2011
                  :author: marcink
                  :copyright: (C) 2009-2010 Marcin Kuzminski <marcin@python-works.com>
                  :license: GPLv3, see COPYING for more details.
              """
              # This program is free software: you can redistribute it and/or modify
              # it under the terms of the GNU General Public License as published by
              # the Free Software Foundation, either version 3 of the License, or
              # (at your option) any later version.
              #
              # This program is distributed in the hope that it will be useful,
              # but WITHOUT ANY WARRANTY; without even the implied warranty of
              # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
              # GNU General Public License for more details.
              #
              # You should have received a copy of the GNU General Public License
              # along with this program.  If not, see <http://www.gnu.org/licenses/>.
+             def __get_lem():
+                 from pygments import lexers
+                 from string import lower
+                 from collections import defaultdict
+                 d = defaultdict(lambda: [])
+                 def __clean(s):
+                     s = s.lstrip('*')
+                     s = s.lstrip('.')
+                     if s.find('[') != -1:
+                         exts = []
+                         start, stop = s.find('['), s.find(']')
+                         for suffix in s[start + 1:stop]:
+                             exts.append(s[:s.find('[')] + suffix)
+                         return map(lower, exts)
+                     else:
+                         return map(lower, [s])
+                 for lx, t in sorted(lexers.LEXERS.items()):
+                     m = map(__clean, t[-2])
+                     if m:
+                         m = reduce(lambda x, y: x + y, m)
+                         for ext in m:
+                             desc = lx.replace('Lexer', '')
+                             d[ext].append(desc)
+                 return dict(d)
+             # language map is also used by whoosh indexer, which for those specified
+             # extensions will index it's content
+             LANGUAGES_EXTENSIONS_MAP = __get_lem()
+             #Additional mappings that are not present in the pygments lexers
+             # NOTE: that this will overide any mappings in LANGUAGES_EXTENSIONS_MAP
+             ADDITIONAL_MAPPINGS = {'xaml': 'XAML'}
+             LANGUAGES_EXTENSIONS_MAP.update(ADDITIONAL_MAPPINGS)
              def str2bool(_str):
                  """
                  returs True/False value from given string, it tries to translate the
                  string into boolean
                  :param _str: string value to translate into boolean
                  :rtype: boolean
                  :returns: boolean from given string
                  """
                  if _str is None:
                      return False
                  if _str in (True, False):
                      return _str
                  _str = str(_str).strip().lower()
                  return _str in ('t', 'true', 'y', 'yes', 'on', '1')
              def generate_api_key(username, salt=None):
                  """
                  Generates unique API key for given username,if salt is not given
                  it'll be generated from some random string
                  :param username: username as string
                  :param salt: salt to hash generate KEY
                  :rtype: str
                  :returns: sha1 hash from username+salt
                  """
                  from tempfile import _RandomNameSequence
                  import hashlib
                  if salt is None:
                      salt = _RandomNameSequence().next()
                  return hashlib.sha1(username + salt).hexdigest()
              def safe_unicode(_str, from_encoding='utf8'):
                  """
                  safe unicode function. In case of UnicodeDecode error we try to return
                  unicode with errors replace
                  :param _str: string to decode
                  :rtype: unicode
                  :returns: unicode object
                  """
                  if isinstance(_str, unicode):
                      return _str
                  try:
                      u_str = unicode(_str, from_encoding)
                  except UnicodeDecodeError:
                      u_str = unicode(_str, from_encoding, 'replace')
                  return u_str
              def engine_from_config(configuration, prefix='sqlalchemy.', **kwargs):
                  """
                  Custom engine_from_config functions that makes sure we use NullPool for
                  file based sqlite databases. This prevents errors on sqlite.
                  """
                  from sqlalchemy import engine_from_config as efc
                  from sqlalchemy.pool import NullPool
                  url = configuration[prefix + 'url']
                  if url.startswith('sqlite'):
                      kwargs.update({'poolclass':NullPool})
                  return efc(configuration, prefix, **kwargs)

rhodecode/lib/celerylib/tasks.py

0 +1 -36

              # -*- coding: utf-8 -*-
              """
                  rhodecode.lib.celerylib.tasks
                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                  RhodeCode task modules, containing all task that suppose to be run
                  by celery daemon
                  :created_on: Oct 6, 2010
                  :author: marcink
                  :copyright: (C) 2009-2011 Marcin Kuzminski <marcin@python-works.com>
                  :license: GPLv3, see COPYING for more details.
              """
              # This program is free software: you can redistribute it and/or modify
              # it under the terms of the GNU General Public License as published by
              # the Free Software Foundation, either version 3 of the License, or
              # (at your option) any later version.
              #
              # This program is distributed in the hope that it will be useful,
              # but WITHOUT ANY WARRANTY; without even the implied warranty of
              # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
              # GNU General Public License for more details.
              #
              # You should have received a copy of the GNU General Public License
              # along with this program.  If not, see <http://www.gnu.org/licenses/>.
              from celery.decorators import task
              import os
              import traceback
              import logging
              from time import mktime
              from operator import itemgetter
-             from pygments import lexers
              from string import lower
              from pylons import config
              from pylons.i18n.translation import _
+             from rhodecode.lib import LANGUAGES_EXTENSIONS_MAP
              from rhodecode.lib.celerylib import run_task, locked_task, str2bool, \
                  __get_lockkey, LockHeld, DaemonLock
              from rhodecode.lib.helpers import person
              from rhodecode.lib.smtp_mailer import SmtpMailer
              from rhodecode.lib.utils import OrderedDict, add_cache
              from rhodecode.model import init_model
              from rhodecode.model import meta
              from rhodecode.model.db import RhodeCodeUi, Statistics, Repository
              from vcs.backends import get_repo
              from sqlalchemy import engine_from_config
              add_cache(config)
              try:
                  import json
              except ImportError:
                  #python 2.5 compatibility
                  import simplejson as json
              __all__ = ['whoosh_index', 'get_commits_stats',
                         'reset_user_password', 'send_email']
              CELERY_ON = str2bool(config['app_conf'].get('use_celery'))
-             LANGUAGES_EXTENSIONS_MAP = {}
-             def __clean(s):
-                 s = s.lstrip('*')
-                 s = s.lstrip('.')
-                 if s.find('[') != -1:
-                     exts = []
-                     start, stop = s.find('['), s.find(']')
-                     for suffix in s[start + 1:stop]:
-                         exts.append(s[:s.find('[')] + suffix)
-                     return map(lower, exts)
-                 else:
-                     return map(lower, [s])
-             for lx, t in sorted(lexers.LEXERS.items()):
-                 m = map(__clean, t[-2])
-                 if m:
-                     m = reduce(lambda x, y: x + y, m)
-                     for ext in m:
-                         desc = lx.replace('Lexer', '')
-                         if ext in LANGUAGES_EXTENSIONS_MAP:
-                             if desc not in LANGUAGES_EXTENSIONS_MAP[ext]:
-                                 LANGUAGES_EXTENSIONS_MAP[ext].append(desc)
-                         else:
-                             LANGUAGES_EXTENSIONS_MAP[ext] = [desc]
-             #Additional mappings that are not present in the pygments lexers
-             # NOTE: that this will overide any mappings in LANGUAGES_EXTENSIONS_MAP
-             ADDITIONAL_MAPPINGS = {'xaml': 'XAML'}
-             LANGUAGES_EXTENSIONS_MAP.update(ADDITIONAL_MAPPINGS)
              def get_session():
                  if CELERY_ON:
                      engine = engine_from_config(config, 'sqlalchemy.db1.')
                      init_model(engine)
                  sa = meta.Session()
                  return sa
              def get_repos_path():
                  sa = get_session()
                  q = sa.query(RhodeCodeUi).filter(RhodeCodeUi.ui_key == '/').one()
                  return q.ui_value
              @task(ignore_result=True)
              @locked_task
              def whoosh_index(repo_location, full_index):
                  #log = whoosh_index.get_logger()
                  from rhodecode.lib.indexers.daemon import WhooshIndexingDaemon
                  index_location = config['index_dir']
                  WhooshIndexingDaemon(index_location=index_location,
                                       repo_location=repo_location, sa=get_session())\
                                       .run(full_index=full_index)
              @task(ignore_result=True)
              def get_commits_stats(repo_name, ts_min_y, ts_max_y):
                  try:
                      log = get_commits_stats.get_logger()
                  except:
                      log = logging.getLogger(__name__)
                  lockkey = __get_lockkey('get_commits_stats', repo_name, ts_min_y,
                                          ts_max_y)
                  log.info('running task with lockkey %s', lockkey)
                  try:
                      lock = DaemonLock(lockkey)
                      #for js data compatibilty cleans the key for person from '
                      akc = lambda k: person(k).replace('"', "")
                      co_day_auth_aggr = {}
                      commits_by_day_aggregate = {}
                      repos_path = get_repos_path()
                      p = os.path.join(repos_path, repo_name)
                      repo = get_repo(p)
                      repo_size = len(repo.revisions)
                      #return if repo have no revisions
                      if repo_size < 1:
                          lock.release()
                          return True
                      skip_date_limit = True
                      parse_limit = int(config['app_conf'].get('commit_parse_limit'))
                      last_rev = 0
                      last_cs = None
                      timegetter = itemgetter('time')
                      sa = get_session()
                      dbrepo = sa.query(Repository)\
                          .filter(Repository.repo_name == repo_name).scalar()
                      cur_stats = sa.query(Statistics)\
                          .filter(Statistics.repository == dbrepo).scalar()
                      if cur_stats is not None:
                          last_rev = cur_stats.stat_on_revision
                      if last_rev == repo.get_changeset().revision and repo_size > 1:
                          #pass silently without any work if we're not on first revision or
                          #current state of parsing revision(from db marker) is the
                          #last revision
                          lock.release()
                          return True
                      if cur_stats:
                          commits_by_day_aggregate = OrderedDict(json.loads(
                                                      cur_stats.commit_activity_combined))
                          co_day_auth_aggr = json.loads(cur_stats.commit_activity)
                      log.debug('starting parsing %s', parse_limit)
                      lmktime = mktime
                      last_rev = last_rev + 1 if last_rev > 0 else last_rev
                      for cs in repo[last_rev:last_rev + parse_limit]:
                          last_cs = cs  # remember last parsed changeset
                          k = lmktime([cs.date.timetuple()[0], cs.date.timetuple()[1],
                                        cs.date.timetuple()[2], 0, 0, 0, 0, 0, 0])
                          if akc(cs.author) in co_day_auth_aggr:
                              try:
                                  l = [timegetter(x) for x in
                                       co_day_auth_aggr[akc(cs.author)]['data']]
                                  time_pos = l.index(k)
                              except ValueError:
                                  time_pos = False
                              if time_pos >= 0 and time_pos is not False:
                                  datadict = \
                                      co_day_auth_aggr[akc(cs.author)]['data'][time_pos]
                                  datadict["commits"] += 1
                                  datadict["added"] += len(cs.added)
                                  datadict["changed"] += len(cs.changed)
                                  datadict["removed"] += len(cs.removed)
                              else:
                                  if k >= ts_min_y and k <= ts_max_y or skip_date_limit:
                                      datadict = {"time": k,
                                                  "commits": 1,
                                                  "added": len(cs.added),
                                                  "changed": len(cs.changed),
                                                  "removed": len(cs.removed),
                                                 }
                                      co_day_auth_aggr[akc(cs.author)]['data']\
                                          .append(datadict)
                          else:
                              if k >= ts_min_y and k <= ts_max_y or skip_date_limit:
                                  co_day_auth_aggr[akc(cs.author)] = {
                                                      "label": akc(cs.author),
                                                      "data": [{"time":k,
                                                               "commits":1,
                                                               "added":len(cs.added),
                                                               "changed":len(cs.changed),
                                                               "removed":len(cs.removed),
                                                               }],
                                                      "schema": ["commits"],
                                                      }
                          #gather all data by day
                          if k in commits_by_day_aggregate:
                              commits_by_day_aggregate[k] += 1
                          else:
                              commits_by_day_aggregate[k] = 1
                      overview_data = sorted(commits_by_day_aggregate.items(),
                                             key=itemgetter(0))
                      if not co_day_auth_aggr:
                          co_day_auth_aggr[akc(repo.contact)] = {
                              "label": akc(repo.contact),
                              "data": [0, 1],
                              "schema": ["commits"],
                          }
                      stats = cur_stats if cur_stats else Statistics()
                      stats.commit_activity = json.dumps(co_day_auth_aggr)
                      stats.commit_activity_combined = json.dumps(overview_data)
                      log.debug('last revison %s', last_rev)
                      leftovers = len(repo.revisions[last_rev:])
                      log.debug('revisions to parse %s', leftovers)
                      if last_rev == 0 or leftovers < parse_limit:
                          log.debug('getting code trending stats')
                          stats.languages = json.dumps(__get_codes_stats(repo_name))
                      try:
                          stats.repository = dbrepo
                          stats.stat_on_revision = last_cs.revision if last_cs else 0
                          sa.add(stats)
                          sa.commit()
                      except:
                          log.error(traceback.format_exc())
                          sa.rollback()
                          lock.release()
                          return False
                      #final release
                      lock.release()
                      #execute another task if celery is enabled
                      if len(repo.revisions) > 1 and CELERY_ON:
                          run_task(get_commits_stats, repo_name, ts_min_y, ts_max_y)
                      return True
                  except LockHeld:
                      log.info('LockHeld')
                      return 'Task with key %s already running' % lockkey
              @task(ignore_result=True)
              def reset_user_password(user_email):
                  try:
                      log = reset_user_password.get_logger()
                  except:
                      log = logging.getLogger(__name__)
                  from rhodecode.lib import auth
                  from rhodecode.model.db import User
                  try:
                      try:
                          sa = get_session()
                          user = sa.query(User).filter(User.email == user_email).scalar()
                          new_passwd = auth.PasswordGenerator().gen_password(8,
                                           auth.PasswordGenerator.ALPHABETS_BIG_SMALL)
                          if user:
                              user.password = auth.get_crypt_password(new_passwd)
                              user.api_key = auth.generate_api_key(user.username)
                              sa.add(user)
                              sa.commit()
                              log.info('change password for %s', user_email)
                          if new_passwd is None:
                              raise Exception('unable to generate new password')
                      except:
                          log.error(traceback.format_exc())
                          sa.rollback()
                      run_task(send_email, user_email,
                               "Your new rhodecode password",
                               'Your new rhodecode password:%s' % (new_passwd))
                      log.info('send new password mail to %s', user_email)
                  except:
                      log.error('Failed to update user password')
                      log.error(traceback.format_exc())
                  return True
              @task(ignore_result=True)
              def send_email(recipients, subject, body):
                  """
                  Sends an email with defined parameters from the .ini files.
                  :param recipients: list of recipients, it this is empty the defined email
                      address from field 'email_to' is used instead
                  :param subject: subject of the mail
                  :param body: body of the mail
                  """
                  try:
                      log = send_email.get_logger()
                  except:
                      log = logging.getLogger(__name__)
                  email_config = config
                  if not recipients:
                      recipients = [email_config.get('email_to')]
                  mail_from = email_config.get('app_email_from')
                  user = email_config.get('smtp_username')
                  passwd = email_config.get('smtp_password')
                  mail_server = email_config.get('smtp_server')
                  mail_port = email_config.get('smtp_port')
                  tls = str2bool(email_config.get('smtp_use_tls'))
                  ssl = str2bool(email_config.get('smtp_use_ssl'))
                  debug = str2bool(config.get('debug'))
                  try:
                      m = SmtpMailer(mail_from, user, passwd, mail_server,
                                     mail_port, ssl, tls, debug=debug)
                      m.send(recipients, subject, body)
                  except:
                      log.error('Mail sending failed')
                      log.error(traceback.format_exc())
                      return False
                  return True
              @task(ignore_result=True)
              def create_repo_fork(form_data, cur_user):
                  from rhodecode.model.repo import RepoModel
                  from vcs import get_backend
                  try:
                      log = create_repo_fork.get_logger()
                  except:
                      log = logging.getLogger(__name__)
                  repo_model = RepoModel(get_session())
                  repo_model.create(form_data, cur_user, just_db=True, fork=True)
                  repo_name = form_data['repo_name']
                  repos_path = get_repos_path()
                  repo_path = os.path.join(repos_path, repo_name)
                  repo_fork_path = os.path.join(repos_path, form_data['fork_name'])
                  alias = form_data['repo_type']
                  log.info('creating repo fork %s as %s', repo_name, repo_path)
                  backend = get_backend(alias)
                  backend(str(repo_fork_path), create=True, src_url=str(repo_path))
              def __get_codes_stats(repo_name):
                  repos_path = get_repos_path()
                  p = os.path.join(repos_path, repo_name)
                  repo = get_repo(p)
                  tip = repo.get_changeset()
                  code_stats = {}
                  def aggregate(cs):
                      for f in cs[2]:
                          ext = lower(f.extension)
                          if ext in LANGUAGES_EXTENSIONS_MAP.keys() and not f.is_binary:
                              if ext in code_stats:
                                  code_stats[ext] += 1
                              else:
                                  code_stats[ext] = 1
                  map(aggregate, tip.walk('/'))
                  return code_stats or {}

rhodecode/lib/indexers/__init__.py

0 +13 -19

              # -*- coding: utf-8 -*-
              """
                  rhodecode.lib.indexers.__init__
                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                  Whoosh indexing module for RhodeCode
                  :created_on: Aug 17, 2010
                  :author: marcink
                  :copyright: (C) 2009-2010 Marcin Kuzminski <marcin@python-works.com>
                  :license: GPLv3, see COPYING for more details.
              """
              # This program is free software: you can redistribute it and/or modify
              # it under the terms of the GNU General Public License as published by
              # the Free Software Foundation, either version 3 of the License, or
              # (at your option) any later version.
              #
              # This program is distributed in the hope that it will be useful,
              # but WITHOUT ANY WARRANTY; without even the implied warranty of
              # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
              # GNU General Public License for more details.
              #
              # You should have received a copy of the GNU General Public License
              # along with this program.  If not, see <http://www.gnu.org/licenses/>.
              import os
              import sys
              import traceback
              from os.path import dirname as dn, join as jn
              #to get the rhodecode import
              sys.path.append(dn(dn(dn(os.path.realpath(__file__)))))
              from string import strip
-             from rhodecode.model import init_model
-             from rhodecode.model.scm import ScmModel
-             from rhodecode.config.environment import load_environment
-             from rhodecode.lib.utils import BasePasterCommand, Command, add_cache
              from shutil import rmtree
-             from webhelpers.html.builder import escape
-             from vcs.utils.lazy import LazyProperty
-             from sqlalchemy import engine_from_config
              from whoosh.analysis import RegexTokenizer, LowercaseFilter, StopFilter
              from whoosh.fields import TEXT, ID, STORED, Schema, FieldType
              from whoosh.index import create_in, open_dir
              from whoosh.formats import Characters
              from whoosh.highlight import highlight, SimpleFragmenter, HtmlFormatter
+             from webhelpers.html.builder import escape
+             from sqlalchemy import engine_from_config
+             from vcs.utils.lazy import LazyProperty
+             from rhodecode.model import init_model
+             from rhodecode.model.scm import ScmModel
+             from rhodecode.config.environment import load_environment
+             from rhodecode.lib import LANGUAGES_EXTENSIONS_MAP
+             from rhodecode.lib.utils import BasePasterCommand, Command, add_cache
              #EXTENSIONS WE WANT TO INDEX CONTENT OFF
-             INDEX_EXTENSIONS = ['action', 'adp', 'ashx', 'asmx', 'aspx', 'asx', 'axd', 'c',
-                                 'cfg', 'cfm', 'cpp', 'cs', 'css', 'diff', 'do', 'el', 'erl',
-                                 'h', 'htm', 'html', 'ini', 'java', 'js', 'jsp', 'jspx', 'lisp',
-                                 'lua', 'm', 'mako', 'ml', 'pas', 'patch', 'php', 'php3',
-                                 'php4', 'phtml', 'pm', 'py', 'rb', 'rst', 's', 'sh', 'sql',
-                                 'tpl', 'txt', 'vim', 'wss', 'xhtml', 'xml', 'xsl', 'xslt',
-                                 'yaws']
+             INDEX_EXTENSIONS = LANGUAGES_EXTENSIONS_MAP.keys()
              #CUSTOM ANALYZER wordsplit + lowercase filter
              ANALYZER = RegexTokenizer(expression=r"\w+") | LowercaseFilter()
              #INDEX SCHEMA DEFINITION
              SCHEMA = Schema(owner=TEXT(),
                              repository=TEXT(stored=True),
                              path=TEXT(stored=True),
                              content=FieldType(format=Characters(ANALYZER),
                                           scorable=True, stored=True),
                              modtime=STORED(), extension=TEXT(stored=True))
              IDX_NAME = 'HG_INDEX'
              FORMATTER = HtmlFormatter('span', between='\n<span class="break">...</span>\n')
              FRAGMENTER = SimpleFragmenter(200)
              class MakeIndex(BasePasterCommand):
                  max_args = 1
                  min_args = 1
                  usage = "CONFIG_FILE"
                  summary = "Creates index for full text search given configuration file"
                  group_name = "RhodeCode"
                  takes_config_file = -1
                  parser = Command.standard_parser(verbose=True)
                  def command(self):
                      from pylons import config
                      add_cache(config)
                      engine = engine_from_config(config, 'sqlalchemy.db1.')
                      init_model(engine)
                      index_location = config['index_dir']
                      repo_location = self.options.repo_location
                      repo_list = map(strip, self.options.repo_list.split(',')) \
                          if self.options.repo_list else None
                      #======================================================================
                      # WHOOSH DAEMON
                      #======================================================================
                      from rhodecode.lib.pidlock import LockHeld, DaemonLock
                      from rhodecode.lib.indexers.daemon import WhooshIndexingDaemon
                      try:
                          l = DaemonLock()
                          WhooshIndexingDaemon(index_location=index_location,
                                               repo_location=repo_location,
                                               repo_list=repo_list)\
                              .run(full_index=self.options.full_index)
                          l.release()
                      except LockHeld:
                          sys.exit(1)
                  def update_parser(self):
                      self.parser.add_option('--repo-location',
                                        action='store',
                                        dest='repo_location',
                                        help="Specifies repositories location to index REQUIRED",
                                        )
                      self.parser.add_option('--index-only',
                                        action='store',
                                        dest='repo_list',
                                        help="Specifies a comma separated list of repositores "
                                              "to build index on OPTIONAL",
                                        )
                      self.parser.add_option('-f',
                                        action='store_true',
                                        dest='full_index',
                                        help="Specifies that index should be made full i.e"
                                              " destroy old and build from scratch",
                                        default=False)
              class ResultWrapper(object):
                  def __init__(self, search_type, searcher, matcher, highlight_items):
                      self.search_type = search_type
                      self.searcher = searcher
                      self.matcher = matcher
                      self.highlight_items = highlight_items
                      self.fragment_size = 200 / 2
                  @LazyProperty
                  def doc_ids(self):
                      docs_id = []
                      while self.matcher.is_active():
                          docnum = self.matcher.id()
                          chunks = [offsets for offsets in self.get_chunks()]
                          docs_id.append([docnum, chunks])
                          self.matcher.next()
                      return docs_id
                  def __str__(self):
                      return '<%s at %s>' % (self.__class__.__name__, len(self.doc_ids))
                  def __repr__(self):
                      return self.__str__()
                  def __len__(self):
                      return len(self.doc_ids)
                  def __iter__(self):
                      """
                      Allows Iteration over results,and lazy generate content
                      *Requires* implementation of ``__getitem__`` method.
                      """
                      for docid in self.doc_ids:
                          yield self.get_full_content(docid)
                  def __getitem__(self, key):
                      """
                      Slicing of resultWrapper
                      """
                      i, j = key.start, key.stop
                      slice = []
                      for docid in self.doc_ids[i:j]:
                          slice.append(self.get_full_content(docid))
                      return slice
                  def get_full_content(self, docid):
                      res = self.searcher.stored_fields(docid[0])
                      f_path = res['path'][res['path'].find(res['repository']) \
                                           + len(res['repository']):].lstrip('/')
                      content_short = self.get_short_content(res, docid[1])
                      res.update({'content_short':content_short,
                                  'content_short_hl':self.highlight(content_short),
                                  'f_path':f_path})
                      return res
                  def get_short_content(self, res, chunks):
                      return ''.join([res['content'][chunk[0]:chunk[1]] for chunk in chunks])
                  def get_chunks(self):
                      """
                      Smart function that implements chunking the content
                      but not overlap chunks so it doesn't highlight the same
                      close occurrences twice.
-                     @param matcher:
-                     @param size:
+                     :param matcher:
+                     :param size:
                      """
                      memory = [(0, 0)]
                      for span in self.matcher.spans():
                          start = span.startchar or 0
                          end = span.endchar or 0
                          start_offseted = max(0, start - self.fragment_size)
                          end_offseted = end + self.fragment_size
                          if start_offseted < memory[-1][1]:
                              start_offseted = memory[-1][1]
                          memory.append((start_offseted, end_offseted,))
                          yield (start_offseted, end_offseted,)
                  def highlight(self, content, top=5):
                      if self.search_type != 'content':
                          return ''
                      hl = highlight(escape(content),
                               self.highlight_items,
                               analyzer=ANALYZER,
                               fragmenter=FRAGMENTER,
                               formatter=FORMATTER,
                               top=top)
                      return hl

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages