Mercurial Frequently Asked Questions Section 1: General Usage ------------------------ Q. I did an 'hg pull' and my working directory is empty! There are two parts to Mercurial: the repository and the working directory. 'hg pull' pulls all new changes from a remote repository into the local one but doesn't alter the working directory. This keeps you from upsetting your work in progress, which may not be ready to merge with the new changes you've pulled and also allows you to manage merging more easily (see below about best practices). To update your working directory, run 'hg update'. If you're sure you want to update your working directory on a pull, you can also use 'hg pull -u'. This will refuse to merge or overwrite local changes. Q. What is the difference between revision numbers, changeset IDs, and tags? Mercurial will generally allow you to refer to a revision in three ways: by revision number, by changeset ID, and by tag. A revision number is a simple decimal number that corresponds with the ordering of commits in the local repository. It is important to understand that this ordering can change from machine to machine due to Mercurial's distributed, decentralized architecture. This is where changeset IDs come in. A changeset ID is a 160-bit identifier that uniquely describes a changeset and its position in the change history, regardless of which machine it's on. This is represented to the user as a 40 digit hexadecimal number. As that tends to be unwieldy, Mercurial will accept any unambiguous substring of that number when specifying versions. It will also generally print these numbers in "short form", which is the first 12 digits. You should always use some form of changeset ID rather than the local revision number when discussing revisions with other Mercurial users as they may have different revision numbering on their system. Finally, a tag is an arbitrary string that has been assigned a correspondence to a changeset ID. This lets you refer to revisions symbolically. Q. What are branches, heads, and the tip? The central concept of Mercurial is branching. A 'branch' is simply an independent line of development. In most other version control systems, all users generally commit to the same line of development called 'the trunk' or 'the main branch'. In Mercurial, every developer effectively works on a private branch and there is no internal concept of 'the main branch'. Thus Mercurial works hard to make repeated merging between branches easy. Simply run 'hg pull' and 'hg update -m' and commit the result. 'Heads' are simply the most recent commits on a branch. Technically, they are changesets which have no children. Merging is the process of joining points on two branches into one, usually at their current heads. Use 'hg heads' to find the heads in the current repository. The 'tip' is the most recently changed head, and also the highest numbered revision. If you have just made a commit, that commit will be the head. Alternately, if you have just pulled from another repository, the tip of that repository becomes the current tip. The 'tip' is the default revision for many commands such as update, and also functions as a special symbolic tag. Q. How does merging work? The merge process is simple. Usually you will want to merge the tip into your working directory. Thus you run 'hg update -m' and Mercurial will incorporate the changes from tip into your local changes. The first step of this process is tracing back through the history of changesets and finding the 'common ancestor' of the two versions that are being merged. This is done on a project-wide and a file by file basis. For files that have been changed in both projects, a three-way merge is attempted to add the changes made remotely into the changes made locally. If there are conflicts between these changes, the user is prompted to interactively resolve them. Mercurial uses a helper tool for this, which is usually found by the hgmerge script. Example tools include tkdiff, kdiff3, and the classic RCS merge. After you've completed the merge and you're satisfied that the results are correct, it's a good idea to commit your changes. Mercurial won't allow you to perform another merge until you've done this commit as that would lose important history that will be needed for future merges. Q. How do tags work in Mercurial? Tags work slightly differently in Mercurial than most revision systems. The design attempts to meet the following requirements: - be version controlled and mergeable just like any other file - allow signing of tags - allow adding a tag to an already committed changeset - allow changing tags in the future Thus Mercurial stores tags as a file in the working dir. This file is called .hgtags and consists of a list of changeset IDs and their corresponding tags. To add a tag to the system, simply add a line to this file and then commit it for it to take effect. The 'hg tag' command will do this for you and 'hg tags' will show the currently effective tags. Note that because tags refer to changeset IDs and the changeset ID is effectively the sum of all the contents of the repository for that change, it is impossible in Mercurial to simultaneously commit and add a tag. Thus tagging a revision must be done as a second step. Q. How do tags work with multiple heads? The tags that are in effect at any given time are the tags specified in each head, with heads closer to the tip taking precedence. Q. What are some best practices for distributed development with Mercurial? First, merge often! This makes merging easier for everyone and you find out about conflicts (which are often rooted in incompatible design decisions) earlier. Second, don't hesitate to use multiple trees locally. Mercurial makes this fast and light-weight. Typical usage is to have an incoming tree, an outgoing tree, and a separate tree for each area being worked on. The incoming tree is best maintained as a pristine copy of the upstream repository. This works as a cache so that you don't have to pull multiple copies over the network. No need to check files out here as you won't be changing them. The outgoing tree contains all the changes you intend for merger into upsteam. Publish this tree with 'hg serve' or hgweb.cgi or use 'hg push' to push it to another publicly availabe repository. Then, for each feature you work on, create a new tree. Commit early and commit often, merge with incoming regularly, and once you're satisfied with your feature, pull the changes into your outgoing tree. Q. How do I import from a repository created in a different SCM? Take a look at contrib/convert-repo. This is an extensible framework for converting between repository types. Q. What about Windows support? Patches to support Windows are being actively integrated, a fully working Windows version is probably not far off Section 2: Technical -------------------- Q. What limits does Mercurial have? Mercurial currently assumes that single files, indices, and manifests can fit in memory for efficiency. Offsets in revlogs are currently tracked with 32 bits, so a revlog for a single file can currently not grow beyond 4G. There should otherwise be no limits on file name length, file size, file contents, number of files, or number of revisions. The network protocol is big-endian. File names cannot contain the null character. Committer addresses cannot contain newlines. Mercurial is primarily developed for UNIX systems, so some UNIXisms may be present in ports. Q. How does signing work? Take a look at the hgeditor script for an example. The basic idea is to sign the manifest ID inside that changelog entry. The manifest ID is a recursive hash of all of the files in the system and their complete history, and thus signing the manifest hash signs the entire project to that point. More precisely: each file hash is an SHA1 hash of the contents of that file and the hashes of its parent revisions. The manifest contains a list of each file in the project along with its current file hash. This manifest is hashed similarly to the file hashes, incorporating the hashes of the parent revisions. Q. What about hash collisions? What about weaknesses in SHA1? The SHA1 hashes are large enough that the odds of accidental hash collision are negligible for projects that could be handled by the human race. The known weaknesses in SHA1 are currently still not practical to attack, and Mercurial will switch to SHA256 hashing before that becomes a realistic concern. Collisions with the "short hashes" are not a concern as they're always checked for ambiguity and are still long enough that they're not likely to happen for reasonably-sized projects (< 1M changes).