FAQ.txt
290 lines
| 11.6 KiB
| text/plain
|
TextLexer
/ doc / FAQ.txt
mpm@selenic.com
|
r446 | Mercurial Frequently Asked Questions | ||
mpm@selenic.com
|
r449 | ==================================== | ||
mpm@selenic.com
|
r446 | |||
Section 1: General Usage | ||||
------------------------ | ||||
mpm@selenic.com
|
r449 | .Q. I did an "hg pull" and my working directory is empty! | ||
mpm@selenic.com
|
r446 | |||
There are two parts to Mercurial: the repository and the working | ||||
mpm@selenic.com
|
r449 | directory. "hg pull" pulls all new changes from a remote repository | ||
mpm@selenic.com
|
r446 | into the local one but doesn't alter the working directory. | ||
This keeps you from upsetting your work in progress, which may not be | ||||
ready to merge with the new changes you've pulled and also allows you | ||||
to manage merging more easily (see below about best practices). | ||||
mpm@selenic.com
|
r449 | To update your working directory, run "hg update". If you're sure you | ||
want to update your working directory on a pull, you can also use "hg | ||||
pull -u". This will refuse to merge or overwrite local changes. | ||||
mpm@selenic.com
|
r446 | |||
mpm@selenic.com
|
r449 | .Q. What are revision numbers, changeset IDs, and tags? | ||
mpm@selenic.com
|
r446 | |||
Mercurial will generally allow you to refer to a revision in three | ||||
ways: by revision number, by changeset ID, and by tag. | ||||
A revision number is a simple decimal number that corresponds with the | ||||
ordering of commits in the local repository. It is important to | ||||
understand that this ordering can change from machine to machine due | ||||
to Mercurial's distributed, decentralized architecture. | ||||
This is where changeset IDs come in. A changeset ID is a 160-bit | ||||
identifier that uniquely describes a changeset and its position in the | ||||
change history, regardless of which machine it's on. This is | ||||
represented to the user as a 40 digit hexadecimal number. As that | ||||
tends to be unwieldy, Mercurial will accept any unambiguous substring | ||||
of that number when specifying versions. It will also generally print | ||||
these numbers in "short form", which is the first 12 digits. | ||||
You should always use some form of changeset ID rather than the local | ||||
revision number when discussing revisions with other Mercurial users | ||||
as they may have different revision numbering on their system. | ||||
Finally, a tag is an arbitrary string that has been assigned a | ||||
correspondence to a changeset ID. This lets you refer to revisions | ||||
symbolically. | ||||
mpm@selenic.com
|
r449 | .Q. What are branches, heads, and the tip? | ||
mpm@selenic.com
|
r446 | |||
The central concept of Mercurial is branching. A 'branch' is simply | ||||
an independent line of development. In most other version control | ||||
systems, all users generally commit to the same line of development | ||||
called 'the trunk' or 'the main branch'. In Mercurial, every developer | ||||
effectively works on a private branch and there is no internal concept | ||||
of 'the main branch'. | ||||
Thus Mercurial works hard to make repeated merging between branches | ||||
mpm@selenic.com
|
r449 | easy. Simply run "hg pull" and "hg update -m" and commit the result. | ||
mpm@selenic.com
|
r446 | |||
'Heads' are simply the most recent commits on a branch. Technically, | ||||
they are changesets which have no children. Merging is the process of | ||||
joining points on two branches into one, usually at their current | ||||
mpm@selenic.com
|
r449 | heads. Use "hg heads" to find the heads in the current repository. | ||
mpm@selenic.com
|
r446 | |||
The 'tip' is the most recently changed head, and also the highest | ||||
numbered revision. If you have just made a commit, that commit will be | ||||
the head. Alternately, if you have just pulled from another | ||||
repository, the tip of that repository becomes the current tip. | ||||
The 'tip' is the default revision for many commands such as update, | ||||
and also functions as a special symbolic tag. | ||||
mpm@selenic.com
|
r449 | .Q. How does merging work? | ||
mpm@selenic.com
|
r446 | |||
The merge process is simple. Usually you will want to merge the tip | ||||
mpm@selenic.com
|
r449 | into your working directory. Thus you run "hg update -m" and Mercurial | ||
mpm@selenic.com
|
r446 | will incorporate the changes from tip into your local changes. | ||
The first step of this process is tracing back through the history of | ||||
changesets and finding the 'common ancestor' of the two versions that | ||||
are being merged. This is done on a project-wide and a file by file | ||||
basis. | ||||
For files that have been changed in both projects, a three-way merge | ||||
is attempted to add the changes made remotely into the changes made | ||||
locally. If there are conflicts between these changes, the user is | ||||
prompted to interactively resolve them. | ||||
Mercurial uses a helper tool for this, which is usually found by the | ||||
hgmerge script. Example tools include tkdiff, kdiff3, and the classic | ||||
RCS merge. | ||||
After you've completed the merge and you're satisfied that the results | ||||
are correct, it's a good idea to commit your changes. Mercurial won't | ||||
allow you to perform another merge until you've done this commit as | ||||
that would lose important history that will be needed for future | ||||
merges. | ||||
mpm@selenic.com
|
r449 | .Q. How do tags work in Mercurial? | ||
mpm@selenic.com
|
r446 | |||
Tags work slightly differently in Mercurial than most revision | ||||
systems. The design attempts to meet the following requirements: | ||||
- be version controlled and mergeable just like any other file | ||||
- allow signing of tags | ||||
- allow adding a tag to an already committed changeset | ||||
- allow changing tags in the future | ||||
Thus Mercurial stores tags as a file in the working dir. This file is | ||||
called .hgtags and consists of a list of changeset IDs and their | ||||
corresponding tags. To add a tag to the system, simply add a line to | ||||
mpm@selenic.com
|
r449 | this file and then commit it for it to take effect. The "hg tag" | ||
command will do this for you and "hg tags" will show the currently | ||||
mpm@selenic.com
|
r446 | effective tags. | ||
Note that because tags refer to changeset IDs and the changeset ID is | ||||
effectively the sum of all the contents of the repository for that | ||||
change, it is impossible in Mercurial to simultaneously commit and add | ||||
a tag. Thus tagging a revision must be done as a second step. | ||||
mpm@selenic.com
|
r449 | |||
mpm@selenic.com
|
r455 | .Q. What if I want to just keep local tags? | ||
You can add a section called "[tags]" to your .hg/hgrc which contains | ||||
a list of tag = changeset ID pairs. Unlike traditional tags, these are | ||||
only visible in the local repository, but otherwise act just like | ||||
normal tags. | ||||
mpm@selenic.com
|
r449 | .Q. How do tags work with multiple heads? | ||
mpm@selenic.com
|
r446 | |||
The tags that are in effect at any given time are the tags specified | ||||
mpm@selenic.com
|
r455 | in each head, with heads closer to the tip taking precedence. Local | ||
tags override all other tags. | ||||
mpm@selenic.com
|
r446 | |||
mpm@selenic.com
|
r449 | .Q. What are some best practices for distributed development with Mercurial? | ||
mpm@selenic.com
|
r446 | |||
First, merge often! This makes merging easier for everyone and you | ||||
find out about conflicts (which are often rooted in incompatible | ||||
design decisions) earlier. | ||||
Second, don't hesitate to use multiple trees locally. Mercurial makes | ||||
this fast and light-weight. Typical usage is to have an incoming tree, | ||||
an outgoing tree, and a separate tree for each area being worked on. | ||||
The incoming tree is best maintained as a pristine copy of the | ||||
upstream repository. This works as a cache so that you don't have to | ||||
pull multiple copies over the network. No need to check files out here | ||||
as you won't be changing them. | ||||
The outgoing tree contains all the changes you intend for merger into | ||||
mpm@selenic.com
|
r449 | upsteam. Publish this tree with 'hg serve" or hgweb.cgi or use 'hg | ||
push" to push it to another publicly availabe repository. | ||||
mpm@selenic.com
|
r446 | |||
Then, for each feature you work on, create a new tree. Commit early | ||||
and commit often, merge with incoming regularly, and once you're | ||||
satisfied with your feature, pull the changes into your outgoing tree. | ||||
mpm@selenic.com
|
r449 | .Q. How do I import from a repository created in a different SCM? | ||
mpm@selenic.com
|
r446 | |||
Take a look at contrib/convert-repo. This is an extensible | ||||
framework for converting between repository types. | ||||
mpm@selenic.com
|
r449 | .Q. What about Windows support? | ||
mpm@selenic.com
|
r446 | |||
Patches to support Windows are being actively integrated, a fully | ||||
working Windows version is probably not far off | ||||
Section 2: Technical | ||||
-------------------- | ||||
mpm@selenic.com
|
r449 | .Q. What limits does Mercurial have? | ||
mpm@selenic.com
|
r446 | |||
Mercurial currently assumes that single files, indices, and manifests | ||||
can fit in memory for efficiency. | ||||
Offsets in revlogs are currently tracked with 32 bits, so a revlog for | ||||
a single file can currently not grow beyond 4G. | ||||
There should otherwise be no limits on file name length, file size, | ||||
file contents, number of files, or number of revisions. | ||||
The network protocol is big-endian. | ||||
File names cannot contain the null character. Committer addresses | ||||
cannot contain newlines. | ||||
Mercurial is primarily developed for UNIX systems, so some UNIXisms | ||||
may be present in ports. | ||||
mpm@selenic.com
|
r455 | .Q. How does Mercurial store its data? | ||
The fundamental storage type in Mercurial is a "revlog". A revlog is | ||||
the set of all revisions of a named object. Each revision is either | ||||
stored compressed in its entirety or as a compressed binary delta | ||||
against the previous version. The decision of when to store a full | ||||
version is made based on how much data would be needed to reconstruct | ||||
the file. This lets us ensure that we never need to read huge amounts | ||||
of data to reconstruct a object, regardless of how many revisions of it | ||||
we store. | ||||
In fact, we should always be able to do it with a single read, | ||||
provided we know when and where to read. This is where the index comes | ||||
in. Each revlog has an index containing a special hash (nodeid) of the | ||||
text, hashes for its parents, and where and how much of the revlog | ||||
data we need to read to reconstruct it. Thus, with one read of the | ||||
index and one read of the data, we can reconstruct any version in time | ||||
proportional to the object size. | ||||
Similarly, revlogs and their indices are append-only. This means that | ||||
adding a new version is also O(1) seeks. | ||||
Revlogs are used to represent all revisions of files, manifests, and | ||||
changesets. Compression for typical objects with lots of revisions can | ||||
range from 100 to 1 for things like project makefiles to over 2000 to | ||||
1 for objects like the manifest. | ||||
.Q. How are manifests and changesets stored? | ||||
A manifest is simply a list of all files in a given revision of a | ||||
project along with the nodeids of the corresponding file revisions. So | ||||
grabbing a given version of the project means simply looking up its | ||||
manifest and reconstruction all the file revisions pointed to by it. | ||||
mpm@selenic.com
|
r446 | |||
mpm@selenic.com
|
r455 | A changeset is a list of all files changed in a check-in along with a | ||
change description and some metadata like user and date. It also | ||||
contains a nodeid to the relevent revision of the manifest. | ||||
.Q. How do Mercurial hashes get calculated? | ||||
Mercurial hashes both the contents of an object and the hash of its | ||||
parents to create an identifier that uniquely identifies an object's | ||||
contents and history. This greatly simplifies merging of histories | ||||
because it avoid graph cycles that can occur when a object is reverted | ||||
to an earlier state. | ||||
All file revisions have an associated hash value. These are listed in | ||||
the manifest of a given project revision, and the manifest hash is | ||||
listed in the changeset. The changeset hash is again a hash of the | ||||
changeset contents and its parents, so it uniquely identifies the | ||||
entire history of the project to that point. | ||||
mpm@selenic.com
|
r446 | |||
mpm@selenic.com
|
r455 | .Q. What checks are there on repository integrity? | ||
Every time a revlog object is retrieved, it is checked against its | ||||
hash for integrity. It is also incidentally doublechecked by the | ||||
Adler32 checksum used by the underlying zlib compression. | ||||
Running 'hg verify' decompresses and reconstitutes each revision of | ||||
each object in the repository and cross-checks all of the index | ||||
metadata with those contents. | ||||
But this alone is not enough to ensure that someone hasn't tampered | ||||
with a repository. For that, you need cryptographic signing. | ||||
.Q. How does signing work with Mercurial? | ||||
Take a look at the hgeditor script for an example. The basic idea is | ||||
to use GPG to sign the manifest ID inside that changelog entry. The | ||||
manifest ID is a recursive hash of all of the files in the system and | ||||
their complete history, and thus signing the manifest hash signs the | ||||
entire project contents. | ||||
mpm@selenic.com
|
r446 | |||
mpm@selenic.com
|
r449 | .Q. What about hash collisions? What about weaknesses in SHA1? | ||
mpm@selenic.com
|
r446 | |||
The SHA1 hashes are large enough that the odds of accidental hash collision | ||||
are negligible for projects that could be handled by the human race. | ||||
The known weaknesses in SHA1 are currently still not practical to | ||||
attack, and Mercurial will switch to SHA256 hashing before that | ||||
becomes a realistic concern. | ||||
Collisions with the "short hashes" are not a concern as they're always | ||||
checked for ambiguity and are still long enough that they're not | ||||
likely to happen for reasonably-sized projects (< 1M changes). | ||||
mpm@selenic.com
|
r455 | |||