diff --git a/mercurial/help/internals/changegroups.txt b/mercurial/help/internals/changegroups.txt
new file mode 100644
--- /dev/null
+++ b/mercurial/help/internals/changegroups.txt
@@ -0,0 +1,142 @@
+Changegroups
+============
+
+Changegroups are representations of repository revlog data, specifically
+the changelog, manifest, and filelogs.
+
+There are 2 versions of changegroups: ``1`` and ``2``. From a
+high-level, they are almost exactly the same, with the only difference
+being a header on entries in the changeset segment.
+
+Changegroups consists of 3 logical segments::
+
+   +---------------------------------+
+   |           |          |          |
+   | changeset | manifest | filelogs |
+   |           |          |          |
+   +---------------------------------+
+
+The principle building block of each segment is a *chunk*. A *chunk*
+is a framed piece of data::
+
+   +---------------------------------------+
+   |           |                           |
+   |  length   |           data            |
+   | (32 bits) |       <length> bytes      |
+   |           |                           |
+   +---------------------------------------+
+
+Each chunk starts with a 32-bit big-endian signed integer indicating
+the length of the raw data that follows.
+
+There is a special case chunk that has 0 length (``0x00000000``). We
+call this an *empty chunk*.
+
+Delta Groups
+------------
+
+A *delta group* expresses the content of a revlog as a series of deltas,
+or patches against previous revisions.
+
+Delta groups consist of 0 or more *chunks* followed by the *empty chunk*
+to signal the end of the delta group::
+
+  +------------------------------------------------------------------------+
+  |                |             |               |             |           |
+  | chunk0 length  | chunk0 data | chunk1 length | chunk1 data |    0x0    |
+  |   (32 bits)    |  (various)  |   (32 bits)   |  (various)  | (32 bits) |
+  |                |             |               |             |           |
+  +------------------------------------------------------------+-----------+
+
+Each *chunk*'s data consists of the following::
+
+  +-----------------------------------------+
+  |              |              |           |
+  | delta header | mdiff header |   delta   |
+  |  (various)   |  (12 bytes)  | (various) |
+  |              |              |           |
+  +-----------------------------------------+
+
+The *length* field is the byte length of the remaining 3 logical pieces
+of data. The *delta* is a diff from an existing entry in the changelog.
+
+The *delta header* is different between versions ``1`` and ``2`` of the
+changegroup format.
+
+Version 1::
+
+   +------------------------------------------------------+
+   |            |             |             |             |
+   |    node    |   p1 node   |   p2 node   |  link node  |
+   | (20 bytes) |  (20 bytes) |  (20 bytes) |  (20 bytes) |
+   |            |             |             |             |
+   +------------------------------------------------------+
+
+Version 2::
+
+   +------------------------------------------------------------------+
+   |            |             |             |            |            |
+   |    node    |   p1 node   |   p2 node   | base node  | link node  |
+   | (20 bytes) |  (20 bytes) |  (20 bytes) | (20 bytes) | (20 bytes) |
+   |            |             |             |            |            |
+   +------------------------------------------------------------------+
+
+The *mdiff header* consists of 3 32-bit big-endian signed integers
+describing offsets at which to apply the following delta content::
+
+   +-------------------------------------+
+   |           |            |            |
+   |  offset   | old length | new length |
+   | (32 bits) |  (32 bits) |  (32 bits) |
+   |           |            |            |
+   +-------------------------------------+
+
+In version 1, the delta is always applied against the previous node from
+the changegroup or the first parent if this is the first entry in the
+changegroup.
+
+In version 2, the delta base node is encoded in the entry in the
+changegroup. This allows the delta to be expressed against any parent,
+which can result in smaller deltas and more efficient encoding of data.
+
+Changeset Segment
+-----------------
+
+The *changeset segment* consists of a single *delta group* holding
+changelog data. It is followed by an *empty chunk* to denote the
+boundary to the *manifests segment*.
+
+Manifest Segment
+----------------
+
+The *manifest segment* consists of a single *delta group* holding
+manifest data. It is followed by an *empty chunk* to denote the boundary
+to the *filelogs segment*.
+
+Filelogs Segment
+----------------
+
+The *filelogs* segment consists of multiple sub-segments, each
+corresponding to an individual file whose data is being described::
+
+   +--------------------------------------+
+   |          |          |          |     |
+   | filelog0 | filelog1 | filelog2 | ... |
+   |          |          |          |     |
+   +--------------------------------------+
+
+The final filelog sub-segment is followed by an *empty chunk* to denote
+the end of the segment and the overall changegroup.
+
+Each filelog sub-segment consists of the following::
+
+   +------------------------------------------+
+   |               |            |             |
+   | filename size |  filename  | delta group |
+   |   (32 bits)   |  (various) |  (various)  |
+   |               |            |             |
+   +------------------------------------------+
+
+That is, a *chunk* consisting of the filename (not terminated or padded)
+followed by N chunks constituting the *delta group* for this file.
+