Show More
@@ -157,6 +157,28 b' def partition(lst, nslices):' | |||
|
157 | 157 | The current strategy takes every Nth element from the input. If |
|
158 | 158 | we ever write workers that need to preserve grouping in input |
|
159 | 159 | we should consider allowing callers to specify a partition strategy. |
|
160 | ||
|
161 | mpm is not a fan of this partitioning strategy when files are involved. | |
|
162 | In his words: | |
|
163 | ||
|
164 | Single-threaded Mercurial makes a point of creating and visiting | |
|
165 | files in a fixed order (alphabetical). When creating files in order, | |
|
166 | a typical filesystem is likely to allocate them on nearby regions on | |
|
167 | disk. Thus, when revisiting in the same order, locality is maximized | |
|
168 | and various forms of OS and disk-level caching and read-ahead get a | |
|
169 | chance to work. | |
|
170 | ||
|
171 | This effect can be quite significant on spinning disks. I discovered it | |
|
172 | circa Mercurial v0.4 when revlogs were named by hashes of filenames. | |
|
173 | Tarring a repo and copying it to another disk effectively randomized | |
|
174 | the revlog ordering on disk by sorting the revlogs by hash and suddenly | |
|
175 | performance of my kernel checkout benchmark dropped by ~10x because the | |
|
176 | "working set" of sectors visited no longer fit in the drive's cache and | |
|
177 | the workload switched from streaming to random I/O. | |
|
178 | ||
|
179 | What we should really be doing is have workers read filenames from a | |
|
180 | ordered queue. This preserves locality and also keeps any worker from | |
|
181 | getting more than one file out of balance. | |
|
160 | 182 | ''' |
|
161 | 183 | for i in range(nslices): |
|
162 | 184 | yield lst[i::nslices] |
General Comments 0
You need to be logged in to leave comments.
Login now