Okay. I updated mold to v2.0.0
. Added "-Z", "time-passes"
to get link times, ran cargo with --timings
to get CPU utilization graphs.
Tested on two projects of mine (the one from yesterday is “X”).
Link times are picked as the best from 3-4 runs, changing only white space on main.rs
.
lto="fat" |
lld | mold |
---|---|---|
project X (cu=1) | 105.923 | 106.380 |
Project X (cu=8) | 103.512 | 103.513 |
Project S (cu=1) | 94.290 | 94.969 |
Project S (cu=8) | 100.118 | 100.449 |
Observations (lto="fat"
): As expected, not a lot of utilization of multi-core. Using codegen-units
larger than 1 may even cause a regression in link time. Choice of linker between lld
and mold
appears to be of no significance.
lto="thin" |
lld | mold |
---|---|---|
project X (cu=1) | 46.596 | 47.118 |
Project X (cu=8) | 34.167 | 33.839 |
Project X (cu=16) | 36.296 | 36.621 |
Project S (cu=1) | 41.817 | 41.404 |
Project S (cu=8) | 32.062 | 32.162 |
Project S (cu=16) | 35.780 | 36.074 |
Observations (lto="thin"
): Here, we see parallel LLVM_lto_optimize
runs kicking in. Testing with codegen-units=16
was also done. In that case, the number of parallel LLVM_lto_optimize
runs was so big, the synchronization overhead caused a regression running that test on a humble workstation powered by an Intel i7-7700K processor (4 physical, 8 logical cores only). The results will probably look different running this test case (cu=16) in a more powerful setup. But still, the choice of linker between lld
and mold
appears to be of no significance.
lto=false |
lld | mold |
---|---|---|
project X (cu=1) | 29.160 | 29.231 |
Project X (cu=8) | 8.130 | 8.293 |
Project X (cu=16) | 7.076 | 6.953 |
Project S (cu=1) | 11.996 | 12.069 |
Project S (cu=8) | 4.418 | 4.462 |
Project S (cu=16) | 4.357 | 4.455 |
Observations (lto=false
): Here, codegen-units
becomes the dominant factor with no heavy LLVM_lto_optimize
runs involved. Going above codegen-units=8
does not hurt link time. Still, the choice of linker between lld
and mold
appears to be of no significance.
lto="off" |
lld | mold |
---|---|---|
project X (cu=1) | 29.109 | 29.201 |
Project X (cu=8) | 5.896 | 6.117 |
Project X (cu=16) | 3.479 | 3.637 |
Project S (cu=1) | 11.732 | 11.742 |
Project S (cu=8) | 2.354 | 2.355 |
Project S (cu=16) | 1.517 | 1.499 |
Observations (lto="off"
): Same observations as lto=false
. Still, the choice of linker between lld
and mold
appears to be of no significance.
Debug builds link in <.4 seconds.
Thank you for working on this. I found
Okhsl
to be of great utility.