Dissolve-MPI running slower than Dissolve-Serial on 0.7.0

An issue has been found in 0.7.0 with the parallel version running much slower than the serial version. I ran a simple benzene equilibration for 100 iterations and the time taken to complete was as follows:

-Dissolve (serial) - 5 mins 32s
-Dissolve-mpi (np = 1) - 11 mins 1s
-Dissolve-mpi (np=2) - 8 mins 56s
-Dissolve-mpi (np=4) - 10 mins 15s

I initially thought the new averaging feature may be responsible as all the sims started taking longer once this feature was put in, however, I set these values to their default and obtained these interesting results nonetheless. I have analysed the output files and the MD step seems to be the cause and is 4-10x slower in the parallel sims than the one using the serial version. All the other modular steps are completed at similar speeds.

I’ve just tried a fresh “release-style” compilation of dissolve-mpi and I see the following timings on my machine (for the MD system test):

1:      -->                   MD  (MD01)          3.2 s/iter  (1 iterations)
1:      -->               Forces  (Forces01)     0.44 s/iter  (1 iterations)
1:Total time taken for 1 iterations was 3.61 seconds (3.6 s/iter).

2:      -->                   MD  (MD01)          1.6 s/iter  (1 iterations)
2:      -->               Forces  (Forces01)     0.31 s/iter  (1 iterations)
2:Total time taken for 1 iterations was 1.90 seconds (1.9 s/iter).

3:      -->                   MD  (MD01)          1.1 s/iter  (1 iterations)
3:      -->               Forces  (Forces01)     0.27 s/iter  (1 iterations)
3:Total time taken for 1 iterations was 1.40 seconds (1.4 s/iter).

4:      -->                   MD  (MD01)         0.83 s/iter  (1 iterations)
4:      -->               Forces  (Forces01)     0.24 s/iter  (1 iterations)
4:Total time taken for 1 iterations was 1.07 seconds (1.1 s/iter).

If I compile with all debugging and checking turned, then the timings are worse, but still scale as expected:

1:      -->                   MD  (MD01)           15 s/iter  (1 iterations)
1:      -->               Forces  (Forces01)      2.3 s/iter  (1 iterations)
1:Total time taken for 1 iterations was 17.36 seconds (17 s/iter).

2:      -->                   MD  (MD01)          7.7 s/iter  (1 iterations)
2:      -->               Forces  (Forces01)      1.7 s/iter  (1 iterations)
2:Total time taken for 1 iterations was 9.37 seconds (9.4 s/iter).

3:      -->                   MD  (MD01)          5.3 s/iter  (1 iterations)
3:      -->               Forces  (Forces01)      1.4 s/iter  (1 iterations)
3:Total time taken for 1 iterations was 6.70 seconds (6.7 s/iter).

4:      -->                   MD  (MD01)            4 s/iter  (1 iterations)
4:      -->               Forces  (Forces01)      1.3 s/iter  (1 iterations)
4:Total time taken for 1 iterations was 5.31 seconds (5.3 s/iter).

Did you compile this version yourself, or download it from the website? If the former, what was the exact cmake command used to set up the build (e.g. cmake ../ -DPARALLEL:bool=true)?

I should say this is with the current continuous version on the website (b21b833 at time of writing).

I compiled it myself, using the continuous build as of 3rd Jan which I assume is the previous continuous build.

I compiled it as per the instructions you sent, and the cmake command was:
cmake …/dissolve -DPARALLEL:bool=true -DANTLR_EXECUTABLE:path=/where/i/downloaded/antlr4.9.jar

I have just compiled the latest version you used using Ninja with the relevant commands added to the cmake line and also -DBUILD_ANTLR_RUNTIME:bool=true added.

The timings are similar to before:
-Dissolve-mpi(np=1) - 10 mins 2s
-Dissolve-mpi(np=2) - 10 mins 5s
-Dissolve-mpi(np=4) - 8mins 34s
-Dissolve-serial - 5 mins 53s

I will send you the input files and output files so you can check the relevants stats yourself and also see if you can reproduce what I am seeing, in case it is a me issue!