openoffice-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andre Fischer <awf....@gmail.com>
Subject Bottom up build
Date Wed, 29 Jan 2014 09:18:45 GMT
I would like to report some observations that I made when thinking about 
how to make building OpenOffice with one global makefile feasible.  It 
will probably the last of build related mails in the near future.

Traditional make uses a top-down approach.  It starts with a target, 
'all' by default, and looks at its dependencies.  When one of those has 
to be made or is newer then the target then the target also has to be 
made.  This is done recursively and depth-first.  Every file on which 
'all' has a direct or indirect dependency has to be checked.  If we 
would build OpenOffice with one makefile (included makefiles don't 
count) then that are a lot of files to check.  There are about 9800 cxx 
files, 3500 c files, 12000 hxx files, and lot of external headers.  
Checking the modification time of so many files is one of the reasons 
for the delay in , say, sw/ between starting make and its first action.

As I don't have all global dependencies in a format that would allow 
experimation, I tried how long it would take to get the mtime (time of 
last modification) for all files, source, generated, compiled, linked, 
about 120000.  I wrote a small Java program for that.  With a warm cache 
that takes about 23s.  When run in 4 threads this reduced to less than 
8s.  Could be worse.

But it also could be better because in general there are only a few 
files modified, usually files that you modified yourself in an editor.  
There is another approach, introduced, as far as I know, by the tup [1] 
build tool, that is bottom up.  If you had something similar to the 
oracle of complexity theory, that gave you the list of modified files 
since the last build, you could find the depending files much faster.  
Faster for two reasons. Firstly, there is only one path in the 
dependency tree up towards the root (while there are many down from the 
root).  Only targets on this path are affected by the modified file. 
Secondly, the dependency analysis is comparatively cheap.  The expensive 
part is to determine the file modification times.  If they where 
miraculously given then even the top-down approach would not take 
noticably longer.

So I made another experiment to see if such an oracle can be created.  
Java 7 has the java.nio.file.WatchService that lets you monitor file 
modfifications in one directory.  I registered it to all directories in 
our source tree (some 16000 directories).  With the WatchService in 
place every file modification can be recorded and stored for later.  On 
the next build you only have to check the set of modified files, not all 
files.  Registering the directory watchers takes a couple of seconds but 
after that it does not cause any noticeable CPU activity. Any file 
modifications are reported almost at once.  I do not have the framework 
in place to start a build with this information but I would expect it to 
be as fast as compiling the modified files and linking takes.

The tup website references a paper [2] in which the established top-down 
approaches are called alpha alogithms and the new bottom-up approach is 
called beta algorithm. Tup has implemented a file modification watcher 
(in C or C++) only for Linux.  On Windows it just scans all files (for 
which it needs a little more time than my Java program, maybe it does 
not use more than one thread).


This is something that we should keep in mind for when we ever should 
get a build solution with global dependencies and this build tool would 
turn out to be too slow.


If can find the source code of my Java experiments at [3]. If nothing 
else you can see an application of the ForkJoinPool that allowed my to 
write the parallel file system scan in just a few lines.  There is also 
an alternative implementation that uses the ExecutorService (with a 
fixed thread pool) which needs a few more lines of code.  And there is 
of course the use of the WatchService.


Regards,
Andre


[1] http://gittup.org/tup/
[2] http://gittup.org/tup/build_system_rules_and_algorithms.pdf
[3] http://people.apache.org/~af/test.zip

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@openoffice.apache.org
For additional commands, e-mail: dev-help@openoffice.apache.org


Mime
View raw message