gcc 4.7 and newer support a new option, -gsplit-dwarf, that creates two object files: the normal .o file, without dwarf info, and a .dwo file, with the dwarf info. When using ccache, only the .o is kept around in the cache. Removing both files and running ccache again will only recreate the .o.
I think the added support will be very useful for big project like Mozilla thunderbird, or firefox. Of course, the problem is that ccache seems to assume single source file -> single object file mapping in its original design. I took a quick look at ccache.c file. There is a data object called cached_obj. http://gitweb.samba.org/?p=ccache.git;a=blob;f=ccache.c;h=377bfa1d73ce6364155678dbc0ef532f653c51aa;hb=HEAD#l101 101 /* 102 * Name (represented as a struct file_hash) of the file containing the cached 103 * object code. 104 */ 105 static struct file_hash *cached_obj_hash; 106 This is referenced about 70 times in the file, and I think those places where cached_obj_hash is referenced are where we can change the code to add support for the multiple object files (or object-like files produced from a single source file.) It may be that the lower level code/function that takes cached_obj_hash could be modified to handle such multiple output files, but I think it is better initially to support the changes at the top level in ccache.c to make the multiple-objects changes obvious and stand out to catch people's attention. But then I realize that ccache resorts to invoking the original compiler under some circumstances, and we need to handle those cases as well. I have not figured out where we should change the source file. (Not forgetting to copy and manage the created multiple object-like files from the original compiler in the cache.: i.e., .o and .dwo file in this case.) A good grad student or systems programming course project ?
For those interested, please check out https://bitbucket.org/zephyrus00jp/ccache-gsplit-dwarf-support/ I have modified ccheck to the point it works with -gsplit-dwarf for successful compilations (and some corner-cases as well). Checkout, and ./autogen.sh configure make should get you a working binary. I am testing this with mozilla thunderbird build.
ishikawa@yk.rim.or.jp: Thanks for thinking/working on this. Meanwhile, I will make ccache 3.1.x bail out (i.e., disable itself) when -gsplit-dwarf is used (just like other similar options producing multiple output files, e.g. --coverage).
(In reply to comment #3) > ishikawa@yk.rim.or.jp: Thanks for thinking/working on this. > > Meanwhile, I will make ccache 3.1.x bail out (i.e., disable itself) when > -gsplit-dwarf is used (just like other similar options producing multiple > output files, e.g. --coverage). Thank you for your comment. I will get my version tested out by some users, and then monitor the latest development of the official ccache 3.1.x and will provide a cleaned up patch hopefully in a couple of months. TIA
Hi, I have been using the modified version of ccache that supports -gsplit-dwarf of gcc 4.7 (and up) so that .o and .dwo files created for a single source files are handled. I synched with the latest repository. And the current version works as far as the linking mozilla thunderbird works. So far, so good. However, I have been struggling with running gdb against the created binary. gdb could not find the symbol information. I am not sure if my version of gdb is the latest version that would understand the split-dwarf scheme. But I now know for certain that there is a problem with ccache as well. I have been posting my woes in mozilla bugzilla. https://bugzilla.mozilla.org/show_bug.cgi?id=905646 Here is a quote from the latest post. Basically, I was trying to figure out how to tell |gdb| where to look for symbol information. --- begin quote --- After searching for a clue using Google search engine, I could find an obscure reference to a utility called |dwp|. This is mentioned in a linux tool chain mail post: http://www.cygwin.com/ml/binutils/2012-10/msg00365.html and a RedHat document (there are a few similar documents): https://access.redhat.com/site/documentation/en-US/Red_Hat_Developer_Toolset/2/html/User_Guide/sect-binutils-Resources.html However, there is no on-line document about how to use |dwp| as far as I could tell. Also, the stock GNU gold directory in GNU binutils does't have |dwp| in it. Only CVS version of gold source tree at sourceware.org has it. ftp://sourceware.org/pub/binutils/snapshots/ I downloaded the latest CVS source and compiled gold. dwp utitily is also compiled and installed in my default /usr/local/bin directory. Running |dwp -help| printed out something like this. ishikawa@vm-debian-amd64:~/Dropbox$ dwp --help Usage: dwp [options] [file...] -h, --help Print this help message -e EXE, --exec EXE Get list of dwo files from EXE (defaults output to EXE.dwp) -o FILE, --output FILE Set output dwp file name -v, --verbose Verbose output -V, --version Print version number Now some of you may remember that I am using a hacked version of ccache that supports -gsplit-dwarf linking: this version of ccache supports the two outputs .o and .dwo and copies them into the cache and retrieves thenm when an identical compilation of .cpp file is requested (or so I thought. And as far as that is concerned, I was right. But read on.) Well, a surprise. When I tested |dwp| to see if "-e thunderbird-bin" spews out something meaningful, I got the following output. ishikawa@vm-debian-amd64:~/Dropbox$ dwp -v -e /REF-OBJ-DIR/objdir-tb3/mozilla/dist/bin/thunderbird-bin /FF-NEW/cache-dir/b/0/9d439fe3e01509d1030519704724ea-1758240.o.tmp.vm-debian-amd64.dwo dwp: error: cannot open /FF-NEW/cache-dir/b/0/9d439fe3e01509d1030519704724ea-1758240.o.tmp.vm-debian-amd64.dwo: No such file or directory dwp: fatal error: /FF-NEW/cache-dir/b/0/9d439fe3e01509d1030519704724ea-1758240.o.tmp.vm-debian-amd64.dwo: can't open ishikawa@vm-debian-amd64:~/Dropbox$ WHOA!? But those of you who know the innards of ccache would immediately recognize the problem. --- end quote The problem is that |gcc-4.8 -gsplit-dwarf ...| embeds the exact source file pathname in the debug information section of the produced .o file. And that path name is looked up for searching the source file for debugging (and presumably lookin for .dwo file as well: .dwo file's path is pathname-A.dwo if .o file's path is pathname-A.o. If I am not mistaken, my version of ccache, which I modified to support -gsplit-dwarf, creates an intermediate preprocessed file, .i file, to calculate a hash value. If a search based on this hash produces a match, the corresponding .o (and .dwo file as well if -gsplit-dwarf is used) is returned as the compilation result. If a search failed, ccache compiles this .i file (with the pathname that looks like "/FF-NEW/cache-dir/b/0/9d439fe3e01509d1030519704724ea-1758240.o.tmp.vm-debian-amd64.i" created by ccache. Unfortunately, the use of this temporary pathname for preprocessed .i file is not good for the Fission scheme invoked by -gsplit-dwarf. I wonder the following modification is possible. It adds a certain overhead (thrown-away preprocessing once), but if there is no hit in the cache, instead of compiling the temporarily created .i file, can we invoke the real compiler to compile the original file (with the correct path of the source that would be used for debugging at run time) and put the produced .o (and .dwo if -gsplit-dwarf is given) into the cache? I looked at the ccache manual but there is no explicit mention of an option that would override the behavior of intermediate file pathname creation. TIA
(In reply to comment #5) > I wonder the following modification is possible. > It adds a certain overhead (thrown-away preprocessing once), but > if there is no hit in the cache, > instead of compiling the temporarily created .i file, > can we invoke the real compiler to compile the original file (with the correct > path of the source that would be used for debugging at run time) > and put the produced .o (and .dwo if -gsplit-dwarf is given) into the cache? Sounds like you want to set conf->run_second_cpp (AKA CCACHE_CPP2) to true if -gsplit-dwarf is used.
Sorry for my late reply. Your suggestion seems to work to an extent. I have been using the option on one of my TWO PCs where I installed the patched ccache/ But the added CPP preprocessing for a new compilation is not quite optimum and defeats the purpose of ccache somewhat. I will continue to use the option, and if the added overhead becomes unbearable (some files in mozilla source code expands to like 1 million lines of code !), I would ask GCC people if they can support an additional option that takes a filename argument and embeds THAT file name as the source file name in the produced debug information. It may be difficult to persuade GCC developers to add such a strange request :) But I presume simple GDB and -g usage also suffers from this problem. So there may be some users who need such options. TIA PS: Come to think of it, if CPP can output #file the_source_file_name or something like that at the very beginning of the output, the following gdb uses that file as the source file name (???)
(In reply to comment #5) > If a search failed, ccache compiles this .i file (with the pathname > that looks like > "/FF-NEW/cache-dir/b/0/9d439fe3e01509d1030519704724ea-1758240.o.tmp.vm-debian-amd64.i" > created by ccache. > > Unfortunately, the use of this temporary pathname for preprocessed .i file > is not good for the Fission scheme invoked by -gsplit-dwarf. This is incorrect. Both the location and the .o. in the filename suggest that this is a path of the output object file, rather than the input .i file. What in fact happens is that -gsplit-dwarf puts the .dwo in the same location as the .o and records this location. However ccache creates the .o in its cache directory instead of wherever the -o option passed to it said, and so that cache location gets included in the .o as the location, where the .dwo file will be looked for. You can run you ccache command wrapped in something like 'strace -f -e execve -s 1024' if you want to check. If it's not feasible to make ccache create the .o directly in the proper location, then gcc's -fdebug-map-prefix and clang's -fdebug-compilation-dir should be enough to fake it.
(In reply to comment #8) > (In reply to comment #5) > > If a search failed, ccache compiles this .i file (with the pathname > > that looks like > > "/FF-NEW/cache-dir/b/0/9d439fe3e01509d1030519704724ea-1758240.o.tmp.vm-debian-amd64.i" > > created by ccache. > > > > Unfortunately, the use of this temporary pathname for preprocessed .i file > > is not good for the Fission scheme invoked by -gsplit-dwarf. > > This is incorrect. Both the location and the .o. in the filename suggest that > this is a path of the output object file, rather than the input .i file. > Hi, thank you for your analysis. > What in fact happens is that -gsplit-dwarf puts the .dwo in the same location > as the .o and records this location. However ccache creates the .o in its cache > directory instead of wherever the -o option passed to it said, and so that > cache location gets included in the .o as the location, where the .dwo file > will be looked for. You can run you ccache command wrapped in something like > 'strace -f -e execve -s 1024' if you want to check. I will have something to do over the weekend. I will verify what is going and see if I can tweak where the .dwo will go, etc. > If it's not feasible to make ccache create the .o directly in the proper > location, then gcc's -fdebug-map-prefix and clang's -fdebug-compilation-dir > should be enough to fake it. I think your suggested -fdebug-map-prefix or clag's -debug-compilation-dir may be a better approach overall. (This is because processing .dwo location solely for this -fsplit-dwarf issue seems an overkill and unnecessarily complication for NON-split-dwarf compilation.) Anyway, thank you for your suggestion and let me find out what can be done quickly. TIA
(In reply to comment #8) > (In reply to comment #5) [...] > This is incorrect. Both the location and the .o. in the filename suggest that > this is a path of the output object file, rather than the input .i file. > You were right! > What in fact happens is that -gsplit-dwarf puts the .dwo in the same location > as the .o and records this location. However ccache creates the .o in its cache > directory instead of wherever the -o option passed to it said, and so that > cache location gets included in the .o as the location, where the .dwo file > will be looked for. You can run you ccache command wrapped in something like > 'strace -f -e execve -s 1024' if you want to check. Indeed this was the case! My eyes were confused with the path names based on the hashed values and did not catch this subtle issue. I also thought setting CCACHE_CPP2 solved the issue earlier and did not pay close attention to the issue until recently :-( > If it's not feasible to make ccache create the .o directly in the proper > location, then gcc's -fdebug-map-prefix and clang's -fdebug-compilation-dir > should be enough to fake it. Well, by tweaking my modified version of ccache's to create .o directly in the proper location (I am hoping I did not break anything: I followed the processing of profile option where .o is produced directly), I could finally make my binaries work with gdb (with proper setting of debug-file-directory to search for scattered .dwo files, etc.) (!) I no longer have to set CCACHE_CPP2 either (!) Latest version of my modified ccache is at https://bitbucket.org/zephyrus00jp/ccache-gsplit-dwarf-support People are welcome to test it, and eventually incorporate the patch to the mainline. Thank you again for your comment. ccache works now with -gsplit-dwarf correctly! Here is a long writeup about how the bug was found and fixed. A short summary: Hi, I analyzed a little bit more, and found the following. I modified ccache further and now all is well. I found that the .dwo file pathname stored in debug section of .o object file contains an incorrect pathname (that of a temporary file pathname, and this file disappears after compilation and caching.) |.dwo| file pathname ought to be directly based on the original C/C++ source file (or at least that of a persistent cached data file.) I found a quick fix for this problem, and now the recorded |.dwo| pathname in the object file is properly the_basename_of_original_source_path.dwo (!) and gdb session works with proper setting for debug-file-directory (to look for created .dwo files in object file directories), etc. A longer summary: The current handling of the name of the object file created by the native C-compiler (using -o) and subsequent renaming, etc. for caching purposes is OK for the case of ordinary object files. However, the particular order of creation and renaming/copying is not correct any more if |-gsplit-dwarf| is given. (It must be similar to the handling of auxiliary file(s) for profiling. Regrettably, I mimicked the file processing for profiling files when I modified ccache to support .dwo file, but failed to note the peculiar order used for profiling files, and did not follow this peculiar order for .dwo file processing. Now I modified ccache so that it follows this peculiar order.) The object file name given to compiler (-o) must not be a temporary file that disappears after the compilation and saving to the ccache cache area because that path is used to create the path of .dwo file pathname which is stored in .o file. Specifically, in the scheme of my currently modified version of ccache, with -gsplit-dwarf, the object path name should not be that of temporary file that disappears after compilation (such a name has trailing suffix after the ".o" part.) Otherwise the embedded file pathname in the object file to look for the split .dwo file (created at the time of object generation) contains the temporary file's pathname that disappears after the compilation and caching of the object. This is bad. The first compilation and any subsequent re-compilations with cache hit end up with an object that contains the pathname of NON-existent .dwo pathname in it and so gdb can't find it. That was the root cause of the problems I had (aside from the failure to specify the directories where .dwo files ought to be searched). The following is a more lengthy explanation about how I found this, and a possible few methods to work around this issue. (Finally I chose the first method, i.e, to change the creation order of the object file and cached file a la the handling of profiling-related files.) How I found the issue: After compilation of thunderbird binary using my modified version of ccache to support "-gsplit-dwarf" (found in https://bitbucket.org/zephyrus00jp/ccache-gsplit-dwarf-support ) and specifying -gsplit-dwarf to gcc command, I ran thunderbird-bin with gdb. I still get the missing information error as follows: [ Below, /REF-OBJ-DIR/objdir-tb3 is my MOZ_OBJ directory. ] [invoking gdb with thunderbird-bin binary... ] <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /REF-OBJ-DIR/objdir-tb3/mozilla/dist/bin/thunderbird-bin...done. Dwarf Error: CU at offset 0x0 references unknown DWO with ID 296369af0b4237a4 [in module /REF-OBJ-DIR/objdir-tb3/mozilla/dist/bin/thunderbird-bin] (gdb) quit So I checked why gdb still can't find proper .dwo file: I dumped the debug info stored in thunderbird-bin to learn what are the path names that point at separate dwo file(s). The following command was used. readelf -wi /REF-OBJ-DIR/objdir-tb3/mozilla/dist/bin/thunderbird-bin | grep -3 dwo (output produced is in OUTPUT-1 near the end of this memo) I found that Dwarf ID 296369af0b4237a4 is assumed to be in the following file. /FF-NEW/cache-dir/0/2/056a98ef332fca9ec9462de72994c7-1808007.o.tmp.vm-debian-amd64.dwo [ /FF-NEW/cache-dir is my CCACHE's top directory. # ccache export CCACHE_DIR=/FF-NEW/cache-dir ] Note the trailing suffix of ".vm-debian-amd64.dwo" to the .dwo file pathname. [ |vm-debian-amd64| is the hostname of my local PC.] Now, I knew that something is wrong already. Why do we have a path in ccache's cache area? Also, when I look at the ccache's directory, the files left there are actually as follows. They are found in the cache directory "0/2/". /FF-NEW/cache-dir/0/2/056a98ef332fca9ec9462de72994c7-1808007.dwo /FF-NEW/cache-dir/0/2/056a98ef332fca9ec9462de72994c7-1808007.o /FF-NEW/cache-dir/0/2/056a98ef332fca9ec9462de72994c7-1808007.d Here, please note the suffix ".tmp.vm-debian-amd64.dwo" that is part of the pathname stored in the object file is not part of the existing files' path names. So the object file contains a path (for .dwo file) that refers to the ccache's cached directory, and the exact file path is not correct (module the trailing suffix for making it unique during compilation). This tells me that my modified version of ccache is copying the dwo file using the name of temporary object file before it is finally copied/renamed to an object file with the basename being equal to the basename of the original C source pathname! Something is wrong here. By checking the content of ".d" file, /FF-NEW/cache-dir/0/2/056a98ef332fca9ec9462de72994c7-1808007.d I found it is the object file produced from nsMailApp.c. [ ".d" file's first line starts with: nsMailApp.o: /REF-COMM-CENTRAL/comm-central/mail/app/nsMailApp.cpp \ ] [Analysis-2] I recompiled nsMailApp.cpp by inserting a blank line to force the real compilation to occur. Here is the log from this compilation using the modified version of ccache: --- From ccache log: [2014-03-21T23:25:28.337991 19890] (cached_obj) Object file /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o not in cache [2014-03-21T23:25:28.338037 19890] (tmp_obj) Outputting to temporary destination: /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o.tmp.vm-debian-amd64.19890 [2014-03-21T23:25:28.338043 19890] Setting tmp_dwo to /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o.tmp.vm-debian-amd64.dwo [2014-03-21T23:25:28.338060 19890] Running real compiler [2014-03-21T23:25:28.338066 19890] Executing /usr/bin/g++-4.8 -fno-builtin-strlen -Wl,--gdb-index -fPIC -Wall -Wpointer-arith -Woverloaded-virtual -Werror=return-type -Werror=int-to-pointer-cast -Wtype-limits -Wempty-body -Wsign-compare -Wno-invalid-offsetof -Wcast-align -fno-exceptions -fno-strict-aliasing -fno-rtti -fno-exceptions -fno-math-errno -std=gnu++0x -pthread -pipe -gsplit-dwarf -g -O2 -freorder-blocks -fno-omit-frame-pointer -DDEBUG=1 -DDEBUG_4GB_CHECK -DUSEHELGRIND=1 -I../../mozilla/dist/system_wrappers -include /REF-COMM-CENTRAL/comm-central/mozilla/config/gcc_hidden.h -DMOZ_SOURCE_STAMP=cc3144d8a709 -DMOZ_SOURCE_REPO=http://hg.mozilla.org/comm-central/ -DAB_CD=en-US -DAPP_VERSION=30.0a1 -DTHUNDERBIRD_ICO="../../mozilla/dist/branding/thunderbird.ico" -DGRE_MILESTONE=30.0a1 -DGRE_BUILDID=20140321232458 -DXPCOM_GLUE -DMOZ_GLUE_IN_PROGRAM -DOSTYPE="Linux3" -DOSARCH=Linux -DNO_NSPR_10_SUPPORT -I/REF-COMM-CENTRAL/comm-central/mozilla/toolkit/xre -I/REF-COMM-CENTRAL/comm-central/mozilla/xpcom/base -I/REF-COMM-CENTRAL/comm-central/mozilla/xpcom/build -I/REF-COMM-CENTRAL/comm-central/mail/app -I. -I../../mozilla/dist/include -I../../mozilla/dist/include/nsprpub -I/REF-OBJ-DIR/objdir-tb3/mozilla/dist/include/nspr -I/REF-OBJ-DIR/objdir-tb3/mozilla/dist/include/nss -I/REF-OBJ-DIR/objdir-tb3/mozilla/dist/include -DDEBUG -D_DEBUG -DTRACING -DMOZILLA_CLIENT -include ../../mozilla/mozilla-config.h -c -o /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o.tmp.vm-debian-amd64.19890 /REF-COMM-CENTRAL/comm-central/mail/app/nsMailApp.cpp [2014-03-21T23:25:28.338093 19890] Unlink /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o.tmp.stdout.vm-debian-amd64.19890 [2014-03-21T23:25:28.338139 19890] Unlink /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o.tmp.stderr.vm-debian-amd64.19890 [2014-03-21T23:25:34.603952 19890] Unlink /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o.tmp.stdout.vm-debian-amd64.19890 [2014-03-21T23:25:34.603998 19890] Unlink /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o.tmp.stderr.vm-debian-amd64.19890 My comment: The following log shows the moving of the object and .dwo files: temporaryfilepathname.o.tmp.vm-debian-amd64.19890 to temporaryfilepathname.o temporaryfilepathname.dwo.tmp.vm-debian-amd64.19890 to temporaryfilepathname.dwo ... then ... copy temporaryfilepathname.o to nsMailApp.o copy temporaryfilepathname.dwo to nsMailApp.dwo [2014-03-21T23:25:34.604029 19890] Successfully moved /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o.tmp.vm-debian-amd64.19890 to /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o [2014-03-21T23:25:34.604046 19890] Successfully moved /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o.tmp.vm-debian-amd64.dwo to /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.dwo [2014-03-21T23:25:34.604052 19890] (cached_obj) Stored in cache: /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o [2014-03-21T23:25:34.604058 19890] (cached_dwo) Stored in cache: /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.dwo [2014-03-21T23:25:34.604072 19890] Unlink nsMailApp.o via nsMailApp.o.vm-debian-amd64.19890.rmXXXXXX [2014-03-21T23:25:34.604183 19890] Copying /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o to nsMailApp.o via nsMailApp.o.vm-debian-amd64.19890.upgjyl (uncompressed) [2014-03-21T23:25:34.604390 19890] Created nsMailApp.o from /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o [2014-03-21T23:25:34.604401 19890] Unlink nsMailApp.dwo via nsMailApp.dwo.vm-debian-amd64.19890.rmXXXXXX [2014-03-21T23:25:34.604451 19890] Copying /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.dwo to nsMailApp.dwo via nsMailApp.dwo.vm-debian-amd64.19890.td6SMO (uncompressed) [2014-03-21T23:25:34.604613 19890] Created nsMailApp.dwo from /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.dwo [2014-03-21T23:25:34.604645 19890] Copying .deps/nsMailApp.o.pp to /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.d via /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.d.vm-debian-amd64.19890.72Ev1h (uncompressed) [2014-03-21T23:25:34.604722 19890] (cached_dep) Stored in cache: /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.d [2014-03-21T23:25:34.625238 19890] Added object file hash to /FF-NEW/cache-dir/a/8/136f90b139bba19aaf1d63ea1cad76-8672.manifest [2014-03-21T23:25:34.625272 19890] Unlink /FF-NEW/cache-dir/tmp/nsMailApp.tmp.vm-debian-amd64.19890.ii [[BUG/PROBLEM]] Here I noticed the smoking-gun: My modified version of ccache (and the original) first runs C compiler to creates the object in a temporary file created under cache directory (by specifying "-o filepathname" to the compiler command) with extra suffix, and then if successful, it renames/copies it into a cached object file path name ending in ".o" (without the extra suffix), then it copies this binary to the requested object file (on the original CC command line), i.e., nsMailApp.o. Note the -o option in the above log: -o /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o.tmp.vm-debian-amd64.19890 When this object is compiled, it seems that gcc uses this temporary pathname (with ".tmp.vm-debian-amd64.19890") to generate the object file pathname, AND then append .dwo to it (after removing .o) to come up with the ".dwo" file pathname and stores this path in the generated .o object file (!) This is bad, and not desirable at all. Note the source file given to C compiler is /REF-COMM-CENTRAL/comm-central/mail/app/nsMailApp.cpp cf. The following is the output of .debug_info section of nsMailApp.o readelf -wi /REF-OBJ-DIR/objdir-tb3/mail/app/nsMailApp.o Contents of the .debug_info section: Compilation Unit @ offset 0x0: Length: 0xa6 (32-bit) Version: 4 Abbrev Offset: 0x0 Pointer Size: 8 <0><b>: Abbrev Number: 1 (DW_TAG_compile_unit) <c> DW_AT_ranges : 0x490 <10> DW_AT_low_pc : 0x0 <18> DW_AT_stmt_list : 0x0 <1c> DW_AT_GNU_dwo_id : 0x40e3bc5eda47ddbb <24> DW_AT_GNU_ranges_base: 0x0 <28> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mail/app <49> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/2/8/04130f169d4e2510f15cf12d1a985d-1808008.o.tmp.vm-debian-amd64.19890.dwo <a6> DW_AT_GNU_pubnames: 1 <a6> DW_AT_GNU_addr_base: 0x0 [[Workaround/Fix]] So, I think, disregarding subtle issues of using a single ccache cache by multiple users from remote machines where source files may be mounted (possibly with different top mount directory on different machines, I think for a single user on a local machine, I can pick up one of the following options to fix the problem. (The use of |.tmp.hostname.number| may help avoid undesirable collision when locking is not used [ccache does use locking, but I think for generating object using the real compiler, it may not use a lock to avoid unnecessary lock contention assuming that this suffix is enough to make the temporary file unique even if multiple users from remote machines try compiling]. Well, at least, my final modification is simply based on the same processing done for profiling-related files, and so should be as bug-free as the handling of profiling option.] 1: A direct approach if I can swap the order of creation of the final object file and cached file, things should be OK. OLD order: original.c (C/C++) -> hashedpath.o in cache -> copy back to original.o NEW order: original.c (C/C++) -> original.o -> save/copy to the hashedpath.o in cache Then, my attempts to use -gsplit-dwarf with ccache should succeed eventually. Initially, I was not so sure whether I could hack ccache in this manner easily. So I thought of a couple of ad-hoc fixes as noted below. But, as it tuned out, there is a variable |output_to_real_object_first| within ccache source. When this variable is set to |true|, ccache behaves in the desired new way I described above (!). Well, there is no direct way to set this variable to |true|. It is set only when profiling option is given to gcc. So, I modified my version of ccache to support -gsplit-dwarf to set this variable internally to |true| when -gsplit-dwarf is given (i.e., when internally split_dwarf_p is |true|. Where I should do this setting was not quite clear, so I set it in two places for now. That should not be a big problem.) Patched. Tested. ==> It works !!! The source is in https://bitbucket.org/zephyrus00jp/ccache-gsplit-dwarf-support 2: Maybe a slightly easier approach If I create the first object copy directly to "*.o" without the extra suffix such as ".tmp.vm-debian-amd64.19890", then at least .dwo file will be created in a file that will remain in the cache before it is purged. (Currently, pathname.o.tmp.vmdeiban-amd64.19890.dwo is removed.) [[Eventually not tried since the approach 1 was chosen since, already the correct order of file processing was provided for profiling data, and I re-used the algorithm.]] 3. Maybe the easiest approach if it works. When attaching ".dwo" suffix, look for ".o." in the pathname and replaces the ".o." plus trailing data with ".dwo", and all should be well. [[Eventually not tried since the approach 1 was chosen.]] TIA! cf. OUTPUT-1: vm-debian-amd64:/tmp$ readelf -wi /REF-OBJ-DIR/objdir-tb3/mozilla/dist/bin/thunderbird-bin | grep -3 dwo <c> DW_AT_ranges : 0x490 <10> DW_AT_low_pc : 0x0 <18> DW_AT_stmt_list : 0x0 <1c> DW_AT_GNU_dwo_id : 0x296369af0b4237a4 <24> DW_AT_GNU_ranges_base: 0x0 <28> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mail/app <49> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/0/2/056a98ef332fca9ec9462de72994c7-1808007.o.tmp.vm-debian-amd64.dwo <a0> DW_AT_GNU_pubnames: 1 <a0> DW_AT_GNU_addr_base: 0x0 Compilation Unit @ offset 0xa4: -- <b0> DW_AT_low_pc : 0x403c30 <b8> DW_AT_high_pc : 0x6 <c0> DW_AT_stmt_list : 0x838 <c4> DW_AT_GNU_dwo_id : 0x45569e416fcef84d <cc> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mozglue/build <fa> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/f/1/5f35e06d9e4564c095e292e9f62eee-1684.o.tmp.vm-debian-amd64.26130.dwo <154> DW_AT_GNU_pubnames: 1 <154> DW_AT_GNU_addr_base: 0x5f0 Compilation Unit @ offset 0x158: -- <164> DW_AT_ranges : 0x680 <168> DW_AT_low_pc : 0x0 <170> DW_AT_stmt_list : 0x8a9 <174> DW_AT_GNU_dwo_id : 0x7cd0f272112f3a8c <17c> DW_AT_GNU_ranges_base: 0x4e0 <180> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mozglue/build <1ae> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/f/a/b3a3ba33e2e25bc0e8b32b02c92695-8199.o.tmp.vm-debian-amd64.26131.dwo <208> DW_AT_GNU_pubnames: 1 <208> DW_AT_GNU_addr_base: 0x5f8 Compilation Unit @ offset 0x20c: -- <218> DW_AT_ranges : 0x1310 <21c> DW_AT_low_pc : 0x0 <224> DW_AT_stmt_list : 0x9cd <228> DW_AT_GNU_dwo_id : 0x495e4f5948368bff <230> DW_AT_GNU_ranges_base: 0x6a0 <234> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mfbt <259> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/b/4/2c6e719b6fd0b12c6d22c2f0401b88-161397.o.tmp.vm-debian-amd64.32293.dwo <2b5> DW_AT_GNU_pubnames: 1 <2b5> DW_AT_GNU_addr_base: 0x758 Compilation Unit @ offset 0x2b9: -- <2c5> DW_AT_ranges : 0x3cd0 <2c9> DW_AT_low_pc : 0x0 <2d1> DW_AT_stmt_list : 0x1489 <2d5> DW_AT_GNU_dwo_id : 0xc176a2365120c362 <2dd> DW_AT_GNU_ranges_base: 0x1350 <2e1> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mfbt <306> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/7/8/2b87baacdd3bbd2b978d24f11e6d95-1615712.o.tmp.vm-debian-amd64.25544.dwo <363> DW_AT_GNU_pubnames: 1 <363> DW_AT_GNU_addr_base: 0x2178 Compilation Unit @ offset 0x367: -- <373> DW_AT_low_pc : 0x4090e0 <37b> DW_AT_high_pc : 0x23 <383> DW_AT_stmt_list : 0x2ace <387> DW_AT_GNU_dwo_id : 0x70c9ecac365d0a7b <38f> DW_AT_GNU_ranges_base: 0x3d30 <393> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mfbt <3b8> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/c/0/410b5b275568a95cd02479711227da-199575.o.tmp.vm-debian-amd64.25546.dwo <414> DW_AT_GNU_pubnames: 1 <414> DW_AT_GNU_addr_base: 0x4ba8 Compilation Unit @ offset 0x418: -- <424> DW_AT_ranges : 0x3e80 <428> DW_AT_low_pc : 0x0 <430> DW_AT_stmt_list : 0x2c5f <434> DW_AT_GNU_dwo_id : 0xdd08bcc5a8ae2b91 <43c> DW_AT_GNU_ranges_base: 0x3d80 <440> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mfbt <465> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/8/4/b4b2fda0986d83d6ac98aa93d0f917-95255.o.tmp.vm-debian-amd64.32298.dwo <4c0> DW_AT_GNU_pubnames: 1 <4c0> DW_AT_GNU_addr_base: 0x4bc0 Compilation Unit @ offset 0x4c4: -- <4d0> DW_AT_low_pc : 0x409190 <4d8> DW_AT_high_pc : 0x4e <4e0> DW_AT_stmt_list : 0x2e54 <4e4> DW_AT_GNU_dwo_id : 0xdf696b86ee50555f <4ec> DW_AT_GNU_ranges_base: 0x3eb0 <4f0> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mfbt <515> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/c/7/9fbb92e2020cead603dab4c2a35723-102275.o.tmp.vm-debian-amd64.25568.dwo <571> DW_AT_GNU_pubnames: 1 <571> DW_AT_GNU_addr_base: 0x4d00 Compilation Unit @ offset 0x575: -- <581> DW_AT_ranges : 0x51e0 <585> DW_AT_low_pc : 0x0 <58d> DW_AT_stmt_list : 0x2f8a <591> DW_AT_GNU_dwo_id : 0xca82fcbb2cdd3cbb <599> DW_AT_GNU_ranges_base: 0x3f10 <59d> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mfbt <5c2> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/d/a/f23e4f14bb239c19907c2fa2a0e976-146445.o.tmp.vm-debian-amd64.32313.dwo <61e> DW_AT_GNU_pubnames: 1 <61e> DW_AT_GNU_addr_base: 0x4d50 Compilation Unit @ offset 0x622: -- <62e> DW_AT_low_pc : 0x40a770 <636> DW_AT_high_pc : 0xda8 <63e> DW_AT_stmt_list : 0x3ba6 <642> DW_AT_GNU_dwo_id : 0xc68db1e1816e5f0d <64a> DW_AT_GNU_ranges_base: 0x5210 <64e> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mfbt <673> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/9/4/32c5adb3db58ce092d6a5c123574b1-160450.o.tmp.vm-debian-amd64.25586.dwo <6cf> DW_AT_GNU_pubnames: 1 <6cf> DW_AT_GNU_addr_base: 0x62e0 Compilation Unit @ offset 0x6d3: -- <6df> DW_AT_ranges : 0x7b00 <6e3> DW_AT_low_pc : 0x0 <6eb> DW_AT_stmt_list : 0x40fc <6ef> DW_AT_GNU_dwo_id : 0x45bd1a7f61c5bc01 <6f7> DW_AT_GNU_ranges_base: 0x5a70 <6fb> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mfbt <720> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/d/9/7d24168f3dcf711154ed6ba5f400a4-125500.o.tmp.vm-debian-amd64.32324.dwo <77c> DW_AT_GNU_pubnames: 1 <77c> DW_AT_GNU_addr_base: 0x6d20 Compilation Unit @ offset 0x780: -- <78c> DW_AT_ranges : 0x7bf0 <790> DW_AT_low_pc : 0x0 <798> DW_AT_stmt_list : 0x4e1a <79c> DW_AT_GNU_dwo_id : 0x64af6f4caa0816a4 <7a4> DW_AT_GNU_ranges_base: 0x7b40 <7a8> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mfbt <7cd> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/0/b/2fd25244ee8e9098b2958721ad1ac3-143519.o.tmp.vm-debian-amd64.32334.dwo <829> DW_AT_GNU_pubnames: 1 <829> DW_AT_GNU_addr_base: 0x8950 Compilation Unit @ offset 0x82d: -- <839> DW_AT_low_pc : 0x40dc50 <841> DW_AT_high_pc : 0x71 <849> DW_AT_stmt_list : 0x503f <84d> DW_AT_GNU_dwo_id : 0xc506ebd4061ce66a <855> DW_AT_GNU_ranges_base: 0x7c20 <859> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mfbt <87e> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/1/4/fe95b17c99ae20a74f2fd6b81bd0f0-99325.o.tmp.vm-debian-amd64.25618.dwo <8d9> DW_AT_GNU_pubnames: 1 <8d9> DW_AT_GNU_addr_base: 0x8bb0 Compilation Unit @ offset 0x8dd: -- <8e9> DW_AT_ranges : 0x93b0 <8ed> DW_AT_low_pc : 0x0 <8f5> DW_AT_stmt_list : 0x518c <8f9> DW_AT_GNU_dwo_id : 0x491969f8e52b54d5 <901> DW_AT_GNU_ranges_base: 0x7c70 <905> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mfbt <92a> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/f/d/9e07192934646eba029ff511f74076-184402.o.tmp.vm-debian-amd64.25620.dwo <986> DW_AT_GNU_pubnames: 1 <986> DW_AT_GNU_addr_base: 0x8be8 Compilation Unit @ offset 0x98a: -- <996> DW_AT_ranges : 0xa1e0 <99a> DW_AT_low_pc : 0x0 <9a2> DW_AT_stmt_list : 0x5e3b <9a6> DW_AT_GNU_dwo_id : 0xa6676a86d5caf46b <9ae> DW_AT_GNU_ranges_base: 0x9400 <9b2> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mfbt <9d7> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/e/3/c8d61de611430e16c4d554bf0d8998-129641.o.tmp.vm-debian-amd64.25634.dwo <a33> DW_AT_GNU_pubnames: 1 <a33> DW_AT_GNU_addr_base: 0xa590 Compilation Unit @ offset 0xa37: -- <a43> DW_AT_low_pc : 0x411510 <a4b> DW_AT_high_pc : 0xc82 <a53> DW_AT_stmt_list : 0x650f <a57> DW_AT_GNU_dwo_id : 0xead599ce9759370 <a5f> DW_AT_GNU_ranges_base: 0xa220 <a63> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mfbt <a88> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/b/a/2544de0fbf1cc47196de30b56f4019-151744.o.tmp.vm-debian-amd64.32353.dwo <ae4> DW_AT_GNU_pubnames: 1 <ae4> DW_AT_GNU_addr_base: 0xb298 Compilation Unit @ offset 0xae8: -- <af4> DW_AT_ranges : 0xb730 <af8> DW_AT_low_pc : 0x0 <b00> DW_AT_stmt_list : 0x6a8c <b04> DW_AT_GNU_dwo_id : 0xe5293ce56c3e90b5 <b0c> DW_AT_GNU_ranges_base: 0xab90 <b10> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/mfbt <b35> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/9/d/b32682b808a86c64ed7cde72b30a4e-135703.o.tmp.vm-debian-amd64.25652.dwo <b91> DW_AT_GNU_pubnames: 1 <b91> DW_AT_GNU_addr_base: 0xbca0 Compilation Unit @ offset 0xb95: -- <ba1> DW_AT_low_pc : 0x4133f0 <ba9> DW_AT_high_pc : 0x473 <bb1> DW_AT_stmt_list : 0x70f8 <bb5> DW_AT_GNU_dwo_id : 0xa47dc76a09b2d291 <bbd> DW_AT_GNU_ranges_base: 0xb780 <bc1> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/xpcom/glue/standalone <bf7> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/f/5/3836fa1ede70eda91d49d8e5baa889-281482.o.tmp.vm-debian-amd64.dwo <c4d> DW_AT_GNU_pubnames: 1 <c4d> DW_AT_GNU_addr_base: 0xca08 Compilation Unit @ offset 0xc51: -- <c5d> DW_AT_low_pc : 0x413870 <c65> DW_AT_high_pc : 0xb0c <c6d> DW_AT_stmt_list : 0x7486 <c71> DW_AT_GNU_dwo_id : 0xd85c7c1f48186606 <c79> DW_AT_GNU_ranges_base: 0xbc00 <c7d> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/xpcom/glue/standalone <cb3> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/0/5/d6ab16869aacc7ee3212bd2db5750f-682862.o.tmp.vm-debian-amd64.dwo <d09> DW_AT_GNU_pubnames: 1 <d09> DW_AT_GNU_addr_base: 0xcef0 Compilation Unit @ offset 0xd0d: -- <d19> DW_AT_low_pc : 0x414380 <d21> DW_AT_high_pc : 0x1da <d29> DW_AT_stmt_list : 0x7b09 <d2d> DW_AT_GNU_dwo_id : 0x862034e6e3bac28b <d35> DW_AT_GNU_ranges_base: 0xbe70 <d39> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/xpcom/glue/standalone <d6f> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/4/d/32897988686492c0b7dcf5a2b77ca6-492831.o.tmp.vm-debian-amd64.dwo <dc5> DW_AT_GNU_pubnames: 1 <dc5> DW_AT_GNU_addr_base: 0xd860 Compilation Unit @ offset 0xdc9: -- <dd5> DW_AT_low_pc : 0x414560 <ddd> DW_AT_high_pc : 0xab3 <de5> DW_AT_stmt_list : 0x7e02 <de9> DW_AT_GNU_dwo_id : 0xc2bea6c4680fd391 <df1> DW_AT_GNU_ranges_base: 0xbf30 <df5> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mozilla/xpcom/glue/standalone <e2b> DW_AT_GNU_dwo_name: /FF-NEW/cache-dir/a/6/4b1e20c0eb03484712372e7205f369-300266.o.tmp.vm-debian-amd64.dwo <e81> DW_AT_GNU_pubnames: 1 <e81> DW_AT_GNU_addr_base: 0xd9d8 OUTPUT-2: After the latest fix to ccache, the following is in the debug info sction of nsMailApp.o. Note the .dwo file name, nsMailApp.dwo. (This was /FF-NEW/cache-dir/0/2/056a98ef332fca9ec9462de72994c7-1808007.o.tmp.vm-debian-amd64.dwo before the fix.) env LANG=C LC_ALL=C readelf -wi /REF-OBJ-DIR/objdir-tb3/mail/app/nsMailApp.o Contents of the .debug_info section: Compilation Unit @ offset 0x0: Length: 0x57 (32-bit) Version: 4 Abbrev Offset: 0x0 Pointer Size: 8 <0><b>: Abbrev Number: 1 (DW_TAG_compile_unit) <c> DW_AT_ranges : 0x490 <10> DW_AT_low_pc : 0x0 <18> DW_AT_stmt_list : 0x0 <1c> DW_AT_GNU_dwo_id : 0xaaa120ce2d0c5e1d <24> DW_AT_GNU_ranges_base: 0x0 <28> DW_AT_comp_dir : /REF-OBJ-DIR/objdir-tb3/mail/app <49> DW_AT_GNU_dwo_name: nsMailApp.dwo <57> DW_AT_GNU_pubnames: 1 <57> DW_AT_GNU_addr_base: 0x0 [end of memo]
(In reply to comment #10) ... > Latest version of my modified ccache is at > https://bitbucket.org/zephyrus00jp/ccache-gsplit-dwarf-support > People are welcome to test it, and eventually incorporate the patch to the > mainline. The modified source above was created based on the development snapshot at https://github.com/jrosdahl/ccache.git So merging to the latest github source would be much easier than to somewhat diverged ccache 3.1.9 source files. TIA
BTW, now the modified version passes all the tests make test and so it is in a good shape, maybe we can begin considering how to merge this into mainline. TIA
(In reply to ishikawa from comment #12) > BTW, now the modified version passes all the tests > make test > and so it is in a good shape, maybe we can begin considering how to merge > this into mainline. > > TIA Any idea how to land the patches to the supported ccache source tree? (Should I create a broken down patches as a patch set, etc.?) TIA
(In reply to ishikawa from comment #13) Yes, a patch set applying to latest master would be preferable.
(In reply to Joel Rosdahl from comment #14) I will try to create a patch hopefully before the end of the year... (My day job keeps me busy for the next several weeks.) TIA
OK, sounds good!
Created attachment 10567 [details] Patch to support -gsplit-dwarf Hi, Here is the long awaited patch for supporting -gsplit-dwarf of GCC 4.8 and later. The patch is created against the git repository. Aside from the main change, I have added a few error output to utility functions such as x_unlink to detect when somehow the deletion failed. (This seems to suggest some unneeded deletion attempts of temporary files. But I may be wrong.) Some logging functions have now extra error checks so that ccache prints something to stderr when the logging directory fills up and ccache can't write to it any more. Also, I added a prefix to logged output to tell me which internal variable (basically file path) is used for moving/copying of files. This helped me in detecting an unnecessary extra copy when the compiler generates an output object directly (and it must do so when -gsplit-dwarf is invoked since the source pathname is embedded in the created .dwo file and this is used later by gdb to look at the original source files). There is a fix to omit the unnecessary extra copy step in such cases. Keeping such extra prefix string is up to you. I would be happy if the major changes to support -split-dwarf is accepted. I have been using this modified ccache for several months and did not see any problems and at least |make test| runs so I have not introduced any bugs to normal operations when "-gsplit-dwarf" is not given. I have not used the pch option to precompile header, and so there may be some unintended interactions. So testing by brave users are welcome. But again you don't have to be so brave since the modification works most of the time. (I have been compiling mozilla thunderbird using this. successfully both under 32-bit and 64-bit linux. Using -gsplit-dwarf and GNU gold linker has been indispensable to compile thunderbird under 32-bit Debian GNU/Linux. Without -gsplit-dwarf and GNU gold linker, the linking runs out of virtual space under 32-bit linux!) Enjoy! TIA
(In reply to ishikawa from comment #17) Thanks! I've started to have a look at your patch now, sorry about the delay. Here are a couple of initial questions: 1. Could you comment on the usage of output_to_real_object_first? Is it really necessary to introduce it again (it was removed in https://git.samba.org/?p=ccache.git;a=commit;h=18a645451194becb832bd1ff4fee1c1e9f3d0dc5)? 2. There is some commented out code in the patch. Is it just something that could/should be removed? 3. called_compiler should be a bool. However, isn't it the case that called_compiler is true when (and only when) mode == FROMCACHE_COMPILED_MODE? If so, there is no need for the extra flag, just check "mode" instead. 4. Not sure I understand the comment "should be ATOMIC, but since we already created cp using x_strdup, and format, we should be relatively safe." - could you clarify if there are any potential issues? 5. There are 14 TODOs left in the code - I guess they should be addressed or removed? 6. "Fixed typo: form -> from": Nope, that's not a typo. :-) "form" is used as a verb in this case; it has the same meaning as "construct". 7. "reindented:" - why? 8. Please see HACKING.txt for some hints regarding the code style. Or just follow the existing code style. For instance, don't compare pointers against NULL, keep lines up to 80 characters with tab size 2, put one space between if/switch/for/while/do and opening curly brace. 9. 'not sure if we should put .dwo file hash in manifest.. Is it possible that the content of manifest is used to clear/remove files when "ccache -C" is issued? If so, then we need to put .dwo file entry in manifest.': No, I don't see a reason to put .dwo stuff in the manifest. 10. "perror(fname); /* to catch strange errors. */" - that's not a good idea. ccache should never print error messages on its own without also calling fatal(). So it's either "print error + fatal()" or "cc_log() + failed()".
(In reply to Joel Rosdahl from comment #18) Sorry I missed this post since I was down with type A flu. I will read this and hopefully have answers tomorrow. TIA
(In reply to Joel Rosdahl from comment #18) In reply to ishikawa from comment #17) >Thanks! > >I've started to have a look at your patch now, sorry about the delay. > >Here are a couple of initial questions: Thank you for your comment. > >1. Could you comment on the usage of output_to_real_object_first? Is >it really necessary to introduce it again (it was removed in >https://git.samba.org/?p=ccache.git;a=commit;h=18a645451194becb832bd1ff4fee1c1e9f3d0dc5)? OK, let me re-explain the background. The issue was that I wanted to perform the direct generation of the object in the target object file first so that the correct pathname for the object is embedded in target object. So I modified the handling of object generation when "-gsplit-dwarf" is given (a la profiling file handling, etc.) OLD order: original.c (C/C++) -> hashedpath.o in cache -> copy back to original.o NEW order: original.c (C/C++) -> original.o -> save/copy to the hashedpath.o in cache It worked like a charm, but then I noticed a performance problem last year. ccache was doing AN EXTRA UNNECESSARY COPY: >NEW order: > original.c (C/C++) -> original.o -> save/copy to the hashedpath.o in cache After the copy to the hashdedpath.o in cache, ccache DID the EXTRA COPY of hashedpath.o in cache -> orginal.o An additional flag +/* in conjunction with -gsplit-dwarf, try to reduce copying */ +static int called_compiler = 0; was introduced to avoid this extra copy. Well, to be honest my main change was done before the removal |output_to_real_object_first| and so it is possible that I can do away with it. But this needs careful review. Does profiling work without |output_to_real_object_first|? And it does not suffer from the extra copying I mentioned? (Sorry I am writing this on a computer where is not convenient to replace ccache.) >2. There is some commented out code in the patch. Is it just something >that could/should be removed? Yes: there are three #if 0, #endif lines, and a few block comments similar to the following: + /* + * if something failed internal to get_file_from_cache + * we may want to remove these files: + * if(output_dwo) x_unlink(output_dwo); + * if(cached_dwo) x_unlink(cached_dwo); + * x_unlink(output_obj); + * x_unlink(cached_stderr); + * x_unlink(cached_obj); + * x_unlink(cached_dep); + *x_unlink(cached_dia); + */ The former can be safely removed. The latter, the block comment, is left since I noticed that calling convenion and internal processing of |get_file_from_cache()| was modified. Previously if there was an error encountered by get_file_from_cache, ccache removed the possibly corrupt/bogus/irrelevant files left around. But I don't see the code for removal any more, so I left the comment. If this omission of removal of files is not an oversight, I think it is OK to remove these block comments, too. >3. called_compiler should be a bool. However, isn't it the case that >called_compiler is true when (and only when) mode == >FROMCACHE_COMPILED_MODE? If so, there is no need for the extra flag, >just check "mode" instead. Your suggestion makes sense. History: I was not familiar with the internal of ccache and was not sure how I can check if I am calling the compiler to produce an object and not copying the cached object. So I introduced |called_compiler| and used it. I was not so sure if I can rely on (mode == FROMCACHE_COMPILED_MODE) exclusively. But to see if that was the case, I added assert() if that would hold in practice to check called_compiler => assert(mode == FROMCACHE_COMPILED_MODE); !called_compiler => assert(mode == FROMCACHE_CPP_MODE || mode == FROMCACHE_DIRECT_MODE); e.g.: + if (!called_compiler) { + assert(mode == FROMCACHE_CPP_MODE || mode == FROMCACHE_DIRECT_MODE); + get_file_from_cache(cached_dwo, output_dwo); + + } else { + assert(mode == FROMCACHE_COMPILED_MODE); + cc_log("[direct object generation]: skipped the copy from cached_dwo=%s to output_dwo=%s", + cached_dwo, output_dwo); + } But now that you mention it, I think I can safely use (mode == FROMCACHE_COMPILED_MODE) instead. >4. Not sure I understand the comment "should be ATOMIC, but since we > already created cp using x_strdup, and format, we should be > relatively safe." - could you clarify if there are any potential > issues? The comment is from this portion: + cp = strrchr(tmp_obj, '.'); + assert(cp != NULL); + { + /* should be ATOMIC, but since we already created cp using + * x_strdup, and format, we should be relatively safe. + */ + *cp = '\0'; + tmp_dwo = format("%s.dwo", tmp_obj); + *cp = '.'; + } As you can see, I am inserting a NUL character into a string of the form "string-possibly-with-dot-within.suffix-without-dot" to obtain "string-possibly-with-dot-within" so that I can produce a filepath "string-possibly-with-dot-within.dwo". Then afterward, I re-inserted the overwritten "." into the string to recover the original value. Very tricky. In a multi-threaded program, this type of variable mutation may not be a great idea. But ccache is single-threaded, it should be OK. I wrote the comment at the very early stage when I tinkered with ccache, and was not sure if doing this kind of data mutation was OK or not. Even if ccache becomes multi-threaded, here the program is working on a duped copy and so it should be relatively safe operation from the viewpoint of unintended data change done by a competing thread. I can probably modify the comment to something like String is modified in place. Since we already created the string x_strdup, and format, we should be relatively safe and not cause unintended change even if ccache becomes multi-threaded. >5. There are 14 TODOs left in the code - I guess they should be addressed or removed? Here are comments on each appearance of "TODO": Some are still active todo item, but others have been handled one way or the other and so either be removed or a new modified comment is appropriate. (i) + if (tmp_dia) { /* CI ??? TODO/FIXME: dont we need this? */ + tmp_unlink(tmp_dia); + } This is a real todo item to investigate. This call to tmp_unlink(tmp_dia) one is missing from ccache. Should we not want to remove this? (ii) fd = open(tmp_stderr, O_RDONLY | O_BINARY); if (fd != -1) { - if (str_eq(output_obj, "/dev/null") || errno == ENOENT) { + /* TODO/FIXME: the order of OR clauses + * may have some issue. + */ + if ( str_eq(output_obj, "/dev/null") + || errno == ENOENT + ) { /* we can use a quick method of getting the failed output */ copy_fd(fd, 2); I think I put the comment based on the earlier code. This one can be safely taken out. To be honest, I am not entirely sure why we are checking /dev/null is here. Does anyone want to run C compiler and generate output to /dev/null? (What about .dwo file in that case?) Maybe for testing? (iii) + /* + * TODO/FIXME: check return value of stat() and act + * accordingly. + */ if (conf->compression) { - stat(cached_stderr, &st); + /* + * The file was compressed, + * so obtain the compressed size again. + */ + if ( stat(cached_stderr, &st) == 0) { + stats_update_size(file_size(&st), 1); + } else { + cc_log("stat on cached_stderr=%s failed (%s)", cached_stderr, strerror(errno)); This TODO/FIXME should be changed to TODO/DONE or DONE: Previously the return value of stat was not checked. Suggested change: Even "DONE" is spurious. + /* + * check return value of stat() and act accordingly. + */ or can be removed. (iv) - put_file_in_cache(output_obj, cached_obj); + /* TODO/DONE : + if(output_to_real_object_first) { + Perform the extra copy of .dwo file if --gsplit-dwarf is used. + } else if(move_uncompressed_file(...) != 0) { + The above line of move_uncompressed_file() + ought to be repeated for .dwo file if --gsplit-dwarf is used. + } + */ This one is the comment about what is done for |--gsplit-dwarf|. I put it in since the code in if() {...} else { ...} is rather long-winding and it was difficult to see what we are doing. "TODO/DONE" can be changed to "Description of what is done below." (v) + /* TODO/FIXME: if we output to real object file first, then + tmp_obj *IS* the real object, and so we should not remove it! + and instead, we should not copy back the cached file to the final + destination object. This is a waste of time. + Same can be said of .dwo file: real and cached one below. + Somehow dependency file was not copied this way. + */ TODO/FIXME => TODO/DONE => DONE => "" so that the first line of the comment block reads + /* If we output to real object file first, then BTW, this is where the extra unnecessary copying from the cache file to the final destination is done when we output to final destination object file first. (vi) + /* TODO/FIXME: if we output to real object file first, then + tmp_dwo *IS* the real .dwo file, and so we should not remove it! + and instead, we should not copy back the cached file to the final + destination .dwo. This is a waste of time. + Same can be said of .o file above: real and cached one. + Somehow dependency file was not copied this way. + */ + /* tmp_unlink(tmp_dwo); */ + } This is the repeat of the longer comment above in (v). This can be simply changed to + /* If we output to real object file first, then + tmp_dwo *IS* the real .dwo file, and so we should not remove it! + */ + /* tmp_unlink(tmp_dwo); */ I left /* tmp_unlink(tmp_dwo) */ to stress the change and what should not be done here. (vii) + /* TODO: if split_dwarf_p is not true and we get non-null + * get_path_in_cache(object_name, ".dwo"), we probably should purge + * that file! + */ + if(!split_dwarf_p) { + char *cp; + cp = get_path_in_cache(object_name, ".dwo"); + if(!cp) { + cc_log("We are not using split dwarf but, we have .dwo. Maybe we should purge it.:%s", cp); + } + } This is a real todo item: somehow a .dwo file (the extra debug information file generated when |-gsplit-dwarf| is used) is left lying around. It is unlikely to happen, come to think of it since the chance of hash generated for a compiler command without |-gsplit-dwarf| matching to an existing entry with |-gsplit-dwarf|. But I inserted this code while modifying ccache to support |-gsplit-dwarf|. Strange things do happen and abort due to errors may leave such files around. (viii) - if (!str_eq(output_obj, "/dev/null")) { + /* TODO/DONE: In the else clause, do a similar processing for + split dwarf file. cached_dwo. in case -gsplit-dwarf is specified. */ + + if (str_eq(output_obj, "/dev/null")) { I can simply take out "TODO/DONE" here to say + /* In the else clause, do a similar processing for + split dwarf file. cached_dwo. in case -gsplit-dwarf is specified. */ (ix) + /* TODO/FIXME: + * if something failed internal to get_file_from_cache + * we may want to remove files. + * ... This is the removal of files that seemed to have disappeared which I mention in the answer to your question No. 2 above. It can be safely taken out if you think it is OK. (x) + /* TODO/FIXME: not sure if we should put .dwo file hash in manifest.. + * Is it possible that the content of manifest is used to clear/remove + * files when "ccache -C" is issued? If so, then we need to put + * .dwo file entry in manifest. + */ + This is answered in your comment number 9 below. So I can change this whole comment into + /* We do not have to put .dwo file hash in manifest. + */ (xi) - stat(manifest_path, &st); - stats_update_size(file_size(&st) - old_size, old_size == 0 ? 1 : 0); + + /* + * TODO/FIXME: Check the return value of stat and act accordingly. + */ + if (stat(manifest_path, &st) == 0) { + stats_update_size((file_size(&st) - old_size), + old_size == 0 ? 1 : 0); + } else { + cc_log("stat on manifest (manifest_path) %s failed (%s), so size statistics is incorrect.", + manifest_path, strerror(errno)); + } In the original code, the return value of stat() was not checked. It is in the patch now. So I can simply say + /* + * Check the return value of stat and act accordingly. + */ (xii) + /* TODO/FIXME: Is there an angle in which split_dwarf_p processing + has anything to do with "-" output ? */ + /* don't try to second guess the compilers heuristics for stdout handling */ This is an actual todo item to investigate. What should we do if |-gsplit-dwarf| is given when "-" is given as the output. As the existing comment suggests we may not want to try to second guess the compilers hueristics and simply bail out if "-" is given as the output and |-gsplit-dwarf| is given at the same time. Your opinion is appreciated. (xiii) + /* TODO/DONE: Do we need to cope with strange + output_obj file name in case split_dwarf_p ? */ + I put in this comment during the modification of ccache. I think I can safely remove this now, or put + /* "-gsplit-dwarf" may encounter corner cases + when strange file names are given. */ This is a real possibility. (xiv) + /* TODO/DONE: cc_process_args ought to have generated + * output_dwo. But we may not want to do so yet before + * generating_dependencies ? + */ To be honest, this comment is a little hazy. But look at the comment below . @@ -2337,6 +2896,18 @@ cc_process_args(struct args *args, struct args **preprocessor_args, } } + /* -gsplit-dwarf support + * + * -gsplit-dwarf processing should not affect dependency generation. + * BUT, given the command line below, + * dependency file is generated first and then + * ccache.o is generated. + * Note the ordering and take care of that. + * + * ccache /usr/bin/gcc-4.8 -gsplit-dwarf -DHAVE_CONFIG_H -DSYSCONFDIR=/usr/local/etc -I. -I. -MD -MP -MF .deps/ccache.c.d -g -O2 -Wall -W -Werror -c -o ccache.o ccache.c + * + */ + So maybe the dubious comment should be re-written as + /* DONE: cc_process_args ought to have generated + * output_dwo. But we may not want to do so before + * generating_dependencies because it has been the + * implicit ordering that the dependency + * is generated first and then object file is generated. + */ So in a nutshell: i real todo item to investigate ii the comment can be taken out (it refers to the old code, I think) iii This has been taken care of and so a positive comment that describes the current behavior may be in order, or can be removed iv "TODO/DONE" can be changed to "Description of what is done below." v "TODO/FIXME" is actually now done so "/* DONE: if we output to real object file first, then" ... vi This is a repetition of (v) and can be shortened to /* If we output to real object file first, then tmp_dwo *IS* the real .dwo file, and so we should not remove it! */ vii This is a real todo item to handle a very rare but possible case of unwanted .dwo file somehow creeping into the cache and stay there. viii Simply take out "TODO/DONE" and say /* In the else clause, do a similar processing for split dwarf file. cached_dwo. in case -gsplit-dwarf is specified. */ ix Answered to Question 2 above. Can be safely removed if the missing unlink function calls are intentiona. x Answered in Question/comment 9 below. So a short comment as follows suffices. /* We do not have to put .dwo file hash in manifest. */ xi So I can simply say, by removing "TODO/FIXME" /* * Check the return value of stat and act accordingly. */ xii This is an acutal to do item: what should we do when "-" is given as the output at the same time "-gsplit-dwarf" is given. xiii can be safely removed. xiv A reworded comment should be put in. >6. "Fixed typo: form -> from": Nope, that's not a typo. :-) "form" is used as a verb in this case; it has the same meaning as "construct". > Mea culpa. I was not reading the comment very carefully :-) > 7. "reindented:" - why? On my screen, the particular line fills up to the right very much. What is the adopted maximum line length? (Oh, maybe I should check HACKING.txt.) If 80 characters are the maximu, it fits certainly. However, when I was tinkering with the comment line to change a word or two, emacs, the editor I was using, tried to re-indent the line. Thus I re-indented the whole paragraph. Maybe I should try to set the enviromental setting of Emacs to set the right margin for comment line to 79 or 80 column. >8. Please see HACKING.txt for some hints regarding the code style. Or > just follow the existing code style. For instance, don't compare > pointers against NULL, keep lines up to 80 characters with tab size > 2, put one space between if/switch/for/while/do and opening curly > brace. >For instance, don't compare pointers against NULL OK. I found the use of |if (pointer)| rather odd, but if the current existing style mandates it. Fine. > keep lines up to 80 characters with tab size 2 Do you know by chance if there is a good emacs mode line to enforce "tab size 2"? BTW, this |tab size 2| does not have any thing to do with the indentation, does it? I mean I see code like the following which indent lines with full 8 characters (??) --- begin quote --- static void add_prefix(struct args *args) { char *e; char *tok, *saveptr = NULL; struct args *prefix; int i; if (str_eq(conf->prefix_command, "")) { return; ---- end quote ---- Oh now it has dawned on me. I am using emacs and the tab characters have the effect of full 8 column indentation: tab skips to the next tab position (which is 8 chars apart). So I see "char *e;" the first argument declaration line at the 8th stop on the screen. But you say "tab stop 2". Do you use an editor that sets tab positions two characters apart? Does this mean thta maybe I should use REAL TWO SPACE CHARACTER indentation or what? (But that will produce many whitespace only changes.) I need to understand - what editor you are using, - how tab spacing is handled by the editor for display, AND - how tab input is expanded into space characters (or not expannded at all.) I suspect that you use tab to indent the line, and the editor keeps the input tab characters as tab internally, but the skipping of tab on your editor screen is done at tab stops two characters apart. Correct? In my case, emacs uses tab (or new line) to move the next input position for a line to the intended indent position and expand the TAB character to the desired spacing so that the default indentation is two SPACE characters before starting a declaration (this behavior can be changed by one's customization.) So tab character is usually absent in the source file. I have just realized that real TABs are used in ccache source files. I wonder if these real tab characters have the effect of skipping to tab stop positions two characters apart. So maybe the above code snipet which has 8 character indentation in my emacs buffer looks as follows in your editor? (I replaced tab character with two space characters.) --- This is from ccache.c static void add_prefix(struct args *args) { char *e; char *tok, *saveptr = NULL; struct args *prefix; int i; if (str_eq(conf->prefix_command, "")) { return; ---- My modification does use TAB character since when I insert a new line after an existing line, emacs tries to follow the starting position of the previous line, and if the indented space is multiple of 8 [this is the case since in my setting, tab characters skip to the next tab position 8 characters apart], emacs uses the appropriate number of tab characters for indentation (again, this use of tab character can be controlled by customization by emacs.). Now that I realize this subtle issue, I rechecked the indentation in ccache.c: it seems that there have been different contributors since there are place where TAB characters have been used for indentation, but there are four space characters for indentation as well. Oh well. I will check "one space between if/switch/for/while/do and opening curly brace" manually. >9. 'not sure if we should put .dwo file hash in manifest.. Is it >possible that the content of manifest is used to clear/remove files >when "ccache -C" is issued? If so, then we need to put .dwo file entry >in manifest.': No, I don't see a reason to put .dwo stuff in the >manifest. Thank you for the clarification. 10. "perror(fname); /* to catch strange errors. */" - that's not a good idea. ccache should never print error messages on its own without also calling fatal(). So it's either "print error + fatal()" or "cc_log() + failed()". So this piece of patch fd = open(fname, O_RDONLY|O_BINARY); if (fd == -1) { + perror(fname); /* to catch strange errors. */ return false; } should be fd = open(fname, O_RDONLY|O_BINARY); if (fd == -1) { + perror(fname); /* to catch strange errors. */ + fatal(); return false; } This is it. TIA for your comment. I am afarid that I am away from computers over the weekend, but once you digest my comments and repost your idea of going about modifying the patch and integration, I will follow the plan. TIA
Created attachment 10813 [details] Patch to support -gsplit-dwarf (rev01) Here is a modified patch. I have tried to address the comments in the new patch. However, I have not addressed the following issues yet. 1. the use of output_to_real_object_first, and 2. most importantly the tab size 2 issue: I still can't figure out what emacs mode I should be using even after I read HACKING.txt and inserted an experimental mode line at the top of ccache.c and another mode setting comment at the end of ccache.c. (With the current mode line and mode setting done at the end of the file, I *SEE* something like the following in Emacs buffer. That is Emacs shows the initial TAB character on the first line as a skipping to the tab position (now at 2.) --- This is from ccache.c static void add_prefix(struct args *args) { char *e; char *tok, *saveptr = NULL; struct args *prefix; int i; if (str_eq(conf->prefix_command, "")) { return; ---- *HOWEVER*, if I re-indent the whole file, I think I get something very different from what joel would like to have and very unexpected result from my own view, too.. I seem to get four tab characters for the initial 8 character indentation (8 = 4 * 2). This certainly messes up the file. If someone figures out how I should be using these modes to fit into the style of HACKING.txt, please let me know. Aside from these, I modified the comments along the line of previous discussion (slightly modified, though) and did away with the use of perror() and instead use cc_log() now, etc. I have run |make test| and it succeeded and so it did not introduce new errors when we don't use -gsplit-dwarf. I am yet to run it through the build of mozilla thunderbird yet. TIA
Created attachment 10815 [details] Patch to support -gsplit-dwarf (rev02) I have noticed that the ccache git repository has incorporated a few changes to accommodate win32 environment and such. So I have re-created a diff (sorry again, this is not a git patch). - The diff is against the latest ccache git. - I have cleaned up some comments and hopefully, they are more palatable. - Comparison of pointer to NULL in getopt_long.c and snprintf.c were there before I touched the ccache, so the changes in the previous patch were reverted. The files are not touched at all. (Obviously, some people, like me, regards the use of "if (pointer)" rather strange looking.) - still not checking/reviewing usage of output_to_real_object_first. - tab spacing issue. I am struggling with it. I removed the smart-tab submode since it does not work as intended when I modified the source code. I wonder if someone can suggest a set of suitable parameters GNU indent so that can run through my local copy of ccache.c, for example, to match the indentation style that is expected. TIA
Created attachment 10819 [details] Patch to support -gsplit-dwarf (rev03) hopefully close to final :-) This is the final cleaned up patch. (Rev03) I handled the two remaining issues. - |output_to_real_object_first|, and - the space issues. I changed the source finally to reflect the removal of |output_to_real_object_first|. In so doing, I found |tmp_obj|, and |tmp_dia| are no longer then required and modified the code accordingly. In one place, a copy of diagnostic output file previously done in my code seems unnecessary if "output to real object first" is always done. But just in case, I simply kept the code of block in #if 0/#endif. That part can be removed safely when you merge the changes. I also found the real cause of the the space issues and fixed them finally, I hope: I didn't realize that I was creating the diff using diff -ibcwr (note the flags for ignoring the whitespaces, etc.) After removing the flags, I noticed where my modified code (re-indented in Emacs, I think) is different from ccache git repository from the viewpoint of spaces and fixed most of the space issues by hand. So diff should be easy to check now. However, a few diffs caused by strange use of all spaces without using tab at all in git repository files remain. TIA CI
(In reply to ishikawa from comment #23) I hasten to add that - |make test| passed and so I did not add bugs when "-gsplit-dwarf" is not used, and - I tested the operation of "-gsplit-dwarf" on local PC and it works ! TIA
(In reply to ishikawa from comment #20) > [...] > Does profiling work without |output_to_real_object_first|? The test suite thinks so, at least. :) > And it does not suffer from the extra copying I mentioned? Yes, it does. Thanks for mentioning it! Fixed in 38301c0dad27523f59d874759595944558ba022d. This also means that your patch is affected, so please move your changes forward. So, please remove the output_to_real_object_first (commented out) logic, the "if (mode == FROMCACHE_COMPILED_MODE /* called compier */) {" check, etc. > The latter, the block comment, is left since I noticed that calling > convenion and internal processing of |get_file_from_cache()| was > modified. Previously if there was an error encountered by > get_file_from_cache, ccache removed the possibly > corrupt/bogus/irrelevant files left around. But I don't see the code > for removal any more, so I left the comment. It's OK to leave commented away code for work-in-progress patches, but if you indent to produce potentially final versions of the patch, please remove left-over code and comments as much as possible. > If this omission of removal of files is not an oversight, I think it > is OK to remove these block comments, too. I don't think it was an oversight. But even if it was, it's not related to gsplit-dwarf support, right? Please keep unrelated changes apart in different patches - it's just so much harder to review and understand a big patch with many different types of changes than one to-the-point patch. > As you can see, I am inserting a NUL character into a string [...] Very tricky. I suggest doing this instead: char *base_name = remove_extension(output_obj); tmp_dwo = format("%s.dwo", base_name); free(base_name); And it only needs to be done if split_dwarf_p is true, right? > Even if ccache becomes multi-threaded, here the program is working > on a duped copy and so it should be relatively safe operation from > the viewpoint of unintended data change done by a competing thread. No need to try to account for potential future changes; it will only complicate the logic without much gain, I would say. > This call to tmp_unlink(tmp_dia) one is missing from ccache. Should > we not want to remove this? There is no longer any tmp_dia file, so it can't be removed. But again, this shouldn't be related to -gsplit-dwarf support. > Does anyone want to run C compiler and generate output to /dev/null? Not related to -gsplit-dwarf support. > Previously the return value of stat was not checked. Not related to -gsplit-dwarf support, please revert it and put in another patch suggestion. > What should we do if |-gsplit-dwarf| is given when "-" is given as > the output. As the existing comment suggests we may not want to try > to second guess the compilers hueristics and simply bail out if "-" > is given as the output and |-gsplit-dwarf| is given at the same > time. I think that "-o -" only is supported by non-GCC compilers since the GCC versions I have lying around fail when specifying "-" as the output. I don't think you need to do anything special for the -gsplit-dwarf case since compilation with GCC will fail anyway with or without -gsplit-dwarf. > Do you know by chance if there is a good emacs mode line to enforce "tab size 2"? Not sure I understand your question, but: (setq tab-width 2). > - what editor you are using, > - how tab spacing is handled by the editor for display, AND > - how tab input is expanded into space characters (or not expannded > at all.) I'm using Emacs with http://www.emacswiki.org/emacs/SmartTabs for ccache code. Indentation is done with one tab character per level and alignment with space characters. You can then use whatever tab width (= tab display size) you want. I personally use tab width 2. (I actually prefer using spaces for indentation, but since the ccache code used tabs for indentation when I took over maintainership, that's the way it is.) > diff -ibcwr (note the flags for ignoring the whitespaces, etc.) Why don't you use Git instead? For one thing, you wouldn't get bogus changes in the patch (e.g. for version.c) and you would be able to more easily rebase your changes to new changes on the branch. I have not started to look at your latest patch yet. You produce so much text and questions that it takes a long time to just read and understand it. :-) I would appreciate less braindumping, to be honest. Please generate the patch against latest master. And try to make it as small and neat as possible: split unrelated changes to different patches, remove lengthy comments about stuff to maybe investigate and do. Thanks!
(In reply to Joel Rosdahl from comment #25) Thank you for your comments. I will regenerate a new patch and split them into a series of small patches so that you can decide whether a particular change or two should be in the final ccache, etc. As you decide to pickup hunks or not, I will re-generate the subsequent patches, etc. Yes, I will use git. I started to hack on ccache.c source code before I realized there was master git for ccache and eventually realized that there is one. So my local changes are in a git repository, but it started on a frozen image of some years ago... I decided to merge my local changes into newly created repository from ccache git master and along the merge, will create the small patches (logically separated I hope), but minor interactions between the patches are hard to avoid... TIA CI
(In reply to Joel Rosdahl from comment #25) Before uploading a series of patch, here is to recap what I did: >> [...] >> Does profiling work without |output_to_real_object_first|? > > The test suite thinks so, at least. :) I have removed the use of |output_to_real_object_first| per the latest git repository. >> And it does not suffer from the extra copying I mentioned? > > Yes, it does. Thanks for mentioning it! Fixed in > 38301c0dad27523f59d874759595944558ba022d. This also means that your > patch is affected, so please move your changes forward. > > So, please remove the output_to_real_object_first (commented out) > logic, the "if (mode == FROMCACHE_COMPILED_MODE /* called compier */) > {" check, etc. I merged my local patch with the latest git repository. I left a few assert statements as a sanity check in my patch. Support for -gsplit-dwarf is a major change after all. We can remove them after half a year or so. >> The latter, the block comment, is left since I noticed that calling >> convenion and internal processing of |get_file_from_cache()| was >> modified. Previously if there was an error encountered by >> get_file_from_cache, ccache removed the possibly >> corrupt/bogus/irrelevant files left around. But I don't see the code >> for removal any more, so I left the comment. > > It's OK to leave commented away code for work-in-progress patches, but > if you indent to produce potentially final versions of the patch, > please remove left-over code and comments as much as possible. Will do. >> If this omission of removal of files is not an oversight, I think it >> is OK to remove these block comments, too. > > I don't think it was an oversight. But even if it was, it's not > related to gsplit-dwarf support, right? Please keep unrelated changes > apart in different patches - it's just so much harder to review and > understand a big patch with many different types of changes than one > to-the-point patch. Yes, for this particular comment, I created a separate patch so that we can decide whether the disappearance of x_unlink() was intentional/wise or it is not a major issue after all, etc. >> As you can see, I am inserting a NUL character into a string [...] Very tricky. > > I suggest doing this instead: > > char *base_name = remove_extension(output_obj); > tmp_dwo = format("%s.dwo", base_name); > free(base_name); > Thank you for the hint. > And it only needs to be done if split_dwarf_p is true, right? Ahem. There is a complication. The way -gsplit-dwarf is handled as command option, I am afraid I have to create tmp_dwo as soon as I learn the output file name (by -o I think) fairly early even before I see "-gsplit-dwarf" eventually and so I am afraid that I have to create tmp_dwo here always... >> Even if ccache becomes multi-threaded, here the program is working >> on a duped copy and so it should be relatively safe operation from >> the viewpoint of unintended data change done by a competing thread. > > No need to try to account for potential future changes; it will only > complicate the logic without much gain, I would say. OK, fair enough. >> This call to tmp_unlink(tmp_dia) one is missing from ccache. Should >> we not want to remove this? > > There is no longer any tmp_dia file, so it can't be removed. But > again, this shouldn't be related to -gsplit-dwarf support. You are right. |tmp_dia| is gone and so this no longer matters. >> Does anyone want to run C compiler and generate output to /dev/null? > > Not related to -gsplit-dwarf support. Actually it does. Although you figured out (later in the post) that gcc doesn't handle /dev/null with -gsplit-dwarf well, I needed to ask what we should do for .dwo file name generation, etc. But if gcc does not work, this is a no starter. Let us forget it. >>> Previously the return value of stat was not checked. > > Not related to -gsplit-dwarf support, please revert it and put in > another patch suggestion. I created a separate patch for this. >> What should we do if |-gsplit-dwarf| is given when "-" is given as >> the output. As the existing comment suggests we may not want to try >> to second guess the compilers hueristics and simply bail out if "-" >> is given as the output and |-gsplit-dwarf| is given at the same >> time. > > I think that "-o -" only is supported by non-GCC compilers since the > GCC versions I have lying around fail when specifying "-" as the > output. I don't think you need to do anything special for the > -gsplit-dwarf case since compilation with GCC will fail anyway with or > without -gsplit-dwarf. All right. Fair enough. I did not realize other compilers support "-o -". >> Do you know by chance if there is a good emacs mode line to enforce "tab size 2"? > > Not sure I understand your question, but: (setq tab-width 2). Well, actually I tried this, but not quite well in conjunction with SmartTabs mode as you mention (also in HACKING.txt). I am using emacs 24.3 from GNU ftp server, and I am not sure what I am doing wrong, but the next patches I will upload hopefully are better generated in terms of white space: I checked strange diffs caused by the different usage of space/tab characters and modified the affected lines manually. >> - what editor you are using, >> - how tab spacing is handled by the editor for display, AND >> - how tab input is expanded into space characters (or not expannded >> at all.) > I'm using Emacs with http://www.emacswiki.org/emacs/SmartTabs for > ccache code. Indentation is done with one tab character per level and > alignment with space characters. You can then use whatever tab width > (= tab display size) you want. I personally use tab width 2. I tried SmartTabs, but somehow it did not work as I hope it would, and so reverted to adjusting tab/space manually using "tabify" occasionally since as you mention, the original code used tabs for indentation. > (I actually prefer using spaces for indentation, but since the ccache > code used tabs for indentation when I took over maintainership, that's > the way it is.) Maybe as the maintainer, you can force a new style by announcing a flag day in the future when you run all the source code of ccache through gnu indent to your liking :-) [I am half serious. In linux kernel, Alan Cox did exactly that when he was tired of the sphagetti code of SCSI subsystem. He created a singlet patch set that is the result of running gnu indent on SCSI subsystem source code. White space change only patch set about a dozen years ago or so. ] >> diff -ibcwr (note the flags for ignoring the whitespaces, etc.) > > Why don't you use Git instead? For one thing, you wouldn't get bogus > changes in the patch (e.g. for version.c) and you would be able to > more easily rebase your changes to new changes on the branch. My next revision consist of a few separate patch sets. Yes, using git. > I have not started to look at your latest patch yet. You produce so > much text and questions that it takes a long time to just read and > understand it. :-) I would appreciate less braindumping, to be honest. > > Please generate the patch against latest master. And try to make it as > small and neat as possible: split unrelated changes to different > patches, remove lengthy comments about stuff to maybe investigate and > do. Thanks! Some lengthy comments may still remain since I am not sure if the issue is resolved completely now. But the patch sets as a whole would be much easier to understand, I hope. TIA
I am uploading these files. CAVEAT: The following patches are created during the work to support -gsplit-dwarf (FissionDwarf) of GCC. (See bug 10005 ) Some of the patches have interdependence and if earlier patches are not applied, the later ones may fail. List of patches. 0001-Harden-logging-perror-replaced-with-cc_log-and-fatal.patch 0002-hash-file-creation-failure-is-recorded-now.patch 0003-Use-x_unlink-instead-of-unlink-per-HACKING.txt.patch 0004-make-sure-the-result-of-stat-call-is-checked.patch 0005-Enhance-log-message-add-prefix-var.-name-and-a-few-a.patch 0006-warning-about-missing-x_unlinks-when-get_file_from_c.patch 0007-Support-for-gsplit-dwarf.patch 0008-Missing-copy-of-diagnostics-file-in-case-of-failure.patch Short explanation. 1: 0001-Harden-logging-perror-replaced-with-cc_log-and-fatal.patch harden logging: trace errors and/or abort perror() -> cc_fail and fatal This was necessary when I realized that log was not written to a smallish partition for testing and the partition filled up. This is actually not related to -gsplit-dwarf and I will file a separate bugzilla, but anyway I post the identical file here for convenience. This is to util.c only and so can be independent of main ccache.c change to support -gsplit-dwarf. 2: 0002-hash-file-creation-failure-is-recorded-now.patch hash file creation failure is recorded now. This is actually not related to -gsplit-dwarf and I will file a separate bugzilla, but anyway I post the identical file here for convenience. This is to hash.c only and so can be independent of main ccache.c change to support -gsplit-dwarf. 3: 0003-Use-x_unlink-instead-of-unlink-per-HACKING.txt.patch This is actually not related to -gsplit-dwarf and I will file a separate bugzilla, but anyway I post the identical file here for convenience. Since this changes ccache.c, it may interfere with the later patch application if this is not applied to ccache repository. 4: 0004-make-sure-the-result-of-stat-call-is-checked.patch make sure the result of stat() call is checked. This is actually not related to -gsplit-dwarf and I will file a separate bugzilla, but anyway I post the identical file here for convenience. Since this changes ccache.c, it may interfere with the later patch application if this is not applied to ccache repository. 5: 0005-Enhance-log-message-add-prefix-var.-name-and-a-few-a.patch Enhance log message: add prefix (var. name) and a few additions. These enhanced log messages were crucial in developing and debugging the support for -gsplit-dwarf. Also, I could find the extra unnecessary copy between the cached and real object files by these log messages. So I think this change is very important for enhancing ccache in the future or when subtle bugs may creep in. This is actually not related to -gsplit-dwarf and I will file a separate bugzilla, but anyway I post the identical file here for convenience. Since this changes ccache.c, it may interfere with the later patch application if this is not applied to ccache repository. 6: 0006-warning-about-missing-x_unlinks-when-get_file_from_c.patch This is about comment that warns about missing x_unlinks when get_file_from_cache() failed. (Not all of the warning comments are in this patch.) Somehow certain unlinking of files using x_unlink when get_file_from_cache failed (these unlinking calls were in the summer of 2014, I think.) I wonder if this is quite intentional, and I just wanted to remind developers these are now gone by inserting comments. This is actually not related to -gsplit-dwarf and I will file a separate bugzilla, but anyway I post the identical file here for convenience. Since this changes ccache.c, it may interfere with the later patch application if this is not applied to ccache repository. BTW, the changes for pure support of -gsplit-dwarf also have similar comments. 7: 0007-Support-for-gsplit-dwarf.patch -gsplit-dwarf support Main patch. Some part of the diff/patch overlap with lines modified by the previous patches and so these patches must be applied in the order given here. 8: 0008-Missing-copy-of-diagnostics-file-in-case-of-failure.patch This is actually not related to -gsplit-dwarf and I will file a separate bugzilla, but anyway I post the identical file here for convenience and since this was in the previous patch. I noticed that when an error occurs, a premature exit path is taken, but maybe we might want to copy the temporary diagnostic (*_dia) file so that it is stored permanently somewhere. I tried to do something to this effect, but then the change of default behavior of ccache to produce output in the final destination first (and then copy them back to cache destination) may have unnecessitated this change. But I am not so sure. So again to alert the devlopers (and myself) about this, I left a comment to this effect. I will upload the files now, but filing different bugzilla for some of these will be done tomorrow.
Created attachment 10843 [details] patch 1: Harden logging by tracing errors and/or abort, etc. 1: 0001-Harden-logging-perror-replaced-with-cc_log-and-fatal.patch harden logging: trace errors and/or abort perror() -> cc_fail and fatal This was necessary when I realized that log was not written to a smallish partition for testing and the partition filled up. This is actually not related to -gsplit-dwarf and I will file a separate bugzilla, but anyway I post the identical file here for convenience. This is to util.c only and so can be independent of main ccache.c change to support -gsplit-dwarf.
Created attachment 10844 [details] patch 2: hash file creation failure is recorded now
Created attachment 10845 [details] patch 3: use x_unlink instead of unlink per HACKING.txt 3: 0003-Use-x_unlink-instead-of-unlink-per-HACKING.txt.patch This is actually not related to -gsplit-dwarf and I will file a separate bugzilla, but anyway I post the identical file here for convenience. Since this changes ccache.c, it may interfere with the later patch application if this is not applied to ccache repository.
Created attachment 10846 [details] patch 4: make sure the result of stat call is checked. 4: 0004-make-sure-the-result-of-stat-call-is-checked.patch make sure the result of stat() call is checked. This is actually not related to -gsplit-dwarf and I will file a separate bugzilla, but anyway I post the identical file here for convenience. Since this changes ccache.c, it may interfere with the later patch application if this is not applied to ccache repository.
Created attachment 10847 [details] patch 5: enhance log messages and add a few cc_log calls. 5: 0005-Enhance-log-message-add-prefix-var.-name-and-a-few-a.patch Enhance log message: add prefix (var. name) and a few additions. These enhanced log messages were crucial in developing and debugging the support for -gsplit-dwarf. Also, I could find the extra unnecessary copy between the cached and real object files by these log messages. So I think this change is very important for enhancing ccache in the future or when subtle bugs may creep in. This is actually not related to -gsplit-dwarf and I will file a separate bugzilla, but anyway I post the identical file here for convenience. Since this changes ccache.c, it may interfere with the later patch application if this is not applied to ccache repository.
Created attachment 10848 [details] part 6: warning about missing x_unlinks 6: 0006-warning-about-missing-x_unlinks-when-get_file_from_c.patch This is about comment that warns about missing x_unlinks when get_file_from_cache() failed. (Not all of the warning comments are in this patch.) Somehow certain unlinking of files using x_unlink when get_file_from_cache failed (these unlinking calls were in the summer of 2014, I think.) I wonder if this is quite intentional, and I just wanted to remind developers these are now gone by inserting comments. This is actually not related to -gsplit-dwarf and I will file a separate bugzilla, but anyway I post the identical file here for convenience. Since this changes ccache.c, it may interfere with the later patch application if this is not applied to ccache repository. BTW, the changes for pure support of -gsplit-dwarf also have similar comments.
Created attachment 10849 [details] patch 7: -gsplit-support 7: 0007-Support-for-gsplit-dwarf.patch -gsplit-dwarf support Main patch. Some part of the diff/patch overlap with lines modified by the previous patches and so these patches must be applied in the order given here.
Created attachment 10850 [details] patch 8: Missing copy of diagnostics file in case of failure 8: 0008-Missing-copy-of-diagnostics-file-in-case-of-failure.patch This is actually not related to -gsplit-dwarf and I will file a separate bugzilla, but anyway I post the identical file here for convenience and since this was in the previous patch. I noticed that when an error occurs, a premature exit path is taken, but maybe we might want to copy the temporary diagnostic (*_dia) file so that it is stored permanently somewhere. I tried to do something to this effect, but then the change of default behavior of ccache to produce output in the final destination first (and then copy them back to cache destination) may have unnecessitated this change. But I am not so sure. So again to alert the developers (and myself) about this, I left a comment to this effect.
Comment on attachment 10844 [details] patch 2: hash file creation failure is recorded now I forgot attach this excerpt to patch 2. 2: 0002-hash-file-creation-failure-is-recorded-now.patch hash file creation failure is recorded now. This is actually not related to -gsplit-dwarf and I will file a separate bugzilla, but anyway I post the identical file here for convenience. This is to hash.c only and so can be independent of main ccache.c change to support -gsplit-dwarf. Anyway, I will bugzilla for patches unrelated to -gsplit-dwarf directly tomorrow. (But then again, if ccache.c is modified by the patch(es), they may have overlapping lines with main patch (for -gsplit-dwarf) and patch(es) may not applied cleanly unless the patches are applied in the order given. TIA
My goodness, I have created the patch using the mail feature of git, but obviously this was not good :-( You have to download it to view it (Or maybe I can change mime-type as an afterhought?)
Now I see if I click on the "patch" box in "Details" of the attachment, I can see the patch inside the web browser (after I changed the mime type to application/x-diff). Sorry for the trouble. TIA
I have filed separate bugzillas for the following: >List of patches. >0001-Harden-logging-perror-replaced-with-cc_log-and-fatal.patch in bug 11154 >0002-hash-file-creation-failure-is-recorded-now.patch in bug 11155 > 0003-Use-x_unlink-instead-of-unlink-per-HACKING.txt.patch in bug 11156 > 0004-make-sure-the-result-of-stat-call-is-checked.patch in bug 11157 > 0005-Enhance-log-message-add-prefix-var.-name-and-a-few-a.patch in bug 11158 > 0006-warning-about-missing-x_unlinks-when-get_file_from_c.patch in bug 11159 0007-Support-for-gsplit-dwarf.patch This is the main patch for *THIS* bugzilla entry (bug 10005) > 0008-Missing-copy-of-diagnostics-file-in-case-of-failure.patch in bug 11160 TIA
Thanks for splitting up the patch! I'll have a look at the new ones in a not too distant future.
Created attachment 10996 [details] Part 7 (updated gsplit-dwarf support) Dear Joel, After thinking the merging strategy, I decided to put the patch 5 "enhance log messages and a few cc_log calls" on hold. That patch can wait until the major part of the patch, namely the core support for FissionDwarf is merged. (part7 above.) So I refreshed the patch 7 (to skip the enhanced log portion) and resynced with the main git repository and uploading it. I noticed that profiling support is in the git repository now. You may notice that -gsplit-dwarf support is very much like the profiling support since ".dwo" file is handled in a similar manner as .cov file is. So I think merging of gsplit-dwarf support should be easier now that you know how the processing of compiler to produce an additional file beside the main object file is like. As a matter of fact, many added code snippets appear near the handling of profiling file. Thank you in advance for your attention. Chiaki Ishikawa
Thanks for the patch update and your work on -gsplit-dwarf support! > Note: cc_process_args ought to have generated/set output_dwo. But we may not > want to do so before generating dependencies (or set generating_dependencies) > because it has been the implicit ordering that the dependency is generated > first and then object file is generated. (CI failed to record the particular > command line, etc. which necessitated him to insert the assert() statement > below.) See a block comment that starts with "-gsplit-dwarf support: Ordering > of dependency generation." in the earlier part of this file. I have read the above and the comment it refers to, but I still don't understand what dependency generation has to do with -gsplit-dwarf support. Could you please explaind the correlation between the two in more details? Have you considered writing some test cases for -gsplit-dwarf support (see test.sh)? Do you think that it would be feasible? Now some comments on the patch itself: > +/* -gsplit-dwarf support: Split dwarf information (GCC 4.8 and up). Contains pathname if not NULL. */ This line (and other lines) are too long, see HACKING.txt. > +bool split_dwarf_p = false; I happen to think I know what "_p" means, but I doubt most developers do. :-) I suggest renaming it to something like "using_split_dwarf". > +/* Note: we do not need to put .dwo file in MANIFEST file. */ Remove unnecessary comment. > + cc_log("stat on manifest (manifest_path) %s failed (%s), so size statistics is incorrect.", > + manifest_path, strerror(errno)); Please fix alignment of arguments (align with spaces). > + /* -gsplit-dwarf support: creating .dwo file name > + * > + * Question: What is the reliable way of creating the .dwo file name > + * in the case of using |-o output_obj| to the compiler ? > + * Answer: Heuristics based on one run of GCC 4.8 is placed below, and > + * it works with gcc 4.9 as well. > + * > + * BTW, no matter whether -gsplit-warning is placed on the command > + * line, or if so, whether it has been seen or not, we need to set > + * tmp_dwo in this part of the code to reflect the value of the > + * current output_obj to tmp_dwo. > + * If -gsplit-dwarf is specified on the command line eventually, this > + * tmp_dwo will be used. > + */ > + > + // Creation of .dwo file name. > + // In the extreme simple case, > + // Try replacing .o with .dwo. > + // > + // Fact: gcc-4.8 produced the following .dwo file for the command > + // below: > + // t-test.o.tmp.dwo > + // > + // /usr/bin/gcc-4.8 -gsplit-dwarf -c t-test.c -o t-test.o.tmp.gazonk > + // > + // So here is the heuristics (works with gcc 4.9 also). > + // Strip anything after the final '.' and add "dwo" instead. I suggest replacing the above comments with this: /* GCC (at least 4.8 and 4.9) forms the .dwo file name by removing everything * after (and including) the last "." from the object file name and then * appending ".dwo". */ > + { /* code suggested by Joel. */ Thanks for mentioning, but I think the comment should be removed. ;-) > + /* -gsplit-dwarf support requires the removal of a few more files.*/ Remove unnecessary comment. > + /* gsplit-support modification */ Remove unnecessary comment (twice). > + /* > + * OBSERVATION: processing stderr does not have to be changed in the > + * presence of "-gsplit-dwarf". > + */ Remove unnecessary comment. > + if (tmp_dwo) { > + tmp_unlink(tmp_dwo); > + } tmp_dwo will never be NULL here, right? So the check can be removed. > + /* -gsplit-dwarf support: > + * The code immediately above created object file (output_obj). > + * We should also repeat the similar procedurek for .dwo file > + * when "-gsplit-dwarf is given. > + * Because of the order in which tmp_dwo is set when command line is > + * scanned, we can have non-null tmp_dwo even when split_swarf_p is > + * false. Check only when split_dwarf_p is true. > + */ Remove unnecessary comment. > + if (tmp_dwo && split_dwarf_p) { tmp_dwo can never be NULL here, so remove that condition. > + /* Description of processing below.: > + * Perform the copy of .dwo file if --gsplit-dwarf is used. > + */ > + > + /* stats_update(STATS_TOCACHE) is now called AFTER the processing of > + * .dwo file below. > + */ > + > + /* > + * Caution: if we output to real object file first, then > + * output_obj *IS* the real object, and so we should not remove it! > + * Also, in this case, we should NOT copy back the cached file to the > + * final destination object. This is a waste of time. > + * Same can be said of .dwo file: real and cached one below. > + * > + * Somehow dependency file was not copied this way. > + */ > + > + /* -gsplit-dwarf support: > + * if -gsplit-dwarf is given on the compiler command line, > + * .dwo file ought to be put into cache as well. So let's do it here. > + */ Remove unnecessary comments. > + if (split_dwarf_p) > + { Put brace on previous line, see HACKING.txt. > + /* Caution: if we output to real object file first, then > + * tmp_dwo *IS* the real .dwo file, and so we should not > + * remove it! The original code is left as a comment line > + * below to stress this important point. > + */ > + /* tmp_unlink(tmp_dwo); */ Remove. > + if (tmp_dwo) > + free(tmp_dwo); No need to check for non-NULL here. (free(NULL) is a no-operation.) > + /* TODO/NOTE: Sanity Check. > + * If split_dwarf_p is NOT true and we get non-null > + * get_path_in_cache(object_name, ".dwo"), we probably should purge > + * that file! Unlikely to happen, but non-zero probability. > + * But this is not so critical. We will purge the file eventually by > + * LRU algorithm. > + * So just leave a log message. > + */ > + if (!split_dwarf_p) { > + char *cp; > + cp = get_path_in_cache(object_name, ".dwo"); > + if (!cp) { > + cc_log("We are not using split dwarf but, we have .dwo. Maybe we should purge it.:%s", cp); > + } > + } Looks like you have misunderstood what get_path_in_cache does: it only formats a path string; it doesn't check whether there is a file in the cache. In other words, cp will never be NULL. I suggest to just remove the comment and the code. > + /* OBSERVATION: processing for profile_use is a good example > + * which we can mimic to implement the support for > + * --gsplit-dwarf. > + * So the support for -gsplit-dwarf was modeled after the > + * processing to support profile. > + */ Remove unnecessary comment. > + * For -gsplit-dwarf support, check for .dwo as well. See below. > + * cached_obj contains the path for cached object file. Remove unnecessary comment. > + * Notice the logic. We only check when output_dia is not NULL. Remove unnecessary comment. > + * We should repeat the above check for cached_dwo. Remove unnecessary comment. > + /* we get file(s) from cache unless we generate object in direct mode */ I don't understand this comment. What do you mean by generating object in direct mode? > + /* sanity check for -gsplit-dwarf support */ > + assert( (mode == FROMCACHE_CPP_MODE || mode == FROMCACHE_DIRECT_MODE)); The FROMCACHE_COMPILED_MODE state was removed in 38301c0dad27523f59d874759595944558ba022d, so the assert can be removed. > + /* We repeat the same procedure as above for .dwo file if split_dwarf_p */ Remove unnecessary comment. > + /* > + * we don't need to get file from cache when we called the > + * compiler in this run of ccache since the output_dwo would > + * have been created by the compiler. > + */ > + /* internal sanity check: the following means compiler > + * was not called. We are in from_cache() remember?_*/ > + assert(mode == FROMCACHE_CPP_MODE || mode == FROMCACHE_DIRECT_MODE); Remove. > + /* > + * Note: if something failed internal to get_file_from_cache > + * we may want to remove a flurry of files. > + * There were calls to x_unlinks in the original code > + * circa 2014 summer. > + */ Remove unnecessary comment. > + /* NOTE: The error processing above assumed we only do a copy/hardlink > + * for cached_obj. We may need to repeat the copy and error > + * processing for cached_dwo in a similar manner instead of doing the > + * copy/link only. > + * > + * But do note that the unlinking of related files > + * MUST BE REPEATED in the both cases of the failures of copy/link > + * of cached_obj -> output_obj > + * and cached_dwo -> output_dwo if (split_dwarf_p). > + */ Remove unnecessary comment. > + /* > + * Note: if something failed internal to get_file_from_cache > + * we may want to remove a flurry of files. > + * There were calls to x_unlinks in the original code circa 2014 summer. > + */ Remove unnecessary comment. > + /* OBSERVATION: The above seems to be the best place where we > + * can copy object file and dwo file safely (-gsplit-dwarf support). > + */ Remove unnecessary comment. > + /* -gsplit-dwarf support */ Remove unnecessary comment. > + /* Note: we do not need to put .dwo file in MANIFEST file. */ Remove unnecessary comment. > + /* -gsplit-dwarf support: internal sanity check */ Remove unnecessary comment. > + /* OBSERVATION: We need to create output_dwo if > + -gsplit-dwarf is given. But we should not do here > + YET since --gsplit-dwarf may be given later on the > + command line !*/ Remove unnecessary comment. > + /* -gsplit-dwarf support > + * This place is chosen as the place to handle -gsplit-dwarf. > + */ Remove unnecessary comment. > + /* we should add this option to the compiler option > + when the real compiler is invoked. */ Remove unnecessary comment. > + /* found_S_opt needs to be handled with caution with split_dwarf_p */ Remove unnecessary comment. > + * So split_dwarf_p is set to false. Remove unnecessary comment. > + * Note: Just a tidbit. When -S is given on the command line, > + * the cached assember output file in the cache spool actually > + * has .o suffix. Interesting, isn't it? > + */ Well, .o stands for "whatever output the compiler produces", not "object file" per se. :-) Remove the comment. > + /* Naive question: Is there an angle in which split_dwarf_p processing > + * has anything to do with "-" output that is handled specialy below? > + * For example, what do we want to do with .dwo file (produce it > + * somewhere, but what path name to use)? > + * => > + * As it turns out, GCC does not work with output with "-" and > + * since -gsplit-dwarf is only for GCC at this moment (March, 2015), > + * we don't have to worry about this. > + */ Remove unnecessary comment. > + /* -gsplit-dwarf support: > + * We have probably already seen and check for > + * -gsplit-dwarf by the time we reach here. (We did it somewhere above.) > + * Here is the chance to create output_dwo based on the value of output_obj. > + * The basic outline would be > + * output_obj = basename(input_file) > + * add ".dwo" to it with minor tweaks to handle special cases. > + */ > + > + /* > + * Hueristics is based on the following observation. > + * Special case 1: What do we do if we need to output .s file > + * for output and -gsplit-dwarf is specified? > + * What file(s) does GCC 4.8 produce exactly? > + * Answer: > + * /usr/bin/gcc-4.8 -gsplit-dwarf -c t-test.c -S > + * generated > + * t-test.s > + * that contains assembler directives to generate data (dwarf data). > + * This means that if we are only producing assembler source > + * we do not need to handle ".dwo" file at all. > + * > + * /usr/bin/gcc-4.8 -gsplit-dwarf -c t-test.c -S -o t-test.s.gazonk > + * produced > + * t-test.s.gazonk > + * > + * DONE: > + * The above special cases need to be handled. > + * The modification below more or less assumed that .dwo file is produced > + * always (together with the paired .o file) unless "-S" is specified. > + * > + * File-scope global variable |found_S_opt| is set to true > + * when "-S" is specified. > + * > + * Special case 2: /usr/bin/gcc-4.8 -gsplit-dwarf -c t-test.s > + * produced > + * t-test.o > + * t-test.o.dwo. > + * > + * So let us disable split_dwarf_p to false if > + * found_S_opt is true. > + * This is done at the end of a loop in argument processing. > + */ Remove unnecessary comments. > + /* -gsplit-dwarf support: create output_dwo */ Remove unnecessary comment. > + if (split_dwarf_p && !found_S_opt) { Since you set split_dwarf_p to false if found_S_opt is true, you can remove the "&& !found_S_opt" part. > + { /* code suggested by Joel. */ Remove unnecessary comment. > + /* Long-term question: Do we need to cope with strange output_obj file > + * name in case split_dwarf_p? "-gsplit-dwarf" support may encounter > + * corner cases when strange file names are given. > + * Answer: Don't worry too much. GCC itself may not run with strange > + * path name in the first placbe. We may find the answer in the long > + * run. > + */ Remove unnecessary comment. > + if (output_dwo) free(output_dwo); output_dwo = NULL; > ... > + if (cached_dwo) free(cached_dwo); cached_dwo = NULL; Remove superfluous "if" check.
(In reply to Joel Rosdahl from comment #43) Thank you for the comment. Regarding the dependency generation and what it has to do with -gsplit-dwarf support, I need to investigate myself. It has been quite a while since I wrote the patch in a piecemeal fashion by trials and errors. So I have to refresh my memory somehow. Regarding the test a la test.sh, I have not looked inside test.sh very carefully yet although the thought occurred to me before. But it may have to wait a little. (I wonder if profiling support has its own test suite by now.) As for the individual comments for the patch, I will address them. I notice a couple of things: - Is free(NULL) a no-op on all the supported platforms? Maybe the C standard has mandated it is OK today, but 15 years ago, the assumption was not quite portable. - tmp_dwo can be null when an error path is taken if I recall correctly. I think that is why |if (tmp_dwo)| is done before free (tmp_dwo) or in some places. Anyway, I will report back with the new patch which addresses your comment. TIA
Created attachment 11014 [details] Part 7 (updated gsplit-dwarf support) Take 2: tried to address initial comments Here is the new patch that incorporate the suggestions from the initial comment. I am afraid that my tab setting is not quite in line with yours (still struggling with whitespace issues although now TAB is shown as visible symbol in my emacs buffer.) Here are responses to your earlier comments. >> Note: cc_process_args ought to have generated/set output_dwo. But we may not >> want to do so before generating dependencies (or set generating_dependencies) >> because it has been the implicit ordering that the dependency is generated >> first and then object file is generated. (CI failed to record the particular >> command line, etc. which necessitated him to insert the assert() statement >> below.) See a block comment that starts with "-gsplit-dwarf support: Ordering >> of dependency generation." in the earlier part of this file. > >I have read the above and the comment it refers to, but I still don't understand what dependency generation has to do with -gsplit-dwarf support. Could you please explaind the correlation between the two in more details? I tried to recall what led to this comment. After tinkering with dependency relations this time around, I came to realize that I thought this had something to do with a compilation command that is meant to generate ONLY dependency relation although -gsplit-dwarf is specified (using -M). It has nothing to do with "Ordering". I think the comment was not updated properly. I mean, with a command a la gcc -M -gsplit-dwarf ... -o dependency-file.d we may generate only the dependency and not the object file (and .dwo file). But I think this was based on my misunderstanding somewhere. It turns out that ccache does NOT support "-M" option at all.: ccache /usr/bin/gcc-4.8 -gsplit-dwarf -DHAVE_CONFIG_H -DSYSCONFDIR=/usr/local/etc -I. -I. -M -g -O2 -Wall -W -Werror -o .deps/ccache.c.d ccache.c This leaves a log in logfile: [2015-05-04T06:01:54.935840 18731] Working directory: /home/ishikawa/repos/ccache [2015-05-04T06:01:54.935870 18731] using_split_dwarf is set to true due to -gsplit-dwarf. [2015-05-04T06:01:54.935910 18731] Compiler option -M is unsupported [2015-05-04T06:01:54.935935 18731] Failed; falling back to running the real compiler [2015-05-04T06:01:54.935955 18731] Executing /usr/bin/gcc-4.8 -gsplit-dwarf -DHAVE_CONFIG_H -DSYSCONFDIR=/usr/local/etc -I. -I. -M -g -O2 -Wall -W -Werror -o .deps/ccache.c.d ccache.c [2015-05-04T06:01:54.936036 18731] Acquired lock /FF-NEW/cache-dir/8/stats.lock [2015-05-04T06:01:54.936349 18731] Releasing lock /FF-NEW/cache-dir/8/stats.lock [2015-05-04T06:01:54.936387 18731] Unlink /FF-NEW/cache-dir/8/stats.lock [2015-05-04T06:01:54.936433 18731] Result: unsupported compiler option Originally, I *think* I wanted to make sure that ccache behaves sanely when only the dependency file was generated but no .dwo file is generated even though -gsplit-dwarf is given on the command line. (But the comment was not quite updated to reflect that.) But given that -M is not handled, maybe my concern was moot. Hmm... Maybe I should just leave a comment as follows. /* If and when "-M" option is supported by ccache, this means only * the dependency file is generated (presumably by specifying the * destination using -o destinationa_filepath), -gsplit-dwarf, -gp * handling and even ordinary handling of object file must be * modified. */ and so I did with a sample command line.. Your comment is appreciated. >Have you considered writing some test cases for -gsplit-dwarf support (see test.sh)? Do you think that it would be feasible? > Like I said, testing in test.sh has to wait for a while until I figure out the complete framework. I may add a simple case first and then enhance it later. I hope this is acceptable. Right now |make test| passes and so it should be relatively safe :-) And I am using it for my local mozilla thunderbird build with good effect. >Now some comments on the patch itself: > >> +/* -gsplit-dwarf support: Split dwarf information (GCC 4.8 and up). Contains pathname if not NULL. */ > >This line (and other lines) are too long, see HACKING.txt. I tried to set the right margin to 78 and re-indent this and other longish comment lines. But then later on, I realize that maybe your tab setting might make long lines (in my editor buffer) that goes way over 80 char column position not exceeding the 80 char column because the indentation is used by tabs? Please let me know which lines are still too long for your viewing in your editor. (In the end, you may need tweak the whitespace like you did for other patches. Sorry about this.) >> +bool split_dwarf_p = false; > >I happen to think I know what "_p" means, but I doubt most developers do. :-) I suggest renaming it to something like "using_split_dwarf". Fixed. As you might have guessed, "_p" or "p" at the end is very customary in Lisp :-) > >> +/* Note: we do not need to put .dwo file in MANIFEST file. */ > >Remove unnecessary comment. > Done. >> + cc_log("stat on manifest (manifest_path) %s failed (%s), so size statistics is incorrect.", >> + manifest_path, strerror(errno)); > >Please fix alignment of arguments (align with spaces). I hope I have done it right. But again, I am not entirely sure if we are looking at the same layout of lines in our editor buffer as far as whitespaces are concerned :-( > >> + /* -gsplit-dwarf support: creating .dwo file name >> + * >> + * Question: What is the reliable way of creating the .dwo file name >> + * in the case of using |-o output_obj| to the compiler ? >> + * Answer: Heuristics based on one run of GCC 4.8 is placed below, and >> + * it works with gcc 4.9 as well. >> + * >> + * BTW, no matter whether -gsplit-warning is placed on the command >> + * line, or if so, whether it has been seen or not, we need to set >> + * tmp_dwo in this part of the code to reflect the value of the >> + * current output_obj to tmp_dwo. >> + * If -gsplit-dwarf is specified on the command line eventually, this >> + * tmp_dwo will be used. >> + */ >> + >> + // Creation of .dwo file name. >> + // In the extreme simple case, >> + // Try replacing .o with .dwo. >> + // >> + // Fact: gcc-4.8 produced the following .dwo file for the command >> + // below: >> + // t-test.o.tmp.dwo >> + // >> + // /usr/bin/gcc-4.8 -gsplit-dwarf -c t-test.c -o t-test.o.tmp.gazonk >> + // >> + // So here is the heuristics (works with gcc 4.9 also). >> + // Strip anything after the final '.' and add "dwo" instead. > >I suggest replacing the above comments with this: > >/* GCC (at least 4.8 and 4.9) forms the .dwo file name by removing everything > * after (and including) the last "." from the object file name and then > * appending ".dwo". > */ > Done. >> + { /* code suggested by Joel. */ > >Thanks for mentioning, but I think the comment should be removed. ;-) > Done. (In one other place as well.) >> + /* -gsplit-dwarf support requires the removal of a few more files.*/ > >Remove unnecessary comment. Done. > >> + /* gsplit-support modification */ > >Remove unnecessary comment (twice). Done. > >> + /* >> + * OBSERVATION: processing stderr does not have to be changed in the >> + * presence of "-gsplit-dwarf". >> + */ > >Remove unnecessary comment. Done. > >> + if (tmp_dwo) { >> + tmp_unlink(tmp_dwo); >> + } > >tmp_dwo will never be NULL here, right? So the check can be removed. Well, I have to see if this is the case... Well, I should have known. This seems to be the case. |make test| runs without a hitch if I remove the null-check. It seems |cached_tmp| may be null sometimes, and I got confused. Done. >> + /* -gsplit-dwarf support: >> + * The code immediately above created object file (output_obj). >> + * We should also repeat the similar procedurek for .dwo file >> + * when "-gsplit-dwarf is given. >> + * Because of the order in which tmp_dwo is set when command line is >> + * scanned, we can have non-null tmp_dwo even when split_swarf_p is >> + * false. Check only when split_dwarf_p is true. >> + */ > >Remove unnecessary comment. > Done >> + if (tmp_dwo && split_dwarf_p) { > >tmp_dwo can never be NULL here, so remove that condition. > Right, now it reads |if (using_split_dwarf) }| >> + /* Description of processing below.: >> + * Perform the copy of .dwo file if --gsplit-dwarf is used. >> + */ >> + >> + /* stats_update(STATS_TOCACHE) is now called AFTER the processing of >> + * .dwo file below. >> + */ >> + >> + /* >> + * Caution: if we output to real object file first, then >> + * output_obj *IS* the real object, and so we should not remove it! >> + * Also, in this case, we should NOT copy back the cached file to the >> + * final destination object. This is a waste of time. >> + * Same can be said of .dwo file: real and cached one below. >> + * >> + * Somehow dependency file was not copied this way. >> + */ >> + >> + /* -gsplit-dwarf support: >> + * if -gsplit-dwarf is given on the compiler command line, >> + * .dwo file ought to be put into cache as well. So let's do it here. >> + */ > >Remove unnecessary comments. > Done. >> + if (split_dwarf_p) >> + { > >Put brace on previous line, see HACKING.txt. Oops. Sorry, ccache follows BSD-style, whereas my default editing mode in Emacs seems to follow GNU coding style. When I created a new local repository to check in the latest git repository, I forgot to put in .dirs-local file with the following content: ---- quote --- ;;; Directory Local Variables ;;; See Info node `(emacs) Directory Variables' for more information. ((c-mode (show-trailing-whitespace . t) (c-file-style . "BSD"))) --- end quote Now it is done, I should be less free of this style issue. >> + /* Caution: if we output to real object file first, then >> + * tmp_dwo *IS* the real .dwo file, and so we should not >> + * remove it! The original code is left as a comment line >> + * below to stress this important point. >> + */ >> + /* tmp_unlink(tmp_dwo); */ > >Remove. > Done. >> + if (tmp_dwo) >> + free(tmp_dwo); > >No need to check for non-NULL here. (free(NULL) is a no-operation.) Right!. I checked and found the latest C specifiation calls for free(NULL) shall be a no-op. Long time ago, I was bitten by a feature of a library for an embdded device when free(NULL) caused an implementation-defined or undefined behavior according to the then current C specification, and have been cautiois about this. But we are on a modern C worksation/PC/server with up-to-date library, and so free(NULL) is a no-op. Done. >> + /* TODO/NOTE: Sanity Check. >> + * If split_dwarf_p is NOT true and we get non-null >> + * get_path_in_cache(object_name, ".dwo"), we probably should purge >> + * that file! Unlikely to happen, but non-zero probability. >> + * But this is not so critical. We will purge the file eventually by >> + * LRU algorithm. >> + * So just leave a log message. >> + */ >> + if (!split_dwarf_p) { >> + char *cp; >> + cp = get_path_in_cache(object_name, ".dwo"); >> + if (!cp) { >> + cc_log("We are not using split dwarf but, we have .dwo. Maybe we should purge it.:%s", cp); >> + } >> + } > >Looks like you have misunderstood what get_path_in_cache does: it only formats a path string; it doesn't check whether there is a file in the cache. In other words, cp will never be NULL. I suggest to just remove the comment and the code. > Done. I am afraid I certainly did misunderstand. >> + /* OBSERVATION: processing for profile_use is a good example >> + * which we can mimic to implement the support for >> + * --gsplit-dwarf. >> + * So the support for -gsplit-dwarf was modeled after the >> + * processing to support profile. >> + */ > >Remove unnecessary comment. Done. > >> + * For -gsplit-dwarf support, check for .dwo as well. See below. >> + * cached_obj contains the path for cached object file. > >Remove unnecessary comment. Done. > >> + * Notice the logic. We only check when output_dia is not NULL. > >Remove unnecessary comment. Done. > >> + * We should repeat the above check for cached_dwo. > >Remove unnecessary comment. Done. > >> + /* we get file(s) from cache unless we generate object in direct mode */ > >I don't understand this comment. What do you mean by generating >> object in direct mode? Please recall that at one point in time ccache generated the output to tempory files and then copy them to final destinations. OTOH, there was a flag to force the generation to final destinations. I am referring to that mode as direct mode, but now that ccache generates the files in destination files first, I took the comment out. > >> + /* sanity check for -gsplit-dwarf support */ >> + assert( (mode == FROMCACHE_CPP_MODE || mode == FROMCACHE_DIRECT_MODE)); > >The FROMCACHE_COMPILED_MODE state was removed in 38301c0dad27523f59d874759595944558ba022d, so the assert can be removed. > Done. >> + /* We repeat the same procedure as above for .dwo file if split_dwarf_p */ > >Remove unnecessary comment. > Done. > >> + /* >> + * we don't need to get file from cache when we called the >> + * compiler in this run of ccache since the output_dwo would >> + * have been created by the compiler. >> + */ >> + /* internal sanity check: the following means compiler >> + * was not called. We are in from_cache() remember?_*/ >> + assert(mode == FROMCACHE_CPP_MODE || mode == FROMCACHE_DIRECT_MODE); > Done. >> + /* >> + * Note: if something failed internal to get_file_from_cache >> + * we may want to remove a flurry of files. >> + * There were calls to x_unlinks in the original code >> + * circa 2014 summer. >> + */ > >Remove unnecessary comment. Done. > >> + /* NOTE: The error processing above assumed we only do a copy/hardlink >> + * for cached_obj. We may need to repeat the copy and error >> + * processing for cached_dwo in a similar manner instead of doing the >> + * copy/link only. >> + * >> + * But do note that the unlinking of related files >> + * MUST BE REPEATED in the both cases of the failures of copy/link >> + * of cached_obj -> output_obj >> + * and cached_dwo -> output_dwo if (split_dwarf_p). >> + */ > >Remove unnecessary comment. > Done. >> + /* >> + * Note: if something failed internal to get_file_from_cache >> + * we may want to remove a flurry of files. >> + * There were calls to x_unlinks in the original code circa 2014 summer. >> + */ > >Remove unnecessary comment. > Done >> + /* OBSERVATION: The above seems to be the best place where we >> + * can copy object file and dwo file safely (-gsplit-dwarf support). >> + */ > >Remove unnecessary comment. > Done. >> + /* -gsplit-dwarf support */ > >Remove unnecessary comment. > Done >> + /* Note: we do not need to put .dwo file in MANIFEST file. */ > >Remove unnecessary comment. > Done. >> + /* -gsplit-dwarf support: internal sanity check */ > >Remove unnecessary comment. > DONE >> + /* OBSERVATION: We need to create output_dwo if >> + -gsplit-dwarf is given. But we should not do here >> + YET since --gsplit-dwarf may be given later on the >> + command line !*/ > >Remove unnecessary comment. > Done >> + /* -gsplit-dwarf support >> + * This place is chosen as the place to handle -gsplit-dwarf. >> + */ > >Remove unnecessary comment. > Done. >> + /* we should add this option to the compiler option >> + when the real compiler is invoked. */ > >Remove unnecessary comment. Done. > >> + /* found_S_opt needs to be handled with caution with split_dwarf_p */ > >Remove unnecessary comment. > >> + * So split_dwarf_p is set to false. > >Remove unnecessary comment. Done. > >> + * Note: Just a tidbit. When -S is given on the command line, >> + * the cached assember output file in the cache spool actually >> + * has .o suffix. Interesting, isn't it? >> + */ > >Well, .o stands for "whatever output the compiler produces", not "object file" per se. :-) Remove the comment. > Yeah, right. Done. >> + /* Naive question: Is there an angle in which split_dwarf_p processing >> + * has anything to do with "-" output that is handled specialy below? >> + * For example, what do we want to do with .dwo file (produce it >> + * somewhere, but what path name to use)? >> + * => >> + * As it turns out, GCC does not work with output with "-" and >> + * since -gsplit-dwarf is only for GCC at this moment (March, 2015), >> + * we don't have to worry about this. >> + */ > >Remove unnecessary comment. Done. > >> + /* -gsplit-dwarf support: >> + * We have probably already seen and check for >> + * -gsplit-dwarf by the time we reach here. (We did it somewhere above.) >> + * Here is the chance to create output_dwo based on the value of output_obj. >> + * The basic outline would be >> + * output_obj = basename(input_file) >> + * add ".dwo" to it with minor tweaks to handle special cases. >> + */ >> + >> + /* >> + * Hueristics is based on the following observation. >> + * Special case 1: What do we do if we need to output .s file >> + * for output and -gsplit-dwarf is specified? >> + * What file(s) does GCC 4.8 produce exactly? >> + * Answer: >> + * /usr/bin/gcc-4.8 -gsplit-dwarf -c t-test.c -S >> + * generated >> + * t-test.s >> + * that contains assembler directives to generate data (dwarf data). >> + * This means that if we are only producing assembler source >> + * we do not need to handle ".dwo" file at all. >> + * >> + * /usr/bin/gcc-4.8 -gsplit-dwarf -c t-test.c -S -o t-test.s.gazonk >> + * produced >> + * t-test.s.gazonk >> + * >> + * DONE: >> + * The above special cases need to be handled. >> + * The modification below more or less assumed that .dwo file is produced >> + * always (together with the paired .o file) unless "-S" is specified. >> + * >> + * File-scope global variable |found_S_opt| is set to true >> + * when "-S" is specified. >> + * >> + * Special case 2: /usr/bin/gcc-4.8 -gsplit-dwarf -c t-test.s >> + * produced >> + * t-test.o >> + * t-test.o.dwo. >> + * >> + * So let us disable split_dwarf_p to false if >> + * found_S_opt is true. >> + * This is done at the end of a loop in argument processing. >> + */ > >Remove unnecessary comments. > Done. >> + /* -gsplit-dwarf support: create output_dwo */ > >Remove unnecessary comment. Done. > >> + if (split_dwarf_p && !found_S_opt) { > >Since you set split_dwarf_p to false if found_S_opt is true, you can remove the "&& !found_S_opt" part. > You are right. Done. >> + { /* code suggested by Joel. */ > >Remove unnecessary comment. > Done. >> + /* Long-term question: Do we need to cope with strange output_obj file >> + * name in case split_dwarf_p? "-gsplit-dwarf" support may encounter >> + * corner cases when strange file names are given. >> + * Answer: Don't worry too much. GCC itself may not run with strange >> + * path name in the first placbe. We may find the answer in the long >> + * run. >> + */ > >Remove unnecessary comment. > Done. >> + if (output_dwo) free(output_dwo); output_dwo = NULL; >> ... >> + if (cached_dwo) free(cached_dwo); cached_dwo = NULL; > >Remove superfluous "if" check. Done. Please check if the new patch makes sense. TIA
Comment on attachment 11014 [details] Part 7 (updated gsplit-dwarf support) Take 2: tried to address initial comments I thought I removed all the unnecessary comments, but I may have overlooked some by mistake. Please let me know if you think the comment about dependency generation is appropriate and other outstanding issues. I will remove the unnecessary comments after hearing the new comments. TIA
Hi again, Thanks for the patch update. Almost there! :-) > /* > - * Name (represented as a struct file_hash) of the file containing the cached > - * object code. > + * Name (represented as a struct file_hash) of the file containing the > + * cached object code. > */ Why this change? > + /* Here cached_obj contains a file name for object */ Isn't cached_obj's documentation enough? > + { > + char *base_name = remove_extension(output_obj); > + tmp_dwo = format("%s.dwo", base_name); > + free(base_name); > + } > + cc_log("Setting tmp_dwo to %s", tmp_dwo); I suggest doing this only when using_split_dwarf is true. > + if (using_split_dwarf) { > + assert(tmp_dwo); > + assert(cached_dwo); > + put_file_in_cache(tmp_dwo, cached_dwo); > + > + /* internal sanity check */ > + if (!generating_dependencies) > + assert(tmp_dwo && cached_dwo); Does this assert check something that the two asserts a couple of lines up don't check? > /* Check if the diagnostic file is there. */ > - if (output_dia && stat(cached_dia, &st) != 0) { > + if (output_dia && x_stat(cached_dia, &st) != 0) { > cc_log("Diagnostic file %s not in cache", cached_dia); This makes a missing cached_dia file produce two log messages instead of one. I think that one is enough. > + if (output_dwo) { > + if (x_stat(cached_dwo, &st) != 0) { > + cc_log("Split dwarf file %s not in cache", cached_dwo); This makes a missing cached_dwo file produce two log messages instead of one. I think that one is enough. > cc_log("Succeeded getting cached result"); > stats_update(STATS_CACHEHIT_CPP); > break; > + default: > + assert(0); The compiler warns about missing cases for enums, but it won't do it if there is a "default" branch in the switch, so I'd say that the above change actually makes the code less safe. Better remove it. > + if (str_eq(argv[i], "-gsplit-dwarf")) { > + cc_log("using_split_dwarf is set to true due to -gsplit-dwarf."); Since other log messages are written without referring to internal variable names, I suggest rephrasing to something like "Enabling caching of dwarf files since -gsplit-dwarf is used". > + if (found_S_opt /* || generating_dependencies*/ ) { What about the commented away code, should it be deleted or uncommented? > + cc_log("using_split_dwarf is reset to false due to found_S_opt"); Please rephrase without referring to variable names.
(In reply to Joel Rosdahl from comment #47) Hi, Thank you for your feedback. I will study your comment and come up with a patch hopefully within this week. TIA CI
Created attachment 11197 [details] Part 7 (updated gsplit-dwarf support) Take 3: tried to address follow-up comments Attached is the updated patch to reflect the latest comment. >Thanks for the patch update. Almost there! :-) Last mile is always hard, isn't it? >> /* >> - * Name (represented as a struct file_hash) of the file containing the cached >> - * object code. >> + * Name (represented as a struct file_hash) of the file containing the >> + * cached object code. >> */ >Why this change? I am afraid I must have reformatted this comment block with my somewhat short fill-column setting. Reverted to the original line break. >> + /* Here cached_obj contains a file name for object */ >Isn't cached_obj's documentation enough? cached_obj's documentation states the purpose of the variable. But we don't know where it is set, when it becomes invalid, etc. What I tried to say here (well, English is not my mother tongue) was /* When control reaches here, cached_obj contains a file name for object */ and so I rephrased the comment. I agree if you feel this unnecessary, we can remove it. Just let me know in your next comment. >> + { >> + char *base_name = remove_extension(output_obj); >> + tmp_dwo = format("%s.dwo", base_name); >> + free(base_name); >> + } >> + cc_log("Setting tmp_dwo to %s", tmp_dwo); > I suggest doing this only when using_split_dwarf is true. DONE as suggested: I need to check if leaving tmp_dwo to NULL does not result in error. I was forced to set an initial value of NULL to tmp_dwo since GCC complained about uninitialized value usage. --- begin compiler log --- ccache.c: In function 'ccache': ccache.c:1059:3: error: 'tmp_dwo' may be used uninitialized in this function [-Werror=maybe-uninitialized] put_file_in_cache(tmp_dwo, cached_dwo); ^ ccache.c:825:8: note: 'tmp_dwo' was declared here char *tmp_dwo; /* for supporting -gsplit-dwarf */ ^ --- end compiler log --- Checked by |make test| and real compilation steps of mozilla thunderbird. >> + if (using_split_dwarf) { >> + assert(tmp_dwo); >> + assert(cached_dwo); >> + put_file_in_cache(tmp_dwo, cached_dwo); >> + >> + /* internal sanity check */ >> + if (!generating_dependencies) >> + assert(tmp_dwo && cached_dwo); >Does this assert check something that the two asserts a couple of lines up don't check? Removed: Good catch. I think the particular if() and two assert() statements were left from previous rewriting efforts. They are useless now, and so removed. >> /* Check if the diagnostic file is there. */ >> - if (output_dia && stat(cached_dia, &st) != 0) { >> + if (output_dia && x_stat(cached_dia, &st) != 0) { >> cc_log("Diagnostic file %s not in cache", cached_dia); >This makes a missing cached_dia file produce two log messages instead > of one. I think that one is enough. REVERTED to stat(): I did not realize x_stat() produced a log of its own. (I thought x_stat() had something to do with remote file system as x_unlink() does. I was wrong.) >> + if (output_dwo) { >> + if (x_stat(cached_dwo, &st) != 0) { >> + cc_log("Split dwarf file %s not in cache", cached_dwo); >This makes a missing cached_dwo file produce two log messages instead > of one. I think that one is enough. DONE: Using stat() instead of x_stat(). There is a place I noticed x_stat() is used and this was reverted to stat(): maybe I did this in my previous patch. NEW mods: - if (x_stat(output_obj, &st) != 0) { + if (stat(output_obj, &st) != 0) { cc_log("Compiler didn't produce an object file"); stats_update(STATS_NOOUTPUT); failed(); >> cc_log("Succeeded getting cached result"); >> stats_update(STATS_CACHEHIT_CPP); >> break; >> + default: >> + assert(0); > The compiler warns about missing cases for enums, but it won't do it > if there is a "default" branch in the switch, so I'd say that the > above change actually makes the code less safe. Better remove it. REMOVED. >> + if (str_eq(argv[i], "-gsplit-dwarf")) { >> + cc_log("using_split_dwarf is set to true due to -gsplit-dwarf."); > Since other log messages are written without referring to internal > variable names, I suggest rephrasing to something like "Enabling > caching of dwarf files since -gsplit-dwarf is used". Changed as suggested. >> + if (found_S_opt /* || generating_dependencies*/ ) { > > What about the commented away code, should it be deleted or uncommented? DELETED: When I started hacking on the code, I was not quite sure how to handle dependencies with -gsplit-dwarf. So I did not delete the commented out code right away to see if I needed to resurrect it any time soon. But then later on, I realized support of split-dwarf and dependency generation is orthogonal and they do not quite interfere with each other. So let us delete it. >> + cc_log("using_split_dwarf is reset to false due to found_S_opt"); > >Please rephrase without referring to variable names. Change as suggested: New message is "Disabling caching of dwarf files since -S is used". Hope this helps. TIA
Merged in 35b5c0fbae74189c18c80477bd074bb5c6e8071c, thanks! > REVERTED to stat(): I did not realize x_stat() produced a log of its own. > (I thought x_stat() had something to do with remote file system as > x_unlink() does. I was wrong.) Yes, I agree that x_stat is not a very good name. I'll think about renaming it.
(In reply to Joel Rosdahl from comment #50) > Merged in 35b5c0fbae74189c18c80477bd074bb5c6e8071c, thanks! You are very welcome and I will use the git head version to see if anything went awry during the merge process. Thank YOU! >> REVERTED to stat(): I did not realize x_stat() produced a log of its own. >> (I thought x_stat() had something to do with remote file system as >> x_unlink() does. I was wrong.) > > Yes, I agree that x_stat is not a very good name. I'll think about renaming it. x_stat() does something extra, i.e., stat() and if stat() fails, complain with a log message. So stat_and_complain() or stat_and_log() or log_stat(), etc. Anyway, something different would be less confusing. Thank you again for maintaining and making the great software available.
Included in 3.2.3.