Archive for April, 2007

controlling symbol ordering

Tuesday, April 24th, 2007

So as an experiment I wanted to be able to control the order of the symbol addresses in the shared libraries we create in OpenOffice.org, e.g. the major component of writer libsw680li.so (for i386 linux). To see if placing the methods used during a standard startup together made any difference to startup performance.

Firstly I need to know what those methods are:

gcc provides -finstrument-functions which will instrument the code to force a call to __cyg_profile_func_enter on enter and __cyg_profile_func_exit on exit and pass them the address of the function in question. In this case I made a little shared library for use with LD_PRELOAD which would collect what was called and use libbfd to get their names and output them in order of execution. There are various other possibilities mooted around more sophisticated optimal orderings elsewhere, but lets keep it simple and output in call order.

Secondly I need to be able to somehow control their ordering in the final shared object:

gcc provides -ffunction-sections to stick each function in a section of its own which, from some random googling, apparently enables one to create a custom linker script which can specify the order of the functions. Firefox has some simple tooling to take the output of ld –verbose to get the default linker script and then munge the desired ordering of the sections through it so as to generate the custom linker script which can be passed to ld with -T/–script.

So, taking the basic instrumenter and its output for a startup of writer built with -finstrument-functions and munging that output through the mozilla mklinkscipt to ld –script on a recompile with -ffunction-sections, does it achieve anything ?

The results indicate a .12 second warm start improvement, which appears quite promising.

instrumenting and ldscript tools

backtraces and prelink

Monday, April 16th, 2007

So, for something like openoffice.org the debuginfo is about 400Megs in size which is sort of large to ask someone to download. Anyway, crashes happen when the user isn’t prepared, so they never have a debugging OOo to hand and if they do install the debuginfo they won’t be able to reproduce the problem. So it’s useful for apps like this to print out their callstack on crash, and enable that to be submitted to developers to map that stack back to a debugging version of that deployed OOo with offline debuginfo.

i.e. using glibc backtrace documentation and friends, heres a basic linux journal article on the topic.

a) but shared libraries complicate this a little in that the address you get from backtrace of something from a shared library is where the code ended up when the lib was loaded. So you really want that location relative to the baseaddress of the library so that you can use addr2line -e that_shared_library (abs address – base_address) to get the position within the offline .so. Using dladdr on the result from backtrace will give the name of the library it belongs to in dli_fname and the base address in dli_fbase

b) All ok so far, but there’s another problem. prelink ate your binaries. While you weren’t looking prelink has come along and optimized the .sos on your computer and added some extra stuff to them. So the callstack from your copy of OOo doesn’t match the offline debuginfo copy of OOo anymore.

e.g. the original not-prelinked line of an address in libsw680li
0x064987e9: /usr/lib/openoffice.org/program/libsw680li.so + 0x5557e9

and the equivalent line after it has been prelinked
0x494bf7e9 /usr/lib/openoffice.org/program/libsw680li.so + 0x58d7e9

0x5557e9 has moved to 0x58d7e9, an increase of 0×38000

So we need a way to fix this problem. Sometimes you can get a symbol name and offset from symbol in dli_sname and dli_saddr from dladdr. If so, then you could print that info in the call stack. As a bodge you can get the location of that symbol in the offline original library and adjust all the prelinked values for that library accordingly from that reference. But you’re like to not get anything back for those values in plenty of cases, leaving unmappable gaps in the callstack. A better way to know that the values for libsw680li.so have to be adjusted by 0×38000 to get the offline position is needed.

so, readelf on the original unprelinked library shows…
[20] .dynamic DYNAMIC 00865944 865944
and on the prelinked…
[20] .dynamic DYNAMIC 497cf944 89d944

The diff of 0×38000 in that sections offset looks useful. Can we include that info in our callstack so that offline we can can compare it to our offline libs and make the adjustment. By using dl_iterate_phdr we can look for the PT_DYNAMIC section of the various loaded libraries and spit out its p_offset value in our callstack. i.e. now we get…

0x064987e9: 0×00865944: /usr/lib/openoffice.org/program/libsw680li.so + 0x5557e9 for a nonprelinked
0x494bf7e9: 0x0089d944: /usr/lib/openoffice.org/program/libsw680li.so + 0x58d7e9 for prelinked

so if we get a prelinked crash, we can compare the included dynamic section offset and compare it for a given lib to the non-prelinked offline copy and make the adjustments to restore the callstack to a non-prelinked case and use addr2line for those values on our callstack and recover the original function names and position in the source. ta-da.

source code