11th July, 2011

unused code, libreoffice style

The return of callcatcher derived lists of unused code list in LibreOffice. I tweaked callcatcher to understand the additional gcc command line options used by the new gbuild module so it can be dropped in as a gcc replacement in that environment.

There’s now a findunusedcode target in the top level Makefile and a cached list of easy to remove methods in the tree as unusedcode.easy. These are non-virtual C++ methods which are not called directly, nor have their address taken by any code in a stock debug level Linux build.

What distinguishes unusedcode.easy from not-easy is simply that the easy list is restricted to C++ name-mangled class-level symbols and so omits any non-mangled C-style symbols which might be dlsymed from some not easy to find entry point.

At a count of 5176 easy unused methods there’s enough there to be getting on with for the moment, and can revisit the C-style symbols with a whitelist of known dlsym names on completion of those.

Posted at 1:34 pm | Comments Off

11th July, 2011

regression testing libreoffice filters

For regression testing LibreOffice filters I’ve now arranged things so that each import filter’s cppunit test comprises of three data dirs, a pass dir, a fail dir and an indeterminate dir. Files in pass must parse without error, those in fail are expected to fail, but fail gracefully by returning an error or throwing an exception, i.e. a crash is a fail on a “fail” test, while “can’t parse” is the expected pass state.

The pass/fail dirs are typically pre-filled in the tree with a small sample of tricky documents which get tested at every build time to ensure they remain working.

indeterminate dirs on the other hand are expected to be empty in the tree, and the cppunit tests don’t care if their contents can be parsed or not, only that they don’t crash. This is really convenient for searching for crashes in a large document collection (horde), given that its an order of magnitude faster than using the full application to load and layout the results.

So I/we can just take a large document horde of e.g. docs and throw them in sw/qa/core/data/ww8/indeterminate and run make -sr in sw and sit back and wait to see if anything in there is a crasher at the parser level. For extra goodness export VALGRIND=memcheck to run the whole lot under valgrind.

FWIW, today anyway

  1. All 3721 attachments of (alleged) mime-type application/msword in openoffice.org’s bugzilla pass without crash when placed into ww8/indeterminate. To be re-run under valgrind later
  2. And all (ok, only 128) attachments of (alleged) mime-type application/msword in freedesktop.org’s bugzilla pass under VALGRIND=memcheck when placed into indeterminate.

I’ve got doc, rtf, qpro, wmf, emf, hwp, lwp and sxw organized and pre-seeded with a sample handful files so far. Plenty more filters than that of course, but .doc is my current focus as the richest vein of available had-bugs-reported documents.

Posted at 12:58 pm | Comments Off

4th July, 2011

Inkscape, textext and pixelated fonts

I use Inkscape a lot. Particularly for drawing diagrams for my research. I use the TexText plugin for Inkscape to allow me to embed arbitrary LaTeX in documents. However, recently I’ve had an issue with it.

TexText uses pdf2svg to convert PDFs to SVGs to embed into an Inkscape project. Under the hood it runs pdflatex on the LaTeX fragment you provide, then it converts this to an SVG with a tight bounding box. This has been generating horribly pixelated fonts for me. The output of X+y=z is the following:
Garbled output from pdf2svg

Having looked into it there are two ways to solve the problem

I do wonder though, why LaTeX prefers type3 fonts over their type1 brethren.

Posted at 11:37 pm | Comments Off