Freitag, 7. September 2018

MacOS deployment and ship Java Runtime JRE with Eclipse RCP product

This is not quite simple. Please follow the instructions to ship your RCP product with a JRE on MacOS.


You need a working RCP product export with Maven Tycho to build a MacOS product.

1. Install Java Development Kit (JDK) on a MacOS system

Download the installation package from Oracle and install on a MacOS target system. You need a JDK. JRE is not sufficent. Dont't care about the size since we will delete everything but the JRE later on.

2. Copy the JDK to your product

The (for now) current JDK 1.8.181 is installed into that location:


Open a shell and copy the JDK to your development project:
user@mac: cp -R /library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk /Users/peter/...

To uninstall the JDK you have just to delete the JDK directory:
user@mac: rm -rf /library/Java/JavaVirtualMachines/jdk1.8.0_181

Place the content of JDK-directory in the following directory of your exported RCP product:

Open a shell and verify that the following command works properly:
user@mac: /path/to/
     Contents/Home/jre/bin/java -version

3. Specify VM in product configuration

Modify the file YourProduct.ini and add the following 2 lines:


The linebreak between -vm and the path is important!

Since the product is started from the path ../Eclipse/jre leads to the JDK we provided.

You can integrate the -vm option in your product build via the product configuration editor in tab Launch => launch arguments => macosx => programme arguments. Note that the linebreak is not needed there and is added by the build process.

4. Launch the product to test if everything works until now

Uninstall the JDK (rm -rf /library/Java/JavaVirualMachines/jdk1.8.0_181.jdk) and launch your RCP  product. If everything is correct it will use the JDK you provided. Otherwise it will fail to launch.

5. Remove unneeded files from JDK

You can remove a lot of files and directories from folder. You just need:

Remove all other files and directories. Your product will not start anymore but don't worry.

6. Fix Info.plist

The Info.plist contains the option CFBundleExecutable. The setting points to a symbolic link. This link gets broken in the deployment process.

The solution is to simply change CFBundleExecutable:


7. Test and deploy

Launch the product to test that everything works as expected. Integrate the deployment of the JRE into your tycho build.

Samstag, 16. Dezember 2017

Simplify text based editing of DocBook XML documentation

Writing and maintaining technical documentation is a really wide topic. Our product System Concept DMS needs a documentation.

Here is what we need:

  1. PDF output with TOC, page-numbers etc.
  2. HTML help page to publish on the web
  3. Files for the Eclipse help system (programme online help)
All this should be generated out of one single source.

I started with docbook about 1,5 years ago. As I am a programmer I have no difficulties editing XML files. But I realized that writing the documentation in plain docbook XML is to much work. I believe that problems will appear conerning image parameters. These are present in each docbook file so changes must be rolled out to all files.

So I left doccbook again and went back to LibreOffice to at least collect knowledge for later.

I reactivated the docbook stuff and the idea is to simplify the editing process by pre processing the source files.

Here is my current draft of a simplified input file:

 <section id="function_objekte_entfernen">  
 #height 10cm  
 <title>Objekte entfernen</title>  
 Mit der Aktion #b Objekte entfernen# können mit System Concept DMS aufgebrachte   
 #link Haftnotizen function_haftnotiz#, Markierungen und Schwärzungen aus   
 einem Dokument entfernt werden.  
 Sie können die Funktion per #b Rechtsklick-Objekte entfernen# direkt aus der   
 Kachelansicht oder aus einem geöffneten Dokument aufrufen.  
 Es wird der Dialog zum Entfernen von Objekten geöffnet. Markieren Sie den entsprechende Eintrag  
 in der Tabelle und Bestätigen Sie mit #b OK#. Das Objekte wird entfernt und die  
 Dokumentansicht aktualisiert.  
 #img img/web/function_objekte_entfernen_1.png 12cm 'Dialog zum Entfernen von Objekten'#  

As you can see some DocBook XML elements are still present. Frequently used or complex elements are simplyfied to a #tag # syntax.

Two empty lines create a paragraph-break (</para><para>)

With the #img src title# tag the pre-processing centralizes the XML-representation of images. So if a change is neccessary I just re-generate the XML-files.

I decided for the '#' because shift is not needed. Writing the documentation text must be as easy as possible.

#b: emphasis
#img: mediaobject
#icon: inlinemediaobject
#l: itemizedlist
#li: listitem + para
#-: end sequence (e.g. for #l or #li)

The above example will produce the follwoing DocBook XML section:

 <section id="function_objekte_entfernen">  
 <?dbfo-need height="10cm" ?>  
 <title>Objekte entfernen</title>  
 Mit der Aktion <emphasis>Objekte entfernen</emphasis> können mit System Concept DMS aufgebrachte   
 <link linkend="function_haftnotiz">Haftnotizen</link>, Markierungen und Schwärzungen aus   
 einem Dokument entfernt werden.  
 Sie können die Funktion per <emphasis>Rechtsklick-Objekte entfernen</emphasis> direkt aus der   
 Kachelansicht oder aus einem geöffneten Dokument aufrufen.  
 Es wird der Dialog zum Entfernen von Objekten geöffnet. Markieren Sie den entsprechende Eintrag  
 in der Tabelle und Bestätigen Sie mit <emphasis>OK</emphasis>. Das Objekte wird entfernt und die  
 Dokumentansicht aktualisiert.  
 <imageobject condition="print">  
 <imagedata fileref="img/web/function_objekte_entfernen_1.png" format="PNG" contentdepth="12cm" />  
 <phrase>Dialog zum Entfernen von Objekten</phrase>  
 <para>Dialog zum Entfernen von Objekten</para>  

This is much more away from "just writing" and almost two times longer.

The next point will be to divide the source into logical files (sections) and re-combine them for different purposes. This can be done via XML entities.

Dienstag, 30. Mai 2017

Ideas and problems with yellow pin notes on PDF documents

The idea is quite simple: pin this yellow little notes on digital PDF document in an easy-to-use way. You can do it with Adobe Reader but there are some problems:

  • It is not easy to use - e.g. you will have to use "save as" and cannot overwrite the existing document.
  • Adobe Reader is not available on every system
  • There is no way to intergrate the Reader functions  in the System Concept DMS product.

The System Concept DMS software is able to place notes in an easy to use way for about 2 years. But there was no way to remove or edit the notes yet.

The SC DMS features uses Apache PDFBox and draws a note in 3 steps:

  1. yellow box (addRect + fill)
  2. text (beginText + shotText + endText)
  3. border (addrect + stroke)

The user interface provides a two-step assistent to enter the text and choose a position for the note.

PDF document with a nice yellow note pinned

Make it removable

There was customer feedback that it would be great if notes are at least removable. This is not a simple task since a note consists of a number of drawing operations which are not connected in any way within the PDF.
I found a solution for that and use PDF comments (lines beginning with '%') to identify content streams which contain removable objects like notes.

So far so good. It turned out that content streams are put together by a certain page function of PDFBox. This resulted in an empty page if the user removed a note.
The reason was that the note META comment was still in the page but all content has been put into one single stream.

Use annotations

I tried to rewrite the note feature and make use of PDF annotations. Doing some reverse engineering I found out that Adobe Reader produces annotations.

Apache PDFBox is able to manage annotations, too:

PDPage page = doc.getPage(0);
List annotations = page.getAnnotations();  
PDAnnotationMarkup freeTextMark = new PDAnnotationMarkup();
freeTextMark.setAnnotationName("SCDMS:Note:Peter Pinnau");


// Yellow color for background
PDColor yellow = new PDColor(new float[] { 1, 1, 0 }, PDDeviceRGB.INSTANCE);
// Position for the annotation
PDRectangle position = new PDRectangle(); 
// set som data
freeTextMark.setTitlePopup("Peter Pinnau");
freeTextMark.setContents("This is the text\nENTER1\nENTER2");
// Color blaxk, "Helv" font, 11 point
freeTextMark.getCOSObject().setString(COSName.DA, "0 0 0 rg /Helv 11 Tf");
// Add the annoation   
// Save the document File("..."));

The above code places a nice multi-line yellow note in the PDF. It is visible and editable in Adobe Reader. It is visible in the PDF viewer shipped with Ubuntu.
But it is NOT visible in Mozillas PDF.JS viewer. Unfortunately SCDMS uses PDF.JS to view PDF documents.

I found out that Apache PDFBox and PDF.JS do not implement a so called default appearance for annotations. Since the annotation has no apperance it is not visible.

Adobe Reader creates a default appearence and displays the annotation correctly. If the PDF is saved ones from Adobe Reader the annotations also become visible in PDF.JS.

There are two open issues concerning that:



The best way to solve this problem concerning SCDMS of course will be to add a correct appearance stream when generating the annotation.
Unfortunately this goes deep into PDF stuff so I hope that PDFBOX-2019 will be solved in the nearer future.

For now I switched back to the old implementation and found another way to do the above mentioned pages operations so that the empty-page-problem could be solved in this particular case.

The content stream merging is done by (page is a page with content from a present document):

PDDocument.importPage(PDPage page)

I now use:

PDDocument.addPage(PDPage page)

and content streams are not put together anymore.

Freitag, 3. März 2017

Noise filter for QR-Code detection in scanned documents

Im am using zxing in a project to detect qr-codes within scanned documents. The goal is to achive almost 100% recognition but there were some issues to solve:

  1. zxing does not find small codes within a document page. Since the qr-code stickers are pinned on the documents manually the user has to pin the sticker in one of the 4 corners.
    The processor than cuts out corner by corner and searches for the code there.
  2. Unfortunately there were still non-recognized codes. Detection relies on printing quality of the stickers which not may be accurate in every situation.
    I did some tests and corrected non-recognized codes with gimp until they worked. I came to the conclusion that a filter is needed to eleminate false pixels as well as possible.

I spend an evening on that and finally found a specialized solution. Take a look and the sample images:

Left: original, Right: filter applied

The result is amazing, isn't it? zxing is now able to recognize the code.

How does it work?

My first idea was to use OpenCV to implement the filter but I than tried a very simple "self-made" algorithm:

  1. Input has to be already black/white pixel-data
  2. Iterate pixel (by rows and columns)
  3. Leave white pixels as they are
  4. For each black pixel calculate black pixels in the surrounding 7x7 square.
  5. Calculate the ration black pixels in 7x7-square / 49
  6. If the ration is less than 0.4 -> set pixel to white

It is important to work on a copy of the input data. The filter must not analyse pixels which have been modified by the algotithm.

Since qr-codes consist of rectangular patterns the filter does almost not destroy real data as long as the stickers are pinned likely straight.
Typical noises from bad printers or scanning failures are reduced very well.

When it goes close to the borders there is no 7x7 square available. It would be possible to leave that areas. I decided to shrink the square according to the position and process data in the same way.

Of course the 7x7 is adjusted to the qr-code size and the scanning resolution.

The following illustration shows the 7x7 square around the current pixel. The result of the black pixel count in that case equals 5 (current pixel no included). The current pixel will be erased and set to white.

Illustration 7x7 square

Make it simplier

A friend of mine pointed out that calculation of the ratio is not necessary. The pixel size of the square is always 7x7 = 49. So the threshold of 0.4 can be pre calculated as 0.4*49 = 20.

Exception: The border areas of the image. The square is shrinked but it is no problem to use the precalculated threshold. The algorithm is than a little more "aggressive" at the image borders (first 3 pixels).

Close areas

Next step is to use the algorithm to close areas. If the threshold as greater than 40 pixels are set to black.

The following image shows the progress. Please enlarge the picture and compare the middle and right sample.  you will see that some white pixels in the data blocks have been closed.

Left: orginal, middle: cleared, right: closed

Donnerstag, 16. Februar 2017

# c0ders wanted #

If I believe in the statistics there seem to be some followers on this blog. Now it's up to you. Inspired by a friend of mine I have created a little programming riddle.

I pinned it at the MENSA of the tech university near our office. The purpose is to find a student who will be able to support us.
Unfortunately the semester holidays just started. Perfect timing ;)

Feel free to solve the task and let me know (peter AT

Donnerstag, 12. Januar 2017

PDF File comparison for automated unit testing

Unit testing and automation of those tests is one of the key concepts to implement solid testing for your software project.

Unit tests of System Concept DMS project need to validate produced PDF documents in various scenarios, e.g.:

  • Split a (scanned) stack into page documents
  • Split a scanned stack into single semantic documents using a barcode/ QR-code mechanism
  • Combine single page documents to one document
  • ...
To abstract document orientated unit tests I implemented a base class which can process a flexible number of test document sets in a directory structure.
A document set consists of defined input documents and the corresponding output documents. The following image illustrated the test case AutoIndexBarcodeTest. The test contains 1 document set. The set consists of 1 input document (input1.pdf) and 4 output documents in the subfolder output.

Image 1: test document set, input and output documents
The concrete implementation for the barcode split unit test overrides the method
doFile(File inputFile). The implementation processes the input document and than needs to validate the created output documents against the output templates from the test definition (see image)

protected void doFile(File inputFile) throws Exception {
   // Process input file

   // Validate output files
   //  - count of files
   //  - location/name of the files
   //  - file content

The checks for count and location/name of the output documents were quite simple.

Problem - PDF file comparison using checksum

To compare the PDF documents by content a file checksum like MD5 seemed to be an appropriate solution. Easy to implement and more or less 'bug-save' which is an important thing concerning tests.

Unfortunately valid PDF output results produced different checksums from the test templates. The diff commandline tool reported 1 difference near the end of the documents:

peter@grogonzola:~$ diff -Naur result.pdf template.pdf
 /Root 1 0 R
 /Info 4 0 R
-/ID [<0e90025fbadc9f6434f3d192980836f3> <0e90025fbadc9f6434f3d192980836f3>]
+/ID [<bf69242a5582735a45e72ab0ed370876> <bf69242a5582735a45e72ab0ed370876>]
 /Size 11

The reason for the different checksum is the PDF file identifier (/ID) which is indvidually created per file and always differs even if 2 documents are produced using the exact identical piece of code.

If you want further information about it take a look at the following discussion at stackoverflow:

Mr. Lowagie explains the reason for different checksums and the need for the PDF file identifier. So far so good - but that is really awful for document validation against defined templates and test automation.

Solution 1 - Adapted checksum calculation

One possible solution is to implement an adapted checksum calculation. Look for the /ID line and do not pass the data into the checksum calculation (the same for /CreationDate).
I implemented a quick hack which does the trick but uses a BufferedReader. That is not the first choice when dealing with binary data but it works for now:

public byte[] getChecksum(InputStream stream) throws Exception {
  BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
  try {
    MessageDigest md = MessageDigest.getInstance("MD5");
    String line;
    while ((line = reader.readLine()) != null) {   
      if (!line.startsWith("/ID [<")) {
    return md.digest();
  } finally {
    // close the stream
    try { reader.close(); } catch (Exception e) {}

The problem here is that BufferedReader is for text files and line orientated. The plan is to implement a real binary solution using a byte[] buffer array. Basically it is easy to find the "/ID [<" snippet in the byte array. The problem is to also find it when it is splitted into different buffers during the read proccess.

Solution 2 - Using external tools/ libraries for comparison

Another solution is to use external tools/ libraries to compare PDF documents. Possible tools can be:

  • - Apache PDFBox based framework for JUnit
  • diffpdf (GUI) to get an inspiration
  • command line base tool
  • imagemagick (compare)

All of these tools more or less interprete/ render the documents and so do a comparison based on their structure and appearance and not on the binary file content itself.

Especially concerning Jpdfunit I see a problem when using the same library for test validation that is used within the project to create the PDF documents (System Concept DMS uses Apache PDFBox).


Two PDF files created by the same piece of code at different times are not binary equal. So a checksum algorithm produces differend fingerprints and a comparison fails.


  • Unique PDF file identifier (/ID)
  • Creation date (/CreationDate)

Possible solutions are

  1. Adapted checksum calculation which ommits /ID and /CreationDate
  2. Externals tools/ libraries to compare PDF documents by their structure/appearance and not binary content

Advantages of 1. (adapted checksum)

  • no dependencies
  • fast
  • metadata is validated, too
  • will fail if anything but the ommited parts differs

Advantages of 2. (tools)

  • may also work for document comparison from different sources

To keep tests simple and reduce dependencies on external tools I decided to use the adapted checksum solution.

Dienstag, 11. Oktober 2016

ResponsiveGridLayout - a responsive grad based Layout for SWT

I have implemented a grid based responsive SWT layout for my Document Management Software.
I believe it could be quite useful for other project and release it under the Eclipse Public License (EPL).

The layout arranges components in a grid where all cells have equal dimensions. Number of columns and column width adapts repsonsive to the available width.
Row height adapt to column width using a configurable ratio.

To see it in action please take a look at the demo video:

Here comes the source code. Please leave the header with the license and copyright information

 package biz.pinnau.dms.rcp;  
 import org.eclipse.swt.SWT;  
 import org.eclipse.swt.widgets.Composite;  
 import org.eclipse.swt.widgets.Control;  
 import org.eclipse.swt.widgets.Layout;  
  * Released under Eclipse Public License (EPL) 1.0  
  * Copyright: Peter Pinnau (  
  * Gridbased SWT Layout  
  * 1. All cells have equal dimensions  
  * 2. Cell widths increases/decreases automatically and adjusts to   
  *  available width of the container  
  * 3. Cell height is calculated with a ratio + offset from calculated  
  *  cell width: height = width * ratio + offset  
  * 4. Column count can be automatically calculated according to available width  
  * @author Peter Pinnau (  
  * @version 1.0  
  * Last modified: 2016-10-11  
  * Class was implemented for System Concept DMS  
  * System Concept DMS - Das papierlose Büro  
 public class ResponsiveGridLayout extends Layout {  
       * Fixed number if columns. Default = 0 => adaptive number of columns  
      private int fixedNumColumns = 0;  
       * Mimimum column width  
      private int minColumnWidth;  
       * ratio to calculate row height  
      public float ratio = 1f;  
       * Offset for row height calculation  
      public int offset = 0;  
       * margin around grid  
      public int margin = 0;  
       * spacing between cells  
      public int spacing = 10;  
       * Creates an instance of the layout using a fixed number of columns  
       * @param numColumns  
      public ResponsiveGridLayout(int numColumns, int minimumColumnWidth) {  
           this.fixedNumColumns = numColumns;            
           this.minColumnWidth = minimumColumnWidth;  
       * Creates an instance of the layout with adaptive number of columns depending  
       * on available width and minColumnWidth  
       * @param columnCounts  
      public ResponsiveGridLayout(int minimumColumnWidth) {  
           this.minColumnWidth = minimumColumnWidth;  
      private int getBoxHeight(int boxWidth) {  
           return (int) (boxWidth * ratio) + offset;  
      private int getNumColumns(int availableWidth) {  
           // Fixed number of columns  
           if (fixedNumColumns > 0) return fixedNumColumns;  
           // No available width specified  
           if (availableWidth == SWT.DEFAULT) {  
                // assume 1 column  
                return 1;  
           // Start with width of 1 column  
           int totalWidth = minColumnWidth + 2 * margin;  
           int numColumns = 0;  
           // Increase column count until totalWidth exceeds avaiblableWidth  
           while (totalWidth < availableWidth) {  
                // Add spacing + minColumnWidth (1 new column)  
                totalWidth = totalWidth + spacing + minColumnWidth;  
           if (numColumns == 0) numColumns = 1;  
           return numColumns;            
      protected Point computeSize(Composite composite, int wHint, int hHint, boolean flushCache) {  
           Point size = new Point(0, 0);  
           // Get number of columns  
           int numColumns = getNumColumns(wHint);  
           int boxWidth = getAvailableBoxWidth(wHint, numColumns);  
           int componentCount = composite.getChildren().length;  
           int rows = (int) Math.ceil(componentCount / (double) numColumns);  
           if (rows == 0) rows = 1;  
           size.x = numColumns * boxWidth + (numColumns-1) * spacing + 2 * margin;  
           size.y = rows * getBoxHeight(boxWidth) + (rows-1) * spacing + 2 * margin;  
           return size;  
       * Calculates the available cell width  
       * @param width  
       * @param numColumns  
       * @return  
      private int getAvailableBoxWidth(int width, int numColumns) {  
           if (width == SWT.DEFAULT) return minColumnWidth;  
           int boxWidth = (width - 2 * margin - (numColumns * spacing)) / numColumns;  
           if (boxWidth < minColumnWidth) return minColumnWidth;  
           return boxWidth;  
      protected void layout(Composite composite, boolean flushCache) {  
           Control[] children = composite.getChildren();  
           // Get number of columns  
           int numColumns = getNumColumns(composite.getClientArea().width);  
           int boxWidth = getAvailableBoxWidth(composite.getSize().x, numColumns);  
           int boxHeight = getBoxHeight(boxWidth);  
           // Build the grid  
           int x = margin;  
           int y = margin;  
           for (int i=0; i<children.length; i++) {  
                if (i % numColumns == 0) {  
                     x = margin;  
                     if (i > 0) {  
                          y = y + spacing + boxHeight;   
                     } else {  
                          y = margin;  
                } else {  
                     x = x + boxWidth + spacing;                      
                children[i].setSize(boxWidth, boxHeight);  
                children[i].setLocation(x, y);                                

The following example shows how to use the layout together with a ScrolledComposite.

 scroll = new ScrolledComposite(parent, SWT.V_SCROLL | SWT.H_SCROLL );  
container = new Composite(scroll, SWT.None);  
// Create adaptive Grid  
layout = new ResponsiveGridLayout(150);  
layout.offset = 50;  
layout.ratio = 1.5f;  
layout.margin = 5;  
layout.spacing = 5;  
// Set the child as the scrolled content of the ScrolledComposite  
// Expand both horizontally and vertically  
scroll.addListener(SWT.Resize, new Listener() {                    
      public void handleEvent(Event event) {  
            scroll.setMinSize(container.computeSize(scroll.getClientArea().width, SWT.DEFAULT));  
// Add some controls to container ...