PHPUnit compare generated PDF files with Imagick

Sometimes it is necessary to compare a generated PDF file with a given one in PHP. Just to check with one PHPUnit test that your PDF generation works the expected way. But pretty often generated PDF files are equal, but there content isn't the same. For example if you use FPDF the following assertion can fail if the files have different meta data.

$pdfContent1 = file_get_contents('path_to_pdf1');
$pdfContent2 = file_get_contents('path_to_pdf2');
$this->assertSame($pdfContent1, $pdfContent2);

My solution for this problem is Imagick. The idea is to create an Imagick object from both PDF files. After that we concat the pages and compare the Imagick image objects. If our assertion fails, we can write the diff of both images into a file.

A simple PDF generator as example

I have created a small class PdfGenerator that requires fpdf\FPDF to explain my way a little bit more detailed.

<?php
namespace PHPPDFUnit;

use fpdf\FPDF;

/**
 * A minimal PdfGenerator as example.
 */
class PdfGenerator
{
    /**
     * @var array
     */
    private $pageContent;

    /**
     * @param array $pageContent An array of strings, each string on a new page after render.
     */
    public function __construct(array $pageContent)
    {
        $this->pageContent = $pageContent;
    }

    /**
     * Creates an PDF document and returns the bytecode of the file as string.
     *
     * @return string
     */
    public function render()
    {
        $pdf = new FPDF();
        $pdf->SetFont('Arial', 'B', 100);
        foreach ($this->pageContent as $content) {
            $pdf->AddPage();
            $pdf->Cell(40, 40, $content);
        }
        return $pdf->Output('doc.pdf', 'S');
    }
}

This class takes just an array of Strings as $pageContent. After calling the render function it will write every string on a single page and will return the content of the PDF file as binary string. Here a small example of the usage.

$pdfGenerator = new PdfGenerator(['Hello', 'PHP']);
file_put_contents('file.pdf', $pdfGenerator->render());
Example PDF

Testing the PDF generator

Now we can start testing the PDF generator. As mentioned above just comparing the content would end with Failed asserting that two strings are identical.. We would provoke this error with a sleep of two seconds between rendering the PDF content. To make sure that both content strings are different we will test that with the function assertNotSame in the next example.

public function testRenderSuccess()
{
    $pageContent = ['Hello', 'PHP'];
    $assertedPdfGenerator = new PdfGenerator($pageContent);
    $testPdfGenerator = new PdfGenerator($pageContent);
    $assertedPdfContent = $assertedPdfGenerator->render();
    sleep(2);
    $testPdfContent = $testPdfGenerator->render();

    // the content is different because of the /CreationDate
    $this->assertNotSame($assertedPdfContent, $testPdfContent);

    $assertedImagick = new \Imagick();
    $assertedImagick->readImageBlob($assertedPdfContent);
    $assertedImagick->resetIterator();
    $assertedImagick = $assertedImagick->appendImages(true);
    $testImagick = new \Imagick();
    $testImagick->readImageBlob($testPdfContent);
    $testImagick->resetIterator();
    $testImagick = $testImagick->appendImages(true);

    $diff = $assertedImagick->compareImages($testImagick, 1);
    $this->assertSame(0.0, $diff[1]);
}

Failing example

The test above will be successful with two assertions. Now I would like to show a failing example. So we will create the same test, but with different strings.

public function testRenderFailure()
{
    $assertedPdfGenerator = new PdfGenerator(['Hello', 'PHP']);
    $testPdfGenerator = new PdfGenerator(['Hello', 'HHVM']);
    $assertedPdfContent = $assertedPdfGenerator->render();
    sleep(2);
    $testPdfContent = $testPdfGenerator->render();

    // the content is different because of the /CreationDate
    $this->assertNotSame($assertedPdfContent, $testPdfContent);

    $assertedImagick = new \Imagick();
    $assertedImagick->readImageBlob($assertedPdfContent);
    $assertedImagick->resetIterator();
    $assertedImagick = $assertedImagick->appendImages(true);
    $testImagick = new \Imagick();
    $testImagick->readImageBlob($testPdfContent);
    $testImagick->resetIterator();
    $testImagick = $testImagick->appendImages(true);

    $diff = $assertedImagick->compareImages($testImagick, 1);
    $this->assertSame(0.0, $diff[1]);
}

As expected we have one failure Failed asserting that 9825.0 is identical to 0.0., cause the PDF is different.

Visualize the difference

The failing test above is good, but the message 9825.0 is identical to 0.0 isn't really helpful for a developer. So in general, we can write the content of the test PDF to a file if we get a failure. Maybe a little bit cooler is the opportunity to write the difference of both PDF images into a file. For example like here.

$diff[0]->writeImages('diff.png', false);

Please note, this test is really time consuming and whenever it fails, somebody has to make a visual check. So I use it economical. But in my eyes some applications that really depend on the creation of PDF files may need that.

Diff PDF
Next Previous