Saturday 22 December 2012

Overriding system libraries

Regression testing is great, but when you're testing your applications interactions with other libraries (especially ones that do lots of work) while doing memory checking (valgrind) things can get really really slow. Opendias testing is no exception.

Since we don't want to test these libraries - we just want to test opendias - I've decided to 'stub out' some of the larger application libraries (some parts of same, tesseract and PDF parsing) while running regression tests.

Some very cool stuff. Here's two links I've recently found very useful in that task.

I'm pleased to say, they do their job very well, and I'm very happy at the results.

Sunday 25 November 2012

Serious Authentication

The big news for the 0.9 series is the migration of location based privileges to a proper user authentication mechanism.
The first stage of the is making the application session aware, and that's what's been going on recently.
This looks to be going well and give us some extra flexibility (with fancy user specific functionality in the future.)
Once things are stable on the session handling, user profiles/ management/ authentication and finally migration of location based privileges, will happen.

Tuesday 20 November 2012

Let's just get on with it!

So, I've created a new development branch for the 0.9 series.

First on the list is removing all the hard coded directory locations. Hopefully by the end we'll be able to install the application into any location (eg /tmp/opendias_test/) so that no root access is required for testing (or local deployment).
The idea is to take all location config from the GNU automake "--prefix" variable.

Monday 19 November 2012

openDIAS version 0.8.1 released today.

The openDIAS development team are pleased to announce that openDIAS; the open source document imaging and archive system, have today released version 0.8 of the application.

OpenDIAS is a product designed to allow the home or office user to store and catalog documents, mainly (but not exclusively) from scanned physical documents. Documents are stored on a central machine and can be scanned/ imported into the archive, searched for by OCR content, date, title or assigned tag from any machine on the network.

Opendias 0.8.1 Document Detail

New features in the upcoming 0.8 release include:
  • Document linkage to other docs,
  • Scanning device locking,
  • Document list has been migrated to an auto loaded list,
  • Imported PDF, ODF and Image objects now generate a thumbnail and have OCR performed,
  • Introduction of a localisation framework (initially with English and German lang packs),
  • Various other frontend and backend improvements, tweaks and optimisations,
  • Better testing of various hardware and OS configs (including 64-bit, Ubuntu, Redhat, Debian).


OpenDIAS is available in deb and rpm packages, from a source tarball or GitHub, with a comprehensive testing suite.

Opendias 0.8.1 Document List Page

We encourage all users to upgrade their production systems to this newest version.


For further information or help, visit the project homepage at http://opendias.essentialcollections.co.uk/

Opendias 0.8.1 Document Scanning

Wednesday 24 October 2012

The next cycle begins

Work has started on the new development branch (the 0.9 cycle).

Already started is the 'configurable install location' and 'access controls by username/password'.

No rest...

Sunday 21 October 2012

Version 0.8.1 Released for public beta

The openDIAS development team are pleased to announce that openDIAS; the open source document imaging and archive system, have today released version 0.8 of the application as a public beta.

OpenDIAS is available in deb and rpm packages, from a source tarball or GitHub, with a comprehensive testing suite.

Opendias 0.8.1 Document List Page

We invite all users, either new or familiar with openDIAS to install (onto non production environments) and give feedback before the scheduled release date of Monday the 19th November 2012.

For further information or help, visit the project homepage at http://opendias.essentialcollections.co.uk/

Sunday 19 August 2012

0.8 alpha release has been frozen


The 0.8 alpha release has been frozen.


We expect the release to be final before the end of the year.

Sunday 12 August 2012

Take time to re-focus efforts

I've recently pulled back from the project to get a little perspective. I think for the past few months, things were getting carried away with updates and changes just for sake of them rather than to fulfill the big picture. I'm highly aware that people are waiting for the latest version, and the longer things just trundle on, the longer work that has already been done get waisted.

To that end, I've taken the opertunity to re-focus on generating the next release. I'll be posting shortly with the plan to get to just that point.

Friday 6 July 2012

Done client side testing (plus test)

OK, so we have a fair smattering of client side testing now. Does what is says on the tin. No great shakes, but extra coverage is always good coverage. Unfortunalty were not going to ensure that the layout is not busted on various browsers, but still.

Thursday 21 June 2012

Client side testing


I've had opertunity recently, to experiment with various means of website client-side functionality testing. The one that's far out in front, at least for testing openDIAS is concerned, is QUnit. Our app uses a lot of JQuery for frontend functionality, so using their own testing framework to help test openDIAS looks to fit really well.

Best of all, this can be intergrated into TestSwarm (and further Jenkins), to allow testing over multiple clients on multiple devices: Linux, Mac, Pads, Phones and even (if I dust off that VM) windows.

I'm going to generate a test suit that will intergrate into the current regression tests, and will fill the 'client functionality testing gap'.

Friday 15 June 2012

English language pack created

The moving of English text out of the application was more involved that I'd though. Not that it was difficult. It was just there was more text than I'd thought.

Anyway, we now have an EN language pack.

Efforts have started to translate this pack into German, but since I don't know German and am mainly relying on 'Google translate', this is slow going. Thankfully this is a task that can run in the background.

Wednesday 30 May 2012

Finished the localisation framework


Well, that was much easier than I'd imagine it would be. The end product is almost exactly how it was describe in my earlier post. We just need to move all the current 'hard coded' English test into a language pack and replace with the framework variables.

Then it's time to localise the EN language pack in to some suitable languages - Gulp!

Tuesday 22 May 2012

Experiment to create a localisation framework

As an experiment, I'm looking into how much work is required to add a localisation framework. The idea being that the app web frontend will allow the user to specify a language (from allowed 'installed languages', or leave it to the browser client to negotiate the best language depending on users machine settings. The available languages will be negotiated, depending on the backend installed 'language packs', default to English if all options fail.

Thursday 17 May 2012

Library Updates


The new version of Ubuntu does not support tesseract v2.The OCR libraries have therefore been updated to the latests version. This, and the enevitable following refactor updated several other key elements of openDIAS.
  • Leptonica has replaced libtiff and freeimage, just because that's the one that's intergrated into tesseract, and it can be used for all the other image processing we are doing.
  • Poppler has been introduced to parse PDF files. We can now get thumbnail images for PDFs as well as imported images and scanned docs. A temporary hook has been put in place that will allow PDFs that have already been imported to be re-parsed to get the accurate OCR test and thumbnail image.



Sunday 22 April 2012

Now 64-bit capable


Steve Meifert's changes have fixed various issues seen on 64 bit systems.

Big Thanks to Steve for his input.

Wednesday 4 April 2012

UTF8 FTW


Tests have been added to ensure that UTF8 data, posted to the app, or generated by OCR is stored and rendered correctly by the API and application frontend.

I thought, I may have some trouble with this, but it looks like everything is in order and no (only very minimal) changes are required. This opens the door for full multi-language support, that is planned for the future.

Friday 30 March 2012

Frontend Timeout


What has long been a source of problem is the way that the frontend handles error. If the API call to the backend timeouts, or worse, the backend has totally died, the frontend would just sit there.

Updates have now been added that will catch AJAX/API timeouts/errors and let the user know in as an informative way as possible.

This had a fair impact on testing, to handle slow hosts, but I think things are covered now. I'm sure things will highlight themselves in the coming months if any problems remain.

Friday 23 March 2012

Sane has errors

So, I've started on the move about of SANE calls as described in an earlier post. However, I'm still getting problems in some simple scenarios.  I've isolated these out into their own very simple program so they can be demonstrated.
And as a result, I've raised a new bug against the sane project.

Tuesday 21 February 2012

Sane Threads Fail

This weekend I decided to run the new test suite, across a 'range of machines' and all my 'testing OS's virtual machines'. These test runs showed a number of failures - segfaults on exit. Investigations showed that this is related to the calling of sane functions while within an httpd thread.

I knew that sane is not thread safe, but I thought I'd be OK if I did a 'sane_init ... sane work ... sane_exit', all within a single thread. But alas not.

So, I'm going to have to do a bit of refactoring to solve this problem. The current plan is to:
  • Move the sane_init and sane_exit in the main startup/shutdown methods.
  • Create a 'command socket' within the main block of the program.
  • Isolate sane activity from the current methods that do sane work and replace them with calls into the command socket.
  • Use a listening loop on the command socket to dispatch sane work - that will be done within the main part of the program (and not a thread).
Hopefully this will solve the problems, and has the added advantage of making 'sane and/or device' locking easier to implement. Therefore I'll be doing both of these at the same time.

Friday 17 February 2012

Testing is back up to scratch

The marathon that is 'better.testing' is over!

The old testing was based mainly around using the frontend to exercise the backend. This was a logical approach, but was flawed in a few ways.

  • First, it left little room for edge case testing - 'happy path' testing does not find issues.
  • Second, using the frontend and a valgrind run backend was painfully slow.
So, I've updating things a bit. First the current tests have been split into 'service starting/stopping scenarios', and the 'current frontend tests' (with valgrind checking removed).

A new set of tests have been created, that fire HTTP request at the services just as an AJAX request would do. The responses are checked for what is expected and the database compared to ensure the action have been performed correctly. These tests are also checked under valgrind, to give confidence with regard to memory violations. 

This changes not only cover a greater range, but also allows greater control for the tests we have and flexibility for new testing in the future. 

Will be available in the 0.8 releases.

Thursday 26 January 2012

Swiftly moving on.

The document linking work sent really well, and I've got something pushed (to dev) that I'm fairly happy with. (document linking will be available in the 0.8 releases.)
While that settles in my mind, I've started on the 'better testing' branch - Loooong overdue.

Saturday 21 January 2012

Started work on 'document linking'

Created development branches and started the initial work for the document linking solution. Things are going well so far.

Wednesday 11 January 2012

Wow - It's finally here - 0.7.3.

Wow !

It's finally here - 0.7.3.
Go get it people, enjoy:



  • Issue #10 - Scan images display is not permission checked
  • Remove the de-skew functionality
  • Improve OCR operations accuracy
  • Allow documents to be opened full screen
  • Revamp doc list and filter functionality
  • Added and 'action required' flag to documents
  • Added an "ideas" link and fixed typos/spelling error, on the application homepage.
  • Removed the dependency on glib
  • Added new admin app to easily set config options
  • Improved the selection and assignment of tags
  • Fix an issue with bad setting of SANE options. (some user could not set the resolution)
  • Added language selection for OCR operations
  • Fix install problem if installation is already found
  • Improved build instructions
  • Fix build problem on Fedora
  • Layed the groundwork for device locking
  • Various tidy-ups
  • Unified the backed response structure and defined an API

Sunday 8 January 2012

Sunday 1 January 2012

Release 0.7 out for testing

All open issues for the 0.7 branch have now been closed and I've tagged the codebase (0.7.2).
Testers are giving the new release a thrashing, so the full public release will be cut shortly.