sk-spell

podpora slovenčiny v Open Source programoch

less is better   

posledná zmena: 1. January 2019




back to tesseract-ocr-en

Nowadays file-size is not a issue usually. But there are situations when size of library matters e.g. on devices with limited resources (e.g. phones).

Somebody could try to play with compiler optimization options, but you should be expect additional problem (e.g. unexpected crash or decrease OCR accuracy in case of tesseract)

Another approach could be to skip not needed functionality or symbols. Here are ‘./configure’ options provided directly in tesseract-ocr. All test was done on linux (openSUSE 13.2 64bit).

--disable-graphics

This option disable ScrollView support. ScrollView is used to display tesseract internal state, so that you can view its segmentation and recognition. It is usefull only for advance tesseract desktop user, so it make sense to disable it if you are compiling tesseract library for usage in external application. Some issues with this option were fixed after 3.04.00 release, so make sure that you use the latest code or apply patch that will fix it.

--enable-visibility

This feature hides most of the ELF symbols which would have previously (and unnecessarily) been public. Detailed description is in gcc wiki (it is applied by default in Visual Studio build).

--disable-cube

This option disable cube engine in tesseract. Cube engine is available only for few languages and it not documented for training. This option was implemented after 3.04.00 release, so you can use it only when you use current code from tesseract-ocr repository.

strip

strip is not ‘./configure’ option, but a linux program that discards symbols from object files. You can use like this:

strip libtesseract.so.3.0.4

overview

I tried to compare also size of library if the compiler makes difference. I did not find it interesting: size of gcc compiled library was 3765643 while clang compiled library was 3750108.

configure option size without stip size after stip
nothing 3765643 3259600
--disable-graphics 3593894 3113552
--disable-cube 3456687 3003112
--enable-visibility 3173954 2666416
--disable-graphics --disable-cube 3280545 2852936
--disable-graphics --disable-cube --enable-visibility 2782323 2353512




© projekt sk-spell

RSS [opensource] [w3c] [firefox] [textpattern]