test-cli: drop dependency, add check #22

bertsky · 2022-03-22T13:28:47Z

No description provided.

bertsky · 2022-03-23T06:54:07Z

@crater2150, regarding 22cbdcb, could you please comment what kind of image input the default and legacy models expect? From my experiments, it looks like

RGB and binary does not work (the first input must only have 1 channel in the last axis)
grayscale and binary does not work (there will be no detections whatsoever)
binary * 255 and binary kinda works (with same low-quality results as the test-cli results, which only takes a binary image as input)

In particular, if during training only binarized images were seen, I wonder why there are two inputs at all. Also, should we do cropping and deskewing first? And what about the xheight and resize_height parameters? (They seem to have a large influence, so I'd like to have some guidance beyond the ocrd-tool descriptions.)

crater2150 · 2022-03-28T12:30:21Z

@bertsky

regarding 22cbdcb, could you please comment what kind of image input the default and legacy models expect?

In particular, if during training only binarized images were seen, I wonder why there are two inputs at all.

As far as I can remember, the models were trained on the binary images only, but it is certainly possible to train a model on the color images too. The bundled models are for separating text from nontext only, which worked better on binary images in our experiments at the time. The reason for the two inputs is, that some postprocessing relies on the binary images for distinguishing foreground and background, so they are required even when training on full color images.

Also, should we do cropping and deskewing first?

The models were trained on deskewed images. The input images weren't cropped, but in my experience, cropping images before prediction reduces errors, even if the model was trained on uncropped images.

And what about the xheight and resize_height parameters? (They seem to have a large influence, so I'd like to have some guidance beyond the ocrd-tool descriptions.)

The xheight parameter should be set to match the scaling used in the model, which was set to the default value of 6 during the training of the bundled models. The preprocessing estimates the average line height in the input and scales it so that a lowercase letter like x should be sized xheight pixels. Maybe setting model to __DEFAULT__ should also set xheight?

The resize_height is model-independent, it is only used for scaling down the output of the neural network before postprocessing (splitting regions etc.). This is an upper limit on the image height, so for inputs smaller than this value, it has no effect. For larger images, it reduces the resolution and therefore may affect the quality of the segmentation. Setting it to something higher than the height of the largest input image would disable downscaling, if the performance or memory usage is bad, this value can be decreased.

bertsky and others added 5 commits March 22, 2022 14:28

test-cli: drop dependency, add check

a00380b

update to ocrd v 2.30

8bdb06b

only remove other region types when asked

d7ad368

move model init to setup function in constructor

97c386e

(try to) make use of raw and binary images properly

22cbdcb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test-cli: drop dependency, add check #22

test-cli: drop dependency, add check #22

Uh oh!

bertsky commented Mar 22, 2022

Uh oh!

bertsky commented Mar 23, 2022

Uh oh!

crater2150 commented Mar 28, 2022

Uh oh!

Uh oh!

test-cli: drop dependency, add check #22

Are you sure you want to change the base?

test-cli: drop dependency, add check #22

Uh oh!

Conversation

bertsky commented Mar 22, 2022

Uh oh!

bertsky commented Mar 23, 2022

Uh oh!

crater2150 commented Mar 28, 2022

Uh oh!

Uh oh!