• 0 Posts
  • 13 Comments
Joined 2 years ago
cake
Cake day: August 8th, 2023

help-circle

  • How good is good do you say?

    We got a pretty good results with CER at 4% and WER at 15%!

    This was on a limited dataset used to test and train which most likely means that if you introduced an even larger dataset with greater variations in handwriting style for testing the numbers might be even worse.

    Very simplified: A risk of a character wrong every 20th character and a word wrong every 7th word. The SER was around 20%.

    There’s an reason why no one has released a good model for western letters yet and why companies pay up to 1€ for capturing data from 10 handwritten pages.

    It will come but OCR isn’t as sexy as developing text2image solutions.




  • To train an AI to recognize handwriting you need a huge dataset of handwriting examples. That is millions of samples of handwritten text + information about what the written text says in every example).

    This is why the best engines only exists as a service in the cloud. The OCR engines you can install lovely that are acceptable, but far from perfect, are commercial. Parascript FormXtra is one of the better commercial ones.

    The only OCR Engine that’s free and really good is Tesseract OCR but it doesn’t handle handwritten text.



  • mindlight@lemm.eetoOpen Source@lemmy.mltext in image translation
    link
    fedilink
    arrow-up
    9
    arrow-down
    1
    ·
    11 months ago

    I don’t have the answer your looking for but maybe a pointer for where to look and what to look for …

    What you want is essentially done in two steps.

    1. Optical Character Recognition - an image consists of pixels. There is no text, just pixels. You need a program that can see the difference between pixels forming an A and an B. Tesseract is a very competent program for this and it’s free. However, it’s command line only but I know there are GUI applications based on Tesseract.

    2. Translate text from one language to another - maybe Dialect?



  • You can set the camera to store the pictures as JPEG. I am happy with JPEG for my holiday photos. Just check that you have the best quality setting since JPEG uses lossy compression.

    While HEIF is not the doomsday thing some describes it as, it currently is somewhat problematic.

    There are for example problems, originating in differences in implementation between different hardware vendors, with 10-bit and HDR.


  • mindlight@lemm.eetoSelfhosted@lemmy.worldTrueNAS/Nextcloud HEIC support?
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    1
    ·
    edit-2
    1 year ago

    I replied to a statement about Heif being an Apple image format. It is not.

    Furthermore, HEIF is something that most major mobile device vendors support. Some, like Samsung, even sets it as default on some of their devices. So the whole “Apple always supporting not open standards” is just tiresome at this point.

    99.999% of all Android users are defacto locked in by Google. Yes, Android might be open but Play services are not. Google works hard to lock in Android users.

    At least Apple are open and honest about locking in iOS users.



  • mindlight@lemm.eetoSelfhosted@lemmy.worldTrueNAS/Nextcloud HEIC support?
    link
    fedilink
    English
    arrow-up
    22
    arrow-down
    2
    ·
    1 year ago

    No.

    It’s a container for image data developed by Moving Picture Experts Group (“MPEG”, try to guess what else they have created).

    While there are some compatibility issues between vendors HEIC still offers a greater set of features as compared to fx JFIF (you probably know it as JPEG/JPG.

    Apple was one of the early adopters (2017) and (as usual?) the industry has followed. Microsoft wants money for the codec in Windows and that’s probably one of the reasons why it’s not commonly used…yet.