Convert scanned Khmer documents into Khmer Unicode using

There have been many attempts at creating a viable solution for converting scanned Khmer text into Khmer Unicode, but all have fallen short of actually being useful. But utilizes machine learning, and with additional training data provided by volunteers it can “learn” to convert new fonts with very high accuracy. This makes the solution flexible and viable because non-programmers can “teach” the software to correctly convert a scanned document into Khmer Unicode.


2 Comments. Leave new

  • That is pretty awesome. Actually, OCR has been used to convert English characters for long time, but for Khmer characters is so difficult to map it shape and level of where character are grouped into a word

    Appreciate their hard work.

  • Dear Programmer, would it be possible to scann such amount as the Khmer Tipitaka?


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.
You need to agree with the terms to proceed

This site uses Akismet to reduce spam. Learn how your comment data is processed.