Quantcast
Channel: AutoIt v3 - General Help and Support
Viewing all articles
Browse latest Browse all 12506

OCR Image Pre-processing

$
0
0

I'm trying to write a script to screencapture a window of on-screen text (a single line of between 1 and 10 characters, depending), and return the OCR'd result to a variable that I can then do whatever I want with. Sample image is attached.

 

So far I've used Tesseract UDF (actually, a modified one called "Simple Tesseract" found on these forums), and can successfully screencapture and crop to the desired region, and send the resulting image through the tesseract engine. However, Tesseract always returns an empty string. I've tried with MODI OCR too, and it can't recognize any text either.

 

"Simple Tesseract" UDF:

(Requires Tesseract OCR to be installed)

AutoIt         
#include-once #Include <Array.au3> #Include <File.au3> #include <GDIPlus.au3> #include <ScreenCapture.au3> #include <WinAPI.au3> #include <ScrollBarConstants.au3> #include <WindowsConstants.au3> #Include <GuiComboBox.au3> #Include <GuiListBox.au3> #EndRegion Header #Region Global Variables and Constants Global $last_capture Global $tesseract_temp_path = "C:\" #EndRegion Global Variables and Constants #Region Core functions ; #FUNCTION# ;=============================================================================== ; ; Name...........:  _TesseractTempPathSet() ; Description ...:  Sets the location where Tesseract functions temporary store their files. ;                       You must have read and write access to this location. ;                       The default location is "C:\". ; Syntax.........:  _TesseractTempPathSet($temp_path) ; Parameters ....:  $temp_path  - The path to use for temporary file storage. ;                                   This path must not contain any spaces (see "Remarks" below). ; Return values .:  On Success  - Returns 1. ;                   On Failure  - Returns 0. ; Author ........:  seangriffin ; Modified.......: ; Remarks .......:  The current version of Tesseract doesn't support paths with spaces. ; Related .......: ; Link ..........: ; Example .......:  No ; ; ;========================================================================================== func _TesseractTempPathSet($temp_path)     $tesseract_temp_path = $temp_path         Return 1 EndFunc ; #FUNCTION# ;=============================================================================== ; ; Name...........:  _TesseractScreenCapture() ; Description ...:  Captures text from the screen. ; Syntax.........:  _TesseractScreenCapture($get_last_capture = 0, $delimiter = "", $cleanup = 1, $scale = 2, $left_indent = 0, $top_indent = 0, $right_indent = 0, $bottom_indent = 0, $show_capture = 0) ; Parameters ....:  $get_last_capture   - Retrieve the text of the last capture, rather than ;                                           performing another capture.  Useful if the text in ;                                           the window or control hasn't changed since the last capture. ;                                           0 = do not retrieve the last capture (default) ;                                           1 = retrieve the last capture ;                   $delimiter          - Optional: The string that delimits elements in the text. ;                                           A string of text will be returned if this isn't provided. ;                                           An array of delimited text will be returned if this is provided. ;                                           Eg. Use @CRLF to return the items of a listbox as an array. ;                   $cleanup            - Optional: Remove invalid text recognised ;                                           0 = do not remove invalid text ;                                           1 = remove invalid text (default) ;                   $scale              - Optional: The scaling factor of the screenshot prior to text recognition. ;                                           Increase this number to improve accuracy. ;                                           The default is 2. ;                   $iLeft              - x-Left coordinate ;                   $iTop               - y-Top coordinate ;                   $iRight             - x-Right coordinate ;                   $iBottom            - y-Bottom coordinate ;                   $show_capture       - Display screenshot and text captures ;                                           (for debugging purposes). ;                                           0 = do not display the screenshot taken (default) ;                                           1 = display the screenshot taken and exit ; Return values .:  On Success  - Returns an array of text that was captured. ;                   On Failure  - Returns an empty array. ; Author ........:  seangriffin ; Modified.......: ; Remarks .......:  Use the default values for first time use.  If the text recognition accuracy is low, ;                   I suggest setting $show_capture to 1 and rerunning.  If the screenshot of the ;                   window or control includes borders or erroneous pixels that may interfere with ;                   the text recognition process, then use $left_indent, $top_indent, $right_indent and ;                   $bottom_indent to adjust the portion of the screen being captured, to ;                   exclude these non-textural elements. ;                   If text accuracy is still low, increase the $scale parameter.  In general, the higher ;                   the scale the clearer the font and the more accurate the text recognition. ; Related .......: ; Link ..........: ; Example .......:  No ; ; ;========================================================================================== func _TesseractScreenCapture($get_last_capture = 0, $delimiter = "", $cleanup = 1, $scale = 2, $iLeft = 0, $iTop = 0, $iRight = 1, $iBottom = 1, $show_capture = 0)     Local $tInfo     dim $aArray, $final_ocr[1], $xyPos_old = -1, $capture_scale = 3     Local $tSCROLLINFO = DllStructCreate($tagSCROLLINFO)     DllStructSetData($tSCROLLINFO, "cbSize", DllStructGetSize($tSCROLLINFO))     DllStructSetData($tSCROLLINFO, "fMask", $SIF_ALL)     If $last_capture = "" Then         $last_capture = ObjCreate("Scripting.Dictionary")     EndIf     ; if last capture is requested, and one exists.     If $get_last_capture = 1 And $last_capture.item(0) <> "" Then         Return $last_capture.item(0)     EndIf     $capture_filename = _TempFile($tesseract_temp_path, "~", ".tif")     $ocr_filename = StringLeft($capture_filename, StringLen($capture_filename) - 4)     $ocr_filename_and_ext = $ocr_filename & ".txt"     CaptureToTIFF("", "", "", $capture_filename, $scale, $iLeft , $iTop , $iRight , $iBottom )         ShellExecuteWait(@ProgramFilesDir & "\tesseract-OCR\tesseract.exe", $capture_filename & " " & $ocr_filename & " digits")     ; If no delimter specified, then return a string     If StringCompare($delimiter, "") = 0 Then         $final_ocr = FileRead($ocr_filename_and_ext)     Else         _FileReadToArray($ocr_filename_and_ext, $aArray)         _ArrayDelete($aArray, 0)         ; Append the recognised text to a final array         _ArrayConcatenate($final_ocr, $aArray)     EndIf     ; If the captures are to be displayed     If $show_capture = 1 Then             GUICreate("Tesseract Screen Capture.  Note: image displayed is not to scale", 640, 480, 0, 0, $WS_SIZEBOX + $WS_SYSMENU)  ; will create a dialog box that when displayed is centered         GUISetBkColor(0xE0FFFF)         $Obj1 = ObjCreate("Preview.Preview.1")           $Obj1_ctrl = GUICtrlCreateObj($Obj1, 0, 0, 640, 480)         $Obj1.ShowFile ($capture_filename, 1)         GUISetState()         If IsArray($final_ocr) Then             _ArrayDisplay($aArray, "Tesseract Text Capture")         Else             MsgBox(0, "Tesseract Text Capture", $final_ocr)         EndIf         GUIDelete()     EndIf     FileDelete($ocr_filename & ".*")     ; Cleanup     If IsArray($final_ocr) And $cleanup = 1 Then         ; Cleanup the items         For $final_ocr_num = 1 to (UBound($final_ocr)-1)             ; Remove erroneous characters             $final_ocr[$final_ocr_num] = StringReplace($final_ocr[$final_ocr_num], ".", "")             $final_ocr[$final_ocr_num] = StringReplace($final_ocr[$final_ocr_num], "'", "")             $final_ocr[$final_ocr_num] = StringReplace($final_ocr[$final_ocr_num], ",", "")             $final_ocr[$final_ocr_num] = StringStripWS($final_ocr[$final_ocr_num], 3)         Next         ; Remove duplicate and blank items         For $each in $final_ocr                     $found_item = _ArrayFindAll($final_ocr, $each)                         ; Remove blank items             If IsArray($found_item) Then                 If StringCompare($final_ocr[$found_item[0]], "") = 0 Then                                         _ArrayDelete($final_ocr, $found_item[0])                 EndIf             EndIf             ; Remove duplicate items             For $found_item_num = 2 to UBound($found_item)                                 _ArrayDelete($final_ocr, $found_item[$found_item_num-1])             Next         Next     EndIf     ; Store a copy of the capture     If $last_capture.item(0) = "" Then                     $last_capture.item(0) = $final_ocr     EndIf     Return $final_ocr EndFunc ; #FUNCTION# ;=============================================================================== ; ; Name...........:  CaptureToTIFF() ; Description ...:  Captures an image of the screen, a window or a control, and saves it to a TIFF file. ; Syntax.........:  CaptureToTIFF($win_title = "", $win_text = "", $ctrl_id = "", $sOutImage = "", $scale = 1, $left_indent = 0, $top_indent = 0, $right_indent = 0, $bottom_indent = 0) ; Parameters ....:  $win_title      - The title of the window to capture an image of. ;                   $win_text       - Optional: The text of the window to capture an image of. ;                   $ctrl_id        - Optional: The ID of the control to capture an image of. ;                                       An image of the window will be returned if one isn't provided. ;                   $sOutImage      - The filename to store the image in. ;                   $scale          - Optional: The scaling factor of the capture. ;                   $iLeft          - x-Left coordinate ;                   $iTop           - y-Top coordinate ;                   $iRight         - x-Right coordinate ;                   $iBottom        - y-Bottom coordinate ;                   $bottom_indent  - A number of pixels to indent the screen capture from the ;                                       bottom of the window or control. ; Return values .:  None ; Author ........:  seangriffin ; Modified.......: ; Remarks .......:  ; Related .......: ; Link ..........: ; Example .......:  No ; ; ;========================================================================================== Func CaptureToTIFF($win_title = "", $win_text = "", $ctrl_id = "", $sOutImage = "", $scale = 1, $iLeft = 0, $iTop = 0, $iRight = 1, $iBottom = 1)     Local $hWnd, $hwnd2, $hDC, $hBMP, $hImage1, $hGraphic, $CLSID, $tParams, $pParams, $tData, $i = 0, $hImage2, $pos[4]     Local $Ext = StringUpper(StringMid($sOutImage, StringInStr($sOutImage, ".", 0, -1) + 1))     Local $giTIFColorDepth = 24     Local $giTIFCompression = $GDIP_EVTCOMPRESSIONNONE     ; If capturing a control     if StringCompare($ctrl_id, "") <> 0 Then         $hwnd2 = ControlGetHandle($win_title, $win_text, $ctrl_id)         $pos[0] = 0         $pos[1] = 0         $pos[2] = $iRight - $iLeft         $pos[3] = $iBottom - $iTop     Else                 ; If capturing a window         if StringCompare($win_title, "") <> 0 Then             $hwnd2 = WinGetHandle($win_title, $win_text)             $pos[0] = 0             $pos[1] = 0             $pos[2] = $iRight - $iLeft             $pos[3] = $iBottom - $iTop         Else                         ; If capturing the desktop             $hwnd2 = ""             $pos[0] = 0             $pos[1] = 0             $pos[2] = $iRight - $iLeft             $pos[3] = $iBottom - $iTop         EndIf     EndIf         ; Capture an image of the window / control     if IsHWnd($hwnd2) Then             WinActivate($win_title, $win_text)         $hBitmap2 = _ScreenCapture_CaptureWnd("", $hwnd2, $iLeft, $iTop, $iRight, $iBottom, False)     Else                 $hBitmap2 = _ScreenCapture_Capture("", $iLeft, $iTop, $iRight, $iBottom, False)     EndIf     _GDIPlus_Startup ()         ; Convert the image to a bitmap     $hImage2 = _GDIPlus_BitmapCreateFromHBITMAP ($hBitmap2)     $hWnd = _WinAPI_GetDesktopWindow()     $hDC = _WinAPI_GetDC($hWnd)     $hBMP = _WinAPI_CreateCompatibleBitmap($hDC, $pos[2] * $scale , $pos[3] * $scale)     _WinAPI_ReleaseDC($hWnd, $hDC)     $hImage1 = _GDIPlus_BitmapCreateFromHBITMAP ($hBMP)     $hGraphic = _GDIPlus_ImageGetGraphicsContext($hImage1)     _GDIPLus_GraphicsDrawImageRect($hGraphic, $hImage2, 0 , 0 , $pos[2] * $scale, $pos[3] * $scale)     $CLSID = _GDIPlus_EncodersGetCLSID($Ext)     ; Set TIFF parameters     $tParams = _GDIPlus_ParamInit(2)     $tData = DllStructCreate("int ColorDepth;int Compression")     DllStructSetData($tData, "ColorDepth", $giTIFColorDepth)     DllStructSetData($tData, "Compression", $giTIFCompression)     _GDIPlus_ParamAdd($tParams, $GDIP_EPGCOLORDEPTH, 1, $GDIP_EPTLONG, DllStructGetPtr($tData, "ColorDepth"))     _GDIPlus_ParamAdd($tParams, $GDIP_EPGCOMPRESSION, 1, $GDIP_EPTLONG, DllStructGetPtr($tData, "Compression"))     If IsDllStruct($tParams) Then $pParams = DllStructGetPtr($tParams)     ; Save TIFF and cleanup     _GDIPlus_ImageSaveToFileEx($hImage1, $sOutImage, $CLSID, $pParams)     _GDIPlus_ImageDispose($hImage1)     _GDIPlus_ImageDispose($hImage2)     _GDIPlus_GraphicsDispose ($hGraphic)     _WinAPI_DeleteObject($hBMP)     _GDIPlus_Shutdown() EndFunc

Test Code:

(The 626,148,654,167 parameters specify the screen coordinates to crop the screencapture to. The resulting image is a white "18" on a red background, and is attached to the post)

#include <SimpleTesseract.au3> $OCR_Result = _TesseractScreenCapture(0,"",1,1,626,148,654,167,1)

However, if I put the screen captured image through http://www.free-ocr.com/ (which itself uses Tesseract), the text always works with 100% accurate results.The FAQ at www.free-ocr.com website says that the only pre-processing they do prior to Tesseract is reducing background noise, and adjusting resolution. This leads me to believe that I need to perform some pre-OCR processing. 

 

So, my question is... how can I perform this OCR image pre-processing through autoit? (Maybe through GDI Plus, or through a command line interface).

 

Of course, I'm open to alternatives to Tesseract or OCR altogether if the right solution comes along. I'm relatively new to autoit, and am not too familiar with a lot of the deeper, built-in functionality and interfacing autoit can do with windows, etc.

 

Thanks!

Attached Files


Viewing all articles
Browse latest Browse all 12506

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>