Python으로 문자 추출하기 - 이미지 전처리

728x90

Python을 사용해 이미지에서 텍스트 추출하는 방법으로 tesseract를 설치했었다
오늘은 이미지를 불러와서 텍스트 추출하는 방법과 이미지 전처리에 대해 알아보겠다

1. 이미지 불러오기

path = "C:/Users/****/"
testimg = cv2.imread(path+"파일명.jpg", cv2.IMREAD_COLOR)

불러올 이미지 path와 파일명을 넣고 컬러파일을 불러온다.

2. 이미지 리사이즈

imageHeight, imageWidth = testimg.shape[:2]
resizeHeight = int(0.3 * imageHeight)
resizeWidth = int(0.3 * imageWidth)
img = cv2.resize(testimg, (resizeWidth, resizeHeight), interpolation = cv2.INTER_CUBIC)

사용할 이미지의 사이즈가 너무 커 resize해주었다.
가로 세로 배수는 이미지에 따라 적용해준다.

3. 이미지 grayscale

def gray_scale(image):
    result = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    return result
img_gray = gray_scale(img)

불러온 이미지를 grayscale로 변환해준다.
이미지의 색상특성을 변환하는 cv2.cvtColor함수의 파라미터로 cv2.COLOR_BGR2GRAY를 넣어준다.

4. 이미지 binary

def img_threshold(image):
    result2 = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
    return result2
img_binary = img_threshold(img_gray)

grayscale로 변환한 이미지를 이진화(0,255) 이미지로 변환.

5. 이미지 잡티 제거

def remove_noise(image, kernel_size=3):
    result3 = cv2.medianBlur(image, ksize=kernel_size)
    return result3
img_rm = remove_noise(img_binary)

이미지의 잡티를 제거해준다.
노이즈를 제거하는 함수인 cv2.medianBlur를 사용.

6. 이미지 팽창

def dilation(image):
    kernel = np.ones((3,3), np.uint8)
    result4 = cv2.dilate(image, kernel, iterations=1)
    return result4
img_dilate = dilation(img_binary)

이미지 경계를 기준으로 글자부분이 팽창되어야 하는데 글자가 작아서 그런지 잘 적용되지 않는다.
사용하지 않을듯

7. 이미지 모폴로지 연산

def morphology(image):
    kernel = np.ones((8,8), np.uint8)
    result5 = cv2.morphologyEx(img_rm, cv2.MORPH_OPEN, kernel)
    return result5
img_morp = morphology(img_rm)

팽창과 침식을 이용하는 morphology연산 적용.
open : 침식 연산 적용한 뒤 팽창 연산 적용. 침식연산으로 밝은 영역이 줄어들고 어두운 영역 증가
이 외에도 close, gradient, tophat, blackhat, hitmiss 등이 있음

전체 코드

import os
try:
    from PIL import Image
except ImportError:
    import Image
import pytesseract
import cv2
import numpy as np

#tesseract 설치 경로 / 32비트라면 Program Files(x86)
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
tessdata_dir_config = '--tessdata-dir "C:\\Program Files\\Tesseract-OCR\\tessdata"'

#이미지 불러오기
path = "C:/Users/****"
testimg = cv2.imread(path+"파일명.jpg", cv2.IMREAD_COLOR)

#불러온 이미지가 너무 크다면 이미지 리사이즈 (수치는 확인해가며 조절하기)
imageHeight, imageWidth = testimg.shape[:2]
resizeHeight = int(0.3 * imageHeight)
resizeWidth = int(0.3 * imageWidth)
img = cv2.resize(testimg, (resizeWidth, resizeHeight), interpolation = cv2.INTER_CUBIC)


#이미지 grayscale 변환
def gray_scale(image):
    result = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    return result
img_gray = gray_scale(img)

#grayscale이미지 binary 변환
def img_threshold(image):
    result2 = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
    return result2
img_binary = img_threshold(img_gray)

#이미지 잡티 제거
def remove_noise(image, kernel_size=3):
    result3 = cv2.medianBlur(image, ksize=kernel_size)
    return result3
img_rm = remove_noise(img_binary)

#모폴로지 연산
def morphology(image):
    kernel = np.ones((8,8), np.uint8)
    result5 = cv2.morphologyEx(img_rm, cv2.MORPH_OPEN, kernel)
    return result5
img_morp = morphology(img_rm)

#########################################

#텍스트 추출을 위해 임시파일로 저장하기
filename = "{}.png".format(os.getpid())
cv2.imwrite(filename, img_morp)

#텍스트 추출하기
text = pytesseract.image_to_string(Image.open(filename), config=tessdata_dir_config, lang='eng')
os.remove(filename)

#추출된 텍스트 확인
print(text)

#처리된 이미지 확인
cv2.imshow("Image", img_morp)
cv2.waitKey(0)

이미지 전처리를 적용하였지만 텍스트 추출이 되지 않는다.,.................
다른 옵션을 넣어서 다음에 다시 시도해보도록 하겠다