验证码识别(02)——图像二值化

I adapt python and its image library: PIL to hack captcha. Here is the reference of PIL: The PIL Handbook. In this pessage, I will talk about binaryzation of Image.

about RGB

When the intensities are different, the result is a colorized hue, more or less saturated depending on the difference of the strongest and weakest of the intensities of the primary colors employed.

The RGB color model itself does not define what is meant by red, green, and blue colorimetrically, and so the results of mixing them are not specified as absolute, but relative to the primary colors. When the exact chromaticities of the red, green, and blue primaries are defined, the color model then becomes an absolute color space, such as sRGB or Adobe RGB.

binaryzation

  1. convert captcha to grey-scale image

     img.convert('L')
    
  2. convert grey-scale image to binary image, table is a list with 256 binary elements which indicates the converting map. Take a pixel’s grey value as index of table, its value in list is the value in binary image.

     table = [0 for x in range(230)]+[1 for x in range(26)]
     img.point(table, '1')
    

    img.point() returns a copy of the image where each pixel has been mapped through the given lookup table. The table should contains 256 values per band in the image. If a function is used instead, it should take a single argument. The function is called once for each possible pixel value, and the resulting table is applied to all bands of the image.

origin captcha
grey scale captcha
binary captcha