What is CV_8UC1 and why do I need the image to be that to make a contour out of it?

Dear Huggingface community and @John6666 @Alanturner2 or maybe @Nafnlaus

I was gonna rotate my images but rotating with just PIL.Image.rotate(degree) will rotate with black area around it. So I try to find a method then I found out that I need to make a contour of the image to make it fit into the “frame” or so I think it was.

When I apply it with my image, the image asks for CV_8UC1 with error like this


---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
Cell In[22], line 23
     20 morphed_resized = (morphed > 0.5).astype(np.uint8) * 255
     22 # Find the max-area contour
---> 23 cnts = cv2.findContours(morphed_resized, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2]
     24 cnt = sorted(cnts, key=cv2.contourArea)[-1]
     26 ## This will extract the rotated rect from the contour

error: OpenCV(4.10.0) D:\a\opencv-python\opencv-python\opencv\modules\imgproc\src\contours_new.cpp:330: error: (-2:Unspecified error) in function 'class std::shared_ptr<struct ContourScanner_> __cdecl ContourScanner_::create(class cv::Mat,int,int,class cv::Point_<int>)'
> Modes other than RETR_FLOODFILL and RETR_CCOMP support only CV_8UC1 images (expected: 'img.type() == CV_8UC1'), where
>     'img.type()' is 16 (CV_8UC3)
> must be equal to
>     'CV_8UC1' is 0 (CV_8UC1)

A little googling let me finds out that CV_8UC1 are image format as written onhere, and here. Basically from those two I need to convert the image into the same channel needed by cv2.findContours that, according to its documentation, requires binary image. Is that what CV_8UC1 are?

Oh wait, can I do it with same method as before like
.convert("L")) / 255 ? The guide did it like

morphed_resized = (morphed > 0.5).astype(np.uint8) * 255

but no avail, the error persists.

Hmm maybe I should try with the dividing it with 255 to make it relative to black and white too. Am I doing this correctly? In a minute by trying that

1 Like

It’s easy to convert between OpenCV and PIL, but be careful because the order of the color channels is different. for reference

Come to think of it I did import the image with PIL.

I have changed the importing method with cv2 like this

image_rgba = cv2.imread(os.path.join(ROOT_DIR,image)

Now the error becomes like this

---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
Cell In[25], line 23
     16 morphed = cv2.morphologyEx(threshed, cv2.MORPH_CLOSE, kernel)
     18 # resized to CV_8UC1
     19 # see https://github.com/orgs/ultralytics/discussions/16282
     20 #morphed_resized = (morphed > 0.5).astype(np.uint8) * 255
     21 
     22 # Find the max-area contour
---> 23 cnts = cv2.findContours(morphed, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2]
     24 cnt = sorted(cnts, key=cv2.contourArea)[-1]
     26 ## This will extract the rotated rect from the contour

error: OpenCV(4.10.0) D:\a\opencv-python\opencv-python\opencv\modules\imgproc\src\contours_new.cpp:330: error: (-2:Unspecified error) in function 'class std::shared_ptr<struct ContourScanner_> __cdecl ContourScanner_::create(class cv::Mat,int,int,class cv::Point_<int>)'
> Modes other than RETR_FLOODFILL and RETR_CCOMP support only CV_8UC1 images (expected: 'img.type() == CV_8UC1'), where
>     'img.type()' is 16 (CV_8UC3)
> must be equal to
>     'CV_8UC1' is 0 (CV_8UC1)`

Updating for posterity, will try to understand what is CV_8UC1 and CV_8UC3

1 Like

Maybe C means channels.

This is weird, the code doesn’t update anything to numpy.array but the output stays numpy array.

for image in large_image_stack_512:
    image_rgba = cv2.imread(os.path.join(ROOT_DIR,image))
    # image_rotate = image_rgba.rotate(45, expand=True)
    
    # Convert to gray, and threshold
    gray = cv2.cvtColor(image_rgba, cv2.COLOR_RGB2BGR)
    th, threshed = cv2.threshold(gray, 240, 255, cv2.THRESH_BINARY_INV)

    # Morph-op to remove noise
    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (11, 11))
    morphed = cv2.morphologyEx(threshed, cv2.MORPH_CLOSE, kernel)

    # resized to CV_8UC1
    # see https://github.com/orgs/ultralytics/discussions/16282
    #morphed_resized = (morphed > 0.5).astype(np.uint8) * 255

    print("a", type(morphed))
    # Find the max-area contour
    cnts = cv2.findContours(morphed, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2]
    ^ # where error is

Output

a <class 'numpy.ndarray'>
---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
Cell In[7], line 24
     22 print("a", type(morphed))
     23 # Find the max-area contour
---> 24 cnts = cv2.findContours(morphed, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2]
     25 cnt = sorted(cnts, key=cv2.contourArea)[-1]
     27 ## This will extract the rotated rect from the contour

error: OpenCV(4.10.0) D:\a\opencv-python\opencv-python\opencv\modules\imgproc\src\contours_new.cpp:330: error: (-2:Unspecified error) in function 'class std::shared_ptr<struct ContourScanner_> __cdecl ContourScanner_::create(class cv::Mat,int,int,class cv::Point_<int>)'
> Modes other than RETR_FLOODFILL and RETR_CCOMP support only CV_8UC1 images (expected: 'img.type() == CV_8UC1'), where
>     'img.type()' is 16 (CV_8UC3)
> must be equal to
>     'CV_8UC1' is 0 (CV_8UC1)

Yes thank you for the link, looking at it right now

1 Like

The data type read by cv2.imread() is originally an np.ndarray.
It’s a pain when the order of colors and other things get mixed up, so I try to use PIL to load and save images as much as possible. I only use numpy or Torch Tensor when processing.

Oh man I still got switched up because I thought I’d only get numpy.ndarray when converting it with np.ndarray(image) code. Thanks for the clarification.

Okay so the first step using PIL is correct, I’ll continue from there. it’s pain isn’t it to handle these three libraries for image processing

1 Like

Yea. That’s true. But the fact that they are all libraries with a long history means that they have a lot of accumulated know-how, which is a big help. We can usually manage to find what we’re looking for by searching.

The strength of Python is not so much that Python itself is good, but rather the richness of the libraries and information. Well, I like the readability of Python. It’s easy to copy and paste.

Okay after several fixing I found that, quote from the text,

For instance, “CV_8UC3” in opencv is the same as np.uint8 in numpy. Or “CV_32SC1” is the same as np.int32. Moreover, opencv images are the same as numpy arrays. With that in mind, the way to change the the data type of an opencv image is to use numpy’s .astype function. For instance, if img is an opencv image and I want to make it of “CV_8UC1” type, I would just type: img = img.astype(‘uint8’). And then use img in my opencv function. In the same vane, if I want to check the type of my opencv image, I simply do: print(img.dtype). If it prints out, say, uint8, I know for sure it is “CV_8UC?”. But I don’t know the number of channels (that’s why I put the question mark). To find out the number of channels, I simply do: print(img.shape). If the shape has only two numbers, e.g., (780,1240), I know it is a single channel 780x1240pix image. If the shape has 3 number, e.g., (780,1280,3), I know it is a 3 channel 780x1240pix image. What’s interesting, though, is that you can’t covert 1-channel image to 3-channel image and vise versa using numpy. For this operation you have to use opencv native function. To convert 3-channel image to 1 channel image, you can use cv2.cvtColor(img, cv2.COLOR_BGR2GRAY), to convert 1-channel image to 3-channel image you can use cv2.cvtColor(gray,cv2.COLOR_GRAY2RGB).

That basically CV_8UC1 is like np.int32 while CV_8UC3 is like np.uint8. Since from previous discussion with John also verify that both OpenCV and NumPy has the same format (np.array), it verifies the quote claim that both actually can be changed with each other dimension function respectively.

I tried to follow the guide on the quote to see what’s my image looks like by print(gray.shape) and what I got is:


gray = cv2.cvtColor(np.array(image_rgba), cv2.COLOR_RGB2BGR)
print(gray.shape)

this

(512, 512, 3)

Then I realized that, although the code says it’s converting itself to gray, I somehow (or the code somehow) only converts it into BGR which are just, long story short, RGB with different configuration.

for reference this is the np.info from the variable gray


class:  ndarray
shape:  (512, 512, 3)
strides:  (1536, 3, 1)
itemsize:  1
aligned:  True
contiguous:  True
fortran:  False
data pointer: 0x1c36a60a0a0
byteorder:  little
byteswap:  False
type: uint8
None

So I fixed the code by converting it to grayscale like this

 gray = cv2.cvtColor(gray_BGR, cv2.COLOR_BGR2GRAY)

and now the np.info become

class:  ndarray
shape:  (512, 512)
strides:  (512, 1)
itemsize:  1
aligned:  True
contiguous:  True
fortran:  False
data pointer: 0x1c36a5ca090
byteorder:  little
byteswap:  False
type: uint8
None

The program run smoothly until this function that is apparently deprecated beyond opencv 3.0 (mine is 4.10.0.84)

M = cv2.estimateRigidTransform(current_pts, model_pts, True)
    

Tried changing it into
M = cv2.estimateAffine2D(current_pts, model_pts, True)

and get this error

---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
Cell In[32], line 59
     57 print(M)
     58 # Wrap the image
---> 59 wrap_gray = cv2.warpAffine(gray, M, (int(sx), int(sy)))
     61 cv2.imshow("dst", wrap_gray)
     62 cv2.waitKey(0)

error: OpenCV(4.10.0) :-1: error: (-5:Bad argument) in function 'warpAffine'
> Overload resolution failed:
>  - M is not a numerical tuple
>  - Expected Ptr<cv::UMat> for argument 'M'

When it said it’s not Tuple, I checked by print(M) and I got
(array([[ -0., 1., -0.], [ -1., -0., 511.]]), array([[1], [1], [1], [1]], dtype=uint8))

I don’t understand, it is a Tuple contains two matrix, currently checking if the error means “the image doesn’t fit with the size (sx, sy)”

1 Like

I just realized, both gray is actually uint8, so it’s not about CV_8UC3 afterall? Just grayscale and RGB?

1 Like

Just grayscale and RGB?

True…

The problem actually comes back when I reached that estimateAffine2D, it asks for

Expected Ptr<cv::UMat> for argument 'M'

Which as I searched around it’s the UMat format problem or something

Huft is there easier way to rotate image and erase the black around rotated image…

1 Like

This?

estimateAffine2D return two values

Yes it returns two value which I assume one of them is the image one of them is a … uh … true one dimensional array that has all 1 in it?

Do you think if I trained the rotated image with the dataset like this it will affects the loss? Somehow I think it would that’s why I’m looking for a way to erase the black area around the image

Codes from here is perfect if not because of the deprecation…

1 Like

It’s a waste, but let’s cut out a square from the center and throw away the rest. Do the same with the mask. It’s cropping.

I actually, previously, made a code about cropping like this:

# Size of the image in pixels (size of original image)
# not mandatory
width, height = im.size
print(height)
print(width)

#(left, upper, right, lower) means two points
# 1. (left, upper)
# 2. (right, lower)
# 800x600 pixel image means the left upper point is (0,0) 
# and the right lower point is (800,600)

# Setting the points for cropped image
#left = 10
#top = height / 2
#right = 20
#bottom = 3 * height / 2



# Cropped image of above dimension
# (It will not change the original image)
im1 = im.crop((left, top, right, bottom))

#shows the image in image viewer
im1.show()

Then just change the size to my liking according to the specification as to where the upper point is and where the lower point is. I don’t know if I can apply the code with diagonal “cutting”

But after you say something about mask, isn’t this like “inverse mask” where you want to take out the ‘white’ but let go of the ‘black’? Hm wait, what am I saying…

1 Like

Instead of cutting diagonally, only the center part square without black are used. The mask is also processed accordingly.

1 Like

I just realized this on my way home,

What if I just proceed making masks with that anyway. Because since it’s black the model will think it’s not water no? If it sees it like diagonally then we got what we want, making sure the model understand diagonal river

1 Like

Oh, so the strategy is to leave the image as it is and just use the black color to represent “not river”?
If that’s the case, I think it would only improve the ability to recognize rivers when given strange images like this, and it would stray from the theme of images of rivers from different angles…:thinking:

Ideally, it would be good if there was a proper dataset of rivers from different angles, but it would take too much time, so we’re going to try to make something that looks like it using a mechanical conversion.
Also, “rotation” is just an example, and I thought it would be something that does various image processing, such as projective transformation and 3D processing, but is that not the case?