Bottle Cap Object Detection - Black shiny Caps

Im training a Black shiny Cap Dataset (only caps not bottle). The cap belongs to a makeup product bottle. I want Cap top view (smooth, reflective, shiny closed side) to be class 1 and bottom view- open side to be class 2.

From bottom inside the cap, there is a transpaent plastic semi- sphere. This plastic semi sphere causing a light glares, reflections as well as strong refractions onto the top surface of cap from sindie.

In live camera, now bottom view is detected as class 2 (correct), Top class is also detected as class 2 (not 1). How to make the top class different from bottom class?

From images dataset i can see, top and bottom almost look same with lot of reflections.

Can anyone please help!!

1 Like

Hmm…


Fix the optics first, then the pipeline, then the data. Make “top” and “bottom” look different to the camera, then let the model exploit that separation. Train with glare-heavy augmentations and hard negatives. Use a two-stage detect→crop→classify flow for the subtle top/bottom attribute.

1) Make the views separable with light

  • Cross-polarize. Put a linear polarizer on the light and an orthogonal analyzer on the lens. This suppresses specular glare from shiny plastics, which currently makes top look like bottom. Expect some exposure loss and residual speculars on curves. (Edmond Optimus.)
  • Diffuse the field for glossy curves. Use a dome (a.k.a. flat-dome) or coaxial on-axis light. Domes create “cloudy-day” illumination on curved, reflective parts; coaxial gives uniform top-down light on flatter regions. Both reduce highlight hotspots that collapse your classes. (Advanced Illumination)
  • Backlight when geometry allows. Put a backlight under the imaging plane. The open bottom shows a bright aperture. The closed top stays solid. This simple silhouette cue often solves orientation. (Edmond Optimus.)
  • Wavelength tricks (optional). IR can de-emphasize color and sometimes tame uneven contrast on dark plastics. Test it only if you cannot change geometry. (NI)
    Background: robust inspection comes from lighting that maximizes contrast for the feature you care about, not from heavier models. Use a standard, feature-appropriate lighting method and verify with quick A/B shots. (Advanced Illumination)

2) Change the model structure

  • Detect once, then classify the crop. Train the detector with a single class cap. For each box, run a tiny classifier for top vs bottom. This raises attribute accuracy on near-identical views and is the recommended pattern in the Ultralytics ecosystem. (Ultralytics)
  • Why. The detector learns geometry and placement. The classifier focuses on subtle cues inside the crop (aperture rim, inner insert hint) without background noise. Ultralytics docs and community threads outline this split. (Ultralytics Docs)

3) Make the data teach shape, not shine

  • Hard-negative mining loop. Save every live mistake where a top is predicted as bottom. Add those crops back labeled top and retrain. YOLOv8/11 do not have built-in online HNM, but the workflow is supported and documented in issues. (GitHub)
  • Glare-centric augmentation. In the classifier, use RandomBrightnessContrast and occasional RandomSunFlare. Add mild HSV jitter and some grayscale to break color reliance. Keep geometry stable; this task is reflectance-driven. (Albumentations)
  • Loss choices. If you see imbalance or many easy negatives, focal-style objectives help the detector focus on hard cases; modern IoU-aware variants like Varifocal Loss improve ranking of detections. (arXiv)

4) If data is thin, synthesize the hard cases

  • Omniverse Replicator. Randomize HDR domes, light positions, roughness, IOR, and camera tilt. Emit COCO boxes for detect and 224Ă—224 crops for classify. Replicator’s randomizers cover lights/materials/poses out of the box. (Omniverse Docs)
  • BlenderProc. Established sim-to-real pipeline for photoreal rendering and domain randomization; heavily used in industrial object datasets. (Robotics Simulation Shop)
  • Example. Edge Impulse’s Replicator tutorial shows object-detection datasets generated via domain randomization improving generalization. (docs.edgeimpulse.com)

5) Optional hardware: polarization cameras

  • A Sony Polarsens sensor gives per-pixel 0°/45°/90°/135° frames. Compute DoLP/AoLP images where specular highlights pop differently than diffuse regions. Top and bottom often separate in DoLP/AoLP even when RGB looks identical. Vendor notes and app papers cover use and calibration. (Sony Semiconductor Solutions)

6) Step-by-step playbook

  1. Optics A/B test. Capture 20 tops and 20 bottoms for each setup: current, cross-pol, dome, coaxial, and backlight. Pick the simplest setup giving the largest top/bottom contrast. (Advanced Illumination)
  2. Train a 1-class detector for “cap”.
  3. Train a 2-class classifier on 224Ă—224 crops with glare-heavy augmentations. (Ultralytics Docs)
  4. Close the loop. Mine hard negatives from live video weekly. Retrain the classifier with these and keep detector weights stable unless localization degrades. (GitHub)
  5. Consider Varifocal/quality-aware losses if detector ranking is poor near decision thresholds. (CVF Open Access)

7) Minimal reference code (detect→crop→classify)

# deps:
#   pip install ultralytics==8.3.0 albumentations==1.4.8 opencv-python
# docs:
#   YOLO classify: https://docs.ultralytics.com/tasks/classify/
#   Albumentations transforms: https://albumentations.ai/docs/2-core-concepts/transforms/
from ultralytics import YOLO
import cv2, numpy as np

# 1) train detector separately (one class: cap)
# yolo task=detect mode=train model=yolov8n.pt data=cap.yaml imgsz=640 epochs=100  # https://docs.ultralytics.com/

# 2) train classifier on crops (two classes: top, bottom) with glare aug
# yolo task=classify mode=train model=yolov8n-cls.pt data=top_bottom_cls/ imgsz=224 epochs=50

detector = YOLO("runs/detect/cap/weights/best.pt")
classifier = YOLO("runs/classify/topbottom/weights/best.pt")

def detect_and_classify(img_bgr):
    H, W = img_bgr.shape[:2]
    det = detector(img_bgr, imgsz=640, conf=0.25)[0]
    out = []
    for (x1,y1,x2,y2) in det.boxes.xyxy.int().cpu().numpy():
        x1,y1 = max(0,x1), max(0,y1)
        x2,y2 = min(W-1,x2), min(H-1,y2)
        crop = cv2.resize(img_bgr[y1:y2, x1:x2], (224,224))
        pred = classifier(crop[..., ::-1], imgsz=224)[0]  # BGR->RGB
        cls = int(pred.probs.top1); conf = float(pred.probs.top1conf)
        out.append(((x1,y1,x2,y2), classifier.names[cls], conf))
    return out

This keeps the attribute decision in a classifier that you can harden with mined mistakes and glare aug. (Ultralytics Docs)

8) Evaluate the right thing

  • Per-class PR and confusion matrix. Check whether errors are classification or localization. Ultralytics saves PR curves and confusion matrices; FiftyOne gives interactive views and “hardness” mining. (Ultralytics Docs)

9) Similar problems and why they matter

  • Texture-less industrial parts are inherently ambiguous under glare. Study T-LESS and BOP write-ups for lighting and pose challenges; they explain why lighting and synthetic data help. (arXiv)

Curated materials with purpose

Lighting and optics

  • Advanced Illumination: Practical lighting guide. How to choose feature-appropriate lighting. (Advanced Illumination)
  • UnitX Labs: Dome vs coaxial explainers with reflective examples. Fresh summaries. (UnitX)
  • Edmund Optics: Silhouetting/backlight application note. Orientation via apertures. (Edmond Optimus.)
  • Edmund Optics: Successful polarization techniques. Cross-pol to remove glare. (Edmond Optimus.)

Two-stage pattern and metrics

  • Ultralytics glossary: Two-stage detectors overview and when to use them. (Ultralytics)
  • Ultralytics docs: Classification task quickstart for the crop classifier. (Ultralytics Docs)
  • Ultralytics: Validation/metrics including PR curves and confusion matrices. (Ultralytics Docs)
  • FiftyOne docs: Detection evaluation and hard sample mining. Interactive confusion matrices and hardness scores. (docs.voxel51.com)

Hard negatives, losses, imbalance

  • Ultralytics issues: Hard-negative mining is not built-in; use an external loop. (GitHub)
  • Focal Loss and Varifocal Loss papers for dense detection and ranking. (arXiv)

Augmentations

  • Albumentations: RandomBrightnessContrast and RandomSunFlare docs. Use to stress speculars. (Albumentations)

Synthetic data

Polarization cameras


What to do this week

  1. Add cross-pol sheets and test a small dome or coaxial head; capture 20 tops/20 bottoms per setup; pick the best. (Advanced Illumination)
  2. Switch to detect→crop→classify. Train detector once. Train classifier with glare aug. (Ultralytics Docs)
  3. Start a hard-negative loop from your live stream. Retrain the classifier weekly. (GitHub)
  4. If data is scarce, generate 5–20k synthetic crops varying lights/materials with Replicator or BlenderProc. (Omniverse Docs)

That sequence usually flips your live top views from class 2 to class 1 with minimal disruption.