Hiring Lead AI/ML Engineer | Building the design agent (Multimodal Image-to-CAD) | Stealth Startup

The Mission:

We are building , a high-end AI Design Atelier that transforms creative human intent into production-ready jewelry. We aren’t just building another chatbot; we are building a multimodal pipeline that converts voice and text into 2D images to structured 3D geometry for manufacturing.

We are looking for a founding-level AI/ML Engineer to architect the bridge between Gemini 1.5 Pro/2.5 and Parametric 3D CAD engines.

The Stack:

Foundation: Gemini 1.5 Pro / Vertex AI (Multimodal focus).

Infrastructure: Python, FastAPI, Hugging Face Transformers.

3D/Graphics: Experience with Three.js, GLB/USDZ, or Rhino/Grasshopper API is a massive plus.

Data: RAG (Retrieval-Augmented Generation) for technical jewelry specs and material properties.

What You’ll Build:

The Vision Pipeline: Fine-tuning multimodal prompts to ensure Gemini understands jewelry-specific spatial reasoning (prongs, shanks, stone settings).

The Geometry Engine: Converting Gemini’s JSON outputs into 3D meshes that meet real-world casting tolerances.

The “Album” Logic: Implementing an agentic memory system where the AI “remembers” a user’s aesthetic style across an entire collection.

Who You Are:

• You are bored with generic LLM wrappers and want to solve a hard spatial problem.

• You have deep experience with Structured Outputs (JSON) and Function Calling.

• You understand that in the luxury world, a 0.5mm error is the difference between a masterpiece and a failed cast.

• You’ve shipped production-grade AI agents before.

Why Join Us?

We are led by founders with deep domain expertise in the jewelry space. We have the industry DNA; we need your mathematical and architectural brain to make it “production-ready.”

Location: Remote / Atlantic Canada (optional).

Compensation: Open to discussion

How to Apply: DM me here or send a link to your most impressive Hugging Face Space or GitHub repo that shows off your work with multimodal models.