About the role

Data Engineer 3-5 Years Full-Time Who are we? We are Spyne , redefining how cars are marketed and sold with cutting-edge Generative AI. What started as a bold idea—using AI-powered visuals to help dealers sell online faster—has evolved into a full-fledged AI-first automotive retail ecosystem. Backed by $16 M in Series A funding from Vertex Ventures, Accel, and other top investors, we're scaling fast: ✔ Expanded across the US & EU markets ✔ Launched industry-first AI-powered Image & 360° solutions ✔ Achieved a 5× revenue surge in 15 months, aiming for 3–4× growth this year 🚀 Know Our Journey 2020 : Launched as a visual merchandising platform 2023 : Pivoted to AI-driven automotive retail solutions 2024 : Achieved 5× revenue growth in 15 months, aiming for 3–4× more Today : Driving the GenAI revolution with AI-powered sourcing, pricing, CRM, and Agentic AI for dealerships 👉 Read more

about us: Studio AI Product Vini AI Product Series A Announcement on YourStory Series A Coverage in Economic Times Autocar Pro News What Are We Looking For? We're seeking a highly skilled Data Engineer to establish and own Spyne's dedicated data engineering function—our first. As Spyne AI accelerates market penetration across US rooftops and processes massive, high-velocity streams of unstructured computer vision payloads (Studio AI) and conversational state events (Vini AI), our data warehousing volume and complexity have scaled exponentially. This is not a maintenance role. You will actively restructure our entire data platform—taking absolute ownership from our DevOps team and raising our data infrastructure to true tech-industry standards. You'll build the foundational data layer that powers our BI platforms, ML observability, and long-term analytics roadmap. 📍 Location: Gurugram (Work from Office, 5 days a week) 🖥 Role: Full-Time, Data Engineer What Will You Do? Data Warehousing Architecture & Modeling: Design, scale, and own our core enterprise Data Warehouse built on ClickHouse Cloud—implementing robust data modeling methodologies, efficient time-based partitioning, and the centralized foundational data layer that powers all downstream BI platforms. CDC & Schema Evolution: Spearhead our transition from self-hosted Debezium/Kafka to managed ClickPipes; design performant ELT pipelines using ClickHouse Materialized Views, JSONExtract, and arrayJoin functions to parse deep, complex MongoDB Atlas JSON arrays into clean, flattened analytical tables. Advanced ClickHouse Engine Tuning: Manage SharedReplacingMergeTree tables partitioned by time; handle complex edge cases including cross-partition physical deletions (MongoDB tombstone events) and eliminate Cartesian explosions during array joins and LEFT JOIN operations. Event Sourcing for ML Pipelines: Maintain and optimize our append-only observability architecture (SQS → Lambda → ClickHouse Async Inserts) to track GPU and CPU ML workloads orchestrated via AWS Step Functions and AWS Batch, leveraging AggregatingMergeTree and anyLast state combinators to unify partial state updates. Performance Optimization & OOM Prevention: Troubleshoot and optimize heavy analytical queries to prevent concurrent Out-Of-Memory crashes when Metabase dashboards fire heavy models simultaneously; aggressively push down filters and leverage GLOBAL IN hash-lookups to eliminate broadcast overhead. AWS Data Networking: Navigate secure cross-account data transit within public-internet-denied cloud perimeters using AWS PrivateLink, VPC Lattice Service Networks, and MSK Multi-VPC connectivity secured via IAM authentication. Data Platform Ownership: Define and drive our long-term data warehousing roadmap; document architecture decisions, establish data quality standards, and reduce bandwidth currently falling on the DevOps team. Collaboration: Partner closely with ML Engineering, Product, and DevOps teams to ensure data pipelines are reliable, observable, and aligned with evolving product

Data Engineer

About the role

Other roles at Spyne

Job details

Company

Funding

Founders

What happens next.

Confirm the fit

I pitch you to the company

A meeting lands on your calendar