Job Description
About Nebula Dynamics: We are at the forefront of the cognitive computing revolution. Our mission is to engineer the infrastructure that will define the era of 2026 and beyond. As a high-growth tech unicorn, we are looking for a visionary Principal AI Architect to spearhead our next-generation model deployment and scalable infrastructure strategy. If you are passionate about building the future of AI and want to lead a team that shapes the technological landscape, this is your opportunity.
Why Join Us?
- Work on cutting-edge Generative AI and LLM applications.
- Competitive compensation package and equity options.
- Flexible remote-first culture with a San Francisco office hub.
- Opportunity to define the technical roadmap for 2026 and beyond.
The Role:
We are seeking a seasoned technical leader to design and implement the architectural backbone of our AI systems. You will bridge the gap between theoretical research and production-grade engineering, ensuring our platforms are scalable, secure, and capable of handling massive data throughput.
Responsibilities
- Design and implement scalable, high-performance AI infrastructure architectures for 2026 readiness.
- Lead the end-to-end development of Large Language Model (LLM) deployment pipelines and MLOps workflows.
- Collaborate with cross-functional teams including Data Scientists, Product Managers, and Engineers to translate business requirements into technical solutions.
- Define and enforce best practices for code quality, testing, and system reliability.
- Drive innovation in optimization techniques for GPU clusters and distributed computing environments.
- Mentor and guide junior architects and senior engineers to foster a culture of technical excellence.
- Stay ahead of emerging AI trends to ensure our technology stack remains future-proof.
Qualifications
- Masterβs or Ph.D. in Computer Science, Machine Learning, or a related quantitative field.
- 10+ years of experience in software engineering, with at least 5 years in a senior or principal architecture role.
- Deep expertise in Python, PyTorch, TensorFlow, and modern deep learning frameworks.
- Proven experience architecting and deploying Large Language Models (LLMs) at scale.
- Strong proficiency in cloud platforms (AWS, GCP, or Azure) and containerization technologies (Kubernetes, Docker).
- Experience with distributed systems, microservices, and high-availability architecture.
- Excellent communication skills with the ability to articulate complex technical concepts to non-technical stakeholders.