SF OfficeUSAFullTimeashby2026-06-16
Why this is a real AI job: The role is explicitly focused on designing, building, and evaluating LLM-powered systems for a core healthcare application. The job description heavily emphasizes LLM APIs, agentic workflows, and model evaluation, indicating a primary focus on AI/ML.
ABOUT ABRIDGE Abridge was founded in 2018 with the mission of powering deeper understanding in healthcare. Our AI-powered platform was purpose-built for medical conversations, improving clinical documentation efficiencies while enabling clinicians to focus on what matters most—their patients. Our e…
Details Open source / apply
SF OfficeUSAFullTimeashby2026-06-16
Why this is a real AI job: The role is explicitly focused on building and scaling a GenAI platform for healthcare, with core responsibilities revolving around LLMs, agent orchestration, and evaluation. The job description heavily emphasizes AI/ML concepts and technologies.
ABOUT ABRIDGE Abridge was founded in 2018 with the mission of powering deeper understanding in healthcare. Our AI-powered platform was purpose-built for medical conversations, improving clinical documentation efficiencies while enabling clinicians to focus on what matters most—their patients. Our e…
Details Open source / apply
India OfficeUSAgreenhouse2026-06-16
Why this is a real AI job: The role is explicitly focused on bringing up and optimizing ML frameworks and models on specialized hardware. The responsibilities directly involve core AI/ML tasks like model architecture translation, compiler optimizations, and performance tuning. The requ…
Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training…
Details Open source / apply
San FranciscoUSAFullTimeashby2026-06-16
Why this is a real AI job: The role is explicitly focused on building and optimizing infrastructure for large-scale LLM inference, a core AI task. The description heavily emphasizes AI/ML technologies and their application.
About Anyscale At Anyscale https://www.anyscale.com/, we're on a mission to democratize distributed computing and make it accessible to software developers of all skill levels. We’re commercializing Ray https://docs.ray.io/en/latest/, a popular open-source project that's creating an ecosystem of li…
Details Open source / apply
New York, San MateoUSAgreenhouse2026-06-16
Why this is a real AI job: The role explicitly focuses on developing, fine-tuning, and operationalizing machine learning models, with a strong emphasis on generative AI and LLM inference. The responsibilities are heavily centered around AI/ML engineering tasks.
About Us: At Fireworks, we’re building the future of generative AI infrastructure. Our platform delivers the highest-quality models with the fastest and most scalable inference in the industry. We’ve been independently benchmarked as the leader in LLM inference speed and are driving cutting-edge in…
Details Open source / apply
San FranciscoUSAFullTimeashby2026-06-16
Why this is a real AI job: The role is explicitly focused on building, scaling, and optimizing LLM inference workloads. The team is a 'Forward Deployed Engineering' team working directly with customers on AI deployments. The requirements clearly state experience with LLMs and ML infere…
ABOUT BASETEN Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the fronti…
Details Open source / apply
Redwood City, CAUSAFullTimeashby2026-06-16
Why this is a real AI job: The role is explicitly focused on building and maintaining infrastructure for ML research and serving, with a strong emphasis on large language models and GPU utilization. The requirements and responsibilities directly relate to core AI/ML engineering tasks.
ABOUT THE ROLE We’re looking for seasoned ML Infrastructure engineers with experience designing, building and maintaining training and serving infrastructure for ML research. Responsibilities: - Provide infrastructure support to our ML research and product - Build tooling to diagnose cluster issues…
Details Open source / apply
San FranciscoUSAgreenhouse2026-06-16
Why this is a real AI job: The role explicitly focuses on developing systems for LLM inference and fine-tuning, requiring deep expertise in ML and related technologies. The company is a research-driven AI company.
About the Role Together AI is looking for an ML Engineer who will develop systems and APIs that enable our customers to perform inference and fine tune LLMs. Relevant experience includes implementing runtime systems that perform inference at scale using AI/ML models from simple models up to the lar…
Details Open source / apply
San FranciscoUSAgreenhouse2026-06-16
Why this is a real AI job: The role is explicitly focused on building and optimizing AI inference systems for large language models. The responsibilities and requirements heavily emphasize ML engineering, performance optimization, and working with cutting-edge AI technologies.
About the Role Together AI is seeking a Machine Learning Engineer to join our Inference Engine team, focusing on optimizing and enhancing the performance of our AI inference systems. This role involves working with state-of-the-art large language models models and ensuring they run efficiently and…
Details Open source / apply
San FranciscoUSAgreenhouse2026-06-16
Why this is a real AI job: The role is explicitly focused on building and optimizing the model serving layer for voice applications, working with state-of-the-art voice models and inference engines. The responsibilities are heavily centered around ML engineering tasks.
About the Role Together AI is building the best inference infrastructure for voice applications. Our Voice AI platform powers production-grade, real-time voice agents and applications — serving speech-to-text and text-to-speech models with best-in-class latency and reliability. We're looking for a…
Details Open source / apply