Deep Learning Engineer

$60k – $140k • 0.0% – 0.5%
|
Remote •
New York City
|No experience required
|Full Time

Reposted: 3 months ago

Visa Sponsorship

Available

Remote Work Policy

Onsite or remote

Hires remotely in

North America

Preferred Timezones

Eastern Time

RelocationAllowed

Skills

Machine Learning

Artificial Intelligence

Research

Deep Learning

NLP

About the job

Vision

Stochastic’s vision is to build an efficient AI system where everyone will have access to personalized AI maximizing our productivity and creativity. Just as computers evolved from centralized, enterprise-only form factors to personal computers, we believe in the future of personalized AI that can help everyone with their day-to-day work. The currently popular approach of scaling language models infinitely larger taken by the few companies causes centralization of AI power, does not protect privacy nor leverage individuality, and further accelerates carbon emission problems. By focusing on more efficient language models, we call Evolutionary Language Models that self-improve on user data and interactions, we are planning to deliver a truly personalized AI that will be your best partner.

Team

Founded by Harvard University AI systems researchers that built the world's first Bayesian and LLM inference accelerators and a real-time speech NLP engine that ran on the edge. Stochastic is joined by AI researchers and engineers with the passion for making AI more accessible to everyone. Our recent research includes latency-optimized transformers architecture, quantized parameter efficient fine-tuning, and sparsity-aware throughput maximization on GPUs.

Business

Stochastic serves a diverse range of clients, including Fortune 500 companies and one of the globe's largest asset managers. Our proprietary technology, xChat, streamlines the creation of customized LLM chatbots for both automating customer support and enhancing internal knowledge management, offering the industry's most cost-effective solutions.

Role

We are looking for Machine Learning Engineers who are interested in implementing the best optimization techniques on state-of-the-art ML models. You should have a strong interest in solving the challenges of accelerating ML models. You are someone who is research-oriented, deeply knowledgeable of best practices in your field, and highly self-motivated and directed.

As a Machine Learning Engineer, you will:

Help design and build the tech stack to ensure high system scalability and reliability
Finetune, accelerate and deploy LLMs in our existing pipelines
Conduct research and experiments on latest finetuning and acceleration techniques
Manage specification, development, testing and releasing of new features
Provide support for strategic customers on deployment and scalability issues
Support strategic planning of xChat and xCloud, the two main products of Stochastic

You are a good fit if you have:

Degree in Computer Science/Machine Learning/Statistics
Experience with RAG systems
Experience with Python and MongoDB
Experience finetuning Deep Learning models with PyTorch and Transformers libraries
Experience deploying Deep Learning models in production environments
Experience with at least one the main public cloud providers (AWS, Azure or GCP)
Experience with Kubernetes

Strong Pluses: