Staff Software Engineer, Data Infrastructure
- Full Time
Not Available
About the job
Join our team at ASAPP, where we're developing transformative Vertical AI designed to improve customer experience. Recognized by Forbes AI 50, ASAPP designs generative AI solutions that transform the customer engagement practices of Fortune 500 companies. With our automation and simplified work processes, we empower people to reach their full potential and create exceptional experiences for everyone involved. Work with our team of talented researchers, engineers, scientists, and specialists to help solve some of the biggest and most complex problems the world is facing.
The Data Engineering team at ASAPP designs, builds and maintains our mission-critical core data infrastructure and analytics platform. Accurate, easy-to-access, and secure data is critical to our natural language processing (NLP) customer interaction platform which interacts with tens of millions of end-users in real-time.
We’re looking to hire a skilled Engineer who can contribute to the software systems which manage our data infrastructure, that handle our ever-growing volumes of data and the demands we want to make of it. We're managing a complex assortment of code depositories that drives our infrastructure. A lot of it was built in Scala, which we're moving away from, and moving more towards Python and PySpark. We need to find a skilled Engineer who's built code, tested and deployed, ideally in a production environment. You'll be maintaining production level code, and doing debugging. You'll be contributing to software systems that manage our data infrastructure. As part of our fast-growing data engineering team, this role will also play an integral role in shaping the future of data infrastructure as it applies to improving our existing metric-driven development and machine learning capabilities.
Applicants with all or some relevant combination of the requirements listed below are encouraged to apply. We are able to consider remote and hybrid candidates for this role.
What you'll do
- Build, code, test, deploy and debug production level code
- Contributing directly to the software systems that manage our data infrastructure
- Expand our logging and monitoring processes to discover and resolve anomalies and issues before they become problems
- Develop state-of-the-art automation and solutions in Python, Spark and Flink
- Maintain, Manage, Monitor our infrastructure, including Kafka, Kubernetes, Spark, Flink, Jenkins, general OLAP and RDBMS databases, S3 objects buckets, permissions
- Increase the efficiency, accuracy, and repeatability of our ETL processes
- Know how to make the tradeoffs required to ship without compromising quality
What you'll need
- 12+ years of experience in general software development and/or dev-ops, sre roles in AWS.
- 5+ years experience in data engineering, data systems, pipeline and stream processing.
- Expertise in at least one flavor of SQL, e.g. Redshift, Postgres, MySQL, Presto/Trino, Spark SQL, Hive
- Proficiency in a high-level programming language(s). We use Python, Scala, Java, Kotlin, and Go
- Experience with CI/CD (continuous integration and deployment)
- Experience with workflow management systems such as Airflow, Oozie, Luigi, and Azkaban
- Experience implementing data governance, i.e. access management policies, data retention, IAM, etc.
- Confidence operating in a devops-like capacity working with AWS, Kubernetes, Jenkins, Terraform, etc. thinking about automation, alerting, monitoring, and security and other declarative infrastructure
What we'd like to see
- Bachelor's Degree in a field of science, technology, engineering, or math, or equivalent hands-on experience
- Experience in maintaining and managing kafka (not just using)
- Experience in maintaining and managing OLAP/HA database systems (not just using)
- Familiarity handling Kubernetes clusters for various jobs, apps, and high throughput
- Technical knowledge of data exchange and serialization formats such as Protobuf, Avro, or Thrift
- Experience in either deploying and creating Spark Scala and/or Flink applications.