Avatar for DQLabs
Modern Data Quality Platform
  • B2B
  • Early Stage
    Startup in initial stages

Lead Data Engineer

  • Remote • 
    South Pasadena
  • 6 years of exp
  • Full Time
Posted: 3 years ago
Visa Sponsorship

Not Available

Hires remotely in
RelocationAllowed
Skills
Python
Java
MongoDB
Scala
MSSQL
Snowflake
Microsoft SQL Server
Kafka
Amazon Redshift
Amazon Kinesis
Druid
Kubernetes
Hadoop/Hive/Spark/Scala/MLlib
Apache/Spark/Databricks
Flink
Apache Airflow
Apache Pulsar

About the job

We are looking for an individual that will bring in his/her expertise in a wide variety of big data processing frameworks (both open source and proprietary), large scale database systems (Big Data, OLAP and OLTP), stream data processing, API Development, Machine learning operationalization, and cloud automation to build and support all the data needs across our data platform.

Responsibilities

Design and develop the data platform to efficiently and cost effectively address various data needs across the business.
Build software across our entire cutting-edge data platform, including event driven data processing, storage, and serving through scalable and highly available APIs, with awesome cutting-edge technologies.
Ensure performance isn’t our weakness by implementing and refining robust data processing, REST services, RPC (in an out of HTTP), and caching technologies.
Build process and tools to maintain Machine Learning pipelines in production.
Develop and enforce data engineering, security, data quality standards through automation.
Participate in supporting the data platforms 24X7.

Qualification

Bachelor’s degree in computer science or Similar discipline.
6+ years of experience in software engineering
3+ years of experience in data engineering.
Ability to work in fast paced, high pressure, agile environment and willingness to learn any new technologies and apply them at work in order to stay ahead of the curve.
Expertise in at least few programming languages Java, Scala, Python or similar.
Expertise in building and managing large volume data processing (both streaming and batch) platform is a must.
Expertise in stream processing systems such as Kafka, Kinesis, Pulsar or Similar
Expertise in building micro services and managing containerized deployments, preferably using Kubernetes
Expertise in distributed data processing frameworks such as Apache Spark, Databricks, Flink or Similar.
Expertise in SQL, Spark SQL, Hive etc.
Expertise in OLAP databases such as MSSQL, Snowflake or Redshift.
No-SQL (MongoDB or similar) is a plus
Experience in operationalizing and scaling machine models is a huge plus.
Experience with variety of data Tools & frameworks (example: Apache Airflow, Druid) will be a huge plus.
Strong interpersonal, communication and presentation skills.
Strong team focus with outstanding organizational and resource management skills

About the company

DQLabs company logo
Modern Data Quality Platform11-50 Employees
  • B2B
  • Early Stage
    Startup in initial stages
Learn more about DQLabs image

Founders

Raj Joseph
Founder • 3 years
image
View the team image