- B2B
- Scale StageRapidly increasing operations
- Top InvestorsThis company has received a significant amount of investment from top investors
Data Engineer
- Full Time
Not Available
About the job
We believe small businesses are at the heart of our communities, and championing them is worth fighting for. We empower small business owners to manage their finances fearlessly, by offering the simplest, all-in-one financial management solution they can't live without.
Reporting to the Senior Manager of AI & Data Platform, as a Data Engineer you will be building tools and infrastructure to support efforts of the Data Products and Insights & Innovation teams, and the business as a whole.
We’re looking for a talented, curious self-starter who is driven to solve complex problems and can juggle multiple domains and stakeholders. This highly technical individual will collaborate with all levels of the Data and AI team as well as the various engineering teams to develop data solutions, scale our data infrastructure and advance Wave to the next stage in our transformation as a data-centric organization.
This role is for someone with proven experience in complicated product environments. Strong communication skills are a must to bridge the gap between technical and non-technical audiences across a spectrum of data maturity.
Here’s How You Make an Impact:
- You’re a builder. You’ll be responsible for designing, building and deploying the components of a modern data stack, including CDC ingestion (using Debezium), a centralized Hudi data lake, and a variety of batch, incremental and stream-based pipelines.
- You’ll make things better. You enjoy the challenge of helping build and manage a fault tolerant data platform that scales economically, while balancing innovation with operational stability by maintaining legacy Python ELT scripts and accelerating the transition to dbt models in Redshift.
- You’re all about collaboration and relationships. You will collaborate within a cross-functional team in planning and rolling out data infrastructure and processing pipelines that serve workloads across analytics, machine learning and GenAI services. You enjoy working with different teams across Wave and helping them to succeed by ensuring that their data, analytics, and AI insights are reliably delivered.
- You’re self-motivated and can work autonomously. We count on you to thrive in ambiguous conditions by independently identifying opportunities to optimize pipelines and improve data workflows under tight deadlines.
- You will resolve and mitigate incidents: You will respond to PagerDuty alerts and proactively implement monitoring solutions to minimize future incidents, ensuring high availability and reliability of data systems.
- You're a strong communicator. As a data practitioner, you’ll have people coming to you for technical assistance, and your outstanding ability to listen and communicate with people will reassure them as you help answer their concern.
- You love helping customers. You will assess existing systems, optimize data accessibility, and provide innovative solutions to help internal teams surface actionable insights that enhance external customer satisfaction.
You Thrive Here By Possessing the Following:
- Data Engineering Expertise: Bring 3+ years of experience in building data pipelines and managing a secure, modern data stack. This includes CDC streaming ingestion using tools like Debezium into a Hudi data lake that supports AI/ML workloads and a curated Redshift data warehouse.
- AWS Cloud Proficiency: At least 3 years of experience working with AWS cloud infrastructure, including Kafka (MSK), Spark / AWS Glue, and infrastructure as code (IaC) using Terraform.
- Strong Coding Skills: Write and review high-quality, maintainable code that enhances the reliability and scalability of our data platform. We use Python, SQL, and dbt extensively, and you should be comfortable leveraging third-party frameworks to accelerate development.
- Data Lake Development: Prior experience building data lakes on S3 using Apache Hudi with Parquet, Avro, JSON, and CSV file formats.
- Workflow Automation: Build and manage multi-stage workflows using serverless Lambdas and AWS Step Functions to automate and orchestrate data processing pipelines.
- Data Governance Knowledge: Familiarity with data governance practices, including data quality, lineage, and privacy, as well as experience using cataloging tools to enhance discoverability and compliance.
- CI/CD Best Practices: Experience developing and deploying data pipeline solutions using CI/CD best practices to ensure reliability and scalability.
- Data Integration Tools: Working knowledge of tools such as Stitch and Segment CDP for integrating diverse data sources into a cohesive ecosystem.
- Analytical and ML Tools Expertise: Knowledge and practical experience with Athena, Redshift, or Sagemaker Feature Store to support analytical and machine learning workflows is a definite bonus!
About the company
- B2B
- Scale StageRapidly increasing operations
- Top InvestorsThis company has received a significant amount of investment from top investors