Principal Data Engineer
- Remote •+9
- 10 years of exp
- Full Time
About the job
Material Bank is a fast-paced, high-growth technology company and created the world's largest material marketplace for the Architecture and Design industry, providing the fastest and most powerful way to start and manage a design project. Learn more about us at www.materialbank.com or see below.
--
Material Bank is seeking a Principal Data Engineer/Architect with deep expertise in data architecture and data platform design to enable and enhance Material Bank’s internal catalog systems, tailored specifically for eCommerce applications and supporting complex Bill of Materials (BOM) structures. As a Principal Data Architect at Material Bank, you will take a hands-on role as an individual contributor (IC), designing, architecting, and implementing scalable database systems that support a comprehensive material catalog, central to eCommerce operations, and the creation and management of BOMs.
Key Responsibilities:
Architect eCommerce Catalogs: Lead the design and development of robust, scalable database systems that underlie Material Bank's comprehensive material catalog. Focus on developing optimal schemas and attributes that ensure data integrity, efficient retrieval, and support the detailed specification of materials in BOMs across databases and PIM systems.
Bill of Materials Integration: Develop and enhance data structures that support the creation, management, and retrieval of BOMs. This includes addressing the unique challenges of BOM management, such as ensuring data consistency across multiple components, handling complex hierarchical structures, and linking various materials, components, and attributes required for diverse construction and design projects.
Decentralized Data Ownership and Architecture: Organize data management to ensure that product data remains independent for each business domain. Each domain will have control over its product data. You will create and maintain a platform that enables decentralized data ownership and architecture, ensuring that domain teams can operate independently while preserving data integrity and consistency across the organization.
Self-Serve Data Infrastructure as a Platform: Develop and implement a self-serve data platform that enables domain teams to autonomously create and consume data products. This platform will provide the necessary infrastructure and tools to manage data without requiring specialized technical knowledge, reducing bottlenecks and empowering teams to work independently. You will be a key enabler in the creation and success of this platform.
Hands-On Data Modeling and Development: Engage directly in the creation of detailed data models and the hands-on implementation of scalable, performant systems using tools like DBT, Airflow, and AWS data services such as AWS Glue. You will ensure the integration of data sources and support complex relationships inherent in BOMs.
Ingestion Pipelines: Develop and maintain ingestion pipelines for catalog and BOM data from multiple sources, including third-party data providers, brand partners, and web scraping. Utilize tools such as DBT, Airbyte, Airflow, and AWS Glue to ensure timely and accurate data flow.
Data Governance, Provenance, and Quality: Establish and enforce data governance policies that ensure data quality, integrity, and provenance across catalog and BOM-related data. Develop and implement a Data Quality Matrix to continuously monitor, assess, and improve the accuracy, completeness, and reliability of the data.
Cross-functional Collaboration: Work closely with cross-functional teams, including product managers, engineers, designers, and other technical stakeholders, ensuring that technical specifications are met with high standards of precision and performance.
Continuous Improvement: Drive continuous innovation by refining and optimizing database architectures, introducing new tools and methods, and applying best practices to keep Material Bank at the forefront of eCommerce and BOM management technology.
Success Metrics
The success of an individual in this role will be measured by the quality and completeness of data, adherence to compliance standards, and the efficiency of data governance processes. Key metrics include data accuracy, consistency, and relevance, as well as the speed and cost-effectiveness of data product development. Success will also be evaluated based on the value generated from data products, user satisfaction, operational efficiency, and the trust and reliability of data systems, ensuring that data governance positively impacts business outcomes and decision-making.
What you'll bring:
10+ years of experience in data architecture and data platform design, with a strong focus on eCommerce and BOM applications.
In-depth knowledge of product catalog design, including taxonomy, attribute management, BOM data integration, and data provenance.
Proven ability to architect and optimize schemas for large-scale databases, particularly in supporting complex BOMs in an eCommerce environment.
Extensive hands-on experience with AWS ecosystem data services, including tools like DBT, Airbyte, Airflow, and AWS Glue, for data modeling, pipeline development, performance optimization, and ensuring data provenance.
Proficiency in programming languages commonly used in data engineering and architecture, such as Python, SQL, and Java, with the ability to write efficient, maintainable code that supports large-scale data operations and and Kafka for the processing of data streams.
Experience in developing and implementing a Data Quality Matrix to continuously monitor and improve data quality across systems.
Experience with decentralized data ownership and architecture, enabling domain teams to manage their own data while maintaining organizational data integrity.
Proven track record in developing self-serve data platforms that empower domain teams to autonomously manage and utilize data products.
Excellent collaboration and communication skills, with the ability to convey complex technical concepts to non-technical stakeholders.
Relevant certifications (e.g., AWS Certified Solutions Architect) are a plus.