Junior Data Engineer
- ₹11L – ₹16L • No equity
- Remote •
- 1 year of exp
- Full Time
Not Available
Remote only
About the job
Century Health is at the forefront of transforming patient care through cutting-edge technology. Our mission is to accelerate patient access to breakthrough treatments by harnessing the power of AI to analyze real-world clinical data. By joining us, you become part of a dynamic team dedicated to developing this real-world data marketplace.
We are seeking a highly skilled and motivated junior data engineer to join our growing team! This role is crucial for developing and optimizing our data pipelines, and ensuring data quality and accessibility for advanced analytics and AI models. The ideal candidate will have a strong background in data engineering, with proven experience in data pipelining and orchestration, big data technologies such as SQL, DBT, Spark and Python, and familiarity with cloud infrastructure and database systems. Furthermore, experience building and fine-tuning LLMs for different use cases (text abstraction, Text2SQL, data modeling) is a big plus.
As one of the first hires to our fast-growing startup, you will be given a lot of responsibility and the opportunity to shape our product and data architecture from the ground up! Furthermore, you will receive direct mentorship from the CTO, work closely with our full-stack and data engineering team, and participate fully in all team events.
Key responsibilities:
- Design, build, and maintain efficient, reliable, and scalable data pipelines, from raw data ingest to data cleaning to managing front-end outputs
- Implement data orchestration workflows using SQL + DBT, along with Airflow for orchestration.
- Develop and optimize data processing tasks using Python and Spark
- Leverage AWS cloud services to enhance our data infrastructure's scalability and performance
- Work on database management systems, such as PostgresSQL
- Collaborate with full-stack engineers to manage hand-off between data insights and front-end visualization
- Ensure high-quality data governance and security practices are maintained
- Experience with building, fine-tuning, and deploying LLMs is extremely valuable.
Skills Required:
SQL, DBT, Python, Spark, AWS, PostgresSQL