Data Engineer - Generative AI Applications - Mid Level
- $150k – $200k • 0.1% – 1.0%
- Remote •
- 3 years of exp
- Full Time
Not Available
Onsite or remote
About the job
About wover.ai
Our contextually aware AI empowers employees to be more productive and focused on what really drives your business forward. Applying the latest in generative AI models, wover.ai supercharges your critical data, processes, and decision-making frameworks into seamless automated interactions, leaving your employees to be hyper focused on what matters and less on what can be automated.
We are on the lookout for AI superstars to contribute to the development of wover.ai’s innovative AI solutions that redefine how we work and get $#!+ done.
Interested? Come work with us.
About your role
We seek Pioneer’s, so if you are the take a walk in the woods type or have a healthy appetite for operating at the edge of possible and happen to be an AI Researcher interested in Domain Specific Models (DSM’s) and injecting contextuality into AI models, we want to hear from you.
We are seeking a highly capable and curious Data Engineers (DE) to join our Research team who are laser focused on building the context-aware, domain smarts AI platform that is empowering employees to reach their highest aspirations.
As a wover.ai DE you will be responsible for developing and implementing cutting-edge solutions that leverage the latest in generative AI, ML, and that inject contextuality into NLP technologies.
You will collaborate with a multidisciplinary team of researchers, engineers, and data scientists to explore and implement novel techniques that enable NLP models to comprehend and generate contextually relevant responses. This role requires a strong background in NLP, AI, and ML, combined with a passion for pushing the boundaries of language understanding and generation.
Working closely with cross-functional teams to understand business requirements, available data, and translate them into experiments, build the architecture and train the showcase models. You will work closely with Full Stack Engineers to expose the models as APIs and usable software applications. This is a highly technical Applied Research role that requires expertise in Large Language Models (LLMs), and all data pipeline tasks.
Role Summary:
As a Senior Data Engineer, you shall be a part of our growing team of MI experts. As a team leader, you will be evolving and optimizing our data and data pipeline architecture, as well as, optimizing data flow and collection for cross functional teams. You are an expert data pipeline builder and data wrangler who enjoys optimizing data systems and evolving them. The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data and models devOps (dataOps) architecture is consistent throughout ongoing projects. You are self-directed and comfortable supporting the dataOps needs of multiple teams, systems and products. You will also be responsible for integrating them with the architecture used across the company. The right candidate will be excited by the prospect of optimizing or even re-designing our company’s dataOps architecture to support our existing and next generation of MI-driven products and solutions initiatives.
Key Responsibilities:
• Lead Data Engineering features/enhancements within a certain business area.
• Define the business metrics of success for DE projects and translates them into performance or scalability metrics.
• Design and implement the data platform; and build a reporting and analytics engine/platform
• Create and maintain optimal data and model data Ops pipeline architecture
• Assemble large, complex data sets that meet functional / non-functional business requirements.
• Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
• Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and cloud-based ‘big data’ technologies from AWS, Azure and others.
• Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
• Keep data separated and secure across national boundaries through multiple data centers and strategic customers/partners.
• Create tool-chains for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
• Work with data and machine learning experts to strive for greater functionality in our data and model life cycle management systems.
• Support data Ops competence build-up in Ericsson Businesses and Customer Serving Units
Key Qualifications:
• Bachelors/Masters/Ph.D. in Computer Science, Information Systems, Data Science, Artificial Intelligence, Machine Learning, Electrical Engineering or related disciplines from any of the reputed institutes. First Class, preferably with Distinction.
• Overall industry experience of around 10+ years, at least 5 years’ experience as a Data Engineer.
• 5+ years of experience in the following:
o Software/tools: Hadoop, Spark, Kafka, etc.
o Relational SQL and NoSQL databases, including Postgres and Cassandra.
o Data and Model pipeline and workflow management tools: Azkaban, Luigi, Airflow, Dataiku, etc.
o Stream-processing systems: Storm, Spark-Streaming, etc.
o Object-oriented/object function scripting languages: Python, Java, Scala (Advanced level in one language, at least)
• Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
• Experience performing root cause analysis on internal and external data and processes to answer specific business questions and seek opportunities for improvement.
• Experience in Data warehouse design and dimensional modeling
• Strong analytic skills related to working with unstructured datasets.
• Experience building processes supporting data transformation, data structures, metadata, dependency and workload management.
• Advanced SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of other databases/date-sources.
• Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
• Experience with Docker containers, orchestration systems (e.g. Kubernetes), continuous integration and job schedulers.
• Familiar with functional programming and scripting languages such as JavaScript or GO
• Knowledge of server-less architectures (e.g. Lambda, Kinesis, Glue).
• Experience with cloud native technologies, microservices design and REST APIs.
• Familiar with agile development and lean principles.
• Contributor or owner of GitHub repo.
• Strong project management and interpersonal skills.
• Experience supporting and working with cross-functional teams in a dynamic environment.
• Good communication skills in written and spoken English
• Creativity and ability to formulate problems and solve them independently
• Ability to build and nurture internal and external communities
• Experience in writing and presenting white papers, journal articles and technical blogs on the results
Additional Requirements:
• Applications/Domain-knowledge in Telecommunication and/or IoT, a plus.
• Experience with data visualization and dashboard creation is a plus
• Ability to work independently with high energy, enthusiasm and persistence
• Experience in partnering and collaborative co-creation, i.e., working with complex multiple stakeholder business units, global customers, technology and other ecosystem partners in a multi-culture, global matrix organization with sensitivity and persistence