Resume Upload

Go Back

, QC
2019-08-21 13:01:28

Job Type: Full Time only
Budget $: 100,000 - 200,000

RESPONSIBILITIES

Data Related Responsibilities
Architect, engineer, deploy and maintain data pipelines (Airflow DAGs) that are fault tolerant, temporally consistent, idempotent, replayable, and generally awe-inspiring
Engineer tested and automated data transformations using PySpark, SQL, and Pandas
Ensure the highest standard of data governance by crafting data contracts and service level agreements, automating data lineage tracking, data cataloging and runtime validations

Technical Responsibilities

Ensure high code quality and engineering standards
Work with Jenkins for continuous integration and deployment, Docker for containerization, Git for version control, and Kubernetes for deployment
Write infrastructure as code scripts with Terraform to support and improve our data lakes AWS infrastructure
Engage with technical challenges in the domains of storage, pipelining and schema management
Work on problems related to data access and security
Collaborate with other teams and contribute code to other technical projects when necessary
Provide rigorous code reviews and help manage our repositories
Write comprehensive tests and resolve errors in a timely manner

Non-Technical Responsibilities

Take end-to-end ownership of data pipelines, ensuring that every stakeholderâ??s business needs are well understood and delivered accordingly
Support peers as necessary, both within and outside of your team
Act as a subject matter expert for all Data Engineering related matters within the company
Mentor peers and contribute meaningfully to the technical culture at client

REQUIREMENTS

Relevant academic background and/or verifiable domain expertise in Data Engineering.
A minimum of 2 years programming experience, preferably in high-level Object-Oriented or Functional languages. Fluency in Python is a major asset
Experience working with cloud based infrastructure and DevOps, AWS based work experience an asset
Extensive experience working with batch data pipelining frameworks such as Airflow or Luigi, experience with stream processing frameworks an asset
Deep understanding of data lakes, data warehouses or other analytics solutions
Deep understanding of data transformation techniques and ETL scripting, knowledge of Spark and Pandas a strong asset
Extensive experience writing and optimizing SQL queries
Domain expertise in architecting and maintaining distributed data systems
Knowledge of source control with Git, CICD pipelining, testing, containerization and orchestrated deployment
Experience working in an Agile ecosystem, an asset
Strong written and verbal communication skills in English, French an asset

SKILLS