Overview
We are seeking a Data & Software Engineer works with a small team to build complex data flows for a custom application. Successful candidate will have advanced Python programming skills, familiarity with Java, an understanding of data security, privacy, governance and compliance principles and a demonstrated history of building production data pipelines and ETL workflows at scale. Candidate must have experience:
What will you do?
* Buildingend-to-end data pipelines leveraging Python Using orchestration tools to deploy data pipelines, including configuring and updating Spark Jobs * Containerizingand deploying applications in cloud environments like AWS. * Workingwith MySQL and PostgreSQL including performance tuning, schema design, and query optimization for complex, analytical workloads. * Leveragingindustry standard tools for code control (Git, IaaCcontrol, etc.) * Workingwith data catalogs, tracking data lineage andhandling a variety of data formats, including Geospatial. * UsingBash scripting for automation and data processing tasks * IntegratingAl/ML services and models * Workwith stakeholders to understand data requirements, assess feasibility, and design appropriate solutions with minimal oversight * Leveragestrong problem-solving and debugging skills for data quality issues, pipeline failures, and performance bottlenecks * Leveragea background in large-scale data migration or platform modernization efforts Contribute to data engineering documentation, best practices, and design patterns.
Do you have what it takes?
- Active TS/SCI W/ Polygraph required.
- Bachelor's degree in Computer Science, Engineering, Finance, or a related technical field, or equivalent practical experience.
Minimum of 5 years' experience with: * ApacheSpark & PySpark * AdvancedPython skills (including Pandas & NumPy) * Docker, Podman * AWSS3, Lambda & Step functions * ApacheIceberg, Airflow, etc. * SQL(with Trino) * NoSQL, DynamoDB * UnityCatalog OSS, Apache Polaris * ApacheSuperset * Terraformor CloudFormation * OpenLineage * H3, PostGIS
|