Work with the brightest minds at one of the largest financial institutions in the world. This is long-term contract opportunity that includes a competitive benefit package!
Our client has been around for over 150 years and is continuously innovating in today's digital age. If you want to work for a company that is not only a household name, but also truly cares about satisfying customers' financial needs and helping people succeed financially, apply today.
Position: Big Data Engineer
Location: PHILADELPHIA, Pennsylvania, 19130
Term: 12 months
- Lead complex initiatives on selected domains.
- Ensure systems are monitored to increase operational efficiency and managed to mitigate risk.
- Define opportunities to maximize resource utilization and improve processes while reducing cost.
- Lead, design, develop, test and implement applications and system components, tools and utilities, models, simulation, and analytics to manage complex business functions using sophisticated technologies.
- Resolve coding, testing and escalated platform issues of a technically challenging nature.
- Lead team to ensure compliance and risk management requirements for supported area are met and work with other stakeholders to implement key risk initiatives.
- Mentor less experienced software engineers.
- Collaborate and influence all levels of professionals including managers.
- Lead team to achieve objectives.
- Partner with production support and platform engineering teams effectively.
Is this a good fit? (Requirements):
- 7+ years of Big Data Platform (data lake) and data warehouse engineering.
- Very strong Python and other functional programming skills.
- 5+ years of experience with Hadoop stack: HDFS, Hive, SQL, Spark, Spark Streaming, Spark SQL, HBase, Kafka, Sqoop, Atlas, Flink, Kafka, Cloudera Manager, Airflow, Impala, Hive, HBase, Tez, Hue, and a variety of source data connectors.
- Hands-on experience designing and building microservices, APIs and, MySQL.
- Hands-on experience with building modern, resilient, and secure data pipelines, including movement, collection, integration, transformation of structured/unstructured data with built-in automated data controls, and built-in logging/monitoring/alerting, and pipeline orchestration managed to operational SLAs.
- Hands-on experience developing and managing technical and business metadata.
- Hands-on experience creating/managing Time-Series data from full data snapshots or incremental data changes.
- Hands-on experience with implementing fine-grained access controls such as Attribute Based Access Controls using Apache Ranger.
- Experience automating DQ validation in the data pipelines.
- Experience implementing automated data change management including code and schema, versioning, QA, CI/CD, rollback processing.
- Experience with automating end to end data lifecycle on the big data ecosystem.
- Experience with managing automated schema evolution within data pipelines.
- Experience with implementing masking and/or other forms of obfuscating data.
- Experience automating data pipelines to leverage static and operational metadata.
- Experience developing automated solutions for Data Registration and Classification of technical metadata.
- Advanced understanding of SQL and NoSQL DB schemas.
- Advanced understanding of Partitioned Parquet, ORC, Avro, various compression formats.
- Experience with any one of Ansible, Chef, Puppet, Python, Linux Scripts.
- Development of automation around DevOps style data pipeline deployments and rollbacks.
- Developing containerized Microservices and APIs.
- Understanding of Data Governance policies and standards.
- Google cloud data services experience (bonus).
- Familiarity with key concepts implemented by Apache Hudi or Iceberg, or Databricks Delta Lake.
- 5+ years of Specialty Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education.