Staff Data Engineer
Posted 101ds ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
Data Engineer developing scalable data architectures and pipelines for eSimplicity's government projects. Collaborating with cross-functional teams to enhance data delivery and optimize systems.
Responsibilities:
- Identifies and owns all technical solution requirements in developing enterprise-wide data architecture.
- Creates project-specific technical design, product and vendor selection, application, and technical architectures.
- Provides subject matter expertise on data and data pipeline architecture and leads the decision process to identify the best options.
- Serves as the owner of complex data architectures, with an eye toward constant reengineering and refactoring to ensure the simplest and most elegant system possible to accomplish the desired need.
- Ensure strategic alignment of technical design and architecture to meet business growth and direction and stay on top of emerging technologies.
- Develops and manages product roadmaps, backlogs, and measurable success criteria and writes user stories.
- Responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross-functional teams.
- Support software developers, database architects, data analysts, and data scientists on data initiatives and ensure that the optimal data delivery architecture is consistent throughout ongoing projects.
- Creates new pipeline development and maintains existing pipeline; updates Extract, Transfer, Load (ETL) process; creates new ETL feature development; builds PoCs with Redshift Spectrum, Databricks, etc.
- Implements, with the support of project data specialists, large dataset engineering: data augmentation, data quality analysis, data analytics (anomalies and trends), data profiling, data algorithms, and (measure/develop) data maturity models and develop data strategy recommendations.
- Assemble large, complex data sets that meet non-functional and functional business requirements.
- Identify, design, and implement internal process improvements, including re-designing data infrastructure for greater scalability, optimizing data delivery, and automating manual processes.
- Building required infrastructure for optimal extraction, transformation, and loading of data from various data sources using AWS and SQL technologies.
- Building analytical tools to utilize the data pipeline, providing actionable insight into key business performance metrics, including operational efficiency and customer acquisition.
- Working with stakeholders, including data, design, product, and government stakeholders, and assisting them with data-related technical issues.
- Write unit and integration tests for all data processing code.
- Work with DevOps engineers on CI, CD, and IaC.
- Read specs and translate them into code and design documents.
- Perform code reviews and develop processes for improving code quality.
Requirements:
- All candidates must pass public trust clearance through the U.S. Federal Government.
- Bachelor’s degree in Computer Science, Engineering, or a related technical field; OR In lieu of a degree, 10 additional years of relevant professional experience and 8 years of specialized experience may be substituted.
- 8+ years of total professional experience in the technology or data engineering field.
- Extensive Data pipeline experience using Python, Java, and cloud technologies.
- Expert data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up.
- Self-sufficient and comfortable supporting the data needs of multiple teams, systems, and products.
- Experienced in designing data architecture for shared services, scalability, and performance.
- Experienced in designing data services including API, metadata, and data catalog.
- Experienced in data governance process to ingest (batch, stream), curate, and share data with upstream and downstream data users.
- Ability to build and optimize data sets, ‘big data’ data pipelines, and architecture.
- Ability to perform root cause analysis on external and internal processes and data to identify opportunities for improvement and answer questions.
- Excellent analytic skills associated with working on unstructured datasets.
- Ability to build processes that support data transformation, workload management, data structures, dependency, and metadata.
- Demonstrated understanding and experience using software and tools, including big data tools like Kafka, Spark, and Hadoop; relational NoSQL and SQL databases including Cassandra and Postgres; workflow management and pipeline tools such as Airflow, Luigi and Azkaban; AWS cloud services including Redshift, RDS, EMR, and EC2; stream-processing systems like Spark-Streaming and Storm; and object function/object-oriented scripting languages including Scala, C++, Java, and Python.
- Flexible and willing to accept a change in priorities as necessary.
- Ability to work in a fast-paced, team-oriented environment.
- Experience with Agile methodology, using test-driven development.
- Experience with Atlassian Jira/Confluence.
- Excellent command of written and spoken English.
- Ability to obtain and maintain a Public Trust; residing in the United States.
Benefits:
- full healthcare benefits




















