Data Engineer
Must have: Date of birth
Full resume
full updated linkedin
Overview:
We are seeking an experienced data engineer to deliver high-quality, scalable data solutions on Databricks and AWS for one of our Big Four clients. You will build and optimize pipelines, implement medallion architecture, integrate streaming and batch sources, and enforce strong governance and access controls to support analytics and ML use cases.
Key Responsibilities:
- Build and Maintain Data Pipelines: Develop scalable data pipelines using PySpark and Spark within the Databricks environment.
- Implement Medallion Architecture: Design workflows using raw, trusted, and refined layers to drive reliable data processing.
- Integrate Diverse Data Sources: Connect data from Kafka streams, extract channels, and APIs.
- Data Cataloging and Governance: Model and register datasets in enterprise data catalogs, ensuring robust governance and accessibility.
- Access Control: Manage secure, role-based access patterns to support analytics, AI, and ML needs.
- Team Collaboration: Work closely with peers to achieve required code coverage and deliver high-quality, well-tested solutions.
- Optimize and Operationalize: Tune Spark jobs (partitioning, caching, broadcast joins, AQE), manage Delta Lake performance (Z-Ordering, OPTIMIZE, VACUUM), and implement cost and reliability best practices on AWS.
- Data Quality and Testing: Implement data quality checks and validations (e.g., Great Expectations, custom PySpark checks), unit/integration tests, and CI/CD for Databricks Jobs/Workflows.
- Infrastructure as Code: Provision and manage Databricks and AWS resources using Terraform (workspaces, clusters, jobs, secret scopes, Unity Catalog objects, S3, IAM).
- Monitoring and Observability: Set up logging, metrics, and alerts (CloudWatch, Datadog, Databricks audit logs) for pipelines and jobs.
- Documentation: Produce clear technical documentation, runbooks, and data lineage for governed datasets.
Required Skills & Qualifications:
- Databricks: 6-9 years of experience with expert-level proficiency
- PySpark/Spark: 6-9 years of advanced hands-on experience
- AWS: 6-9 years of experience with strong competency, including S3 and Terraform for infrastructure-as-code
- Data Architecture: Solid knowledge of the medallion pattern and data warehousing best practices
- Data Pipelines: Proven ability to build, optimize, and govern enterprise data pipelines
- Delta Lake and Unity Catalog: Expertise in Delta Lake internals, time travel, schema evolution/enforcement, and Unity Catalog RBAC/ABAC
- Streaming: Hands-on experience with Spark Structured Streaming, Kafka, checkpointing, exactly-once semantics, and late-arriving data handling
- CI/CD: Experience with Git-based workflows and CI/CD for Databricks (e.g., Databricks Repos, dbx, GitHub Actions, Azure DevOps, or Jenkins)
- Security and Compliance: Experience with IAM, KMS, encryption, secrets management, token/credential rotation, and PII governance
- Performance and Cost: Demonstrated ability to tune Spark jobs and optimize Databricks cluster configurations and AWS usage for cost and throughput
- Collaboration: Experience working in Agile/Scrum teams, peer reviews, and achieving code coverage targets
Preferred Skills & Qualifications:
- Certifications: Databricks Data Engineer Professional, AWS Solutions Architect/Developer, HashiCorp Terraform Associate
- Data Catalogs: Experience with enterprise catalogs such as Collibra or Alation, and lineage tooling such as OpenLineage
- Orchestration: Databricks Workflows and/or Airflow
- Additional AWS: Glue, Lambda, Step Functions, CloudWatch, Secrets Manager
- Testing: pytest, chispa, Great Expectations, dbx test
- Domain Experience: Analytics and ML feature pipelines, MLOps integrations
Recommended Jobs
Customer Service Officer
About Us At Swift 7 Consultants , we specialize in delivering innovative business support solutions tailored to meet the evolving needs of our clients. With a focus on operational excellence…
2nd Shift Pre Sew Machine Operator
Reports To: Sewing Supervisor Direct Reports: None FLSA Status: Non-Exempt Employment Type: Full-Time Position Overview: Responsible for producing work in a TSS sewing team-based c…
EMT Intermediate
SUMMARY: The mission and purpose of this position are to compassionately deliver high-quality service and basic patient care in a professional, caring and cost-effective manner. ESSENTIAL FUNC…
Administrative Project Manager
Fair Painters, a sister company of Natural Essentials Inc., is seeking an Administrative Project Manager. We are a growing painting company specializing in high-quality residential and commercial …
9-15 Ft Freight Cargo/Sprinter Van DOT Owner-Operators
Owner-Operators With Cargo/Sprinter Vans Needed: 9-15ft Of The Cargo Space Requirements: ~ Age: 21+. ~ No Rental Vehicles. ~ Not older than 2010. ~9-15ft of the cargo space. ~ Punctuality,…
Join an Established/Growing Cardiology Group in Northwest Ohio
A growing cardiology group in Northwest Ohio is adding a Non-Invasive Cardiologist. Join an established cardiology group with 7 physicians and 8 APP. Opportunity Details Group is made up of …
Groundskeeper - Anheuser-Busch Columbus Brewery
Job Description The Groundskeeper is responsible for maintaining and improving site grounds. Assigned work such as mowing, trimming, leaf removal and duties related to inclement weather, such as s…
Oregon - Armed Security Guard - Full Time and Part Time
Job Details Level Experienced Job Location FPS Oregon - Portland, OR Position Type Full-Time/Part-Time Education Level High School Travel Percentage None Job Shift Any Job…
Delivery Representative
Job Description Job Description Delivery Representative – London Cleaners Established in 1967, London Cleaners is a family-owned and operated, environmentally conscious dry cleaner dedicated…