Data Engineer
Samba TV
We are seeking a skilled Data Engineer to join our Data Technology team in Warsaw. Our team builds and maintains the data platform that powers the entire organization — from ingestion and analytics to reporting, from viewership and contextual datasets to scalable applications that enable data-driven decision making. You will contribute to the design, development, and maintenance of our data infrastructure, working primarily on AWS, Databricks, BigQuery, and Snowflake technology.
At this level, you are a self-sufficient contributor who takes clear ownership of well-scoped pipeline components and features, collaborates effectively with teammates and cross-functional stakeholders, and begins developing the breadth to navigate the full data lifecycle. You bring 2–4 years of hands-on experience, write production-quality code, and are ready to grow toward greater technical autonomy.
What You'll Do
- Design, build, and maintain scalable data pipelines supporting both internal and external data consumers, using Apache Spark (PySpark), Airflow, Databricks, and BigQuery/Snowflake
- Develop and optimize data transformations for large-scale datasets, applying modern table formats such as Delta Lake and Iceberg
- Own and operate Airflow DAGs and orchestration workflows, ensuring reliable and timely delivery of data products
- Participate in the modernization of data frameworks and integrations across Databricks and BigQuery environments
- Build and integrate data validation and quality assurance tooling using frameworks such as Great Expectations or similarImplement monitoring, logging, and alerting for data workflows to ensure production reliability
- Debug and resolve pipeline issues across distributed environments, including cloud storage (AWS S3/GCS), databases, and orchestration tools
- Contribute to the implementation of data governance and access controls using Databricks Unity Catalog
- Collaborate with data scientists, analysts, and software engineers to deliver governed, reusable data assets
- Participate in code reviews, contribute to documentation, and help raise engineering standards within the team
- Identify bottlenecks in the development lifecycle and contribute ideas to improve them
- Bachelor's degree in Computer Science, Engineering, Data Science, or a related technical field, or equivalent practical experience
- 2–4 years of professional experience in data engineering or a closely related role
- Strong proficiency in Python and SQL; hands-on experience building and debugging data pipelines and automation scripts
- Experience with Apache Airflow for workflow orchestration, including DAG development, operator configuration, and troubleshooting
- Hands-on experience with Databricks and Apache Spark (PySpark) for large-scale data processing
- Familiarity with modern table formats such as Delta Lake or Iceberg
- Experience with cloud infrastructure (AWS and/or GCP), including S3/GCS bucket management and cloud data services
- Experience with database migration workflows and version-controlled configuration management (Git)
- Strong debugging and problem-solving skills with the ability to trace issues across distributed systems
- Ability to manage a queue of operational tickets and prioritize based on SLA urgency
- Strong communication skills; comfortable working asynchronously in a distributed, cross-timezone team
- Experience with BigQuery or Snowflake as primary data warehousing platforms
- Familiarity with Databricks Unity Catalog for data governance and metadata management
- Experience with Kubernetes (EKS, GKE) or containerized data workflows
- Knowledge of CI/CD pipelines for data platform environments
- Experience working with REST APIs for data access, automation, or integrationFamiliarity with FinOps concepts or cloud cost management principles
- Background in ad tech, media measurement, or streaming data domain
170000 - 235000 PLN a year
