Back to all jobs
A

Data Engineer

Apify|Dev Tools
Prague
full-timeProduct

Job Description

Apify is the largest marketplace of tools for AI. 30,000+ Actors helping people and agents get real-time web data, track competitors, generate leads, or integrate their apps. Actors are built by a global creator community that now earns more than $1M every month.

Join us to help people put the web to work. Apify can find missing children, protect consumers from fake discounts across the EU, and feed data to AI chatbots.

We're looking for a Data Engineer to own the integration layer between Snowflake and the operational tools that run Apify's go-to-market and product motion: HubSpot, Intercom, Mixpanel, and Segment. You'll make sure the right data lands in the right system at the right time, with the right shape, so Sales, Marketing, Customer Success, and Product teams can act on it.

You'll be the 9th member of the data team - joining a mix of analytical engineers, analysts, and data scientists - at the moment Segment is being rolled out as Apify's CDP. That's yours to land end-to-end.

What you'll be working on:

  • Own the integration domain end to end - all pipelines, transformations, and Snowflake models that connect HubSpot, Intercom, Mixpanel, and Segment to the rest of the platform, in both directions.

  • Design event tracking and the CDP layer with the RevOps team as Segment becomes the source of truth for behavioral data flowing into product, marketing, and CRM systems.

  • Build reliable, observable pipelines in Keboola and dbt - with clear data contracts, schema tests, freshness guarantees, and alerting.

  • Model integration data in Snowflake so HubSpot, Intercom, Mixpanel, and Segment data lands in well-defined tables that downstream consumers can trust, with documentation that analysts and scientists can actually use.

  • Power lifecycle automations - PQA scores back into HubSpot, behavioral campaigns in Intercom and customer.io, product usage signals - by shipping the data they depend on.

  • Diagnose and resolve pipeline incidents independently - trace lineage across multiple components, find root causes, fix, and write the runbook so it doesn't bite the next person.

Tech stack

  • Snowflake - data warehouse

  • Keboola - extractors, writers, and orchestration

  • dbt - transformations on Snowflake (orchestrated by Keboola; this is where we're actively migrating existing transformation logic)

  • Tableau and Redash - BI

  • n8n - workflow automation

  • Segment - CDP, currently being rolled out end-to-end

Who we're looking for:

  • 3+ years of data engineering experience, with meaningful time spent on integrations between a cloud warehouse and operational SaaS tools (HubSpot, Salesforce, Intercom, Zendesk, Mixpanel, Amplitude, Segment, RudderStack, or similar).

  • Fluent in SQL (window functions, CTEs, complex multi-source joins, query optimization) and comfortable in Python for the parts a no-code tool can't handle.

  • Production experience with Snowflake (or BigQuery, Databricks, Redshift), and an understanding of the cost, performance, and access-control tradeoffs of a usage-based warehouse.

  • Experience building end-to-end pipelines combining an orchestration or ELT platform (Keboola, Fivetran, Airflow, Dagster, Prefect, Matillion) with a transformation framework like dbt.

  • Hands-on experience with a CDP (Segment, RudderStack, mParticle) - tracking plans, schemas, identity resolution, downstream consumers - not just installing the snippet.

  • You think in data contracts - schema stability, freshness SLAs, documented field definitions - and treat the boundary between your domain and downstream consumers as a first-class interface.

  • Comfortable with reverse ETL (Census, Keboola, or hand-rolled), and you understand what it means to write back to a CRM that humans are also editing.

  • Pragmatic about tooling - happy to use n8n for the right job, and equally happy to write proper code when that's the right call.

  • Able to explain why a dashboard moved and what it means to non-technical stakeholders in Sales, Marketing, and Customer Success, in English, both in writing and in person.

Nice to have:

  • Experience with usage-based billing or product-led growth data models.

  • Exposure to LLM-assisted workflows in the data stack.

  • Prior experience at a SaaS company between 50 and 500 people.

By the end of the first month, we expect you to:

  • Know the data team, the RevOps and Growth stakeholders who depend on the integration layer, and the workflows that flow through HubSpot, Intercom, Mixpanel, and Segment.

  • Work through the existing Keboola components and dbt models to understand what's in place, what's fragile, and where the silent failures live.

  • Trace a typical record from each source system through to the Snowflake tables analysts use.

By the end of the first 3 months, we expect you to:

  • Have a complete map of the integration domain - what flows where, what's owned by whom, where the silent failures are - and a documented six-month plan for the work ahead.

  • Have at least one end-to-end improvement shipped with monitoring in place.

  • Be the go-to person on the data team for HubSpot, Intercom, Mixpanel, and Segment data questions.

By the end of the first 6 months, we expect you to:

  • Have Segment operating as the durable CDP for Apify, with a published tracking plan and reliable event flows into Snowflake and downstream tools.

  • Have core tables from HubSpot, Intercom, Mixpanel, and Segment with documented data contracts - schema, freshness SLA, ownership - and tests and alerting in place.

  • Have driven measurable improvements in data freshness, pipeline reliability, and incident response time, tracked publicly, and shipped at least one cross-team initiative where the data integration unlocked a business outcome (conversion lift, churn reduction, ops automation).

Why should you work at Apify?

  • Space, support, and autonomy for personal growth, with a direct impact on our success

  • Full-time position in Prague (Lucerna Palace)

  • Flexible working hours (perfect for both night owls ๐Ÿฆ‰ and early birds ๐Ÿฅ)

  • Nobody counts holidays as long as the work gets done ๐Ÿ’ช

  • Unlimited Claude for every Apifier. We don't count tokens. Just use them well ๐Ÿค–

  • Stock options and profit sharing ๐Ÿ’ฐ

  • Free Multisport card

  • We welcome pets, kids, and bikes in the office

  • Epic team buildings and offsites ๐Ÿšข with biking, canoeing, and other adventures ๐Ÿช‚

  • Solid education and training budget, conference tickets, internal โ€œEat & Learnโ€ sessions, and the possibility to work across teams

  • Generous hardware budget ๐Ÿ’ป

  • Free lunches every day when working from the office ๐ŸŒฎ๐Ÿฅก

  • Unlimited supply of โ˜• & ๐Ÿบ and snacks

  • Free entry to the wonderful Prague and Brno Zoo ๐Ÿ˜

  • Ping-pong, chess, PS5, lightsabers, foosball league after lunch.

For more details about Apify and what itโ€™s like to work with us, see our Careers page.

About Apify

First seen: May 29, 2026
Last updated: May 29, 2026