Browse
Employers / Recruiters

Data Engineer - Stream Data Processing - Distributed Data Processing

pathwaycom · 30+ days ago
$150k+
Full-time
Remote
Continue
By pressing the button above, you agree to our Terms and Privacy Policy, and agree to receive email job alerts. You can unsubscribe anytime.

About Pathway

Deeptech start-up, founded in March 2020.

  • Our primary developer offering is an ultra-performant Data Processing Framework (unified streaming + batch) with a Python API, distributed Rust engine, and capabilities for data source integration & transformation at scale (Kafka, S3, databases/CDC,...).
  • The single-machine version is provided on a free-to-use license (`pip install pathway`).
  • Major data use cases are around event-stream data (including real-world data such as IoT), and graph data that changes over time.
  • Our enterprise offering is currently used by leaders of the logistics industry, such as DB Schenker or La Poste, and tested across multiple industries. Pathway has been featured in Gartner's market guide for Event Stream Processing.
  • Learn more at http://pathway.com/ and https://github.com/pathwaycom/.

Pathway is VC-funded, with amazing BAs from the AI space and industry. We have operations across Europe and in the US. We are headquartered in Paris, with significant support from the French ecosystem (BPI, Agoranov, WILCO,...).


The Team

Pathway is built by and for overachievers. Its co-founders and employees have worked in the best AI labs in the world (Microsoft Research, Google Brain, ETH Zurich), worked at Google, and graduated from top universities (Polytechnique, ENSAE, Sciences Po, HEC Paris, PhD obtained at the age of 20, etc…). Pathway’s CTO is a co-author with Goeff Hinton and Yoshua Bengio. The management team also includes the co-founder of Spoj.com (1M+ developer users) and NK.pl (13.5M+ users) and experienced growth leader who has scaled companies with multiple exits.


The opportunity

We are searching for a person with a Data Processing or Data Engineering profile, willing to work with live client datasets, and to test, benchmark, and showcase our brand-new stream data processing technology.

The end-user of our product are mostly developers and data engineers working in a corporate environment. Our development framework is one day expected to become for them a part of their preferred development stack for analytics projects at work – their daily bread & butter.


You Will

You will be working closely with our CTO, Head of Product, as well as key developers. You will be expected to:

  • Implement the flow of data from their location in client's warehouses up to Pathway's ingress.
  • Set up CDC interfaces for change streams between client data stores and i/o data processed by Pathway; ensuring data persistence for Pathway outputs.
  • Design ETL pipelines within Pathway.
  • Contribute to benchmark framework design (throughput / latency / memory footprint; consistency), including in a distributed system setup.
  • Contribute to building open-source test frameworks for simulated streaming data scenarios on public datasets.

Requirements

  • Inside-out understanding of at least one major distributed data processing framework (Spark, Dask, Ray,...)
  • 6 months+ experience working with a streaming dataflow framework (e.g.: Flink, Kafka Streams or ksqldb, Spark in streaming mode, Beam/Dataflow)
  • Ability to set up distributed dataflows independently.
  • Experience with data streams: message queues, message brokers (Kafka), CDC.
  • Working familiarity with data schema and schema versioning concepts; Avro, Protobuf, or others.
  • Familiarities with Kubernetes.
  • Familiarity with deployments in both Azure and AWS clouds.
  • Good working knowledge of Python.
  • Good working knowledge of SQL.
  • Experienced in working for an innovative tech company (SaaS, IT infrastructure or similar preferred), with a long-term vision.
  • Warmly disposed towards open-source and open-core software, but pragmatic about licensing.


Bonus Points

  • Know the ways of developers in a corporate environment.
  • Passionate about trends in data.
  • Proficiency in Rust.
  • Experience with Machine Learning pipelines or MLOps.
  • Familiarity with any modern data transformation workflow tooling (dbt, Airflow, Dagster, Prefect,...)
  • Familiarity with Databricks Data Lakehouse architecture.
  • Familiarity with Snowflake's data product vision (2022+).
  • Experience in a startup environment.

Benefits

Why You Should Apply

  • Intellectually stimulating work environment. Be a pioneer: you get to work with a new type of stream processing framework.
  • Work in one of the hottest data startups in France, with exciting career prospects
  • Responsibilities and ability to make significant contribution to the company’ success
  • Compensation: contract of up to $150k (full-time-equivalent) + Employee stock option plan.
  • Inclusive workplace culture


Further details

  • Type of contract: Flexible / remote
  • Preferable joining date: early 2023.
  • Compensation: contract of up to $150k (full-time-equivalent) + Employee stock option plan.
  • Location: Remote work from home. Possibility to meet with other team members in one of our offices:
    • Paris – Agoranov (where Doctolib, Alan, and Criteo were born) near Saint-Placide Metro (75006).
    • Paris Area – Drahi X-Novation Center, Ecole Polytechnique, Palaiseau.
    • Wroclaw – University area.

Candidates based anywhere in the United States and Canada will be considered.

Last updated on Mar 8, 2024

See more

About the company

More jobs like this

Analyzing
Network Control
3
3djdnw5yqdh8wl3frr5t6561tvvokq01affwpxt3lcutzo4f8yt1aeiy3msk02or

Melbourne, Florida

 · 

30+ days ago

Data Engineer (with NIKE exp.)
A
atjdnw2s7bs9ixn3syxicb6lo3i6p309225p0sn85jt6hn8a2nd1lz60q1ugarb5

Dearborn, Michigan

 · 

30+ days ago

Big Fix Engineer
C
crjdnwsnowo2i4nz45b1teboszrxlg0351vr73gpqw7yanury9u287prckhdnkww

New York, New York

 · 

30+ days ago

Seattle, Washington

 · 

30+ days ago

DPOR - Data Warehouse/BI Developer - IN PERSON IVS ONLY!
B
b6jdnwcpcemgg8el3r9winlpunj8hc038b1vkhowrzxn9gitznreodi38t7rirkp

Richmond, Virginia

 · 

30+ days ago

Staff GIS Specialist
C
c4jdnwc7x3stjcj6zixxnwiepq2dyk03b8lddp27c7hr98p88sagx6olnglsveeo

Kansas City, Missouri

 · 

30+ days ago

SAS/ Mirth Developer
R
rsjdnwc9jel4i3xyjsm3m8vnhrmayk037bphn44zg3i1bl3dcjtqhqlclsisinpr

Cary, North Carolina

 · 

30+ days ago

SSIS Integration Engineer
D
d1jdnwvhbdgegwtuivnn7ap9pt7oxh03c5khec0dlwfqm0mxseydguueduceafch

Hoboken, New Jersey

 · 

30+ days ago

Senior IT Production Operations Specialist
F
fvjdnwvwi7yecmymd9si3it1ointo80348emvd7mgqh749rpbe3n811jnfkeb228

Frisco, Texas

 · 

30+ days ago

Senior Business Process Engineer
D
dhjdnwh4qm62pb5vm2o4tbd72ej7oa01f47beu0d9d984ckrwi58r2ocg36n82t5

Chesterfield, Virginia

 · 

30+ days ago

Developed by Blake and Linh in the US and Vietnam.
We're interested in hearing what you like and don't like! Live chat with our founder or join our Discord
Changelog
🚀 LaunchpadNov 27
Create a site and sell services based on your resume.
🔥 Job search dashboardNov 13
Revamped job search UI with a sortable grid, live filtering, bookmarks, and application tracking.
🫡 Cover letter instructionsSep 27
New Studio settings give you control over AI output.
✨ Cover Letter StudioAug 9
Automatically generate cover letters for any job.
🎯 Suggested filtersAug 6
Copilot suggests additional filters above the results.
⚡️ Quick applicationsAug 2
Apply to jobs using info from your resume. Initial coverage of ~200k jobs in Spain, Germany, Austria, Switzerland, France, and the Netherlands.
🧠 Job AnalysisJul 12
Have Copilot read job descriptions and extract out key info you want to know. Click "Analyze All" to try it out. Click on the Copilot's gear icon to customize the prompt.
© 2024 RemoteAmbitionAffiliate · Privacy · Terms · Sitemap · Status