As a Staff Software Engineer, you will take us beyond traditional monolithic SQL engines and batch pipelines. You will build the next generation of distributed data storage and processing systems. You will build systems that can scale indefinitely, and surpass traditional query performance, while making the interfaces for that data simple, expressive, and cleanly abstracted. Your interfaces will support a broad array of data consumers, from our web application, to business analytics, and artificial intelligence.
Primary Duties:
- Identify and develop scalable and performant solutions.
- Work across discipline to shape product strategy and execution.
- Develop the foundations of code architecture and quality.
- Mentor and coach engineers.
- Set and uphold the standard for engineering processes to support high-quality engineering.
Minimum Qualifications:
- BS/BTech (or higher) in Computer Science, Engineering or a related field required.
- 8+ years of production-level experience as an engineer building highly scalable systems.
- 4+ years of experience acting as a trusted technical decision-maker in a team setting, solving for short-term and long-term business value.
- 4+ years of experience working with SQL or other database querying languages on large multi-table data sets.
- Experience architecting, developing, and deploying large-scale distributed systems at scale.
- Experience with cloud technologies, e.g., AWS, Azure, GCP.
- Experience building continuous integration and continuous development (CI/CD) pipelines.
- Strong familiarity with server-side web technologies (eg: Java, Python, Scala, C#, C++, Go).
Preferred KSAs:
- Deep understanding of one or more tools like Apache Spark, SQL, and Python for data analysis, manipulation, and processing.
- Familiarity with data technologies and architectures (e.g., event based architecture, distributed computing, in-memory data processing).
- Experience working with SQL and NoSQL databases (e.g., MySQL, PostgreSQL, Cassandra, MongoDB), focusing on high-performance querying and optimization for analytical workloads.
- Experience in designing, deploying, and managing data warehouses (eg. Snowflake, Amazon Redshift, etc.) for analytics and business intelligence applications.
- Knowledge of data partitioning, sharding, and indexing strategies to ensure optimal performance in high-load environments.
- Proficiency in designing data models that support analytical requirements, ensuring efficient data retrieval and storage.
- Knowledge of data pipeline architecture, including ETL/ELT processes, batch, and real-time data processing.
- Skilled in optimizing data pipelines for scalability and performance, ensuring efficient data ingestion, storage, and retrieval.
- Ability to use caching strategies and indexing techniques to reduce query and processing times.
- Knowledge of tools for data pipeline creation and orchestration such as Apache Airflow, AWS Glue, and Apache Kafka.
- Knowledge of data security principles and ensuring compliance with regulations (e.g., GDPR, HIPAA) through proper data governance practices.
Physical Requirements:
- Sitting for prolonged periods of time. Extensive use of computers and keyboard. Occasional walking and lifting may be required.
•
Last updated on Sep 16, 2024