Data Engineer III
Want to help everyday Americans build wealth? Financial inequality is increasing and too many people are getting left behind. At Stash, we believe in the power of simplifying investing, making it easy and affordable for everyday Americans to build wealth and achieve their financial goals.
We’re one of the fastest growing fintechs in the U.S. and have had another record-breaking year. In 2021 we almost doubled our headcount and valuation. Our personal finance app makes investing easy and affordable; this year 6 million customers set aside more than $3 billion with Stash.
Prioritizing People is one of our core values and has been key to a healthy work-life balance and a great sense of fulfillment and inclusion. We employ a true people first – hybrid model. Live and work where you feel the most productive, whether that is in your home, in an office, or a combination of both. Anywhere in the US or UK.
Let’s solve complex problems and tackle wealth inequality.
At Stash, data is at the core of how we make decisions and build great products for millions of users. As a Data Engineer working on data management, you will build and maintain our world class data platform that democratizes data at Stash. The new Data Management Team is building a “Data Gateway” to help consumers of data find value more efficiently.
Our vision is for all downstream consumers of the data platform to access it through the Data Gateway. All data will be aligned to our business and customers with aggregations/transformations in place and quality-checks with monitoring/alerting. The Data Gateway will be highly available, scalable, reliable, secure, and performant. Our mantra is “data you can trust.”
Our stakeholders include Analytics, Machine Learning, User Research, Product, Engineering, Marketing, Customer Service and Fraud. We partner closely with these stakeholders to understand their needs and the customer and business problems we need to solve.
Tools and technologies in our tech stack (evolving):
- Hadoop, Yarn, Spark, Hive, Redshift
- dbt running on Spark SQL
- AWS (EMR, Glue, S3, redshift, etc.), Terraform
- Future: Spark Streaming, GraphQL, and multi-model DBs
What you’ll do:
Help build the Data Gateway from the ground up! Our roadmap consists of two workstreams: 1) data modeling, and 2) infrastructure.
For the data modeling workstream, you would work with stakeholders to understand their data needs, design a data model to meet these needs, interface with engineering to determine how to create these data elements from raw data, and then write SQL to perform the transformations. The current tech stack for work in track 1 is dbt running on Spark SQL, Python, Airflow, and AWS (EMR, glue, S3, redshift, etc.).
For the infrastructure workstream, you would upgrade our infrastructure to meet ambitious technical requirements, including near real time data freshness and low latency data element retrieval. We’re still in the planning phases for the tech stack here, but are considering Spark Streaming, GraphQL, and multi-model DBs.
Who we’re looking for:
- 3+ years of professional experience working with data
- Ability to work with stakeholders to understand business problems and data needs, and then model and transform raw data to meet these needs
- Experience with large-scale distributed data processing and working with Spark, Hadoop, or MapReduce Framework
- Experience with large-scale distributed data processing
- You have built large-scale data products and understand the tradeoffs made when building these features
- Experience with dbt
- Experience optimizing Spark jobs
- Experience with Spark streaming
- Experience building performant GraphQL APIs
- Experience working with Apache Airflow
- Experience working on a cloud platform such as AWS
*No recruiters please*
To apply for this job please visit grnh.se.