Friday, August 26, 2022

Linkedin profile: Data engineer | Zach Wilson

 

Airbnb logo
Staff Data Engineer (commercial products)
Airbnb · Full-timeFeb 2021 - Present · 1 yr 7 mosSan Francisco Bay Area
    • Decreased the landing time of Airbnb's unit economics dataset from 3 days to 1 1/2 days. Built a long-term roadmap to further increase the quality of this critical dataset. Improved the quality of online systems at Airbnb to be easily compilable by Spark pipelines to compute metrics offline. Lead a team of 6 engineers to deliver on the MIDAS initiative to increase the data quality of the Commercial Products org. Upleveled smart pricing at Airbnb by improving the feature engineering and latency of the data used to train the smart pricing model.
Netflix logo
        • Built a machine learning feedback system that allowed security engineers to label corporate user behavior as risky or not risky. Built Asset Inventory - a graph database solution that is a map of all of Netflix's cloud infrastructure.
        • Skills: Big Data · Scala · Apache Spark · Cybersecurity · SQL · Machine Learning · Apache Airflow · Team Leadership · Data Visualization · Data Analysis · Java · Python · JavaScript · HTML · Cascading Style Sheets (CSS) · Node.js · React.js · D3.js · Linux · Git · PostgreSQL · REST APIs · Spring Framework · Googling
        • I built a pipeline that measures the cloud infrastructure impact on AB tests, saving Netflix millions by allowing them to make smarter AB test rollout decisions.
Data Engineer
FacebookAug 2016 - May 2018 · 1 yr 10 mosMenlo Park, California
    • - Managed a 10 PB+ Hive data warehouse - Consolidated and conformed company-wide growth metrics (across WhatsApp, Instagram, Messenger, and Facebook) into a single, company-wide view. - Optimized machine learning feature set generation pipelines (200+ TB/day) from having a 4 day latency to having a 1 day latency. While also dropping compute costs for those pipelines 4x. - Reduced core notification data set latencies from 36 hours to < 8 hours. - Migrated 50% of notifications pipelines from using Hive to use Spark, Presto, or real-time streaming. - Cut compute cost from notifications pipelines by 40% over the course of 9 months.

No comments:

Post a Comment