Tuesday, August 24, 2021

Transforming to a Data Business - Dow Jones DNA (Cloud Next '19)

 Aug. 24, 2021

Here is the link. 

It is our hope that our lessons learned will spare other engineers restless nights tossing and turning over infrastructure considerations in transitioning to a data business. Dow Jones DNA (Data, News, and Analytics) platform was the answer to evolving customer requests to leverage premium news for text mining, machine learning, and AI solutions. This talk will detail our decisions made as we migrated our 30-year archive of big data to Google Cloud Storage and BigQuery. We will discuss the architecture trade-offs we faced in migrating 50 TB of historic data to the cloud as well as our ever-growing corpus ingesting 1.3 million articles daily. Dow Jones DNA required data migration, data processing ongoing at scale, and performance required in query responses. The DNA platform started with a team of two data engineers and grew to a team of five data engineers. Managed services were key in making the platform possible with a small team. We will detail the balance of real-world constraints such as small team size, performance versus cost, and the upper limits of quotas as our data ever expands. Build with Google Cloud → https://bit.ly/2KaUXgA Watch more: Next '19 Architecture Sessions here → https://bit.ly/Next19Architecture Next ‘19 All Sessions playlist → https://bit.ly/Next19AllSessions Subscribe to the GCP Channel → https://bit.ly/GCloudPlatform Speaker(s): Patricia Walsh, Dylan Roy Session ID: ARC204 product: Cloud - General; fullname: Patricia Walsh, Dylan Roy; event: Google Cloud Next 2019;


No comments:

Post a Comment