Articles of the Week of May 5, 2025

May 10, 2025

Purpose: To keep up with reading an article a (week)day and add my own thoughts to it.

xkcd #238

#1: I spent 5 hours understanding how Uber built their ETL pipelines.

This article talks about how Uber built a data lake platform, Apache Hudi, to process real-time, incremental data efficiently, in both performance and cost. Then it talks briefly about the storage architecture and types of operations that can be performed on top of Hudi. I haven’t used it before, but will check it out later.

#2: A visual guide to LLM agents

This article gives a nice visual overview of LLM agents, insights into their components, how multiple agents can work together and how to build them - which bringes me to the next article.

#3: QueryGPT – Natural Language to SQL Using Generative AI

Large scale text-to-SQL generation using LLM agents. Beneficial mostly to non-technical users or newcomers or engineers/analysts starting out on a new project.
Didn’t mention dataset-level or table-level permissions, though I assume the ACLs are mostly likely verified after the Intent agent step or users are limited to tables within their workspace.

#4: On becoming competitive when joining a new company

Quite an interesting post to read, especially for early-mid career professionals like me.
I believe that doing exciting and meaningful work in your 20s and mid 30s has the greatest impact on long-term career growth, and this article gives some good tips on how to make the most of it.

#5: WTF: The Who to Follow Service at Twitter

High-level introduction to the Who to Follow service at Twitter (sorry, X), which is a recommendation system that suggests accounts for users to follow.
Since the original paper is quite old now, I am curious to see how the system has evolved over the years, especially as they open-sourced their stuff (ie: tweepcred).

Related Posts