We brought a whole team to San Francisco to present and attend this year’s Data and AI Summit, and it was a blast! I would consider the event a success both in the attendance to the Scribd hosted talks and the number of talks which discussed patterns we have adopted in our own data and ML platform. The three talks I wrote about previously were well received and have since been posted to YouTube along with hundreds of other talks.

  • Christian Williams shared some of the work he has done developing kafka-delta-ingest in his talk: Streaming Data into Delta Lake with Rust and Kafka
  • QP Hou, Scribd Emeritus, presented on his foundational work to ensure correctness within delta-rs during his session: Ensuring Correct Distributed Writes to Delta Lake in Rust with Formal
Verification
  • R Tyler Croy co-presented with Gavin Edgley from Databricks on the cost analysis work Scribd has done to efficiently grow our data platform with: Doubling the size of the data lake without doubling the cost

Members of the Scribd team participated in a panel to discuss the past, present, and future of Delta Lake on the expo floor. We also took advantage of the time to have multiple discussions with our colleagues at Databricks about their product and engineering roadmap, and where we can work together to improve the future of Delta Lake, Unity catalog, and more.

For those working in the data, ML, or infrastructure space, there are a lot of great talks available online from the event, which I highly recommend checking out. Data and AI Summit is a great event for leaders in the industry to get together, so we’ll definitely be back next year!