Events‎ > ‎

Mar 1, 2015 - Large-scale data science and engineering with Spark

posted Feb 10, 2015, 11:15 PM by Mao Ye   [ updated Mar 3, 2015, 11:54 PM ]


- 本次活动注册表: - 本次活动详情链接: - 本次活动问题收集: - 如果你想加入我们的mailing list,请移步

时间:1:30pm - 4pm, 03/01/2015, Sunday

地点:1601 McCarthy Boulevard, Milpitas, CA 95035 (TIPark Silicon Valley)

Tech Talk简介:

Apache Spark has taken Big Data by storm, subsuming Hadoop MapReduce. In this talk, Reynold Xin from Databricks will give a quick introduction to Spark, with a focus on the latest development activities aimed at making large-scale data science and engineering more approachable. In particular, the following will be discussed:

- Spark's basic programming API

- the new DataFrame API for big data

- machine learning pipeline integration

- Databricks Cloud


Reynold Xin is a committer and PMC member on Apache Spark. He is also a co-founder of Databricks. He has been instrumental in the development of Spark as the maintainer of many components. He recently led an effort to scale up Spark and set a new world record in 100 TB sorting (Daytona Gray). Before Databricks, he was pursuing a PhD at UC Berkeley AMPLab. He wrote the two highest cited papers in SIGMOD 2011 and SIGMOD 2013.


1:30pm - 1:50pm receiption and social time

1:50pm - 2:10pm recruiting time: 20 minutes

2:10pm - 3:30pm talk and Q&A

3:30pm - 4pm: offline networking


Google Docs Video





Google Docs Video


Google Docs Video


Google Docs Video


Google Docs Video

主办: 湾区同学技术沙龙 ( 协办: TIPark Silicon Valley(感谢TIPark赞助场地) 南京大学硅谷校友会 硅谷清华联网 中国科技大学校友会创业俱乐部 浙江大学校友会海纳创新创业俱乐部 北京大学北加州校友会

武汉大学北加州校友会 东南大学硅谷校友会

Hao Xu,
Feb 26, 2015, 8:25 PM
Hao Xu,
Feb 24, 2015, 8:40 PM
Ping Zhu,
Feb 24, 2015, 10:58 PM
Mao Ye,
Mar 3, 2015, 11:00 PM