The overwhelming rise of data is not a matter of debate nor an interesting prediction anymore, it is simply a pervasive reality across virtually every domain of human activity. Data acquisition, transfer, and storage have become rich, varied, and inexpensive; sophisticated computational platforms are at our fingertips (given a credit card number offered to a cloud provider), and the software that powers it all is largely based on open source, freely available tools. Have we then reached our fabled data-laden Utopia?Well… We should indeed celebrate the success of building openly and collaboratively tools that democratize access to data and power a more inclusive education and cutting-edge research. I will discuss some of the history that got us here, from the perspective of the Scientific Python ecosystem and Project Jupyter. But I will also explore the new challenges we face as these tools mature and intersect both with the needs and power of industry titans, and with the incentives of institutionalized academia. Finally, I will discuss questions that arise as small groups of tightly connected volunteers grow into large communities, where matters of diversity and governance become critical factors.We are entering an era where these tools and modes of knowledge creation will have the opportunity to scale in unprecedented ways; I hope the perspective we have gained in Project Jupyter can help illuminate some relevant questions and frame the discussion moving forward.About Prof. Fernando Perez
Fernando Pérez is an assistant professor in Statistics at UC Berkeley and a Faculty Scientist in the Department of Data Science and Technology at Lawrence Berkeley National Laboratory. After completing a PhD in particle physics at the University of Colorado at Boulder, his postdoctoral research in applied mathematics centered on the development of fast algorithms for the solution of partial differential equations. Today, his research focuses on creating tools for modern computational research and data science across domain disciplines, with an emphasis on high-level languages, interactive and literate computing, and reproducible research. He created IPython while a graduate student in 2001 and co-founded its successor, Project Jupyter. The Jupyter team collaborates openly to create the next generation of tools for human-driven computational exploration, data analysis, scientific insight and education.
He is a National Academy of Science Kavli Frontiers of Science Fellow and a Senior Fellow and founding co-investigator of the Berkeley Institute for Data Science. He is a co-founder of the NumFOCUS Foundation, and a member of the Python Software Foundation. He is a recipient of the 2012 FSF Award for the Advancement of Free Software, and of the 2017 ACM Software System Award.