An Introduction to Public Data

On September 6, from 9am-5pm, the Brown Institute is proud to host a day devoted to public data. It is designed for students in journalism, statistics and data science — essentially anyone who has an interest in understanding their neighborhoods, their cities, their state and even the nation through data. Throughout the day, students will learn about what data are available and why. What might be missing and why. And how data are used to answer hard questions about who we are and how we live. The day is divided between data publishers and consumers, ending with a panel specifically designed for journalists. A rough schedule is outlined below. The event will be held in the Lecture Hall in Pulitzer Hall. Register at http://brwn.co/datatix

09:00-09:15 Opening Remarks

09:15-10:30 Keynote Discussion

Nancy Potok, Chief Statistician of the United States, interviewed by Margo Anderson, historian of statistics, University of Wisconsin-Milwaukee

10:30-10:45 Break

10:45-12:15 NYC Panel

Local government produces a large number of data products for the public, ranging from surveys—like the Housing Vacancy Survey and the Community Health Survey—to administrative records—like CompStat crime reports and school report cards. This panel consists of the major data providers in New York City who will introduce some of the data products that are important for data journalists and data scientists investigating New York City. The panelists will discuss some common pitfalls public-data users should watch out for, and provide advice and resources (such as online tools) for using these datasets responsibly.

Joe Salvo, Head of the Population Division at the Department of City Planning

Elyzabeth Gaumer, Assistant Commissioner for Research & Evaluation for the Department of Housing Preservation & Development 

Kinjia Hinterland, Director of Data Communications at the Department of Health and Mental and Hygiene

12:15-01:30 Lunch

01:30-02:00 Keynote Talk

Jeffrey Chen, Chief Innovation Officer for the Bureau of Economic Analysis, part of the Department of Commerce.

02:00-03:30 A Research View

The digital revolution has vastly expanded the quantity and quality of data collected by government. New surveys and Freedom of Information/Open Data laws have helped put this data in the hands of data journalists and data scientists. From the Census FactFinder and BEA realtime GDP to school report cards and participatory budgeting, the public is privy to more government information and decision making than ever before.

At the same time, data has made policy far more complex, enabling computer algorithms to make complicated decisions about complex topics like redistricting and recidivism. Has public data truly increased transparency and accountability? Is policy more or less accessible to journalists and the public? This panel will rate the current state of government data, provide an academic context for journalists and data scientists, and anticipate the future of public data.

Ester Fuchs, Urban and Social Policy Program, Columbia SIPA

Andrew Young, NYU’s Governance Lab

John Mollenkopf, Center for Urban Research at the CUNY Graduate Center

03:30-03:45 Break

03:45-05:15 Journalism Finale 

We close the day with a panel of data journalists who will examine how they find and tell stories using public data. These are important lessons for aspiring data journalists and data scientists. Why were the data collected? What were the incentives and motivations behind its collection? How was it collected? How has it been used before? And are there any gaps — who is not being counted? Five practicing data journalists will share their experiences.

Tom Meagher, The Marshall Project

Sarah Ryley, The Trace

Tom McGinty, The Wall Street Journal

Annie Waldman, ProPublica

Laura Bliss, CityLab, The Atlantic


Short Bios

Margo Anderson is Distinguished Professor Emerita [ History & Urban Studies] at the University of Wisconson – Milwaukee. She specializes in American social, urban and women’s history and has research interests in both urban history and the history of the social sciences and the development of statistical data systems, particularly the census. Her publications include the second edition of The American Census: A Social History (Yale University Press, 2015); Encyclopedia of the U.S. Census: From the Constitution to the American Community Survey (ACS), 2d ed. (Washington, D.C.: CQ Press, 2011), coedited with Constance F. Citro and Joseph J. Salvo; and a co-edited volume with Victor Greene, Perspectives on Milwaukee’s Past (University of Illinois Press, 2009). With UWM Professor Amanda Seligman, she is Lead Editor of the Encyclopedia of Milwaukee, https://emke.uwm.edu.  In 2006 she served as the President of the Social Science History Association.

Jonathan Auerbach is a PhD candidate in Columbia University’s Statistics Department. Prior to joining the program, he was a researcher at CUNY’s Center for Urban Research and an analyst at New York City’s legislature, the City Council. His interests include public policy and statistical methodology.

Laura Bliss is an award-winning staff writer at The Atlantic‘s CityLab, covering urban politics and policy with a focus on transportation. She also authors MapLab, a biweekly newsletter about maps. Her work has appeared in the New York Times, The Atlantic, Mother Jones, Los Angeles magazine, GOOD, and beyond. She tweets @mslaurabliss.

Craig Campbell is a Special Advisor at the Mayor’s Office of Data Analytics, where he supports strategic communications and policy development for the NYC Open Data and NYC Analytics programs. Prior to working in local government, he researched trends in municipal data analytics and government technology at the Harvard Kennedy School, where he supported a variety of national policy networks and research programs at the Ash Center for Democratic Governance and Innovation. He holds a degree in architecture and mathematics from Amherst College.

Jeff Chen is a statistician and data science leader, currently serving as the Chief Innovation Officer of the Bureau of Economic Analysis. In this role, he is responsible for integrating innovations in data science and machine learning to improve measurement of the US economy.  He has extensive experience in launching and leading data science initiatives in over 40 domains and working with diverse stakeholders such as firefighters, economists, climatologists, and technologists. Previously, he served as the U.S. Department of Commerce’s Chief Data Scientist; a White House Presidential Innovation Fellow with NASA and the White House Office of Science and Technology Policy focused on data science for the environment; the first Director of Analytics at the NYC Fire Department where he engineered pioneering algorithms for fire prediction; and was among the first data scientists at the NYC Mayor’s Office of Operations during the Bloomberg Administration. Jeff started his career as an econometrician at an engineering consultancy where he developed forecasting and prediction models supporting large-scale infrastructure investment projects. In the evenings, he is an adjunct professor at Georgetown University’s McCourt School of Public Policy where he teaches a graduate course on data science. He holds a B.A. in economics from Tufts University and a M.A. in Quantitative Methods in the Social Sciences from Columbia University.

Ester R. Fuchs is Professor of International and Public Affairs and Political Science and Director of the Urban and Social Policy Program at Columbia University’s School of International and Public Affairs. She previously chaired Urban Studies at Barnard and Columbia Colleges. Fuchs serves as Director ofWhosontheballot.org, an online voter engagement initiative. Fuchs is also a member of the Faculty Steering Committee of the Eric Holder Initiative for Civil and Political Rights, the Columbia Provost’s Just Societies Task Force, the Columbia’s Data Science Institute and its Smart Cities Center. She recently received the Bella Abzug Leadership Award, the City & State Above & Beyond Exceptional New York Women Award for Education and an Award for Outstanding Teaching at SIPA.

Fuchs academic research is in urban politics and policy, American parties and elections, workforce development, smart cities and urban environmental sustainability policy. Fuchs served as Special Advisor to the Mayor for Governance and Strategic Planning under New York City Mayor Michael R. Bloomberg from 2001 to 2005.  And was the first woman to chair a NYC Charter Revision Commission in2005. Fuchs serves on numerous boards, advises businesses and political campaigns and is frequent political commentator on TV, radio and new media.  Fuchs received a BA from Queens College, CUNY; an MA from Brown University and PhD in Political Science from the University of Chicago.

Elyzabeth Gaumer is Assistant Commissioner of Research and Evaluation at the New York City Department of Housing Preservation and Development where she leads the agency’s efforts to evaluate the impact of City-sponsored programs and policies on families and neighborhoods and promote evidence-based policymaking. She represents HPD’s various research activities to a broad range of policy stakeholders at the local and national levels and acts as advisor and contributor to several inter-agency research efforts and working groups. Gaumer is co-Principal Investigator for the Housing and Neighborhood Study (HANS), a randomized control trial jointly led by HPD and Jeanne Brooks-Gunn at Columbia University that evaluates the impact of affordable housing on the health and well-being of low-income New Yorkers. Since 2014, she has also been the Survey Director for the New York City Housing and Vacancy Survey (NYC HVS), the City’s representative survey of the housing stock and population conducted every three years by the US Census Bureau. Her own research interests include rent regulation, neighborhood effects, age stratification in urban areas, and use of paradata to refine survey design and operations.

Kinjia Hinterland, MPH, has over 10 years of experience at the NYC Department of Health and Mental Hygiene and currently serves as the Director of the Data Communications Unit in the Bureau of Epidemiology Services. The Unit is dedicated to effectively communicating health-related data to inform public health policy and programs in NYC. The Data Communications team works with programs across the Health Department to make data available via ongoing publication series, special reports, and EpiQuery, the interactive data analysis tool. The team’s mission is to fulfill community data needs by providing accessible information to audiences with an interest in public health data. Ms. Hinterland received her Master of Public Health degree from the Columbia University Mailman School of Public Health, with a concentration in Sociomedical Sciences. She tweets @kinjiah.

Tom McGinty is a reporter/editor on The Wall Street Journal’s investigative-reporting team. He joined The Journal in 2008. Previously, he was a reporter on the investigative team of Newsday, Long Island’s daily newspaper, from 2001 through 2007. He previously was the training director of Investigative Reporters and Editors and the National Institute for Computer-Assisted Reporting. He was part of a Wall Street Journal team that won a Pulitzer Prize for Investigative Reporting in 2015 for a series on abuses in Medicare, as well as a Gerald Loeb Award and an Investigative Reporters and Editors Freedom of Information Award for the same coverage. He tweets @mcgint.

Tom Meagher is the deputy managing editor of The Marshall Project, a nonprofit news organization dedicated to covering criminal justice in America. A veteran reporter and editor, he previously led an interactive team for the Digital First Media newspaper chain and was the data editor at the Newark Star-Ledger. His reporting at The Marshall Project has won several honors, including a Data Journalism Award for a piece examining changing crime trends. He co-founded Hack Jersey, a group that brings journalists and developers together to work on open source news projects, and he helped to organize the first Open Data Summit in the state of New Jersey. He tweets @ultracasual.

John Mollenkopf is Distinguished Professor of Political Science and Sociology the CUNY Graduate Center and directs its Center for Urban Research.  He has published eighteen books on urban politics, urban policy, and race, ethnicity, and immigration. His current research analyzes how the rise of new immigrant communities has reshaped electoral politics in New York City since 2001.  He and colleagues at the Center for Urban Research have worked extensively with many large administrative databases, including the voter registration and voter history files, Homeless Services application files, and administrative records from the Housing Recovery Office’s Build It Back program.  Much of their analysis involves matching different files, geocoding the data, and mapping the results. CUR also maintains the Oasis on-line mapping system for New York City at http://www.oasisnyc.net/.

Dr. Nancy Potok is Chief Statistician of the United States at OMB. She previously served as Deputy Director and Chief Operating Officer of the U.S. Census Bureau. Dr. Potok has over 30 years of leadership experience in the public, non-profit, and private sectors.  She served as Deputy Under Secretary for Economic Affairs at the US Department of Commerce; Principal Associate Director and CFO at the Census Bureau; Senior Vice President for Economic, Labor, and Population Studies at NORC at the University of Chicago; and Chief Operating Officer at McManis & Monsalve Associates,  a business analytics consulting firm. She is an adjunct professor at the Trachtenberg School of Public Policy and Public Administration and a senior fellow at the Center for Excellence in Public Leadership at The George Washington University. She is the recipient of the Presidential Rank Award, the Secretary of Commerce Gold Medal and Silver Medal for outstanding achievements, the Arthur S. Flemming Award, the Enterprise Risk Manager of the Year Award given by the Association for Federal Enterprise Risk Management, and the Distinguished Alumni Award from The George Washington University. Dr. Potok is a member of the American Statistical Association, an elected Fellow of the National Academy of Public Administration (NAPA) and serves on the Board of Trustees for the Institute of Pure and Applied Mathematics at UCLA.  She received her Ph.D. in public policy and public administration at The George Washington University.

Sarah Ryley is an investigative reporter at The Trace, a non-profit news outlet that covers gun issues. Prior to joining The Trace, she was an investigative reporter and editor at the New York Daily News. Her work there primarily focused on criminal justice and was the catalyst for numerous reforms. Her investigation on the NYPD’s abuse of eviction laws, done in partnership with ProPublica, was awarded the Pulitzer Prize for Public Service in 2017. She tweets @MissRyley.

Joseph J. Salvo is Director of the Population Division at the New York City Department of City Planning. The Population Division serves as the city’s in-house demographic consultant, providing expertise to agencies on applications involving assessments of need, program planning and targeting, and policy formulation. He has testified before Congress, and served as an advisor to the Census Bureau and the National Academy of Sciences. He has co-authored articles on settlement patterns of race/ethnic groups, census methods, and survey evaluation. Dr. Salvo is presently leading a team making technical preparations for the 2020 Census and is active nationally in promoting the use of methods that will provide a more accurate count of the city’s population. He received M.A. and Ph.D. degrees from Fordham University, is a recipient of the Sloan Public Service Award from the Fund for the City of New York, and a Fellow of the American Statistical Association. NYC’s Dept. of City Planning tweets @NYCPlanning.

Annie Waldman is a staff reporter at ProPublica, with a focus on data, education and healthcare. She was part of the ProPublica-NPR team that investigated the United States’ maternal mortality rates, a series of stories that was a finalist for a 2018 Pulitzer Prize in explanatory reporting.

She has been a finalist twice and won two awards from the Education Writers Association for her education reporting. She has won an award from the Society of American Business Editors and Writers and was a finalist for the Loeb Awards for her reporting with Paul Kiel and Al Shaw on the racial disparity of wage garnishment. A piece she published with The New York Times on a New Jersey student debt agency prompted a new law and several new bills, aimed at increasing consumer protections for student borrowers and their families. Following her reporting on the largest accreditor of for-profit colleges, the U.S. Department of Education stripped the agency of its powers.

Prior to joining ProPublica, she was a recipient of a Fulbright Fellowship to Israel, where she reported on the plight of refugees from Darfur and Eritrea. She had a documentary film in the 2009 Sundance Film Festival, on the lives of homeless high school students after Hurricane Katrina, which was later broadcast nationally on PBS, and recently produced a documentary film that premiered at the 2018 Tribeca Film Festival on adolescence in rural industrial towns.

She graduated with honors from the Columbia Graduate School of Journalism and the School of International and Public Affairs at Columbia, where she was the recipient of the Pulitzer Traveling Fellowship and the Brown Institute Computational Journalism Award. Her stories have been published in The New York Times, the Atlantic, Vice, BBC News, The Chronicle of Higher Education and Consumer Reports. She tweets @AnnieWaldman.

Andrew Young is the Knowledge Director at The GovLab, where he leads research efforts focusing on the impact of technology on public institutions. Among the grant-funded projects he has directed are a global assessment of the impact of open government data; comparative benchmarking of government innovation efforts against those of other countries; a methodology for leveraging corporate data to benefit the public good; and crafting the experimental design for testing the adoption of technology innovations in federal agencies. Andrew has authored or co-authored a number of extended works on new approaches for improving governance with technology, including the books The Global Impact of Open Data and Open Data in Developing Economies. His writings can be found in Harvard Business Review, Stanford Social Innovation Review, GrantCraft, and Governing, among others. He tweets @_AndrewYoung.


Links to tools and datasets mentioned during the day

American FactFinder

Longitudinal Employer-Household Dynamics

NYC Population FactFinder

Map Reliability Calculator

New York City Housing and Vacancy Survey

EpiQuery

NYC Community Health Survey

New York City Community Health Profiles

Who’s on the Ballot?

OasisNYC.net