Spokes Projects

The National Science Foundation (NSF) has awarded $11 million to Big Data Spokes Projects and associated planning activities, and plans to invest more than $110 million in Big Data research in Fiscal Year 2017. “The BD Spokes advance the goals and regional priorities of each BD Hub, fusing the strengths of a range of institutions and investigators and applying them to problems that affect the communities and populations within their regions,” says Jim Kurose, Assistant Director of NSF for Computer and Information Science and Engineering. “We are pleased to be making this substantial investment today to accelerate the nation’s big data R&D innovation ecosystem.” Read more in this blogpost and explore the West Big Data Innovation Hub’s NSF-funded Spoke Projects below. We encourage anyone interested in participating in these community-building efforts to contact us and the teams below, and join the conversation on social media with #BDHubs. Guidance for the FY2018 Spokes Solicitation has been sent to our mailing list and posted HERE.

Increasing Collaborations in Protogenomics Applications of Genetic Variations (Planning Grant)


Many well-known diseases can be caused by genetic variants that affect important protein features such as enzyme active sites. The scientific community has catalogued millions of genetic variants and thousands of protein structures. However, these two types of information are not currently linked, or easily linkable, in a manner that makes it easy to explore the relationships between variants and their structural locations. This NSF-funded Spoke Project Planning Grant will promote and facilitate interactions between experts from the genomic variant and protein structure communities, with the shared goal of developing methods for integrating these data comprehensively.


  • Eric Deutsch, Institute for Systems Biology
  • Andreas Prlic, Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (PDB)

Get Involved:

  • Learn more on the project website
  • Save the Date for January 2017 in Seattle, WA: Join a workshop to discuss and plan the integration of genetic variants and protein structures. If you are interested in participating in this workshop, please contact Jennifer Dougherty for further information
  • Chat with PI Eric Deutsch during the NSF Twitter Chat Tues October 18th, 9am PT. Follow @westbigdatahub and #BDHubs to participate

Big Data & Criminal Justice in the Western United States (Planning Grant)


Big data, when well used, can augment the services offered by our criminal justice system, and may lead to heightened levels of efficiency, government transparency, and community trust. Incorporating data-driven techniques can create challenges to our criminal justice system, highlighting unsolved issues with data quality, trustworthiness, and ethical use. In order to address the challenges and opportunities big data techniques hold for the criminal justice system, data scientists, end users, and the communities in which these techniques are implemented must be given opportunities for substantive collaboration. By holding workshops in three locations in the western region of the United States, this project will help create the infrastructure needed to develop, launch, and sustain these collaborative relationships and projects.


Get Involved:

  • More details will be added here soon, including multiple workshops and a 2017 WBDIH Stakeholder Roundtable focused on Public Policy and Big Data in Boise, ID
  • Chat with the team during the NSF Twitter Chat Tues October 18th, 9am PT. Follow @westbigdatahuband #BDHubs to participate
  • Follow the PPRC team on Twitter @pprcboisestate

Accelerating & Catalyzing Reproducibility In Scientific Computation & Data Synthesis (Spoke Grant)


CoMSES Net (Network for Computational Modeling in Social and Ecological Sciences) is dedicated to promoting and enabling open and reproducible scientific computation through cyberinfrastructure and community development. This NSF-funded Spoke Project will update and extend the CoMSES Net Computational Model Library to enhance the user experience and facilitate knowledge discovery of model-based science and model code. To better enable reproducible science, the team plans to build shareable containerized environments for computational models. These environments will archive essential provenance metadata, dependencies, computational pipelines, and analyses used to derive visualizations or statistically-significant findings. The Spoke Project team will work with the WBDIH to convene a regional Working Group for developing community standards and best practices for open model-based science.


  • Michael Barton, Center for Social Dynamics & Complexity and School of Human Evolution & Social Change, Arizona State University


  • Ken Buetow, Computational Science and Informatics, Complex Adaptive Systems, Arizona State University
  • Marco Janssen, Center for Behavior, Institutions & the Environment and School of Sustainability, Arizona State University
  • Allen Lee, Center for Behavior, Institutions & the Environment, Arizona State University

Senior Personnel:

  • Lillian Na’ia Alessa, State of Alaska EPSCoR Program, University of Alaska Fairbanks and University of Idaho

Other Collaborators:

  • Fernando Perez, Project Jupyter (jupyter.org)
  • Parker Antin, CyVerse (www.cyverse.org)
  • Bill Michener, DataOne (www.dataone.org/)
  • Li An, Complex Human-Environment Systems Group, San Diego State University
  • Dawn Parker, Waterloo Institute for Complexity and Innovation, University of Waterloo
  • John Murphy, Decision and Information Sciences Division, Argonne National Laboratory

Get Involved:

  • Learn more about the project at www.comses.net
  • Join the free CoMSES Network and publish your code in the Computational Model Library at www.openabm.org — this project aims to significantly increase the Model Library (currently 360 models) over the course of the coming 3 years.
  • Chat with the team during the NSF Twitter Chat Tues October 18th, 9am PT. Follow @westbigdatahub@openabm_comses and #BDHubs to participate

Metroinsight: Knowledge Discovery & Real-Time Interventions From Sensory Data Flows In Urban Spaces (Spoke Grant)





  The MetroInsight Spokes Project is building an end-to-end system for knowledge discovery using highly-dimensional sensor time-series and real-time data streams. These data streams will support metropolitan infrastructure through data-driven analytics, effective workforce development, and policy support. MetroInsight aims to develop new models and methods to transform information into population-level data suitable for dynamic processing, real-time monitoring, and visualization. In part with the MetroLab Network, the team will bring together a diverse number of partners: utilities, universities, companies, and cities, with the ability to contribute novel tools and urban sensor data to translate knowledge into actions. This Spokes Project will create new learning modules, certification programs on energy and sustainability, online courses on sensor data analytics, and new capstone projects for data science educational programs.

  • Rajesh Gupta, University of California, San Diego
  • Mani Srivastava, University of California, Los Angeles
  • Shade Shutters, Arizona State University
  • Ilkay Altintas, University of California, San Diego
  • Rajit Gadh, University of California, Los Angeles
  • Julian McAuley, University of California, San Diego
Get Involved:
  • Meet the MetroInsight Spoke Project Team at SuperComputing 2016 in Salt Lake City, Utah (November 2016). Follow @westbigdatahub on twitter for updates
  • Chat with PI Rajesh Gupta, Co-PI Ilkay Altintas, and the rest of the team during the NSF Twitter Chat Tues October 18th, 9am PT. Follow @westbigdatahub and #BDHubs to participate