Rice’s Data Science Initiative has a dual focus: making fundamental contributions to data science and advancing the state of data-driven research in specific application areas. In Fall 2015, a faculty committee selected three specific applications areas to provide a focus for the initiative: Health and Medicine, Urban Data Analytics, and High-Velocity Data-Intensive Science.
Data science will revolutionize our understanding of health and medicine. From genomic analysis of patients to demographic studies of widespread problems such as asthma, data science has the potential to fundamentally change the ways that we understand health and the ways that we practice medicine. Personalized, data-driven medicine has great promise, but will require advances in our ability to collect and abstract the appropriate data, to share it among practitioners and systems in ways that respect privacy laws, and to aggregate it in ways that expose new kinds of knowledge.
The initiative actively seeks data scientists who will work with researchers at Rice, in the Texas Medical Center, and beyond, to address the problems of understanding health and delivering medical care in the data age. Rice researchers today are active across the breadth of health and medicine, from imaging to electronic medical records, from studying population statistics on disease to building diagnostic and prosthetic devices.  Problems in this area include extracting, compiling, and the safe storage of electronic medical records; analyzing very large health datasets; connecting health data to administrative and environmental data; and working with real-time data to generate public health alerts.
The last seventy years have seen a global trend toward urbanization; more than 3.5 billion people now live in urban area. This change has created a variety of problems that range from traffic congestion to environmental degradation, to housing shortages, and to fundamental questions about our ability to meet basic human needs such as education. Rice’s location in Houston, one of the most diverse and dynamic cities in the United States, offers an ideal environment for place-based research and education on urban issues.
The initiative actively seeks data scientists who will work with urban policy researchers in the Kinder Institute, the Baker Institute, and Rice’s academic schools to understand the unique problems of urban areas. Problems in this area include, but are not limited to, improving the collection and analysis of complex urban data, from Internet-of-things sensors to traffic citations, building permits, and water bills; understanding, predicting, and ameliorating the impact of catastrophic events such as floods, severe storms, and industrial accidents; and using data-driven techniques to understand and improve the factors that affect quality of life for urban residents.
Researchers in many disciplines today are using petabytes or exabytes of data that may hold the promise of important new discoveries. While these researchers face many of the same issues found in other data-driven efforts, the use of truly-large datasets and datasets that arrive at high-velocity introduces new challenges in terms of both the underlying computer systems, hardware and software, and the analytical techniques applied to the data. These challenges are fundamentally different from those seen at smaller scales.
The initiative actively seeks data scientists who will work on problems related to high-velocity, data-intensive science, including but not limited to new hardware and software data-architectures to deal with high-volume, high-velocity data collection, curation and analysis; highly parallel analytics; and managing and processing distributed or federated data.