Overview
The emergence of “big data” offers unprecedented opportunities for scientists, decision makers, and the public at large, at least in principle, to utilize this wealth of information to drive not only scientific discovery but also decisions and policies. Modern data analytics techniques that integrate sophisticated probabilistic models, statistical inference, and data structures into machine learning algorithms have resulted in powerful ways to extract actionable knowledge from data in virtually every area of human endeavor. However, realizing the full potential of big data to accelerate advances in biological, physical, health, and social sciences, engineering, and in public policy, national security, education, economics, and commerce, present many challenges in data management, data integration, data analytics, and predictive modeling and simulation.
The Center for Big Data Analytics and Discovery Informatics aims to address challenges such as:
- How can we efficiently organize, store, retrieve, manipulate, and analyze data and knowledge from massive, heterogeneous, geographically distributed, autonomous information sources?
- How can we glean useful information from large data sets using automated or semi-automated approaches derived from state-of-the-art algorithms and techniques from artificial intelligence, machine learning, data mining, and statistical inference?
- How can we represent and manipulate scientific knowledge using computational abstractions in a form that lends itself to automated processing by the computer and at the same time, comprehensible and communicable to humans?
- How can we create cognitive tools that could dramatically accelerate scientific progress, by leveraging and extending the reach of human intellect, and partnering with scientists, including citizen scientists, with a broad range of skills and expertise?