IBM’s Open Source Data and Artificial Intelligence Technology Center (CODAIT) is releasing a new toolkit to help developers and data scientists answer questions about the outbreak,media reported. COVID notebooks are designed to help accomplish tasks such as obtaining authoritative data about the current state of the outbreak, cleaning up the most serious data quality issues, organizing the data into a format that is easy to analyze using tools such as Pandas and Scikit-Learn, and building an initial set of sample reports and diagrams.
By handling these tasks, developers and data scientists can free up and focus on advanced analytics and modeling tasks without worrying about data formats and data cleansing. The repository uses a developer-friendly Jupyter notebook to cover every initial data analysis step. There are also data processing pipelines using Elyra Notebook Pipelines Visual Editor and KubeFlow Pipelines.
“The information landscape is overwhelming for data scientists and policy makers who are analyzing THE COVID-19 effect and are trying to come up with actionable plans based on data,” said Frederick Reiss, chief architect of IBM’s Open Source Data and AI Technology Center. “Data from research reports, the news media, social media and health organizations is almost constant, making the task of making data analysis useful, almost impossible. Developers and data scientists need to answer their questions about data sources, tools, and how to draw meaningful, statistically valid conclusions from changing data. “
The COVID notebooks tool is now available through GitHub and you can read more on the IBM Developer Blog.