Special Session 1:
EnGeoData’2019: Environmental and Geo-spatial Data Analytics
Aims and scope:
Environmental and more generally geo-spatial information is now provided by crowdsourcing but also by public administrations in the context of the open data policies. Analyses of such data are still challenging. Firstly because of their heterogeneity (structural, semantic, spatial and temporal), and secondly because of the difficulty in choosing the “best” knowledge discovery process to apply, according to the needs of the experts in the field. This special issue aims to provide high quality research covering all or part of the challenges mentioned above, from a theoretical or experimental point of view.
Topics of interest:
- Pre and post processing of environmental and agriculture data
- Geographical information retrieval
- Spatial data mining and spatial data warehousing
- Knowledge discovery use-cases dedicated to environmental data
- Spatial text mining
- Spatial ontology
- Spatial recommendations and personalization
- Visual analytics for geospatial data
- Dedicated applications:
- Spatio-temporal analytics platform
- Agricultural Decision Support Systems
- Urban traffic systems
- Land-use and urban policies
- Land-use and urban planning analysis
- Spatio-temporal analysis in Ecology and Agriculture
Organizers
Mathieu Roche
Cirad, TETIS, France
Mail: mathieu.roche@cirad.fr
Web: http://textmining.biz/Staff/Roche
Maguelonne Teisseire
Irstea, TETIS, France
Mail: maguelonne.teisseire@irstea.fr
Web: http://textmining.biz/Staff/Teisseire
Juan Antonio Lossio
University of Florida, USAMail: Diana.Inkpen@uottawa.ca
Web: https://simbig.org/lossio-ventura/
Special Session 2:
MLAI4N: Machine Learning and Artificial Intelligence for Biomedical Health Data
Aims and scope:
Recent breakthroughs in Data science (deep neural networks including convolutional and tensor networks, geometrical and topological data analysis, statistical methods in functional imaging consisting of EEG/MEG data or time series of 3D images, etc.) allowed not only to create effective diagnostic and prognostic tools but also created prerequisites for alleviating early detection of mental disorders and brain tumors, accelerate the search for life-saving pharmaceuticals, and provide insights about the molecular pathways of the neurodegenerative diseases.
The workshop is oriented to all potential applications of data science technologies in feature extraction, classification, recognition, segmentation, enhancing, clustering, anomaly detection, and prediction of neurodegenerative disease states – as applied to the various biomedical time series and images of different nature, especially to multi-modal brain data (X-ray, MRI/fMRI/CT, EEG/MEG, and biomarker assays). Special attention will be paid to Biomedical signals processing for diagnostics and treatment outcome prediction and Natural language processing for case records and medical history.
Topics of interest:
- Deep Learning for healthcare
- Data Fusion for HealthCare, especially Biomedical images of different nature (X-ray, CT, etc.);
- Early diagnosis of specific diseases like Alzheimer, ADHD, ASD etc
- Computational Neuroscience; Neuroimaging and Time Series data (including MRI/fMRI/CT, EEG/MEG, etc.) studies;
- Novel methods of data analysis and pattern recognition applied to the biomedical images of different nature;
- Deep learning in Neuroimaging data analysis;
- Matrix and tensor methods in Neuroimaging data analysis;
- Dimensionality reduction in Neuroimaging data analysis;
- Nonparametric and computational Bayesian methods in Neuroimaging data analysis;
- Manifold learning, classification, clustering and regression in Neuroimaging data analysis;
Organizers:
Professor Evgeny Burnaev – contact person, E.Burnaev@skoltech.ru
Professor Andrzej Cichocki, A.Cichocki@skoltech.ru
Professor Alexander Bernstein, A.Bernstein@skoltech.ru
Leading researcher Maxim Sharaev, M.Sharaev@skoltech.ru
Special Session 3:
Data and information quality: Toward Better Data Science
Aims and scope:
Data quality is one of the main pillars of data science ensuring the validity of the analytics, outcomes and inferences that drive important decision making for a wide range of applications—self-driving cars, e-commerce, high frequency trading, social media feed, network traffic routing and many others. In this session we focus on the importance of data quality on a broad spectrum of data science endeavors. Many data quality issues, e.g., missing values (gaps), incomplete data (abnormally low counts), duplicates (abnormally high counts) have the potential to introduce unintended bias and variability in the data that could potentially have life-changing impact e.g., in the justice system or in healthcare applications.
Managing data quality issues after identifying them by interpreting, prioritizing and identifying actionable ones is a non-trivial and important research topic. Over-treating might lead to statistical distortion of the original data changing the nature of the data itself, while not treating them could lead to bad data driven decisions down the road. This special session aims to bring together researchers and practitioners of data science that are interested in the theory, methodology, applications, case studies and practical solutions related to data quality.
Topics of interest:
- Data quality and stream analytics
- Machine learning based methods to improve data quality
- Information quality in spatio-temporal streams
- Data quality of video/image data
- Prioritizing, interpreting and explaining data quality issues
- Visualization methods for data quality
- Effect of bad data on fairness
- Effect of data quality on data science in health care
- Impact of data quality on public policy and social good
- Case studies, experience and applications of data quality from diverse fields
Relevance to the main conference tracks and topics:
Data quality monitoring and assurance is an important part of the data science process to ensure the validity and statistical confidence of the results.
Organizers:
Tamraparni Dasu, AT&T Labs-Research;
Yaron Kanza, AT&T Labs-Research
Special Session 4:
Beyond IID – Non-IID Learning
Aims and scope
Learning from big data is increasingly becoming a major challenge and opportunity for big business and innovative learning theories and tools. Some of the most critical challenges of learning from big data are the uncovering of the explicit and implicit coupling relationships embedded in mixed heterogeneous data from single/multiple sources. The coupling and heterogeneity of the non-IID aspects form the essence of big data and most real-world applications, namely the data is non-IID.
Most of classic theoretical systems and tools in statistics, data mining, database, knowledge management and machine learning assume the independence and identical distribution of underlying objects, features and values. Such theories and tools may lead to misleading or incorrect understanding of real-life data complexities. Non-IID learning in big data is a foundational theoretical problem in AI and data science, which considers the complex couplings and heterogeneity between entities, properties, interactions and contexts.
Topics of interest
- Statistical foundation for non-IID learning
- Mathematical foundation for non-IID learning
- Probabilistic methods for non-IID learning
- Statistical machine learning for non-IID learning
- Non-IID learning theory and foundation
- Non-IID data characterization
- Non-IID data transformation
- Non-IID data representation and encoding
- Non-IID learning models and algorithms
- Non-IID single-source analytics
- Non-IID multi-source analytics
- Non-IID clustering
- Non-IID classification
- Non-IID recommender systems
- Non-IID text mining and document analysis
- Non-IID image and video analytics
Organizers
Yinghuan Shi Email: syh@nju.edu.cn
Guansong Pang Email: pangguansong@gmail.com
Chengzhang Zhu Email: kevin.zhu.china@gmail.com
Special Session 5:
Data Science in Computational Psychiatry and Psychiatric Research
Aims and scope
Psychiatric research entered the age of big data with patient databases now available with thousands of clinical, demographical, social, environmental, neuroimaging, genomic, proteonomic and other -omic measures. The analysis of such data is often more challenging than in other medical research areas because i) psychiatrists study traits which are not easily measurable; they need to be measured indirectly e.g. by questionnaires, ii) the definition of a mental disease is often very broad and often includes distinct but unknown subcategories, iii) there is a high proportion of drop-out in many studies and patients often do not adhere to the treatment and iv) treatment interventions often have several interacting and it is often difficult to measure components (complex interventions). Machine learning techniques are increasingly being used to address problems in psychiatric and psychological research, including bioinformatics, neuroimaging, prediction modelling and personalized medicine, causal modelling, epidemiology and many other research areas. Machine learning plays also an important role in the definition of the modern field of Computational Psychiatry.
Topics of interest
- Computational Psychiatry
- Development of diagnostic, risk and prognostic models (e.g. predicting risk of dementia, psychosis, etc)
- Big data and highly dimensional data analysis in psychiatric research
- Natural language processing, methods for prediction and knowledge discovery from Electronic Health Record (EHR) data
- Adaptive clinical trials and machine learning
- Causal modelling, including Mendelian Randomization
- Bioinformatics and -omics studies
Organizers
daniel.r.stahl@kcl.ac.uk
daniel.stamate@manchester.ac.uk