October 30, 2014
08:30 – 09:00 Room: Regal Hall 1, 1/F
Opening Ceremony
09:00 – 10:00 Room: Regal Hall 1, 1/F
Keynote Speech 1 Usama Fayyad
: BigData, AllData, Old Data: Predictive Analytics in a Changing Data Landscape to Big Data, All Data, Old Data: Predictive Analytics in a Changing Data LandscapeChair: Philip S Yu
Coffee Break
10:20 – 12:20 Room: Regal Hall 1, 1/FChair: Geoff Webb Room: Regal Hall 2, 1/FChair: Wei Wang Room: North Hall 1, B/FChair: Clausel Marianne
DSAA – S1 Machine Learning DSAA – S2 Retrieval, Query & Search DSAA – SS1 SMTDM (1)
Lunch (Venue: Stadium Cafe, 1/F)
13:20 – 14:20 Room: Regal Hall 1, 1/F
Keynote Speech 2 Feiyue Wang
: Social Computing and Computational Societies: An ACP based Approach for Smart and Parallel Economic SystemsChair: Masaru Kitsuregawa
Coffee Break
14:40 – 16:40 Room: Regal Hall 1, 1/FChair: Huan Liu Room: Regal Hall 2, 1/FChair: Herve Martin Room: North Hall 1, B/FChair: Clausel Marianne
DSAA – S3 Analytics Foundations DSAA – S4 Recommendation & Services DSAA – SS2 SMTDM (2)
16:45 – 18:25 Room: Regal Hall 1, 1/F
DSAA Trends & Controversies Invited Talks(1) Philip S Yu: Assessing the Longevity of Online Videos: A New Insight of a Video’s Quality(2) Geoff Webb: Scalable Learning of Bayesian Network Classifiers(3) Gabriella Pasi: Generating, Communicating, Accessing and Analyzing Data in a Context Aware Perspective(4) Osmar Zaiane: Why the Worlds of MOOCs and Big Data Do Not Collide?Chair: Hiroshi Motoda
18:30 – 20.30: Banquet and Conference Center, B/F Reception


October 31, 2014
08:30 – 09:30 Room: Regal Hall 1, 1/F
Keynote Speech 3 Dirk P. Kroese
: Monte Carlo Methods for Big Data and Big ModelsChair: Geoff Webb
Coffee Break
09:50 – 11:50 Room: Regal Hall 1, 1/FChair: Geun-Sik Jo Room: Regal Hall 2, 1/FChair:Xin Wang Room: North Hall 1, B/FChair: Elio Masciari
DSAA – S5 Classification & Clustering DSAA – S6 Infrastructure, Management & Privacy DSAA – SS3 WIACNBD (1)
Lunch (Venue: Stadium Cafe, 1/F)
12:50 – 13:50 Room: Regal Hall 1, 1/F
Keynote Speech 4 Ravi Kumar
: Some Patterns in Online BehaviorChair: Xin Yao
Coffee Break
14:00 – 15:30 Room: Regal Hall 1, 1/F
Panel: Harnessing knowledge in Big Data: Issues and Challenges of Data SciencePanelists:Usama Fayyad, Barclays Bank, UK
Masaru Kitsuregawa, National Institute of Informatics, Japan
Ravi Kumar, Google, Mountain View, CA, USA
Hervé Marti, Université de Grenoble 1, France
Geoff Webb, Monash University, Australia
Xin Yao, University of Birmingham, UK
Osmar Zaiane, University of Alberta, Canada

Chair: Pasi Gabriella

15:30 – 17:30 Room: Regal Hall 1, 1/FChair: Huan Liu Room: Regal Hall 2, 1/FChair:Maria Huang Room: North Hall 1, B/FChair: Elio Masciari
DSAA – S7 Influence Analysis DSAA – S8 Data Science Applications DSAA – SS4 WIACNBD (2)
Banquet 18:00 – 20:00: Xiaonanguo Restaurant, Pudong(Note: Buses are arranged outside the Hotel at 17:30 to take delegates to the restaurant)
Cruise Tour 20:30 – 22:30: Huangpujiang Cruise Tour, Pudong(Note: Buses are arranged outside the Restaurant at 20:00 to take delegates to the cruise)


November 1, 2014
08:30 – 09:00 Room: Regal Hall 1, 1/F
Award Ceremony & DSAA2015 Introduction
09:00 – 10:00 Room: Regal Hall 1, 1/F
Keynote Speech 5 Ofer Azar
: Relative ThinkingChair: Minyi Guo
Coffee Break
10:20 – 12:20 Room: Regal Hall 1, 1/FChair: Osmar Zaiane Room: Regal Hall 2, 1/FChair: Xue-wen Chen Room: North Hall 1, B/FChair:Mathieu Roche
DSAA – S9 Complex Data Analysis DSAA – S10 Community and Network Analysis DSAA – SS5 EGSDA
Lunch (Venue: Stadium Cafe, 1/F)
13:20 – 15:20 Room: East Asia Hall, B/F Room: Dynasty Hall, 1/FChair: Wei Wang Room: North Hall 1, B/FChair: Vincent Tseng
Tutorial 1 Feida Zhu
Network Mining and Analysis for Social Applications
DSAA – S11 Cloud Computing & Parallel Computing DSAA – SS6 BHMA
Coffee Break
15:40 – 17:40 Room: East Asia Hall, B/F Room: North Hall 1, B/FChair: Gabriella Pasi
Tutorial 2 Seung-won Hwang
Harvesting, Integrating, Maintaining and Leveraging Knowledge Graphs
17:40 -19:00 Closing Ceremony & DSAA 2015 Welcome Drink


DSAA – S1: Machine Learning

Minimizing Expected Loss for Risk-Avoiding Reinforcement Learning

Jung-Jung Yeh, Tsung-Ting Kuo, William Chen and Shou-De Lin

Large-Scale Factorization of Type-Constrained Multi-Relational Data

Denis Krompass, Maximilian Nickel and Volker Tresp

Pseudo Labels for Imbalanced Multi-Label Learning

Wenrong Zeng, Xuewen Chen and Hong Cheng

Proactive Learning with Multiple Class-Sensitive Labelers

Seungwhan Moon and Jaime Carbonell

Itemset Approximation Using Constrained Binary Matrix Factorization

Seyed Hamid Mirisaee, Eric Gaussier and Alexandre Termier

Community Detection in Social Networks: the Power of Ensemble Methods

Rushed Kanawati

DSAA – S2: Retrieval, Query & Search

Semi-randomized Hashing for Large Scale Data Retrieval

Haichuan Yang, Xiao Bai, Jun Zhou, Peng Ren, Jian Cheng and Lu Bai

Storage Efficient Graph Search by Composite Dynamic-and-Static Indexing of a Single Network

Yan Xie and Philip Yu

Indexing and Retrieval of Human Motion Data Based on a Growing Self-Organizing Map

Da-Cheng Yu and Wei-Guang Teng

An Improved Dynamic Adaptive Multi-tree Search Anti-collision Algorithm Based on RFID

Min Shao, Xiao-Fang Jin and Li-Biao Jin

Exploiting Mobility for Location Promotion in Location-based Social Networks

Wen-Yuan Zhu, Wen-Chih Peng and Ling-Jyh Chen

Investigating Sample Selection Bias in the Relevance Feedback Algorithm of the Vector Space Model for Information Retrieval

Massimo Melucci

DSAA – S3: Analytics Foundations

Learning a Proximity Measure to Complete a Community

Maximilien Danisch, Jean-Loup Guillaume and Benedicte Le Grand

A Confidence-based Entity Resolution Approach with Incomplete Information

Qi Gu, Yan Zhang, Jian Cao and Guandong Xu

Human Behaviour Analysis Using Multimodal Emotion Recognition

Prashant Jha and Rammohana Reddy Guddeti

Matrix Completion Based on Feature Vector and Function Approximation

Shiwei Ye, Yuan Sun and Yi Sun

An Accuracy Enhancement Algorithm for Fingerprinting Method

Yuntian Brian Bai, Mani Williams, Falin Wu, Allison Kealy and Kefei Zhang

DSAA – S4: Recommendation & Services

Probabilistic Category-based Location Recommendation Utilizing Temporal Influence and Geographical Influence

Dequan Zhou and Xin Wang

Recommending Funding Collaborators with Scholar Social Networks

Juan Zhao, Kejun Dong and Jianjun Yu

An Incremental Scheme for Large-scale Social-based Recommender Systems

Chia-Ling Hsiao, Zih-Syuan Wang and Wei-Guang Teng

Diversification in News Recommendation for Privacy Concerned Users

Maunendra Sankar Desarkar and Neha Shinde

Similarity Analysis of Service Descriptions for E?cient Web Service Discovery

Sowmya Kamath S and Ananthanarayana V.S

DSAA – S5: Classification & Clustering

Active Learning for Text Classification Using the LSI Subspace Signature Model

Weizhong Zhu and Robert B. Allen

A Semisupervised Associative Classification Method for POS Tagging

Pratibha Rani, Vikram Pudi and Dipti Sharma

Optimizing Specificity under Perfect Sensitivity for Medical Data Classification

Cho-Yi Hsiao, Hung-Yi Lo, Tu-Chun Yin and Shou-De Lin

Interactive Correlation Clustering

Floris Geerts and Reuben Ndindi

Rough Possibilistic Meta-Clustering of Retail Datasets

Asma Ammar, Zied Elouedi and Pawan Lingras

DSAA – S6: Infrastructure, Management & Privacy

Detecting Hidden Propagation Structure and Its Application to Analyzing Phishing

Yang Liu and Mingyan Liu

Account Level Demand Estimation and Intelligence Framework

Pranjal Mallick, Vikash Kumar Sharma, Parikshit Bhinde and Mutha Reddy Mandapati

A New Pre-Pushing VoD Scheme in Hierarchical Network Environment

Fei Long and Xingjun Wang

Usage Signatures Analysis – An Alternative Method for Preventing Fraud in e-Commerce Applications

Gabriel Mota, Joana Fernandes and Orlando Belo

Exploring New Privacy Approaches in a Scalable Classification Framework

M Saravanan, Mohamed Thoufeeq, S Akshaya and V.L Jayasre Manchari

DSAA – S7: Influence Analysis

Efficient Analysis of Node Influence Based on SIR Model over Huge Complex Networks

Masahiro Kimura, Kazumi Saito, Kouzou Ohara and Hiroshi Motoda

Social Influence-Aware Reverse Nearest Neighbor Search

Hui-Ju Hung, De-Nian Yang and Wang-Chien Lee

Influence Maximization in a Social Network in the Presence of Multiple Influences and Acceptances

Jun-Li Lu, Ling-Yin Wei and Mi-Yen Yeh

Diversified Ranking on Graphs from the Influence Maximization Viewpoint

Li-Yen Kuo and Ming-Syan Chen

Mining Influence in Evolving Entities: A Study on Stock Market

Chang Liao, Yanfei Huang, Xibin Shi and Xin Jin

DSAA – S8: Data Science Applications

Sentiment Detection and Visualization of Chinese Micro-blog

Zhitao Wang, Zhiwen Yu, Liming Chen and Bin Guo

Critical Class Sensitive Active Learning Method for Classification of Remote Sensing Imagery

Lian-Zhi Huo, Zheng Zhang and Liang Tang

Content Specific Coverage Patterns for Banner Advertisement Placement

Venkata Trinath Atmakuri, Gowtham Srinivas Parupalli and Krishna Reddy Polepalli

Appliance and State Recognition using Hidden Markov Models

Antonio Ridi, Christophe Gisler and Jean Hennebert

Exploring Technological Trends for Patent Evaluation

Shuting Wang, Wang-Chien Lee, Zhen Lei, Xianliang Zhang and Yu-Hsuan Kuo

Crowdsourced Data Analytics: A Case Study of Predictive Modeling Competition

Yukino Baba, Nozomi Nori, Shigeru Saito and Hisashi Kashima

DSAA – S9: Complex Data Analysis

A Model-Selection Framework for Concept-Drifting Data Streams

Bo-Heng Chen and Kun-Ta Chuang

A Probabilistic Condensed Representation of Data for Stream Mining

Michael Geilke, Andreas Karwath and Stefan Kramer

Incrementally Mining Temporal Patterns in Interval-based Databases

Yi-Cheng Chen, Julia Zhu-Ya Weng, Jun-Zhe Wang, Chien-Li Chou, Jiun-Long Huang and Suh-Yin Lee.

The Purpose of Motion: Learning Activities from Individual Mobility Networks

Salvatore Rinzivillo, Lorenzo Gabrielli, Mirco Nanni, Luca Pappalardo, Fosca Giannotti and Dino Pedreschi

On Selecting Feature-Value Pairs on Smart Phones for Activity Inferences

Gunarto Sindoro Njoo, Yu-Hsiang Peng, Wen-Chih Peng and Kuo-Wei Hsu

DSAA – S10: Community and Network Analysis

Learning Sparse and Scale-free Networks

Melih Aslan, Xuewen Chen and Hong Cheng

Overlapping Community Detection in Social Network Based on Microblog User Model

Yajun Gu, Bofeng Zhang, Guobing Zou, Mingqing Huang and Keyuan Jiang

Information Diffusion among Users on Facebook Fan Pages over Time: Its Impact on Movie Box Office

Wan-Hsin Tang, Mi-Yen Yeh and Anthony J.T. Lee

Inferring Potential Users in Mobile Social Networks

Tsung-Hao Hsu, Chien-Cheng Chen, Meng-Fen Chiang, Kuo-Wei Hsu and Wen-Chih Peng

DSAA – S11: Cloud Computing & Parallel Computing

A Token Authentication Solution for Hadoop Based on Kerberos Pre-Authentication

Kai Zheng and Weihua Jiang

Hadoop based Deep Packet Inspection System for Traffic Analysis of E-Business Websites

Jiangtao Luo, Yan Liang, Wei Gao and Junchao Yang

A Data Reusing Strategy Based On Hive

Heng Xie, Mei Wang and Jiajin Le

Dehazing Algorithm’s High Performance and Parallel Computing for GF-1 Satellite Images

Changmiao Hu, Xiaojun Shan and Zheng Zhang

General In-Situ Matrix Transposition Algorithm for Massively Parallel Environments

Marcin Gorawski and Michal Lorek

DSAA – SS1: Statistical and Mathematical Tools for Data Mining – SMTDM (1)

Efficient Learning of General Bayesian Network Classifier by Local and Adaptive Search

Sein Minn, Shunkai Fu and Michel C. Desmarais

A Naive Bayesian Classifier in Categorical Uncertain Data Streams

Jiaqi Ge, Yuni Xia and Jian Wang

A New Set of Random Forests with Varying Dynamic Data Reduction and Voting Techniques

Hussein Mohsen, Hasan Kurban, Mark Jenne and Mehmet Dalkilic

Centroid Training to Achieve Effective Text Classification

Libiao Zhang, Yuefeng Li, Yue Xu, Dian Tjondronegoro and Chao Sun

Representing Sentence with Unfolding Recursive Autoencoders and Dynamic Average Pooling

Yin Hang, Zhang Chunhong, Zhu Yunkai and Ji Yang

DSAA – SS2: Statistical and Mathematical Tools for Data Mining – SMTDM (2)

A Novel Context-based Implicit Feature Extracting Method

Li Sun, Sheng Li, Jiyun Li and Jutao Lv

Local Feature Based Dynamic Time Warping

Zheng Zhang, Liang Tang and Ping Tang

Mobile User Stability Prediction with Random Forest Model

Danqin Wang and Xiaolong Zhang

Detecting Stock Market Manipulation using Supervised Learning Algorithms

Koosha Golmohammadi, Osmar Zaiane and David Diaz

Recommending Missing Citations for Newly Granted Patents

Sooyoung Oh, Zhen Lei, Wang-Chien Lee and John Yen

DSAA – SS3: Warehousing and Intelligent Analysis of Complex Network Big Data – WIACNBD (1)

FCA for Common Interest Communities discovering

Soumaya Guesmi, Chiraz Trabelsi and Chiraz Latiri

Diversification Recommendation of Popular Articles in Micro-blog Scenario

Jianxing Zheng, Bofeng Zhang, Guobing Zou and Xiaodong Yue

Analysis of Circadian Rhythms from Online Communities of Individuals with Affective Disorders

Bo Dao, Thin Nguyen, Dinh Phung and Svetha Venkatesh

Link Prediction and Threads in Email Networks

Qinna Wang

Mining Approximate Multi-Relational Patterns

Eirini Spyropoulou and Tijl De Bie

A Detecting Community Method in Complex Networks with Fuzzy Clustering

Xiao Feng Wang, Gong Shen Liu and Jian Hua Li

DSAA – SS4: Warehousing and Intelligent Analysis of Complex Network Big Data – WIACNBD (2)

Neural Network-Based Approaches for Predicting Query Response Times

Elif Ezgi Yusufoglu, Murat Ayyildiz and Ensar Gul

User Preference Space Partition and Product Filters for Reverse Top-k Queries

Zong-Hua Yang and Hung-Yu Kao

Finding Top-k Semantically Related Terms in Relational Keyword Search

Xiangfu Meng and Jingyu Shao

Design Process for Big Data Warehouses

Francesco Di Tria, Ezio Lefons and Filippo Tangorra

23-bit Metaknowledge Template Towards Big Data Knowledge Discovery and Management

Nima Bari, Roman Vichr, Kamran Kowsari and Simon Berkovich

DSAA – SS5: Environmental and Geo-spatial Data Analytics – EGSDA

Location Semantics Prediction for Living Analytics by Mining Smartphone Data

Chi-Min Huang, Jia-Ching Ying, Vincent S. Tseng and Zhi-Hua Zhou

Management of Complex Data objects in Ship Designing Process

Ruihan Bao and Hongming Cai

SAR Target Recognition Based on Deep Learning

Sizhe Chen and Haipeng Wang

Mining Frequent Time Interval-based Event with Duration Patterns from Temporal Database

Kuan-Ying Chen, Bijay Jaysawal, Jen-Wei Huang and Yong-Bin Wu

DSAA – SS6: Bioinformatics, Biomedical, Health & Medical Analytics – BHMA

Sharing Sensitive Medical Data Sets for Research Purposes – A Case Study

Kalpana Singh, Jia Rong and Lynn Batten

Mobile-based Food Classification For Type-2 Diabetes Using Nutrient and Textual Features

Yan Luo, Charles Ling and Shuang Ao

Solving Longest Overlap Region Problem For Noncoding DNA Sequence With GPU

Yukun Zhong, Jianbiao Lin, Chen Tao, Baoqiu Wang and Che Nian

Investigation of SEE on a 32-bit Microprocessor Based on SPARC V8 Architecture by Laser Test

Chunqing Yu, Long Fan, Suge Yue, Maoxin Chen and Shougang Du

Individualized Arrhythmia Detection with ECG Signals from Wearable Devices

Thanh Binh Nguyen, Wei Luo, Terry Caelli, Svetha Venkatesh and Dinh Phung

DSAA – SS7: Exploratory Computing – EC

Exploratory Computing: A Draft Manifesto

Nicoletta Di Blas, Mirjana Mazuran, Paolo Paolini, Elisa Quintarelli and Letizia Tanca

Exploratory Portals. The Need for a New Generation

Paolo Paolini and Nicoletta Di Blas

Exploring Emotions over Time within the Blogosphere

Patrick Hennig, Philipp Berger, Lukas Pirl, Lukas Schulze and Prof. Dr. Christoph Meinel

StatsReduce in the Cloud for Approximate Analytics

Michel De Rougemont

An Implementation of the Efficient Huge Amount of Pseudo-random Unique Numbers Generator and the Acceleration Analysis of Parallelization

Yun-Te Lin, Yung-Hsiang Huang, Yu-Jung Cheng, Yi-Hao Hsiao, Fang-Pang Lin, Jih-Sheng Chang and Shengwen Wang

Toward Robust Classification using the Open Directory Project

Jongwoo Ha, Jung-Hyun Lee, Won-Jun Jang, Yong-Ku Lee and Sangkeun Lee

Tutorial 1: Network Mining and Analysis for Social Applications

Lecturer: Feida Zhu

Date: November 1, 2014, 13:20 – 15:20

Room: East Asia Hall, B/F


The recent blossom of social network and communication services in both public and corporate settings have generated a staggering amount of network data of all kinds. Unlike the bio-networks and the chemical compound graph data often used in traditional network mining and analysis, the new network data grown out of these social applications are characterized by their rich attributes, high heterogeneity, enormous sizes and complex patterns of various semantic meanings, all of which have posed significant research challenges to the graph/network mining community. In this tutorial, we aim to examine some recent advances in network mining and analysis for social applications, covering a di- verse collection of methodologies and applications from the perspectives of network patterns, relationship mining and identity linkage. We would present the problem setting, the research challenges, the recent research advances and some future directions for each perspective.


Feida Zhu is an assistant professor at the School of Information Systems of SingaporeManagement University (SMU). He has founded and managed as Academic Director the Pinnacle Lab a joint lab with China Ping An Insurance Group to focus on social media mining and analysis for finance innovation. He obtained his Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign (UIUC) in 2009. His current research interests include large-scale data mining, graph/network mining and social network analysis. His research on large-scale frequent pattern mining has won the Best Student Paper Awards at 2007 IEEE International Conference on Data Engineering (ICDE) and 2007 Pacific- Asia Conference on Knowledge Discovery and Data Mining (PAKDD).

Tutorial 2: Harvesting, Integrating, Maintaining and Leveraging Knowledge Graphs

Lecturer: Seung-won Hwang

Date: November 1, 2014, 15:40 – 17:40

Room: East Asia Hall, B/F


Knowledge graphs have served as an integral component of modern search engines and software in general. However, it is non-trivial to harvest, cleanse, and integrate web-scale knowledge graphs, which has been actively studied in diverse fields of computer science. This two-hour tutorial will cover recent work on broad issues of harvesting, integrating, maintain, and leveraging knowledge graphs.


Seung-won Hwang is an associate professor of computer science and engineering in POSTECH, Korea. Her recent research projects have been integrating multiple knowledge graphs for data-driven understanding of entities. Her recent findings have been published at database and NLP venues, including ACL, TKDE, TOIS, TODS, SIGMOD, VLDB, and ICDE.