Invited speakers

Confirmed speakers include:

Chiranjib Bhattacharyya
Professor, Indian Institute of Science
Title: Learning Labels on Graphs using Orthonormal Embeddings
Existing research suggests that embedding graphs on a unit sphere can be beneficial in learning labels on the vertices of a graph. However the choice of optimal embedding remains an open issue. Orthonormal representation of graphs, a class of embeddings over the unit sphere, was introduced by Lov\’asz (1979) and was used to derive the famous $\vartheta$ function, an important construct in Algorithmic Graph theory. Recently we discovered an interesting connection between Kernel methods and $\vartheta$ function. In this talk we will review this connection and discuss new results which show that there exist orthonormal representation based classifiers which are statistically consistent over a large class of graphs. Time permitting we will discuss a linear time algorithm for solving the planted clique problem by using SVMs.
Bio for Chiranjib Bhattacharyya:
Chiranjib Bhattacharyya is currently a Professor in the Department of Computer Science and Automation(CSA), Indian Institute of Science. He completed his PhD in 2000 from CSA, and after a postdoctoral stint at UC Berkeley he joined the same department in 2002 as an Assistant professor. Since then he has been working on various aspects of Machine Learning. His current research interests are Kernel methods, Probabilistic modelling using Bayesian Non-parametrics, and Robust optimization.
Jidong Chen
Risk Intelligence Director, Ant Financial, Alibaba Inc.
Title: Big Data Based Network Security and Fraud Risk Management at Alipay
With development of mobile internet and finance, fraud risk comes in all shapes and sizes. This talk is to introduce the Fraud Risk Management at Alipay under big data. Alipay has built a fraud risk monitoring and management system based on real-time big data processing and intelligent risk models. It captures fraud signals directly from huge amount data of users’ behaviors and network, analyze them in real-time using machine learning, and accurately predict the bad guys and transactions. To extend the fraud risk prevention ability to external customers, Alipay also built up a big data based fraud prevention product called AntBuckler. AntBuckler aims to identify and stop all flavors of malicious behavior with flexibility and intelligence for digital business of merchants and banks. By combing large amount data of Alipay and the customer, AntBuckler uses the RAIN score engine to quantify the risk level of a user or transaction for fraud prevention. It also has user-friendly visualization UI with risk scores, top reasons and fraud connections. I will give more details about AntBuckler in this talk.
Bio for Jidong Chen:
Dr. Jidong Chen is Risk Intelligence Director and Senior Data Expert at Alipay, Ant Financial Group. His responsibility is fraud prevention based on big data analysis and biometric identification applications of Alipay. Before joining Alipay, he was the chief data scientist at Renren Games and a research manager of Big Data Lab at EMC Labs China, for research and advanced development projects in Big Data Management and Analytics. He joined EMC in 2007 on receiving his Ph.D. in Computer Science from Renmin University of China. He joined the CCF Task Force on Big Data in 2012. He has published over 20 papers in refereed international journals and conference proceedings including Geoinformatica, JCST, SIGMOD, SIGIR, VLDB, SIGIR, CIKM, SDM, MDM etc. He had 5 patents filed in U.S. and 2 in China. He co-authored the monograph “Moving Objects Management: Models, Techniques, and Applications” published by Springer and Tsinghua Press. He has held visiting research positions at the PRISM Lab of Versailles University in Paris (2005) and Hong Kong Baptist University (2006).
Brenton Cooper
Chief Technology Officer, Data to Decisions CRC
Title: Big Data use cases in defence and security
The Data To Decisions Cooperative Research Centre, a collaboration between Government, Industry and Academia was established to tackle the Big Data challenges facing Defence and National Security. Focussing on open source data, the D2DCRC will deliver capability to address several high priority user needs including event prediction and improved information retrieval from open source data.
Bio for Brenton Cooper:
Dr Brenton Cooper is the CTO of the Data To Decisions Cooperative Research Centre, a collaboration between Government, Industry and Academia established to tackle the Big Data challenges of Defence and National Security. Brenton has previously held a variety of technology and management roles in BAE Systems, Tenix Defence and Motorola. He has significant expertise in the application of leading edge R&D in the fields of defence, national security and telecommunications. He has delivered advanced concept demonstrators that demonstrate the value that can be extracted from large data holdings, with application to fraud detection, counter-terrorism, law enforcement and battle-space information management. He holds three international patents (granted or submitted) in areas ranging from data mining to information security. Brenton holds a PhD and Bachelor (Honours) degree in Electronic Engineering from the University of Adelaide.
Warwick Graco
Senior Director Operational Analytics, ATO, Australian Whole-of-Government Data Analytics Centre of Excellence
Title: Analytics Centre of Excellence: Roles, Responsibilities and Challenges
The Australian Government established a Whole of Government Data Analytics Centre of Excellence to address issues to do with big data and analytics. This centre provides a template for other governments to follow to facilitate and make best use of both capabilities to improve policy making, to deliver services to citizens and to achieve other government priorities and objectives. This presentation will cover the roles, responsibilities and challenges with setting up and running a centre of excellence and how it can work with government departments and agencies to assist with aggregating and managing data and with extracting value from this important resource.
Bio for Warwick Graco:
Warwick has worked in defence, health and taxation and has been involved in analytics for over 20 years. He is a practicing analytics professional and is currently convenor of the Whole of Government Data Analytics Centre of Excellence. He has a BSc from the University of New South Wales and a PhD from the University of New England Australia. His professional interests include digital transformation and innovation, organizational learning, organizational decision making and analytics. He is a board member of the Institute of Analytics Professionals Australia and a member of the College of Organizational Psychology of the Australian Psychological Society
Prateek Jain
Researcher, Microsoft Research
Title: TBA

Bio for Prateek Jain:
Prateek Jain is a researcher at Microsoft Research Lab, India since Jan 2010. Before joining MSRI, Prateek got his PhD and MS from CS Deptt., The University of Texas at Austin in Dec 2009, under the guidance of Prof. Inderjit S. Dhillon. He got his BTech degree in Computer Science from IIT Kanpur in 2004. Prateek’s primary research interests are in Machine Learning, Non-convex Optimization and Linear Algebra. Prateek received best student paper awards at CVPR-2008 and ICML-2007, and a best runner up award at SDM-2007.
Xiaolong Jin
Associate Professor, Institute of Computing Technology, Chinese Academy of Sciences
Title: Network Big Data: Facing and Tackling the Complexities
Network big data have emerged due to the interaction between human, machine and thing in the Cyberspace. The increase of the scale and complexity of the data requires new techniques and architecture to capture, curate, manage, and process the data to mine the big values. In this talk, I will talk about the major scientific challenges brought by the network big data, i.e., data complexity, computing complexity and system complexity. The three complexity problems require us to take a brand new view in data, algorithms and computational systems. It is expected to further discover the common laws within the big data, study the reduced and incremental computing theory, and bring the life-cycle of data into system design. A bunch of related works as well as our practices in these perspectives will be covered in this talk. At the end of this talk, I will make a brief introduction to the Task Force on Big Data, China Computer Federation, the first expert community specialized in big data in China, and the influential events it organizes.
Bio for Xiaolong Jin :
Xiaolong Jin is currently an Associate Professor at the CAS Key Laboratory of Network Science and Technology, the Institute of Computing Technology (ICT), the Chinese Academy of Sciences (CAS). He is a member of Task Force on Big Data, China Computer Federation. He obtained his Ph.D. degree in Computer Science from Hong Kong Baptist University in 2005. His current research interests include Social Computing, Social Network, Performance Modelling and Evaluation, and Multi-Agent Systems. He has over 120 research papers published in prestigious journals, including IEEE ToC, IEEE TWC, IEEE TSMC, IEEE TPDS etc., and in reputable international conferences, including WWW, CIKM, GLOBECOM, ICC, LCN, AAMAS, IAT, WI, etc. He received the IEEE Best Paper Award at the IEEE 21st International Conference on Advanced Information Networking and Applications (AINA’07) and the Excellent Paper Award at the Second International Conference on Active Media Technology (ICAMT’03). He has served as a chair, co-chair, or member on the technical and executive committees of more than 50 conferences and workshops including IUCC, LCN, AINA, AAMAS, WI, and IAT.
Balaji Krishnapuram
Director & Distinguished Engineer, IBM Watson Health
Title: Rapid Learning using Privacy-Preserving Distributed Data-Mining
Recent advances in technology and data acquisitions costs are leading to an explosion of electronic data for bioinformatics and clinical research. At the same time we are witnessing a new paradigm for clinical research based on secondary use of data from Electronic Medical Records (EMR). In this talk, we describe a Health IT system that supports Rapid Learning across hospitals in Germany, Belgium and Netherlands. We describe that technological developments that enable us to conduct clinical research by bringing the computation to the data in federated databases in each hospital. The system avoids centralizing the data and preserves patient privacy, thus overcoming ethical, political and legal challenges to sharing patient data. Further, it uses ontologies and machine learning methods to dramatically reduce administrative and IT costs for collecting, normalizing and exchanging information across disparate source systems that use different languages, clinical protocols, database schema. We demonstrate the impact of the system on translational clinical research based on a case study across 5 hospitals in 4 countries.
Bio for Balaji Krishnapuram:
Balaji Krishnapuram is responsible for Analytics for IBM Watson Health. He currently leads the development of two products and a a cloud based analytics platform for Healthcare.

Previously he led teams that launched 7 commercially successful products using Machine learning over the last 10 years at Siemens Healthcare. He organized over 15 international conferences/workshops, serving as the General Chair of ACM SIGKDD 2010 and ACM SIGKDD 2016. He edited a book, authored over 25 patents and published over 50 articles in the leading journals and research conferences in the areas of machine learning, information extraction, personalized medicine etc. He obtained his B Tech from IIT Kharagpur in 1999, and PhD from Duke University in 2004.

Sofus Macskássy
Manager, Applied Machine Learning, Facebook
Title: An overview of (some) Machine Learning at Facebook
How do we do scalable machine learning at Facebook and where is it used? I will in this talk first provide an overview of some of our machine learning infrastructure and the tools that we use to make machine learning scalable and easy to use. I will in the latter half of the talk discuss one specific ML use case on predicting attributes of nodes in a large social graph.
Bio for Sofus Macskássy:
Sofus A. Macskassy is part of the applied machine learning team at Facebook. He previously ran the user modeling group at Facebook in their Core Data Science team, was part of the research faculty at USC/ISI, and he was the Director of Fetch Labs at Fetch Technologies. He received his PhD in machine learning/information filtering at Rutgers University. He is passionate about learning about users to better serve them through better filtering, ranking and recommendation. He was the general chair of KDD-2014, serves on the editorial boards of JAIR and ML, and is well published at top-tier conferences and journals.
Wanli Min
Director for Data Science, Alibaba Group
Title: An Analytical Path from Law of Large Number to Big Data
In the era of big data, explosive growth of data volume and variety enables better and informed decisions in many business processes of unprecedented scope and depth. Latest development of machine learning, data mining and artificial intelligence has been adopted to tackle various business problems with high business impact. In this talk, we will discuss several use cases in e-commerce, advertising and smarter city, with highlights on the vital role of data mining technology in these use cases. Technological development cannot thrive without proper business models, and big data technology is no exception either. We will also share the lesson learned on this aspect from our unsuccessful efforts in this talk. In the end we will offer an outlook on business opportunities empowered by big data development.
Bio for Wanli Min:
Dr. Wanli Min received his Ph.D in Statistics from The University of Chicago in 2004. Thereafter he held several positions at IBM T. J. Watson Research Center, IBM Singapore, and Google. He has conducted methodological research on statistical modeling and provided consulting services to customers in various sectors. In 2013 he joined Alibaba group as Director for Data Science. His research interests include probability theory, time series analysis, stochastic process inference, and signal processing.
Stefan Mozar
IEEE NSW Section
Title: Data Analytics in Electronics Manufacturing
Testing of electronic printed circuit board assemblies (PCA) is a bottle neck in high volume production. Test engineers try and test as much as possible to “guarantee” a functional PCA. These tests take significantly more time than it takes to assemble the PCA. This talk shows how data analytics can help reduce test times, and even eliminate PCA testing in many instances. It shows how engineering data lakes can be formed, and how these can be used to better understand causes of failure in design and in manufacturing.
Bio for Stefan Mozar:
Stefan Mozar is the past president of the IEEE Consumer Electronics Society, and a consulting engineer. He is an Adjunct Professor at Guangdong University of Technology (GDUT), China. His research interests include the application of Data Analytics in engineering design. He was a member of the Philips team that did pioneering work on Robust Design. Statistical techniques he helped develop are widely used in the manufacturing, especially the semiconductor industry. He has worked on four continents, in both industry and academia. His work has resulted in publications, inventions, and patents. He has worked on projects that have won about 30 international design awards, including an Australian Design Award.

He is a Fellow of the IEEE, and of the Intuition of Engineers Australia. He is an IEEE Distinguished Lecturer, and has received a number of awards from the IEEE.

Rajesh Parekh
Vice President of Data Science, Groupon
Title: Powering Local Commerce Using Big Data Analytics
Groupon’s pioneering concept of daily deals in local commerce has rapidly evolved as a key enabler connecting online and mobile users with offline local merchants. This new business model presents interesting opportunities for large-scale data analytics and data mining. In this talk, I will provide an overview of some challenging data problems, such as user-deal personalization, and present a “view from the trenches” on the key insights learned,data systems, and methodologies that could be leveraged for solving these problems. I will also outline some interesting open problems that would be of interest to researchers and practitioners alike.
Bio for Rajesh Parekh:
Dr. Rajesh Parekh is Vice President of Data Science at Groupon where he leads the global Data Science efforts. His teams comprise of talented datascientists, engineers, and product managers working to solve interesting and challenging problems in the space of local commerce. Rajesh’s focus spans the entire spectrum of data analytics including defining the right KPIs and metrics to monitor the health of the business, building global dashboards for accurate and near-real time reporting of business performance, conducting deep analysis of user and merchant data to identify strategic insights, and developing large-scale machine learned models for optimizing Groupon’s business. Rajesh received the Top Inventor award in 2013 and the Most Prolific Inventor Award in 2014.

Prior to Groupon, Rajesh was Senior Director of Research at Yahoo Labs where he led the display advertising targeting sciences. His efforts led to a new advertising product offering called perform-alike targeting. At Yahoo, he received the You Rock award for his work on real-time prediction of newsworthy queries, and the Data Wizard award for designing the system that optimizes the number of sponsored ads shown on a search results page. Before Yahoo, Rajesh was at Blue Martini Software where he developed data mining solutions for e-commerce. Rajesh started his professional career at Allstate where he used data mining to solve insurance problems like cross-sell, retention, and fraud.

Rajesh earned his Ph.D. in Computer Science from Iowa State University. His dissertation research focused on constructive learning of Neural Networks and Grammars from example data. He has authored over 30 research publications and has filed over 50 patents. He is actively involved in the datamining community. He is an advisor to the Hive, an incubator for data-driven businesses and has served as founding co-chair of the new Industry Practice Expo track at the KDD conference from 2011 to 2013 and in 2015.

Shuo Shen
Deputy director and CTO of China IoT ID Service Platform, Computer Network Information Center, CAS, China
Title: Big Data in Public Service in China
In the era of IoT, huge amount of objects are online or in the environments to be accessed by people or machine. It’s an essential public service to let people know information behind the objects, for example: the product information, ownership, responsibility and access control, etc. Thus an authoritative service is needed for people to access this information. A national ID service platform in China was founded to provide this service. The basic service and implementation of this platform will be introduced; also experience and problems in public data service will be shared.
Bio for Shuo Shen:
Sean Shen (Shuo Shen), Ph.D, graduated from mathematics and electronics & computer engineering department, Purdue University in 2007. He has worked for Huawei as a senior researcher between 2007 and 2009 and focused on internet research and standardization, especially in applications and Internet security. In 2009 and 2014, he worked for CNNIC (China Internet Network Information Center), he has been served as director or CNNIC Labs and director of Advanced Research Department. He has been directing research and standardization work, including DNS, IP and IoT areas. Since 2014, he has been working as the deputy director and CTO of China IoT ID Service Platform, Computer Network Information Center, China Academy of Science. Dr. Sean Shen has published various international and domestic standards, including RFCs and WG drafts in IETF, telecommunication industrial and national standards in China. He has been serving in various committee and organizations in both international and domestic areas, such as member of national Internet standard planning committee, co-chair of Architecture of Ubiquitous network Working Group.
Partha Talukdar
Assistant Professor, Indian Institute of Science
Title: TBA

Bio for Partha Talukdar:
Partha Talukdar is an Assistant Professor at the Indian Institute of Science (IISc), Bangalore. Previously, he was a Postdoctoral Fellow in the Machine Learning Department at Carnegie Mellon University, working with Tom Mitchell on the NELL project. He received his PhD (2010) in CIS from the University of Pennsylvania, working under the supervision of Fernando Pereira, Zack Ives, and Mark Liberman. Partha is broadly interested in Machine Learning, Natural Language Processing, and Neurosemantics, with particular interest in large-scale learning and inference for knowledge graphs. He is a co-author of a book on Graph-based Semi-supervised Learning published by Morgan & Claypool Publishers.
Jie Tang
Associate Professor, Tsinghua University, China
Title: Toward Understanding Big Scholar Data
Academics and researchers worldwide continue to produce large numbers of scholarly documents including papers, books, technical reports, etc. and associated data such as tutorials, proposals, and course materials. The abundance of data sources enables researchers to study scholarly collaboration at a very large scale. In this talk, I will use as an example to introduce several important heterogeneous graph mining algorithms developed by our teams and our collaborators, such as topic modeling for nodes, path similarity measure, informative subgraph mining, and association finding. The development of these novel graph-based data mining algorithms allows us to identify new associations and hidden relationships in the integrated datasets.
Bio for Jie Tang:
Jie Tang is an associate professor with Department of Computer Science and Technology, Tsinghua University. His interests include social network analysis, data mining, and machine learning. He published more than 100 journal/conference papers and holds 10 patents. He served as PC Co-Chair of WSDM’15, ASONAM’15, ADMA’11, SocInfo’12, KDD-CUP Co-Chair of KDD’15, Poster Co-Chair of KDD’14, Workshop Co-Chair of KDD’13, Local Chair of KDD’12, Publication Co-Chair of KDD’11, and as the PC member of more than 50 international conferences. He is the principal investigator of National High-tech R&D Program (863), NSFC project, Chinese Young Faculty Research Funding, National 985 funding, and international collaborative projects with Minnesota University, IBM, Google, Nokia, Sogou, etc. He leads the project for academic social network analysis and mining, which has attracted millions of independent IP accesses from 220 countries/regions in the world. He was honored with the Newton Advanced Scholarship Award, CCF Young Scientist Award, NSFC Excellent Young Scholar, and IBM Innovation Faculty Award.
Pia Waugh
Director of Coordination and Gov 2.0, Department of Finance, Government of Australia
Title: TBA

Bio for Pia Waugh:
Pia Waugh is an open government and open data ninja, working within the machine to enable greater transparency, democratic engagement, citizen-centric design and real, pragmatic actual innovation in the public sector and beyond. She believes that tech culture has a huge role to play in achieving better policy planning, outcomes, public engagement and a better public service all round. She is also trying to do her part in establishing greater public benefit from publicly funded data, software and research. Pia was awarded as one of the Top 100 Most Influential Women in Australia for 2014.

Pia is currently working as a Director of Coordination and Gov 2.0 for the Australian Government CTO looking at whole of government technology, services and procurement. As part of this work, Pia runs This is in the Department of Finance, which should really be called “the Department for Whole of Government Stuff” considering the breadth of stuff it does.

>> More information about Pia Waugh (

Christopher White
Principal Researcher, Microsoft
Title: The Changing Roles of Information for Defense and Business Intelligence
This presentation will cover recent work at DARPA and Microsoft building real-world applications for defense and law enforcement to find and analyze relevant data. The talk will be an overview of current issues with information on the Internet, detailed application examples, and software demonstrations. It will examine topics with information including trustworthiness, uncertainty, data theft, and information campaigns. It will also cover the fracturing of Internet content into web pages, social media, and other pieces of the deep web. The talk will conclude by covering a few next directions for special projects at Microsoft.
Bio for Christopher White:
Dr. Christopher White is a Principal Researcher at Microsoft working on special projects. From 2011 to 2015 he was a Program Manager at DARPA developing advanced technologies for data science. He created DARPA’s leading programs in area: XDATA, Memex, and the Open Catalog, which are part of the President’s Big Data Initiative. Memex has been applied to domains including countering human trafficking and counter terrorism, where it has been featured on 60 minutes, CNN, the Wall Street Journal, and Google’s Solve for X. Dr. White previously served DARPA as the Agency’s country lead in Afghanistan. Secretary of Defense Leon Panetta recognized DARPA and Dr. White’s efforts in Afghanistan with a Joint Meritorious Unit Award for support in a combat environment. Prior to DARPA he was a fellow at Harvard University’s School of Engineering and Applied Sciences and holds a Ph.D. in electrical engineering from the Johns Hopkins University.
David Willingham
Senior Application Engineer – Data Analytics, MathWorks
Title: Bringing Big Data Modelling into the Hands of Domain Experts with MATLAB
In today’s world, there is an abundance of data being generated from many different sources. Understanding big data and then building predictive models allows you to gain insight into this data and allow you to gain confidence in making future economic business decisions. However, one of the biggest challenges faced has been giving domain experts the right tools to work efficiently and with enough detail to gain this insight, without the need for advanced software programming skills.

In this session we will highlight how domain experts can explore, visualise, prototype & productionize models using big data with MATLAB. We will show:

  • How to efficiently import big data onto any computer, no matter what RAM is installed.
  • Techniques on visualising big data and prototyping models.
  • Improving performance of computing big data models using multi-core machines, clusters and the Cloud
  • How to scale and productionise models in big data environments such as Hadoop

Bio for David Willingham:
David Willingham is a senior application engineer with over 10 years of experience in MATLAB. He specialises in data analytics and focuses on mining, finance, and energy applications. Prior to joining MathWorks, he spent a year in Singapore developing a voice-over-IP DSP algorithm in MATLAB for Infineon. David has an honours degree in electrical and computer systems engineering, specialising in signal processing, from Monash University, Australia.
Hui Xiong
Professor, The State University of New Jersey, Rutgers, USA
Title: Sequential Pattern Analysis with the Right Granularity
Sequential pattern analysis aims at finding statistically relevant temporal structures where the values are delivered in sequences. This is a fundamental problem in data mining with diversified applications in many science and business fields. Given the overwhelming scale and the dynamic nature of the sequential data, new visions and strategies for sequential pattern analysis are required to derive competitive advantages and unlock the power of the big data. To this end, in this talk, we present novel approaches for sequential pattern analysis with applications in dynamic business environments. Particularly, we focus on the development of “temporal skeletonization”, which can help to identify the meaningful granularity for sequential pattern mining. Along this line, we first show that a large number of symbols in a sequence can “dilute” useful patterns which themselves exist at a different level of granularity. This is so-called “curse of cardinality”, which can impose significant challenges to the design of sequential analysis methods. To address this challenge, our key idea is to summarize the temporal correlations in an undirected graph, and use the “skeleton” of the graph as a higher granularity on which hidden temporal patterns are more likely to be identified. In the meantime, the embedding topology of the graph allows us to translate the rich temporal content into a metric space. This opens up new possibilities to explore, quantify, and visualize sequential data. Evaluation on a B2B (Business to Business) marketing application demonstrates that our approach can effectively discover critical buying paths from noisy customer event data.
Bio for Hui Xiong:
Dr. Hui Xiong is currently a Full Professor of Management Science and Information Systems at Rutgers Business School and the Director of Rutgers Center for Information Assurance at Rutgers, the State University of New Jersey, where he received a two-year early promotion/tenure (2009), the Rutgers University Board of Trustees Research Fellowship for Scholarly Excellence (2009), and the ICDM-2011 Best Research Paper Award (2011).
Dr. Xiong is a prominent researcher in the areas of business intelligence, data mining, big data, and geographic information systems (GIS). For his outstanding contributions to these areas, he was elected an ACM Distinguished Scientist. He has a distinguished academic record that includes 200+ referred papers and an authoritative Encyclopedia of GIS (Springer, 2008). He is serving on the editorial boards of IEEE Transactions on knowledge and Data Engineering (TKDE), ACM Transactions on Management Information Systems (TMIS), and IEEE Transactions on Big Data. Also, he served as a Program Co-Chair of the Industrial and Government Track for the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), a Program Co-Chair for the IEEE 2013 International Conference on Data Mining (ICDM-2013) and a General Co-Chair for the IEEE 2015 International Conference on Data Mining (ICDM-2015).
Qiang Yang
Professor, Hong Kong University of Science and Technology
Title: User Modeling in Telecommunications and Internet Industry
It is extremely important in many application domains to have accurate models of user behavior. Data mining allows user models to be constructed based on vast available data automatically. User modeling has found applications in mobile APP recommendations, social networking, financial product marketing and customer service in telecommunications. Successful user modeling should be aware of several critical issues: who are the target users? How should the solutions be updated when new data come in? How should user feedback be handled? What are the ‘pain’ points of users? In this talk, I will discuss my own experience on user modeling with big data. I will draw examples from telecommunications and the Internet industry, contrasting and highlighting some lessons learned in these industries.
Bio for Qiang Yang:
Qiang Yang is the New Bright University Named Professor of Engineering and Chair Professor at Hong Kong University of Science and Technology, where he is the Head of the Department of Computer Science and Engineering. He was the founding head of the Huawei Noah’s Ark Lab (2012-2015). He is currently the Technical Advisor to WeChat. His research interests are data mining, machine learning and artificial intelligence. He is a fellow of AAAI, IEEE, IAPR and AAAS, and ACM Distinguished Scientist. He received his PhD degree in Computer Science from University of Maryland, College Park in 1989, and had been a faculty member at University of Waterloo and Simon Fraser University in Canada. He had been the founding Editor in Chief of ACM Transactions on Intelligent Systems and Technology (ACM TIST) between 2009 and 2014, and is the founding Editor in Chief of IEEE Transactions on Big Data. He has been conference and program chairs for international conferences such as ACM KDD 2010 and 2012, ACM IUI 2010, IEEE Big Data 2013, ACM RecSys 2013 and International Joint Conference on Artificial Intelligence (IJCAI) in 2015.
Xiaoru Yuan
Professor, Peking University, China
Title: Urban Big Data Visualization
Understanding the complex nature of activities in modern metropolitan regions are difficult due to the vast amount of data required for processing and analysis. Visualization provides essential accesses for users to comprehend such big data and gain insights, which is crucial for decision makers, political figures, as well as the general public. This talk will discuss visualization cases covering various types of urban data, including taxi GPS data, vehicle RFID data, subway IC card data, and social media data, and demonstrate how different data sets can be integrated for advanced visual analysis. With the assistant of properly designed visualization and interaction, both general public and experts can interactively conduct the data exploration, mental image construction, and insight discovery.
Bio for Xiaoru Yuan:
Xiaoru Yuan is a tenured faculty member in the School of Electronics Engineering and Computer Science. He serviced as the vice director of Information Science Center, at Peking University. He received Bachelor degrees in chemistry and law from Peking University, China, in 1997 and 1998, respectively. He received the Ph.D. degree in computer science in 2006, from the University of Minnesota at Twin Cities. His primary research interests are in the field of scientific visualization, information visualization and visual analytics. He has co-authored over 60 technical papers in IEEE Visualization, IEEE Information Visualization, IEEE TVCG, IEEE EuroVis, IEEE PacificVis and other major international visualization conference and journals. His co-authored work on high dynamic range volume visualization received Best Application Paper Award at the IEEE Visualization 2005 conference. He and his student team won the award of Outstanding Situation Awareness in IEEE VAST Challenge 2013, and further won three major awards in IEEE VAST Challenge 2014. He served on the program committees of IEEE VIS, EuroVis, and IEEE PacificVis. He was organization co-chair of IEEE PacificVis 2009, program chair of VINCI 2010. He founded ChinaVis conference in 2014 and served as the program co-chair. He also served on the editorial board of CCF journal of CAD&CG and Springer Journal of Visualization, and as guest editor of IEEE TVCG and IEEE CG&A. He is CCF outstanding member and chair of CCF YOCSEF (2012-2013).
Baofeng Zhang
Founder and co-leader of Noah’s Ark LAB, Huawei, Hong Kong
Title: Challenges and Opportunities in Telco Big Data
“Big Data analytics” has been viewed as the next BIG thing in lots of vertical industries, like GE’s “Industry internet” and Germany Government’s “Industry 4.0”, and it is the same situation for telecommunications. Every operator looks Big Data as a new opportunity to increase revenues and profits during a time of stagnant growth in industry, especially facing the competition from Internet service providers. However, besides the challenges of new business models, there are also very unique technical challenges and opportunities in Telco industry, the goal of this talk is to share some experiences and insights in real problems, e.g. How to initiate data mining effort in the business? How to define the business problem to be solved first and then look for methods that might solve it? What are the challenges in BSS and OSS data integration? Different challenges with different data sets etc.
Bio for Baofeng Zhang:
Baofeng ZHANG is founder and co-leader of Noah’s Ark LAB at Huawei, LAB’s mission is to make significant contributions to both the company and society by innovating in data mining, machine learning, artificial intelligence and related fields. Baofeng Zhang has more than 16 years experiences in Telco industry, about 8 years research management background in IT areas. Rich experiences with R&D activities and management (e.g., software design and development, requirement analysis, system architecture design, standard and research) and also joined and led a wide range of research/development projects(e.g. campus calling debit card in circuit-based switch, Intelligent LAN switch/Edge Routers, Telco Big Data and Intelligence Bank …). He is also very active in society, head of delegations to numerous national standard development events and active participant of numerous of international Standard Development e.g. ITU-T and ETSI, member of core expert group in National “Important Special Projects of Crucial Electronic Devices, Hi-end General Chips and Basic Software Products” Program. And he is also the member of CCF (China Computer Federation) Big Data Task Force.
Wenwu Zhu
Professor, Tsinghua University, China
Title: Cyber-Physical-Human Big Data Computing
Big data are always sourced from Cyber space, Physical space or Human space. Cyber-Physical-Human (CPH) big data is the complete image in various vital application scenarios such as urban planning, smart city etc. How to bridge the heterogeneous spaces and discover knowledge from CPH big data is the central problem. This talk will present Cyber-Physical-Human (CPH) big data computing. First, we will introduce the principal concept of Cyber-Physical-Human big data computing. Second, we will present challenges and scientific problems of Cyber-Physical-Human big data computing. Third, we will introduce the methodologies and approaches for CPH big data computing. Last, we will use cyber-social data as an example to show how we will perform social media big data computing. In particular, we will present novel approaches to behavioral prediction and detection based on contextual, cross-domain, cross-platform and suspicious behavioral patterns.
Bio for Wenwu Zhu:
Wenwu Zhu is currently a Professor and Deputy Head of Computer Science Department of Tsinghua University. Prior to his current post, he was a Senior Researcher and Research Manager at Microsoft Research Asia. He was the Chief Scientist and Director at Intel Research China from 2004 to 2008. He worked at Bell Labs New Jersey as Member of Technical Staff during 1996-1999.Wenwu Zhu is an IEEE Fellow, ACM Distinguished Scientist and SPIE Fellow. He has published over 200 referred papers in the areas of multimedia computing, communications and networking. He is inventor or co-inventor of over 50 patents. His current research interests are in the area of Cyber-physical-social big data computing, social media computing, multimedia cloud computing, and multimedia communications and networking. He served(s) on various editorial boards, such as Guest Editors for the Proceedings of the IEEE, IEEE T-CSVT, and IEEE JSAC; Associate Editors for IEEE Transactions on Mobile Computing, IEEE Transactions on Multimedia, IEEE Transactions on Circuits and Systems for Video Technology,and IEEE Transactions on Big Data. He received the 6 Best Paper Awards including ACM Multimedia 2012. He received the Ph.D. degree from New York University Polytechnic School of Engineering in 1996 in Electrical and Computer Engineering.

Presentation Slides can be downloaded now.
Thank MathWorks for sponsoring 2015 Big Data Summit.
IEEE NSW Section sponsors 2015 Big Data Summit, all IEEE NSW members are welcome to attend BDS2015.
The 2015 Big Data Summit received strong support from IEEE and ACM through its relevant task forces and chapters, and especially Chinese and Indian professional bodies and institutions.
HP sponsors Big Data Summit 2015
2015 Big Data Summit is co-located with 21st ACM SIGKDD in Hilton Sydney Hotel
“Big Data in ANZ” Forum will be held in the afternoon of 9 August 2015 in Hilton
“Data Science in India” Forum will be held in the morning of 10 August 2015 in Hilton
“Big Data in China” Forum will be held in the afternoon of 10 August 2015 in Hilton
“Big Data in Asia” Panel will be held in the afternoon of 10 August 2015 in Hilton
Big Data Summit 2015