CP1541: DATA ANALYTICS
SYLLABUS
Module 1:
Introduction, how analytics is used in practice, analytics works in different companies Google, Facebook, Kaggle, and Netflix. BIG DATA–Introduction, why big data – evolution principles- difference from regular data - convergence of key trends – A Wider Variety of Data - unstructured data – Big Data Business Models, Enabling Big Data Analytic Applications. Big data analytics in industry– Digital Marketing and the Non - line World - web analytics – big data and marketing – New School of Marketing - fraud and big data – risk and big data – credit risk management – big data and algorithmic trading – big data and healthcare – big data in medicine – advertising and big data – big data technologies .
Module II:
Big Data Technology -Old vs. New Approaches, Data Discovery, Open-Source Technology, Cloud and Big Data, Predictive Analytics, Software as a Service, Mobile Business Intelligence, Crowdsourcing Analytics, Inter- and Trans-Firewall Analytics, R&D Approach Helps Adopt New Technology, Adding Big Data Technology into the Mix, Information Management - Big Data Foundation, Big Data Computing Platforms, Big Data Computation, Big Data Storage, Big Data Computational Limitations, Big Data Emerging Technologies.Business Analytics - geospatial intelligence, Consumption of Analytics, Creation and Visualizing, Tools for Analytic Applications.
Module III:
The People Part of the Equation - Evolution of Data Science, Learning over Knowing, Data Scientist Skills, Critical Thinking, Holistic View of Analytics, Setting Up the Right Organizational Structure forInstitutionalizing Analytics, Data Privacy and Ethics - Privacy Landscape, Customer Relationship Management, Rights and Responsibility, Technologies for anonymizing data.
Module IV:
BASICS OF HADOOP - Introduction to Hadoop - Data, Data Storage and Analysis, Querying, Comparison with Other Systems- Relational Database Management Systems, Grid Computing, Volunteer Computing, A Brief History of Apache Hadoop.Design of Hadoop distributed file system (HDFS) – HDFS concepts – Java interface – data flow – Hadoop I/O – data integrity – compression – serialization, Avro – file-based data structures, HADOOP RELATED TOOLS - Hbase – data model and implementations – Hbase clients.
4. REFERENCES
1.Michael Minelli, Michelle Chambers, and AmbigaDhiraj, "Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses", Wiley, 2013.
2.Tom White, "Hadoop: The Definitive Guide", Third Edition, O'Reilley, 2012