Big Data Analytics

Course in collaboration with Sopra Steria

About

This course aims to introduce the use of Modern Big Data technology for processing data at scale on top of distributed architectures (e.g., cluster-, cloud-based architectures).

Using the Big Data life cycle as reference (i.e., data acquisition, storage, preparation, analysis, and visualization phases), the course introduces the fundamental concepts at the core of existing Big Data technology and shows their practical application with concrete hands-on examples.

Objectives

At the end of this course, students will be capable of:

  • Define and illustrate with concrete examples the characteristics of Big Data (i.e., volume, velocity, and variety)

  • Understand and configure the main components of a Modern Big Data platform for analytical operations

  • Analyse large and heterogeneous datasets (structured, non-structured) on batch

Methodology

The course follows the principle of blended learning. Students are thus expected to read and prepare using the provided online material before each lesson.

Prerequisites

Students are expected to be familiar with the following topics:

  • Fundamentals of (Relational) DBMS
  • Fundamentals of Data Science
  • Fundamentals of Distributed systems
  • Fundamentals of Graphs and their associated operations

Evaluation

  • 60%   exam + quizes
  • 40%   datathon + demofest

Teaching Staff

Coordinator

Avatar

Javier Espinosa

CPE, Univ. Lyon / LIRIS-CNRS

Associate Professor

Lecturers

Avatar

Javier Espinosa

CPE, Univ. Lyon / LIRIS-CNRS

Associate Professor

Avatar

John Samuel

CPE, Univ. Lyon / LIRIS-CNRS

Associate Professor

Avatar

Osman Aidal

CNRS / CC-IN2P3

Research Engineer

Sopra Steria

Avatar

Anissa Boukhemiri

Sopra Steria

Software Engineer

Software Development, Java

Avatar

Camille Boutar

Sopra Steria

Campus Manager, RH

Academic Recruitment

Avatar

Guillaume Darver

Sopra Steria

Data Specialist

Big Data, Data Science, Semantic Technologies

Avatar

Mathieu Baudin

Sopra Steria

Data Specialist, Team Leader

Business Intelligence, Databricks, AWS

Avatar

Nicolas Brisy

Sopra Steria

Senior Architect

NoSQL, Big Data, DevOps, Java

Syllabus

IRC & ETI

IRC program

ETI program

Practicals

  • All practicals will be conducted using cloud platforms
  • AWS Learning Labs require Edge / Chrome / Firefox web browser
    • Safari is not supported!

AWS Learning Labs

IRC

ETI

Databricks

CPE Lyon students have free access to the Databricks Training Catalog (by registering with their @cpe.fr account) and can obtain $200 vouchers for the Databricks Certifications exams.