Big Data Analytics

Course in collaboration with Sopra Steria

About

This course aims to introduce the use of Big Data systems for processing data at scale on top of distributed architectures (e.g., cluster- and cloud-based architectures).

Using the Big Data life cycle as reference (i.e., data acquisition, storage, preparation, analysis, and visualization phases), the course introduces the fundamental concepts at the core of the existing Big Data stack and shows their practical application with concrete hands-on examples.

Objectives

At the end of this course, students will be capable of:

  • Define and illustrate with concrete examples the characteristics of Big Data (i.e., volume, velocity, and variety).

  • Understand and configure the main components of a Big Data platform for analytical operations.

  • Analyse large and heterogeneous datasets (structured, non-structured) on batch and stream.

Methodology

The course follows the principle of blended learning. Students are expected to read and prepare the courses using the provided online material before each lesson.

Prerequisites

  • Fundamentals of (Relational) DBMS
  • Fundamentals of Data Science
  • Fundamentals of Distributed systems
  • Fundamentals of Graphs and their associated operations

Evaluation

Staff

Lecturers

Avatar

Javier Espinosa

CPE, Univ. Lyon / LIRIS-CNRS

Associate Professor

Avatar

Laura Po

University of Modena and Reggio Emilia

Associate Professor

Avatar

Osman Aidal

CNRS / CC-IN2P3

Research Engineer

Guests

Avatar

Camille Boutar

Sopra Steria

Campus Manager, RH

Academic Recruitment

Avatar

Johan Lefort

Sopra Steria

Data Engineer

Big Data, Business Intelligence, Economics

Avatar

Matthis Crozier

Sopra Steria

Data Engineer

Big Data

Avatar

Nassima Ben Bahtane

Sopra Steria

Data Engineer

PostgreSQL, Spring Boot, DevOps, Apache NiFi

Avatar

Nicolas Brisy

Sopra Steria

Senior Architect

NoSQL, Big Data, DevOps, Java