Big Data Management and Analysis in Linux

Summer in Amsterdam Program
Amsterdam, Netherlands

Dates: 7/2/22 - 7/31/22

Summer in Amsterdam

Big Data Management and Analysis in Linux

Big Data Management and Analysis in Linux Course Overview

OVERVIEW

CEA CAPA Partner Institution: Vrije Universiteit Amsterdam
Location: Amsterdam, Netherlands
Primary Subject Area: Computer Sciences
Instruction in: English
Transcript Source: Partner Institution
Course Details: Level 300
Recommended Semester Credits: 3
Contact Hours: 45
Prerequisites: The course will be fairly technical, combined with many computer tutorials. There are no entry requirements other than a willingness to learn about programming Linux, but a decent background in statistics, mathematics, and programming is an advantage.

DESCRIPTION

The growing availability of extremely large datasets requires scientists and analysts to use powerful supercomputers or computer clusters to store, manage, and analyze these data. These clusters typically run on Linux, which requires some programming skills and insights into suitable software packages. Our course will introduce you to programming in a Linux environment, teach you how to efficiently manage very large datasets (e.g. using sed, awk, and grep commands) and create simple shell scripts to analyze your data (e.g. using a Linux version of the freely available statistics program R). You will also learn how to visualize your data and results in customized plots and figures. These skills are extremely valuable for scientists from all disciplines as well as for business practitioners (e.g. consultants or financial analysts) who are planning to work with big data.

The format of the course is three hour lectures in the morning, followed by two hours of supervised work in computer tutorials in the afternoon. Both the lectures and tutorials will be held in a computer room. The lectures will be interactive, with short examples that allow students to apply the introduced concepts. In the tutorials, students will get more hands-on training in a supervised environment with exercises covering the day's topics, and they will have the opportunity to work on the assignments. The computer room will stay open to students for self-study after the tutorials.

Students are not required to bring their own laptops, but they are allowed to do so if they wish to work on their own computers.

Contact hours listed under a course description may vary due to the combination of lecture-based and independent work required for each course. CEA's recommended credits are based on the contact hours assigned by Vrije Universiteit Amsterdam (VU Amsterdam): 15 contact hours equals 1 U.S. credit


Get a Flight Credit worth up to $750 when you apply with code* by February 14, 2025