Being able to extract insights from massive amount of data has become a key to success in many industries. This Big Data Analytics (BDA) course will survey state-of-the-art topics in big data, focusing on data analysis and machine learning technologies. Participants will gain an understanding of what opportunities big data can bring to their business through hands-on experience with real applications. The topic will be approached from both the practical and theoretical sides.
From the practical side:
Participants will have a chance to practice with state-of-the-art big data toolkits and platforms, such as R, Hadoop, Spark,… There will be hands-on assignments which let you build your first spam classifier, recommendation system, and anomaly detector. At the end of the course, participants will be able to answer these following questions:
- What is big data and why the big data era has come to be?
- What are the main challenges of big data? How to tackle those challenges using state-of-the-art big data technologies?
- What is the difference between data and “information”?
- What is the grammar of data manipulation and how to manipulate data?
- What is the grammar of graphics and how to visualize things?
- What are the common statistical machine learning models used in big data? What techniques are suitable in different
From the theoretical side:
Besides introducing state-of-the-art big data toolkits and technologies, this course will also pay a lot of attention on the theory of learning from data, from both the statistical and the machine learning perspectives. To provide a solid foundation for your future big data project, the following concepts will be carefully covered during the course:
- Sample bias problem
- Occam’s razor and the inherently imperfect definition of simplicity
- Bias and Variance Tradeoff
- Curse of Dimensionality
- Overfitting and Regularization
Who should participate?
This is the second time we run this course. Thus, many materials and assignments have been adapted to allow participants with different backgrounds to take the most out of the course. No prior programming experience is needed, although there will be challenging programming exercises for those who are serious about becoming a data scientist. Participants may include:
- Engineers who need to learn the data analysis skills and the new Big Data technologies to apply to their work.
- Technical managers who need to familiarize themselves with these emerging technologies.
- Computer science students who want to become a data scientist.
Østfold University College and NCE Smart Energy Markets also offer a short program in Big Data Analytics. Read about it here (in Norwegian).