面向地球科学的机器学习入门课程详细信息

课程号 01230480 学分 2
英文名称 Introductory Machine Learning for Earth Scientists
先修课程 Students must have experience in at least one programming language such as Python or Matlab or similar, and be able to use this for tasks such as loading and plotting a data file.
中文简介 本课程介绍基本的机器学习(Machine Learning / ML)概念,并展示其在地球科学中的应用。
ML包含一系列不同的算法,这些算法通常用于基于复杂和非直观的数据进行预测。在许多技术应用中已非常成功,也可以用于地球科学。本课程的目的是概述各种不同的ML算法,让学生直观地了解其工作原理和适用范围,并使学生能够将ML技术应用于地球物理问题。
本课程将介绍最重要和最常用的ML算法,首先介绍较基础的算法,如线性回归和逻辑回归,多元回归和主成分分析,进而介绍较进阶的算法,如人工神经网络、随机森林、支持向量机,最后介绍无监督ML算法,如K均值聚类算法。课程将从理论上介绍了每种算法的基本工作原理,让学生对工作原理产生直观理解。
每一个ML算法都将配有一个地球科学的真实案例研究,从气候科学、古海洋学、火山学、地震学到遥感学和石油地球科学。每一个案例研究都包括编程练习,学生将使用一个ML编程语言(Python)来分析真实地球物理数据和讨论学生的发现,以培养学生精确处理数据和解释ML预测的能力。
该课程是线上线下混合式课程:学生在家观看录制的视频讲座,并将简单的编程练习作为作业。 课堂会议(可以线上或线下进行)将用于讨论发现并分组进行更高级的练习。 这些练习将被打分。 该课程以英语授课。
英文简介 This course introduces basic Machine Learning (ML) concepts and shows applications in Earth Sciences.

ML encompasses a range of different algorithms that are used to make predictions based on often complex and non-intuitive data. Such algorithms have been very successful in many technological applications, but can also be used in the Earth Sciences. The aim of this course is to give an overview over the various different ML algorithms, to give students an intuitive understanding of how they work, what situations which algorithm is suitable for, and to enable students to apply ML techniques to geophysical problems.

The most important and widely used ML Algorithms are introduced, starting from simple linear and logistic regression, as well as multivariate regression and principal component analysis, before continuing to more advanced algorithms such as Artificial Neural Networks, Random Forests, Support Vector Machines, and eventually unsupervised ML algorithms such as the K-means clustering algorithm. The basic principle of the functioning of each algorithm is introduced theoretically, with a focus on an intuitive understanding of the working principles.

Following this, each ML Algorithm will be applied to a real Case Study from the Earth Sciences, ranging from climate science, over (palaeo)oceanography, volcanology, and seismology, to remoting sensing and petroleum geoscience. Each Case Study will consist of a programming exercise in which students will use ML toolboxes/packages in Python to analyze real geophysical data and discuss the findings with fellow students to develop the ability to critically analyze data and interpret ML predictions.

The course is a blended learning (online/offline) course: Students watch recorded video lectures at home and do simple programming exercises as homework. Classroom-meetings (which may be conducted online or offline) will be used to discuss findings and solve more advanced exercises in groups. The solutions to these exercises will be marked. The course is taught in English.
开课院系 地球与空间科学学院
通选课领域  
是否属于艺术与美育
平台课性质  
平台课类型  
授课语言 英文
教材 无;
Data Mining and Knowledge Discovery for Geoscientists,Guangren Shi,Elsevier,2013,Machine Learning Methods in the Environmental Sciences,William Hsieh,Cambridge University Press,2009,Artificial Intelligent Approaches in Petroleum Geosciences,Cranganu, Luchian, Breaban,Springer,2016,
参考书 0124104371;
1st ed.,0521791928;
1st ed.,3319359924;
教学大纲 Machine Learning (ML) techniques are nowadays ubiquitous in the technology sector, and are more and more used in the sciences. In Earth Sciences in particular, however, adoption of ML techniques is still in its early stages, but is growing rapidly owing to the great potential of ML. Key to using this potential is to educate Earth Scientists in ML techniques and showing them how ML can be applied to geoscience-related problems. The purpose of this course is to enable the next generation of geoscientists to, first, understand current ML techniques, and second to apply these to a variety of different geophysical data and problems. The course aims to bridge the gap between ML and geosciences and lets students gain hands-on experience applying ML to real geophysical data.
1. Introduction: What is Machine Learning and what are applications in Earth Sciences? (1 lecture)

2. Review of fundamentals of Statistics
2.1. Correlation, Linear Regression, and Logistic Regression (1 lecture)
2.2. Case Study: Regression to analyse temperature time-series and glacial cycles (1 in-class exercise session)

3. Dimensionality Reduction
3.1. Multivariate Regression and Principal Component Analysis (1 lecture)
3.2. Case Study: Reconstructing Sea-Surface-Temperatures from microfossils (1 in-class exercise session)

4. Artificial Neural Networks (ANN)
4.1. Model Representation, Cost Function, Backpropagation (1 lecture)
4.2. Case Study: Classification of Volcanic Ash Particles (2 in-class exercise sessions)

5. Support Vector Machines (SVM)
5.1. Support Vector Machines for supervised and unsupervised classification (1 lecture)
5.2. Case Study: Land Cover Classification from Satellite (LANDSAT) images (1 in-class exercise session)

6. Random Forests (RF)
6.1. Decision Trees and Random Forests (1 lecture)
6.2. Case Study: Predicting Laboratory Earthquakes from Acoustic Emissions (1 in-class exercise session)

7. Unsupervised Learning and Cluster Analysis
7.1. K-means Algorithm (1 lecture)
7.2. Case Study: Inferring lithologies from well-log data (1 in-class exercise session)
- Teaching consists of lectures (1 lecture = 2 hours), and in-class exercise sessions (1 session = 2 hours).

- In lectures, new topics are introduced to the students.

- In in-class exercise sessions, students work on Case Studies to implement their own Machine Learning algorithms (alone or in groups).
Each Case Study is to be handed in as a marked homework the following week. The final mark will be the average of the homework marks.
教学评估 BERNDTTHOMASANDREAS:
学年度学期:20-21-1,课程班:面向地球科学的机器学习入门1,课程推荐得分:0.0,教师推荐得分:9.38,课程得分分数段:95-100;
学年度学期:21-22-1,课程班:面向地球科学的机器学习入门1,课程推荐得分:0.0,教师推荐得分:10.0,课程得分分数段:95-100;