ST309 Half Unit
Elementary Data Analytics
This information is for the 2023/24 session.
Teacher responsible
Prof Qiwei Yao Col.7.16
Availability
This course is available on the BSc in Accounting and Finance. This course is available as an outside option to students on other programmes where regulations permit. This course is available with permission to General Course students.
This course is available as an outside option to the students who are interested in data analytics and who have statistical background at least equivalent to ST107. No prior knowledge in programming is required. However students who have no previous experience in R are required to take on an online pre-sessional R course from the Digital Skill Lab (https://moodle.lse.ac.uk/course/view.php?id=7745).
This course is capped at 70 for the 2022/23 session.
This course cannot be taken with ST310 Machine Learning.
Pre-requisites
Students must have completed a statistical course at least equivalent to Quantitative Methods (Statistics) (ST107).
Students who have no previous experience in R are required to take on an online pre-sessional R course from the Digital Skill Lab (https://moodle.lse.ac.uk/course/view.php?id=7745).
Course content
The primary focus of this course is to help students view various problems from business, economy/finance, and social domains from a data perspective and understand the principles of extracting useful information and knowledge from data. Students will also gain the hands-on experience using R -- a programming language and software environment for data analysis and visualisation. Learning basic data analytic methods and techniques is combined with real-life examples.
The core contents of the course include data cleansing, data transformation, data visualisation, R-programming, classification, regression, clustering, over-fitting avoidance and model evaluation. The course also covers a subset of the following topics: illustration of R-access of databases and big data platforms, illustration of parallel computing in R, similarity matching, market-basket analysis, link prediction, text mining, network analysis, causal modelling.
This is not a course on algorithms and IT technologies required for handling massive data, which deserve separate courses. The focus is on the fundamental principles and concepts of data analytics or data science