[go: up one dir, main page]

0% found this document useful (0 votes)
140 views5 pages

STAT1005 Course Outline

This document outlines the course details for STAT 1005 – Essential skills for undergraduates: Foundations of Data Science at the University of Hong Kong. It provides the staff details and contact hours, course objectives, syllabus, assessment details and project coordination details. The course introduces basic concepts and methodology of data science through lectures, tutorials and a group project. Students will engage in a full data workflow including collaborative data science projects covering topics from data acquisition and cleaning to analysis, modeling, and communicating results.

Uploaded by

yi yeung yau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
140 views5 pages

STAT1005 Course Outline

This document outlines the course details for STAT 1005 – Essential skills for undergraduates: Foundations of Data Science at the University of Hong Kong. It provides the staff details and contact hours, course objectives, syllabus, assessment details and project coordination details. The course introduces basic concepts and methodology of data science through lectures, tutorials and a group project. Students will engage in a full data workflow including collaborative data science projects covering topics from data acquisition and cleaning to analysis, modeling, and communicating results.

Uploaded by

yi yeung yau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

THE UNIVERSITY OF HONG KONG

DEPARTMENT OF STATISTICS AND ACTUARIAL SCIENCE


DEPARTMENT OF COMPUTER SCIENCE

STAT 1005 – Essential skills for undergraduates: Foundations of Data Science

Lecture Venue: CPD-LG.01 Date and Time: Mon 3:30p.m.–5:20p.m.; Thur 3:30p.m.–4:20p.m.
Tutorial Venue: See moodle Date and Time: See moodle
Part I: Staff Details and Contact Hours:
Wk 2-6 CS Staff Contact Contact method Teaching weeks Assessment Marking and Questions
(Sep 4 – hours
Oct 8)
Lecturer Dr. Yu, Tao Tue 2- • Email: tyu@cs.hku.hk Wk 2-6 • MC Mid-term Test (20%)
4pm • CB-204E (By Python programming (Sep 12 public o It is an online test on week 7 (select
appointment) holiday with online test as make-up any 2 hours for the test between Oct
class) 9 (Sun)– Oct 10 (Mon)
Tutorial Guan Web Email: Wk 2-3 • Assignment 1 (10%)
Tutor Rongxin < > 2pm- u3009630@connect.hku Tutorial 2-3 o Deadline to be announced by Dr Yu
4pm .hk
Tutorial Xie, Fri 10 am Email: tianbaox@connect Wk 4-6
Tutor Tianbao - 12 pm .hku.hk Tutorial 2-3

Tutorial Su, Mon 1 Email: hjsu@connect.h Wk 4-6


Tutor Hongjin pm - 3 pm ku.hk Tutorial 2-3

Wk 1, 8- SAAS Contact Contact method Topics to be covered Assessment Marking and Questions
12, 15 Staff hours
Lecturer Dr. Mon • Email: Wk 1 • MC Final Test (20%)
Adela Lau 5:30- adelalau@hku.hk Intro (1hr) + Online Final Test on Wk o It is an online test on week 14 (select
7:30pm • RRS 207 (By 14 any 2 hours for the test between Nov
appointment) Wk 8 27 (Sun)– Nov 28 (Mon)
Sampling Distributions and • Group Project (30%)
Correlation Analysis ü Online Project Presentation and
Wk 9 discussion forum – asynchronized
Hypothesis testing mode (10%) on week 15 Dec 5
Wk 10 (Mon) 3:30 – 6:30pm
Regression and Prediction ü Group Report (20%) – By week 15
Wk 11 Dec 11 (Sun) 23:59pm in moodle
Classification assignment dropbox
Wk 12 (no face-to-face class, video
record)
Project and competition briefing and
sample codes video records (videos
will be posted on wk 5 – 7 for home
watch and discussion forum Q&A)
Wk15 (no face-to-face class, youtube
record and discussion forum)
Online Project presentation record and
discussion forum Q&A
Tutorial WU Tue 10- Email: Wk 5, 7 Project coordination
Tutor Qingyi 12pm juliewu9@connect.hku.h Group project lead (forming groups Group arrangement on week 5-7 via
k and metaverse guide) – 30 mins video moodle questionnaire
record
Tutorial GUO Sat 8:30- Email: Wk 8 tutorial Ass 2 Q1 (2.5%) – By Oct 30 (Sun) 23:59
Tutor Yunshan 10:30am 3007805@connect.hku.h Wk 7 0.5 hr project brainstorming (Gp in moodle assignment dropbox
k 1-12) - zoom
Wk 12 0.5 hr report progress (Gp1-12)
- zoom
Tutorial YAO Tue 5:30- Email: Wk 9 tutorial Ass 2 Q2 (2.5%) – By Nov 6 (Sun) 23:59
Tutor Minhao 7:30pm mhyao@connect.hku.hk Wk 7 0.5 hr project brainstorming (Gp in moodle assignment dropbox
13-24) - zoom
Wk 12 0.5 hr report progress (Gp 13-
24) - zoom
Tutorial TIAN Mon Email: Wk 10 tutorial Ass 2 Q3 (2.5%) – By Nov13 (Sun) 23:59
Tutor Peixin 2:00- pxtian@connect.hku.hk Group project (programming samples) in moodle assignment dropbox
4:00pm Wk 7 0.5 hr project brainstorming (Gp
25-36) - zoom
Wk 12 0.5 hr report progress (Gp 25-
36) - zoom
Tutorial MIAO Yan Mon Email: Wk 11 tutorial Ass 2 Q4 (2.5%) – By Nov 20 (Sun) 23:59
Tutor 2:30- ymiao7@connect.hku.hk Group project in moodle assignment dropbox
4:30pm Wk 7 0.5 hr project brainstorming (Gp
37-38) - zoom
Wk 12 0.5 hr report progress (Gp 37-
48) - zoom
Project Jin Sat 10:00 Email: Wk 7 Group project lead Group Project Programming Marking
Tutorial Zhenchao – blwx96@connect.hku.hk (programming sample codes) – 45 mins (10%) – By week 15 Dec 11 (Sun)
Tutor 12:00pm video record 23:59pm in moodle assignment dropbox
Wk 7 0.5 hr project brainstorming (Gp
61-72) - zoom
Wk 12 0.5 hr report progress (Gp 61-
72) - zoom
Project LIU TB Email: Wk 7 Group project lead Group Project Programming Marking
Tutorial Yuanpei A ypliu0@connect.hku.hk (programming sample codes) – 45 mins (10%) – By week 15 Dec 11 (Sun)
Tutor (To be video record 23:59pm in moodle assignment dropbox
confirmed) Wk 7 0.5 hr project brainstorming
(Gp49-60) - zoom
Wk 12 0.5 hr report progress (Gp49-
60) - zoom

Course Objective

• The course introduces basic concepts and methodology of data science to junior undergraduate students. The teaching is
designed at a level appropriate for all undergraduate students with various backgrounds and without pre-requisites.
• Students will engage in a full data work-flow including collaborative data science projects. They will study a full
spectrum of data science topics, from initial investigation and data acquisition to the communication of final results.
• Specifically, the course provides exposure to different data types and sources, and the process of data curation for the
purpose of transforming them to a format suitable for analysis. It introduces elementary notions in estimation, prediction
and inference. Case studies involving less-manicured data are discussed to enhance the computational and analytical
abilities of the students.

Syllabus
- General introduction to data science
* Overview with selected case studies. General discussion on origins and forms of data, associated questions and
types of tools for their analysis.
- Data management and exploration
* Data sources, data collection and its impact on visualization, modeling and generalizability of results; data
cleaning/extraction; Quick introduction to high level programming language and Integrated Development
Environment (IDE) (Python, R); Exploratory Data Analysis (EDA); Summaries, aggregation, smoothing,
distributions of data; Data visualization
- Data analytics
* Complements on programming;
* Statistics (1): model for randomness, random variables, distributions, histograms, correlations.
* Statistics (2): independent samples, estimation of mean and variance, confidence interval, hypothesis testing with
p-value.
* Statistics (3): regression models, forecasting, method of classification.
Intended Learning Outcomes
On successful completion of the course, students should be able to:
CLO 1: Explore and wrangle over data; summarize and visualize data.
CLO 2: Formulae problems and bring elementary concepts in estimation, prediction, and inference to bear.
CLO 3: Write basic functions and simple data analysis codes using state-of-art computing software.

Pre-requisites
• NIL

Teaching and Assessment


This course uses problem-based, information acquisition, innovation, collaborative, and peer learning teaching methods.
Teaching is made up of a three-hour lecture and a one-hour tutorial per week. Teaching materials will be uploaded to the
course Moodle for reference and review. Full attendance in lectures and tutorials are expected. Student engagement is
expected via class participation and email communication.
Assessment includes three individual assignment (30%), two test (20%), and a group project (50%). Unless an acceptable
reason is given, penalty will be applied to any late submission of coursework and project. Partially or wholly copied
assignments and project will be penalized and/or reported as plagiarism.

1. Assignment (20%)
There will be one assignment. All questions must be answered in a manner that is easily readable. All symbols
must be defined clearly. Some questions may require you to use Python to analyze some data sets. Your
programming output should be clearly labelled, and edited to remove non-essential portions. Do NOT simply tack
on a pile of output through which the readers will have to read slowly to find the relevant analyses. Electronic
submission is preferred.

2 Test (40%)
There will be two tests. This will assess CLOs 1-3.

3. Group Project (40%)


Students should work on the project in teams of five students. Students consider a real problem involving class
topics on a data set. Each team must write a project report, find the necessary data, carry out the project, and give
an oral presentation. This will directly assess CLOs 1-3.

Grade Descriptors
A: Demonstrate thorough mastery at an advanced level of extensive knowledge and skills re- quired for attaining
all the course learning outcomes. Show strong analytical and critical abilities and logical thinking, with evidence
of original thought, and ability to apply knowledge to a wide range of complex, familiar and unfamiliar situations.
Apply highly effective organizational and presentational skills.
B: Demonstrate substantial command of a broad range of knowledge and skills required for attaining at least most
of the course learning outcomes. Show evidence of analytical and critical abilities and logical thinking, and ability
to apply knowledge to familiar and some unfamiliar situations. Apply effective organizational and presentational
skills.
C: Demonstrate general but incomplete command of knowledge and skills required for attaining most of the
course learning outcomes. Show evidence of some analytical and critical abilities and logical thinking, and ability
to apply knowledge to most familiar situations. Apply moderately effective organizational and presentational
skills.
D: Demonstrate partial but limited command of knowledge and skills required for attaining some of the course
learning outcomes. Show evidence of some coherent and logical thinking, but with limited analytical and critical
abilities. Show limited ability to apply knowledge to solve problems. Apply limited or barely effective
organizational and presentational skills.
F: Demonstrate little or no evidence of command of knowledge and skills required for attaining the course
learning outcomes. Lack of analytical and critical abilities, logical and coherent thinking. Show very little or no
ability to apply knowledge to solve problems. Organization and presentational skills are minimally effective or
ineffective.

Department’s policy on absence from class test


If for any reason you are or have been unable to attend a mid-term test, and if you wish to have a supplementary
mid-term test,
(a) all full-time students should write to the General Office of the Department of Statistics and Actuarial Science
(and email to the course instructor) giving reasons for your absence;
(b) all part-time students should write to the course instructor giving reasons for your absence via email, within 7
days of the absence.
A special/supplementary test is normally granted to those absent from the original test due to illness and with
original medical certificate provided. Students absent due to other reasons are not granted a
special/supplementary test unless with very special circumstances and with valid documental proofs provided.

Plagiarism
Plagiarism is a serious offence in the academic world. It constitutes academic theft – the offender has ‘stolen’
some intellectual property and presented it as his or her own. Plagiarism speaks to a person’s integrity and honesty,
stifles creativity and originality, and defeats the fundamental purpose of education.

In this University, plagiarism is a disciplinary offence. Any student who commits the offence may face
disciplinary action. It is the responsibility of all students at all levels to familiarize themselves with proper
academic practice of writing, citation and referencing. This website (https://tl.hku.hk/plagiarism/) provides
general guidance on what constitutes plagiarism, why it is wrong, and how to avoid it. Students are also expected
to seek specific guidance within their discipline and to consult the relevant University policies and regulations.
Useful links are provided on this website.

References
1. Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python 2nd Edition, by Peter
Bruce (Author), Andrew Bruce (Author), Peter Gedeck. ISBN-13: 978-1492072942 (Chapter 1 - 5). URL:
https://www.amazon.com/Practical-Statistics-Data-Scientists-Essential/dp/149207294X; and publisher
example code for this textbook:
https://www.oreilly.com/library/view/practical-statistics-for/9781492072935/

You might also like