Sep 10
th
Big Data for Data Science
MIE 1628
Lecture 1
Introduction into Big Data

Lecture outline
•
Introductions, instructor’s experience
•
Data science objectives
•
Emergence of big data industry
•
What is Big Data & Hadoop:
–
History of Hadoop
•
Course outline, marking scheme
•
Final project discussion
•
Technology for this course
•
Important Hadoop concepts

SUMMARY
Computer Scientist with 18 years of experience in applying mathematical & statistical techniques to complex data-intensive problems in
finance and academia
Lead Data Scientist, TD Bank
2018 – Present
NLP, Semantic Topic Modelling, Optical Character Recognition, Sentiment Modelling, Wealth Management
Principal Data Scientist, Manager, Capital One UK/Canada
2015 – 2018
Fraud Modelling (first AI based application fraud model in the UK and first self-learning, self-deploying transactional fraud model in the
world)
TEACHING :
University of Nottingham
2014-15
Industry Teaching Sessions
2015-20
McMaster, AI and Machine Learning
2020
University of Toronto
2020
EDUCATION
University of Nottingham
Doctor of Philosophy in Computer Science
2012-2015
Istanbul Technical University
2010 –2012
MSc Robotics / Computer Engineering
Isfahan University
1998 – 2003
BSc Computer Engineering
Instructor’s background
Shahriar Asta

Your point of contact for this course:
Lecturer:
Shahriar Asta:
[email protected]
Three TAs:
•
Ta Jiun Ting (Jeff):
[email protected]
•
Gautam Dawar:
[email protected]
ca

Survey for enrolled participants
You will receive an email with a request to answer a few questions.
Please fill out the survey by next class, it will help me adjust curriculum to
fit your background and programming skills:
–
What department are you affiliated with? (MEIE, others)
–
Have you taken any data science courses (machine learning, statistics etc)?
–
What programing languages and environments do you have experience with:
•
Java, C++, Python, R, Matlab, SAS, SQL, other
–
What big data solutions do you have experience with?
•
None, Hive, Impala, Spark, Kafka, HDFS, Azure, Ambari, Zookeper
–
What topics would you like to cover and focus on in the class?

Business background and motivation
for this course

My experience with enterprise data science and big data
Canadian retail
US retail
Wholesale
Corporate
•
Personal and commercial banking
•
Wealth
•
Insurance
•
Canada: .~15M customers
Revenue: $21B (2017)
•
Personal & business
banking
•
Wealth management
:
•
U.S.
~8 million retail
customers
Revenue: $10.2B (2017)
•
Capital markets
•
Corporate and
investment banking
Revenue: $3.2B (2017)
Revenue: $1.6B (2017)
The Banking Business
Key Product Groups:
Personal banking:
•
Personal deposits
•
Consumer Lending
•
Credit cards and Merchant
solutions
•
Auto finance
Business Banking
•
Commercial banking
•
Small Business Banking
Wealth:
•
Direct Investing
•
Asset Management
•
Advice
Insurance
Includes trade data, planner/advisor
information, securities, mutual funds,
market data …

Long term technology trends
transforming enterprises
Big data
Blockchain
Data science
AI
Mobile and
Digital
Banks becoming technology companies
Process Automation
Biometrics
and
adaptive
security
Cloud
IofT

