Data Analysis and Machine Learning with Python
  • ondemand_video
       Video Length : 18h17m36s
  • format_list_bulleted
       Tasks Number : 96
  • group
       Students Enrolled : 111
  • equalizer
       Medium Level
Authors

Kevin Gautama is a systems design and programming engineer with 16 years of expertise in the fields of electrical and electronics and information technology.

He teaches at the Hanoi University of Industry in the period 2003-2011 and he has a certificate of vocational training by the Ministry of Industry and Commerce and the Hanoi University of Industry.

From extensive design experience through numerous engineering projects, the author founded the Enziin Academy.

The Enziin Academy is a startup in the field of educational, it's core goal is to training design engineers in the fields technology related.

The Enziin Academy is headquartered in Stockholm-Sweden with an orientation operating multi-lingual and global.

The author's skills in IT:

  • Implementing the application infrastructure on Amazon's cloud computing platform.
  • Linux server system administration (Sysadmin).
  • Design load balancing and content distribution system.
  • MySQL database administration.
  • C/C++/C# Programming
  • Ruby and Ruby on Rails Programming
  • Python and Django Programming
  • The WPF/C# on the .NET Framework Programming
  • The PHP/JAVA Programming
  • Machine Learning and Expert System.
  • Internet of Things.

The author's skills in the fields of electric and electronic:

  • The design of popular CPU / MCU systems.
  • Design FPGA / CPLD system (Xilinx - Altera).
  • Design and programming of DSP systems (Texas Instruments).
  • Embedded ARM system design.
  • The RTOS Programming
  • Design and programming electronic power systems.
  • PLC - inverter - sensor - electric control cabinet industrial.
  • Control systems distributed connection with Server.

Read more...

  • Curriculum
  • 1. Introduction
    • videocam
      The tasks to do in this course

      11m26s
    • videocam
      Install Development Environment

      11m26s
  • 2. Interactive Computing with IPython
    • videocam
      IPython Basics

      11m26s
    • videocam
      The commands in IPython

      11m26s
    • videocam
      Interacting with the OS

      11m26s
    • videocam
      Debug with pdb

      11m26s
    • videocam
      Advanced IPython Features

      11m26s
  • 3. Arrays and Vectorized Computation
    • videocam
      Introduction to NumPy

      11m26s
    • videocam
      Multidimensional Array Object

      11m26s
    • videocam
      Fast Element-wise Array Functions

      11m26s
    • videocam
      Data Processing Using Arrays

      11m26s
    • videocam
      File Input and Output with Arrays

      11m26s
    • videocam
      Linear Algebra

      11m26s
    • videocam
      Random Number Generation

      11m26s
  • 4. Data Analysis with pandas
    • videocam
      Introduction to pandas Data Structures

      11m26s
    • videocam
      Essential Functionality

      11m26s
    • videocam
      Summarizing and Computing Descriptive Statistics

      11m26s
    • videocam
      Handling Missing Data

      11m26s
    • videocam
      Hierarchical Indexing

      11m26s
    • videocam
      Advanced pandas

      11m26s
  • 5. Data Loading, Storage, and File Formats
    • videocam
      Reading and Writing Data in Text Format

      11m26s
    • videocam
      Binary Data Formats

      11m26s
    • videocam
      Interacting with HTML and Web APIs

      11m26s
    • videocam
      Interacting with Databases

      11m26s
  • 6. Data Wrangling
    • videocam
      Combining and Merging Data Sets

      11m26s
    • videocam
      Reshaping and Pivoting

      11m26s
    • videocam
      Data Transformation

      11m26s
    • videocam
      String Manipulation

      11m26s
  • 7. Plotting and Visualization
    • videocam
      Matplotlib APIs

      11m26s
    • videocam
      Plotting Functions in pandas

      11m26s
    • videocam
      Example Visualizing Earthquake Crisis Data

      11m26s
    • videocam
      Visualization Tool Ecosystem

      11m26s
  • 8. Data Aggregation and Group Operations
    • videocam
      GroupBy Mechanics

      11m26s
    • videocam
      Data Aggregation

      11m26s
    • videocam
      Group-wise Operations and Transformations

      11m26s
    • videocam
      Pivot Tables and Cross-Tabulation

      11m26s
  • 9. Time Series
    • videocam
      Date and Time Data Types

      11m26s
    • videocam
      Time Series Basics

      11m26s
    • videocam
      Date Ranges Frequencies and Shifting

      11m26s
    • videocam
      Time Zone Handling

      11m26s
    • videocam
      Periods and Period Arithmetic

      11m26s
    • videocam
      Resampling and Frequency Conversion

      11m26s
    • videocam
      Time Series Plotting

      11m26s
    • videocam
      Moving Window Functions

      11m26s
    • videocam
      Performance and Memory Usage

      11m26s
  • 10. Financial and Economic Data
    • videocam
      Time Series and Cross-Section Alignment

      11m26s
    • videocam
      Operations with Time Series of Different Frequencies

      11m26s
    • videocam
      Time of Day and Data Selection

      11m26s
    • videocam
      Splicing Together Data Sources

      11m26s
    • videocam
      Return Indexes and Cumulative Returns

      11m26s
    • videocam
      Group Transforms and Analysis

      11m26s
  • 11. Advanced NumPy
    • videocam
      ndarray Object Internals

      11m26s
    • videocam
      Advanced Array Manipulation

      11m26s
    • videocam
      Broadcasting

      11m26s
    • videocam
      Structured and Record Arrays

      11m26s
    • videocam
      Sorting

      11m26s
    • videocam
      NumPy Matrix Class

      11m26s
    • videocam
      Advanced Array I/O

      11m26s
  • 12. Big Data with Python
    • videocam
      Introducing big data

      11m26s
    • videocam
      Hadoop for big data

      11m26s
    • videocam
      Apache Hadoop

      11m26s
    • videocam
      Example in Hadoop

      11m26s
    • videocam
      Hadoop for finance

      11m26s
    • videocam
      Introducing NoSQL

      11m26s
    • videocam
      MongoDB and PyMongo

      11m26s
  • 13. Getting Started with Python Machine Learning
    • videocam
      Machine learning and Python

      11m26s
    • videocam
      A simple example machine learning

      11m26s
    • videocam
      Linear regression algorithm

      11m26s
    • videocam
      Training a linear regression model

      11m26s
    • videocam
      Recursive polynomial algorithm

      11m26s
    • videocam
      Training a recursive polynomial model

      11m26s
    • videocam
      Support Vector Machine Regression

      11m26s
    • videocam
      The Decision Tree Algorithm

      11m26s
    • videocam
      Random forest algorithm

      11m26s
  • 14. Classification in Machine Learning
    • videocam
      Logistic Regression

      11m26s
    • videocam
      K-Nearest Neighbor Classifier

      11m26s
    • videocam
      Support Vector Machine

      11m26s
    • videocam
      Kernel Support Vector Machine

      11m26s
    • videocam
      Naive Bayes Classifier

      11m26s
    • videocam
      Tree Based Algorithms

      11m26s
    • videocam
      Random Forest Classifier

      11m26s
    • videocam
      K-means clustering

      11m26s
    • videocam
      Hierarchical Clustering in Python

      11m26s
  • 15. Artificial Neural Networks
    • videocam
      Introduction to ANN

      11m26s
    • videocam
      Mathematical basis of ANN

      11m26s
    • videocam
      Perceptron neural network

      11m26s
    • videocam
      The Backpropagation Algorithm

      11m26s
    • videocam
      Building an ANN

      11m26s
    • videocam
      Training an ANN

      11m26s
  • 16. TensorFlow Framework
    • videocam
      Introduction to TensorFlow

      11m26s
    • videocam
      TensorFlow APIs

      11m26s
    • videocam
      Building an ANN with TensorFlow

      11m26s
    • videocam
      Training an ANN with TensorFlow

      11m26s
  • 17. Practical Projects
    • videocam
      Handwriting Recognition with Python

      11m26s
    • videocam
      Image Recognition with Python

      11m26s
    • videocam
      Natural Language Processing with Python

      11m26s
Python analyzer
Data Analysis and Machine Learning with Python


Note: This is a module belongs to the classes, billing features separate for this module will be allowed if the content matches. The classes using this module are listed below.

Python for Data Analysis

For many people, the Python language is easy to fall in love with. Since its first appearance in 1991, Python has become one of the most popular dynamic, programming languages, along with Perl, Ruby, and others.

Python and Ruby have become especially popular in recent years for building websites using their numerous web frameworks, like Rails (Ruby) and Django (Python). Such languages are often called scripting languages as they can be used to write quick-and-dirty small programs, or scripts.

I don’t like the term “scripting language” as it carries a connotation that they cannot be used for building mission-critical software. Among interpreted languages Python is distinguished by its large and active scientific computing community.

Adoption of Python for scientific computing in both industry applications and academic research has increased significantly since the early 2000s. For data analysis and interactive, exploratory computing and data visualization.

Python will inevitably draw comparisons with the many other domain-specific open source and commercial programming languages and tools in wide use, such as R, MATLAB, SAS, Stata, and others. In recent years, Python’s improved library support (primarily pandas) has made it a strong alternative for data manipulation tasks.

Combined with Python’s strength in general purpose programming, it is an excellent choice as a single language for building data-centric applications.

Python for Machine Learning

Machine learning (ML) teaches machines how to carry out tasks by themselves, it is that simple. The complexity comes with the details, and that is most likely the reason you are reading this book. Maybe you have too much data and too little insight, and you hoped that using machine learning algorithms will help you solve this challenge.

So you started to  dig into random algorithms. But after some time you were puzzled: which of the myriad of algorithms should you actually choose? Or maybe you are broadly interested in machine learning and have been reading  a few blogs and articles about it for some time.

The goal of machine learning is to teach machines to carry out tasks by providing them with a couple of examples (how to do or not do a task). Let us assume that each morning when you turn on your computer, you perform the  same task of moving e-mails around so that only those e-mails belonging to a particular topic end up in the same folder.

After some time, you feel bored and  think of automating this chore. One way would be to start analyzing your brain  and writing down all the rules your brain processes while you are shuffling your e-mails. However, this will be quite cumbersome and always imperfect.

While you will miss some rules, you will over-specify others. A better and more future-proof way would be to automate this process by choosing a set of e-mail meta information and body/folder name pairs and let an algorithm come up with the best rule set.

The pairs would be your training data, and the resulting rule set (also called model) could then be applied to future e-mails that we have not yet seen. This is machine learning in its simplest form. Of course, machine learning (often also referred to as data mining or predictive analysis) is not a brand new field in itself.

Quite the contrary, its success over recent years can be attributed to the pragmatic way of using rock-solid techniques and insights from other successful fields; for example, statistics. There, the purpose is for us humans to get insights into the data by learning more about the underlying patterns and relationships.

As you read more and more about successful applications of machine learning (you have checked out kaggle.com already, haven't you?), you will see that applied statistics is a common field among machine learning experts. As you will see later, the process of coming up with a decent ML approach is never a waterfall-like process.

Instead, you will see yourself going back and forth in your analysis, trying out different versions of your input data on diverse sets of ML algorithms. It is this explorative nature that lends itself perfectly to Python. Being an interpreted high-level programming language, it may seem that Python was designed specifically for the process of trying out different things.

What is more, it does this very fast. Sure enough, it is slower than C or similar statically-typed programming languages; nevertheless, with a myriad of easy-to-use libraries that  are often written in C, you don't have to sacrifice speed for agility.

Table of Content

1. Introduction

  • The tasks to do in this course
  • Install Development Environment

2. Interactive Computing with IPython

  • IPython Basics
  • The commands in IPython
  • Interacting with the OS
  • Debug with pdb
  • Advanced IPython Features

3. Arrays and Vectorized Computation

  • Introduction to NumPy
  • Multidimensional Array Object
  • Fast Element-wise Array Functions
  • Data Processing Using Arrays
  • File Input and Output with Arrays
  • Linear Algebra
  • Random Number Generation

4. Data Analysis with pandas

  • Introduction to pandas Data Structures
  • Essential Functionality
  • Summarizing and Computing Descriptive Statistics
  • Handling Missing Data
  • Hierarchical Indexing
  • Advanced pandas

5. Data Loading, Storage, and File Formats

  • Reading and Writing Data in Text Format
  • Binary Data Formats
  • Interacting with HTML and Web APIs
  • Interacting with Databases

6. Data Wrangling

  • Combining and Merging Data Sets
  • Reshaping and Pivoting
  • Data Transformation
  • String Manipulation

7. Plotting and Visualization

  • Matplotlib APIs
  • Plotting Functions in pandas
  • Example Visualizing Earthquake Crisis Data
  • Visualization Tool Ecosystem

8. Data Aggregation and Group Operations

  • GroupBy Mechanics
  • Data Aggregation
  • Group-wise Operations and Transformations
  • Pivot Tables and Cross-Tabulation

9. Time Series

  • Date and Time Data Types
  • Time Series Basics
  • Date Ranges Frequencies and Shifting
  • Time Zone Handling
  • Periods and Period Arithmetic
  • Resampling and Frequency Conversion
  • Time Series Plotting
  • Moving Window Functions
  • Performance and Memory Usage

10. Financial and Economic Data

  • Time Series and Cross-Section Alignment
  • Operations with Time Series of Different Frequencies
  • Time of Day and Data Selection
  • Splicing Together Data Sources
  • Return Indexes and Cumulative Returns
  • Group Transforms and Analysis

11. Advanced NumPy

  • ndarray Object Internals
  • Advanced Array Manipulation
  • Broadcasting
  • Structured and Record Arrays
  • Sorting
  • NumPy Matrix Class
  • Advanced Array I/O

12. Big Data with Python

  • Introducing big data
  • Hadoop for big data
  • Apache Hadoop
  • Example in Hadoop
  • Hadoop for finance
  • Introducing NoSQL
  • MongoDB and PyMongo

13. Getting Started with Python Machine Learning

  • Machine learning and Python
  • A simple example machine learning
  • Linear regression algorithm
  • Training a linear regression model
  • Recursive polynomial algorithm
  • Training a recursive polynomial model
  • Support Vector Machine Regression
  • The Decision Tree Algorithm
  • Random forest algorithm

14. Classification in Machine Learning

  • Logistic Regression
  • K-Nearest Neighbor Classifier
  • Support Vector Machine
  • Kernel Support Vector Machine
  • Naive Bayes Classifier
  • Tree Based Algorithms
  • Random Forest Classifier
  • K-means clustering
  • Hierarchical Clustering in Python

15. Artificial Neural Networks

  • Introduction to ANN
  • Mathematical basis of ANN
  • Perceptron neural network
  • The Backpropagation Algorithm
  • Building an ANN
  • Training an ANN

16. TensorFlow Framework

  • Introduction to TensorFlow
  • TensorFlow APIs
  • Building an ANN with TensorFlow
  • Training an ANN with TensorFlow

17. Practical Projects

  • Handwriting Recognition with Python
  • Image Recognition with Python
  • Natural Language Processing with Python