Showing posts with label Machine learning. Show all posts
Showing posts with label Machine learning. Show all posts

Wednesday, February 5, 2020

Machine Learning workflow



Machine Learning Workflow consists of 3 components

  • Explore and process data
  • Modeling
  • Deployment

Saturday, April 27, 2019

Data dimensions





SCALARS
  • They have 0 dimensions
  • Ex a persons height would be a scalar

1      2.4      -0.3

Friday, April 26, 2019

Bag of words






The Problem with Text
A problem with modeling text is that it is messy, and techniques like machine learning algorithms prefer well defined fixed-length inputs and outputs.
Machine learning algorithms cannot work with raw text directly; the text must be converted into numbers. Specifically, vectors of numbers.
In language processing, the vectors x are derived from textual data, in order to reflect various linguistic properties of the text.
This is called feature extraction or feature encoding.
A popular and simple method of feature extraction with text data is called the bag-of-words model of text.

Tuesday, April 16, 2019

Sage Maker Services


SERVICES PROVIDED BY SAGEMAKER

1) Provides jupyter notebook instance
  • Used to explore and process data
2) API
  •  This simplifies computationally difficult task like train and deploy machine learning model

Machine Learning Workflow



Machine Learning Workflow consists of 3 components
  • Explore and process data
  • Modeling
  • Deployment
EXPLORE AND PROCESS DATA
This component consists of exploring and processing the data.

Retrieve
The first step is to retrieve the data, which includes test and train dataset. Lets take an example of housing dataset which contains csv files. We need to download the data from the source. 

Monday, February 4, 2019

twittter location clustering based on tweets (Spark Mllib)



1)  Create a directory for twitter streams
 cd /usr/lib/spark 
 sudo mkdir tweets 
 cd tweetscd
 sudo mkdir data 
 sudo mkdir training
 sudo chmod  777 /usr/lib/spark/tweets/ 

These are the two folders which we would be using in this project
data :Would contain the master of the csv files which we would pretend coming from a training source.
training :  Source to train our machine learning algorithm

Friday, December 28, 2018

Tensorflow


  • Interface for expressing machine learning algorithms
  • Implementation for executing such algorithms
  • Framework for creating ensemble algorithms for today's most challenging problems

Tuesday, April 24, 2018

Mean, Median and Mode

Mean
The "average" number; found by adding all data points and dividing by the number of data points.


Thursday, January 11, 2018

Quick review of machine learning algorithms

These are some of the important machine learning algorithms

Decision tree

  •  Belongs to the family of supervised learning algorithms. 
  • Can be used for solving regression and classification problems too.The general motive of using
  • Decision Tree is to create a training model which can use to predict class or value of target variables by learning decision rules inferred from prior data(training data)
       Ex : Banker deciding whether to grant a loan.

Thursday, August 31, 2017

Classifying data into predefined categories


Input and output for classification problem


  • Input to classification problem is a feature and output is called as label
  • Problem statement and training data is where we spend amount of time

Lets talk about 2 types of problems

  Problem statement 1
     Email, tweet or trading day
  • Types of problems are Spam or Ham
  • Tweet positive or negative
  • Trading day up-day or down-day

Wednesday, November 2, 2016

Machine learning

-Is a science of getting computers to learn, without being explicitly programmed
-Grew out of work in AI
-New capability of the computers




Labels