In this project we aim to Predict if tumor is benign or malignant by training a Random Forest classification and K-Means clustering model on target Diagnosis. The data contains the following columns: id: ID numberdiagnosis: The diagnosis of breast tissues (M = malignant, B = benign)radius_mean: mean of distances from center to points on the perimetertexture_mean: standard deviation of gray-scale valuesperimeter_mean: mean size of the core tumorarea_mean: mean area size of the tumorsmoothness_mean: mean of local
Stores are looking for new ways to promote their sale and increase their income. An increase can be found in cross-selling these days. Cross-selling is “an action or practice of selling an additional product or service to an existing customer”. It is important to understand how the products and services should be combined to increase their sale. It is the subject of a technique called Market Basket Analysis (MBA) or product association analysis. Market Basket
In this project, we aim to predict the occurrence of diabetes within the PIMA Native American Group. We implemented the Decision Tree algorithm on Python. The data contains the following columns: times_pregnant: Number of times pregnant plasma_glucose: Concentration of plasma glucose in a 2 hour oral glucose tolerance test diastolic_blood_pressure: Measured in mmHg tricep_skin_fold_thickness: Measured in mm serum_insulin: Insulin concentration in serum in 2-hour period. Measured in (mu U/ml) body_mass_index: Weight in kg/height in (m^2) diabetes_pedigree_function: Function that assigns probability
Prediction Of Iris Species
Category: Machine Learning, Project
The Iris flower data set is a multivariate data set introduced by the British statistician and biologist Ronald Fisher in his 1936 paper The use of multiple measurements in taxonomic problems. It is sometimes called Anderson’s Iris data set because Edgar Anderson collected the data to quantify the morphologic variation of Iris flowers of three related species. The data set consists of 50 samples from each of three species of Iris (Iris Setosa, Iris virginica,
Prediction Of Startups Profit
Category: Machine Learning, Project
In this project, we aim to predict the 50 startups profit. we implemented Multiple Linear Regression on Python. The data contains the following columns: R&D SpendAdministrationMarketing SpendStateProfit . let’s get our environment ready with the libraries we’ll need and then import the data! import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns %matplotlib inline Check out the Data df = pd.read_csv('~/DataSet GitHub/Regression/50_startups.csv') df.head(5) df.info() df.describe() . EDA Let’s
This is FSOCIETYSJ
Category: Personal
I’m Sadegh Jalalian and I graduated from University of Salford (UK) with MSc degree in Data Science with distinction. During my MSc, I have worked on five projects: • Prediction of paying off loans to lenders using R & Python • Prediction of shopping behaviour using Python • Analysing the premier league data from Twitter using big data tools and techniques (Hadoop,Hive,Impala,Spark,Sqoop,Scala) • Creating email marketing database using Microsoft SQL Server & BI Tools •
By Order Of The Peaky Blinders
Category: Movie
Peaky Blinders is one of the most unique British dramas ever made. I started watching the series because I saw that Cillian Murphy and Tom Hardy were in it. In the beginning, the story is based on a true story based in England which brings charm to this TV show. The quality of the scenario puts you in a good position where you will have to watch all the episodes before getting satisfied. Cillian Murphy
CRISP-DM
Category: Data Science
The CRISP-DM model has consistently been the most popular data mining process model throughout the past fifteen years that provides a structured approach to planning a data mining project. CRISP-DM stands for cross industry standard process for data mining. The six phases of the CRISP Data Mining process are business understanding, data understanding, data preparation, modelling, evaluation and deployment [43]. As shown in figure below, CRISP-DM methodology is based on 6 steps: . BUSINESS UNDERSTANDING
Comments
I’d like to thank you for the efforts you have put in penning this blog. I really hope to see the same high-grade blog posts from you in the future as well. In truth, your creative writing abilities has inspired me to get my very own blog now 😉