Month: July 2019

Prediction of Pulsar Stars

The HTRU2 dataset describes a sample of pulsar candidates collected during the High Time Resolution Universe Survey. Pulsars are a rare type of Neutron star that produce radio emission detectable here on Earth. They are of considerable scientific interest as probes of space-time, the inter-stellar medium, and states of matter . As pulsars rotate, their emission beam sweeps across the sky, and when this crosses our line of sight, produces a detectable pattern of broadband …

Prediction of Breast Cancer Diagnosis

In this project we aim to Predict if tumor is benign or malignant by training a Random Forest classification and K-Means clustering model on target Diagnosis. The data contains the following columns: id: ID number diagnosis: The diagnosis of breast tissues (M = malignant, B = benign) radius_mean: mean of distances from center to points on the perimeter texture_mean: standard deviation of gray-scale values perimeter_mean: mean size of the core tumor area_mean: mean area size of …

Prediction of Shopping Behaviour

Stores are looking for new ways to promote their sale and increase their income. An increase can be found in cross-selling these days. Cross-selling is “an action or practice of selling an additional product or service to an existing customer”. It is important to understand how the products and services should be combined to increase their sale. It is the subject of a technique called Market Basket Analysis (MBA) or product association analysis.  Market Basket …

Prediction of Diabetes Occurrence​

In this project, we aim to predict the occurrence of diabetes within the PIMA Native American Group. We implemented the Decision Tree algorithm on Python. The data contains the following columns: times_pregnant: Number of times pregnant plasma_glucose: Concentration of plasma glucose in a 2 hour oral glucose tolerance test diastolic_blood_pressure: Measured in mmHg tricep_skin_fold_thickness: Measured in mm serum_insulin: Insulin concentration in serum in 2-hour period. Measured in (mu U/ml) body_mass_index: Weight in kg/height in (m^2) diabetes_pedigree_function: Function that assigns probability …

Prediction Of Iris Species

The Iris flower data set is a multivariate data set introduced by the British statistician and biologist Ronald Fisher in his 1936 paper The use of multiple measurements in taxonomic problems. It is sometimes called Anderson’s Iris data set because Edgar Anderson collected the data to quantify the morphologic variation of Iris flowers of three related species. The data set consists of 50 samples from each of three species of Iris (Iris Setosa, Iris virginica, …

Prediction Of Startups Profit

In this project, we aim to predict the 50 startups profit. we implemented Multiple Linear Regression on Python. The data contains the following columns: R&D Spend Administration Marketing Spend State Profit . let’s get our environment ready with the libraries we’ll need and then import the data! Check out the Data . EDA Let’s create some simple plots to check out the data! . Training a Linear Regression Model Let’s now begin to train out …

This is FSOCIETYSJ

 I’m Sadegh Jalalian and I graduated from University of Salford (UK) with MSc degree in Data Science with distinction.  During my MSc, I have worked on five projects:  • Prediction of paying off loans to lenders using R & Python  • Prediction of shopping behaviour using Python  • Analysing the premier league data from Twitter using big data tools and techniques (Hadoop,Hive,Impala,Spark,Sqoop,Scala)  • Creating email marketing database using Microsoft SQL Server & BI Tools  • …

By Order Of The Peaky Blinders

Peaky Blinders is one of the most unique British dramas ever made. I started watching the series because I saw that Cillian Murphy and Tom Hardy were in it.At the beginning, the story is based on a true story based in England which brings charm to this tv-show. The quality of the scenario puts you in a good position where you will have to watch all the episodes before getting satisfy. Cillian Murphy is so …