Prediction of Diabetes Occurrence​

In this project, we aim to predict the occurrence of diabetes within the PIMA Native American Group. We implemented the Decision Tree algorithm on Python.

The data contains the following columns:

• times_pregnant: Number of times pregnant
• plasma_glucose: Concentration of plasma glucose in a 2 hour oral glucose tolerance test
• diastolic_blood_pressure: Measured in mmHg
• tricep_skin_fold_thickness: Measured in mm
• serum_insulin: Insulin concentration in serum in 2-hour period. Measured in (mu U/ml)
• body_mass_index: Weight in kg/height in (m^2)
• diabetes_pedigree_function: Function that assigns probability of someone getting diabetes
• age: Years
• class: Predictor: the value of 0 or 1 correspond to no diabetes and diabetes

let’s get our environment ready with the libraries we’ll need and then import the data!

``````import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
from sklearn.metrics import confusion_matrix
from sklearn import metrics``````

Check out the Data!

``````#importing the dataset
``df.info()``

Let’s check out the data summary!

``````plt.figure(figsize=(12,8))
sns.heatmap(df.describe()[1:].transpose(),
annot=True,linecolor="w",
linewidth=2,cmap=sns.color_palette("Set1"))
plt.title("Data summary")
plt.show()``````

.

Exploratory Data Analysis

Let’s check out the correlation between variables.

``````correlation = df.corr()
plt.figure(figsize=(10,8))
sns.heatmap(correlation,annot=True,
cmap=sns.color_palette("magma"),
linewidth=2,edgecolor="k")
plt.title("CORRELATION BETWEEN VARIABLES")
plt.show()``````

Let’s check out the Proportion of target variable in dataset!

``````plt.figure(figsize=(12,6))
plt.pie(df["class"].value_counts().values,
labels=["no diabets","diabets"],
autopct="%1.0f%%",wedgeprops={"linewidth":2,"edgecolor":"white"})
my_circ = plt.Circle((0,0),.7,color = "white")
plt.title("Proportion of target variable in dataset")
plt.show()``````
``````plt.figure(figsize=(12,6))
sns.scatterplot(data=df,x='age',y='times_pregnant',hue='class',cmap="Set2")
plt.legend(title='legend',loc='upper right', labels=['no diabets', 'diabets'])``````

For having a chance to get diabetes one should have times_pregnant=4.87, plasma_glucose=141.25, diastolic_blood_pressure= 70.82. If you get scores more than this then your chances of diabetes are likely.

``df[(df['class'] ==1)].mean().reset_index()``

Train Test Split

Split the data into a training set and a testing set

``````X = df.iloc[:,:-1]
Y = df.iloc[:,8]
#Splitting the data into training set and test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,Y, test_size = 0.25, random_state = 0)``````

Train a Model

Now it’s time to train a Decision Tree Classifier.

``````#fitting classifier to the training set
from sklearn.tree import DecisionTreeClassifier
classifier = DecisionTreeClassifier(criterion='entropy', random_state = 0)
classifier.fit(X_train,y_train)``````

Model Evaluation

Now get predictions from the model and create a confusion matrix and a classification report.

``````y_pred = classifier.predict(X_test)
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test,y_pred)
class_names=[0,1] # name  of classes
fig, ax = plt.subplots()
tick_marks = np.arange(len(class_names))
plt.xticks(tick_marks, class_names)
plt.yticks(tick_marks, class_names)
# create heatmap
sns.heatmap(pd.DataFrame(cm), annot=True, cmap="BuPu" ,fmt='g')
ax.xaxis.set_label_position("top")
plt.tight_layout()
plt.title('Confusion matrix', y=1.1)
plt.ylabel('Actual label')
plt.xlabel('Predicted label')``````

105 and 44 are the correct predictions. In addition, 18 and 25 are theincorrect predictions. so we can see that we have quiet lot of correct predictions.

Also Read:  Prediction Of Startups Profit

Correct Predictions : 105+44 = 149

Incorrect Predictions: 18+25 = 43

Create a classification report for the model.

``````from sklearn.metrics import classification_report
print(classification_report(y_test,y_pred))``````

The accuracy of the model is %77!

.

Tree Visualisation

sklearn actually has some built-in visualization capabilities for decision trees, you won’t use this often and it requires you to install the pydot library, but here is an example of what it looks like and the code to execute this:

``````from IPython.display import Image
from sklearn.externals.six import StringIO
from sklearn.tree import export_graphviz
import pydot

features = list(df.columns[:-1])
features``````
``````dot_data = StringIO()
export_graphviz(classifier, out_file=dot_data,feature_names=features,filled=True,rounded=True)

graph = pydot.graph_from_dot_data(dot_data.getvalue())
Image(graph[0].create_png())  ``````

1. Royal CBD

Right here is the perfect website for anybody who hopes to understand this topic.
You know so much its almost tough to argue with you (not that I really will need to…HaHa).

You certainly put a new spin on a subject that’s been written about for ages.
Wonderful stuff, just excellent!

2. Kelly Paseur

Hello! Do you use Twitter? I’d like to follow you if that would be okay. I’m absolutely enjoying your blog and look forward to new posts.|

3. Jenelle Dagan

Great post however , I was wanting to know if you could write a litte more on this subject? I’d be very grateful if you could elaborate a little bit further. Kudos!|

4. Otha Klintworth

Somebody necessarily help to make seriously posts I’d state. This is the first time I frequented your website page and to this point? I surprised with the analysis you made to create this particular post extraordinary. Excellent job!|

5. Shawn Knoth

6. Velia Laracuente

This site was… how do you say it? Relevant!! Finally I have found something that helped me. Thanks!|

7. Evita Midura

I really like what you guys are up too. This sort of clever work and exposure! Keep up the great works guys I’ve added you guys to my personal blogroll.|

8. Leonora Jepson

Wonderful beat ! I would like to apprentice even as you amend your site, how could i subscribe for a weblog web site? The account aided me a acceptable deal. I had been a little bit familiar of this your broadcast provided vibrant transparent idea|

9. Len Welte

Good information. Lucky me I came across your site by chance (stumbleupon). I have bookmarked it for later!|

10. Niki Hauersperger

Hi there i am kavin, its my first occasion to commenting anyplace, when i read this post i thought i could also make comment due to this good piece of writing.|

11. Wilson Bonham

I read this article fully regarding the resemblance of most up-to-date and previous technologies, it’s amazing article.|

12. Curtis Elvin

This is a topic that’s close to my heart… Best wishes! Exactly where are your contact details though?|

13. Hector Schnepel

Wonderful site. Lots of useful information here. I’m sending it to some friends ans also sharing in delicious. And certainly, thanks to your effort!|

14. Rayford Demaio

I think this is among the most important info for me. And i’m glad reading your article. But should remark on some general things, The site style is wonderful, the articles is really great : D. Good job, cheers|

15. Karima Larribeau

I’m not sure where you’re getting your information, but great topic. I needs to spend some time learning more or understanding more. Thanks for great info I was looking for this info for my mission.|

16. Erasmo Solan

This is my first time visit at here and i am really happy to read all at alone place.|

17. Newton Benac

Very good post. I definitely appreciate this site. Keep writing!|

18. Ellsworth Dardar

It’s an awesome piece of writing for all the web people; they will get advantage from it I am sure.|

19. Lyman Selmer

Hi! I’ve been reading your site for a while now and finally got the bravery to go ahead and give you a shout out from New Caney Tx! Just wanted to say keep up the great work!|

20. Billie Steeves

Pretty nice post. I just stumbled upon your blog and wished to say that I’ve truly enjoyed browsing your blog posts. After all I will be subscribing to your rss feed and I hope you write again soon!|

21. Marybelle Niedens

I think this is among the most important information for me. And i am glad reading your article. But should remark on few general things, The website style is ideal, the articles is really nice : D. Good job, cheers|

22. Molly Hochstetler

Very quickly this site will be famous among all blogging and site-building users, due to it’s fastidious articles|

23. Diana Pavelski

This is really interesting, You’re a very skilled blogger. I have joined your rss feed and look forward to seeking more of your magnificent post. Also, I have shared your website in my social networks!|