Posts

Clusters in Effects of Nicotine Dependency

Image
  Introduction Nesarc dataset is used for clustering people who are between 18 and 25, having recently alcohol consumption and/or nicotine dependency. The dataset is also included panic disorder and experience depression in life. Getting Data Everyday smoker in the past 12 months selected for cluster data. Preparing Data NDSymptoms : Nicotine dependency symptoms calculated in 8 different criteria. NICOTINEDEP : Equation TAB12MDX = 1 is n icotine dependency . PANIC : All panic symptoms are in the equation in data . ETHRACE : Etnicity 0. Hispanic or Latino  1. White, Not Hispanic or Latino 2. Black, Not Hispanic or Latino 3. American Indian/Alaska Native, Not Hispanic or Latino 4. Asian/Native Hawaiian/Pacific Islander, Not Hispanic or Latino Plotting Elbow Method The elbow method shows that the cluster number is three. Modelling the Cluster It looks like there is a good separation between the clusters. Cluster Groups The first cluster has the highest nicotine dependency and th...

Lasso Regression in Income per Person

Image
  andom Forest in Income per Person Introduction In gap minder data, Income per person is analyzed with the relation of oil consumption, Co2 emission, internet user rate an so on. Because these features are related social welfare, there might be some correlation with income per person. Getting and Preparing Data incomeperperson : Gross Domestic Product per capita oilperperson: O il Consumption per capita co2emissions:  CO2 consumtion internetuserate :  Internet users (per 100 people) lifeexpectancy :  life expectancy at birth (years) polityscore :  subtracting an autocracy score from a democracy score. relectricperperson:  residential electricity consumption per person urbanrate :  urban population employrate :  Percentage of total population, age above 15, that has been employed   The target data is converted to 12 category and change its data type to string for tree classification.  Data Modelling Target data is income per pers...

Random Forest in Income per Person

Image
  Random Forest in Income per Person Introduction In gap minder data, Income per person is analyzed with the relation of oil consumption, Co2 emission, internet user rate an so on. Because these features are related social welfare, there might be some correlation with income per person. Getting and Preparing Data incomeperperson : Gross Domestic Product per capita oilperperson: O il Consumption per capita co2emissions: CO2 consumtion internetuserate :  Internet users (per 100 people) lifeexpectancy :  life expectancy at birth (years) polityscore :  subtracting an autocracy score from a democracy score. relectricperperson: residential electricity consumption per person urbanrate :  urban population employrate :  Percentage of total population, age above 15, that has been employed   The target data is converted to 12 category and change its data type to string for tree classification.  Data Modelling Target data is income per person and th...

Classification in Income per Person

Image
 Classification in Income per Person Introduction In gap minder data, Income per person is analyzed with the relation of oil consumption, Co2 emission, internet user rate an so on. Because these features are related social welfare, there might be some correlation with income per person. Getting and Preparing Data incomeperperson : Gross Domestic Product per capita oilperperson: O il Consumption per capita co2emissions: CO2 consumtion internetuserate : Internet users (per 100 people) lifeexpectancy : life expectancy at birth (years) polityscore : subtracting an autocracy score from a democracy score. relectricperperson: residential electricity consumption per person urbanrate : urban population employrate : Percentage of total population, age above 15, that has been employed   The target data is converted to 12 category and change its data type to string for tree classification.  Data Modelling Target data is income per person and the rest of the...

Logistic Regresion in Nicotine Dependence and Alcohol Dependence

Image
 Logistic Regresion in Nicotine Dependence and Alcohol Dependence Introduction In nesarc data, logisstic regression is analysed with Nicotine Dependence and Alcohol Conumption. People generally tend to smoke cigarettes while consuming alcohol so there might be a relation between the two features. Getting and Preparing Data Both current alcohol consumers (1) and ex-alcohol consumers (2) are filtered. Data Analysis The p-value of CONUMER is less than 0.05, so it is significantly associated with Nicotine Dependence. When the odd ratio is checked, %60 of alcohol consumers are more likely to have nicotine dependency.. Another explanatory variables "Major depression" has been added to mode. It has also low p-value, and it looks like "Major depression" has more association with Nicotine Dependence. Because the odd raition shows that the people have depression has 3.87 times nicotine dependency. Codes "" import numpy import pandas import statsmodels.api as sm ...