Logistic Regresion in Nicotine Dependence and Alcohol Dependence

 Logistic Regresion in Nicotine Dependence and Alcohol Dependence

Introduction

In nesarc data, logisstic regression is analysed with Nicotine Dependence and Alcohol Conumption. People generally tend to smoke cigarettes while consuming alcohol so there might be a relation between the two features.

Getting and Preparing Data


Both current alcohol consumers (1) and ex-alcohol consumers (2) are filtered.

Data Analysis


The p-value of CONUMER is less than 0.05, so it is significantly associated with Nicotine Dependence.

When the odd ratio is checked, %60 of alcohol consumers are more likely to have nicotine dependency..



Another explanatory variables "Major depression" has been added to mode. It has also low p-value, and it looks like "Major depression" has more association with Nicotine Dependence. Because the odd raition shows that the people have depression has 3.87 times nicotine dependency.

Codes

"" import numpy import pandas import statsmodels.api as sm import seaborn import statsmodels.formula.api as smf # bug fix for display formats to avoid run time errors pandas.set_option('display.float_format', lambda x:'%.2f'%x) data = pandas.read_csv('data/nesarc_pds.csv', low_memory=False) ############################################################################## # DATA MANAGEMENT ############################################################################## #setting variables you will be working with to numeric data['IDNUM'] =pandas.to_numeric(data['IDNUM'], errors='coerce') data['TAB12MDX'] = pandas.to_numeric(data['TAB12MDX'], errors='coerce') data['CONSUMER'] = pandas.to_numeric(data['CONSUMER'], errors='coerce') data['NDSymptoms'] = pandas.to_numeric(data['NDSymptoms'], errors='coerce') sub_alcohol = data[((data["CONSUMER"]==1) | (data["CONSUMER"]==2)) ] data['MAJORDEPLIFE'] = pandas.to_numeric(data['MAJORDEPLIFE'], errors='coerce') data['SOCPDLIFE'] = pandas.to_numeric(data['SOCPDLIFE'], errors='coerce') data['S3AQ3C1'] = pandas.to_numeric(data['S3AQ3C1'], errors='coerce') data['AGE'] =pandas.to_numeric(data['AGE'], errors='coerce') data['SEX'] = pandas.to_numeric(data['SEX'], errors='coerce')


lreg3 = smf.logit(formula = 'NICOTINEDEP ~ CONSUMER', data = sub1).fit() print (lreg3.summary()) # odd ratios with 95% confidence intervals print ("Odds Ratios") params = lreg3.params conf = lreg3.conf_int() conf['OR'] = params conf.columns = ['Lower CI', 'Upper CI', 'OR'] print (numpy.exp(conf)) # logistic regression with panic and depression lreg4 = smf.logit(formula = 'NICOTINEDEP ~ CONSUMER + MAJORDEPLIFE', data = sub1).fit() print (lreg4.summary()) # odd ratios with 95% confidence intervals print ("Odds Ratios") params = lreg4.params conf = lreg4.conf_int() conf['OR'] = params conf.columns = ['Lower CI', 'Upper CI', 'OR'] print (numpy.exp(conf))

Comments

Popular posts from this blog

Lasso Regression in Income per Person

CO2 Emissions Corelations