I need the sas code

Part I. The data file “cdi200.txt” provides selected county demographic information (CDI) for 200 of the most popular counties in the US.

Each line of the data set has an identification number with a county name and state abbreviation and provides informa- tion on 14 variables for a single county. The 17 variables are:

Don't use plagiarized sources. Get Your Custom Essay on
I need the sas code
Just from $13/Page
Order Essay

The conties data is Sorted into an excel workbook:


A public safety official wishes to predict the rate of serious crimes in a CDI (total number of serious crimes per 100,000 population).

  • (8pts) Create the response variable Y= (Total serious crimes)/ (Total population) *100000, which is the total number of serious crimes per 100,000 population. Obtain the scatter plots between Y and each of the following potential predictor variables (variables 8, 9, 11, 12, 13, 14, 15, 16). Comment on the relationships. Also obtain

the correlation matrix of the predictors. Is there evidence of strong linear pairwise associations among the predictors?

  1. Uploading the excel file into SAS:

The code is:

proc import datafile = ‘/folders/SAS/counties_data.xlsx’

out= baseball_sheet2

dbms = xlsx



sheet = “Sheet2”;


  1. Estimating the crime rate in a CDI

proc import datafile = ‘/folders/SAS/counties_data.xlsx’

out = baseball_sheet2

dbms = xlsx

title ‘Crime Rates per 100,000 Population by State’;

input State County Total_serious_Crimes;

Proc princomp out=Crime_Components;



iii. Scatter plot

Y= (total serious crime/total population) *100

y= (4772945 /659348.255)/100,000


The estimated crime rate is 7,238.88

The code:

proc insight data=”c: counties”;

scatter crime single*

crime single;



The scatter plot;


  • Regarding model
  • (5pts) The following code uses the GLMSELECT procedure to conduct the forward and backward selection (and select=SL, stop=PRESS with the de- fault slentry and slstay values in SAS), You may need to modify the code if you use different names for the variables. Run your modified code and write down the fitted final model from each procedure.

The Code:

proc import datafile = ‘/folders/SAS/counties_data.xlsx’

out  =  baseball_sheet2

dbms  =  xlsx

title ‘Crime Rates per 100,000 Population by State’;

input State County Total_serious_Crimes;


proc glmselect;

model y=x8 x9 x11 x12 x13 x14 x15 x16/selection=forward(select=SL stop=PRESS);


proc glmselect;

model y=x8 x9 x11 x12 x13 x14 x15 x16/selectioN=backward(select=SL stop=PRESS);


  • (3pts) Get the adjusted R2 and AIC values of these two Based on the AIC criterion, which of the two models from (a) will you choose?

I would choose the first model


  • (5pts) Can you do hypothesis test to compare the two models from (a)? If yes, conduct the test. Specify your hypotheses, test statistics and its value, p-value or rejection region and your

No, the results are incomparable

  • (2pts) Based on the result of the test in (c), and considering the adjusted R2 and AIC values of these two models, which model will you recommend? Explain.

I would recommend the first model because it directly generates the required models. The crime rate is obtained using a simpler sas code.

  • (5pts) Fit the regression model containing variables 7, 8, 9, 13, and 16. Obtain the residual plots. Based on these plots, should any modifications be made?

data Crime;

proc import datafile = ‘/folders/SAS/counties_data.xlsx’

out  =  baseball_sheet2

dbms  =  xlsx

proc glmselect;

model y=x7 x8 x9 x13x16/selection=forward(select=SL stop=PRESS);


proc glmselect;

model y=x7 x8 x9 x13x16/selection/selectioN=backward(select=SL stop=PRESS);


No modifications sare required for this model

  • (4pts) For the model in question 3, determine the Cook’s distance and leverage values. Do any observations seem bothersome?

Some observations seem bothersome because they lie extremely different from the rest.

  • (2pts) For the model in question 3, obtain the variance inflation factors. Do you think there are serious multicollinearity problems?

                        The variance analytics:


Part II. An experiment was conducted to study the effect of 3 drugs in the treatment of leprosy. The three drugs include two antibiotics (A and D) and a control. Ten patients were selected for each treatment (Drug). A pretreatment score and a posttreatment score of leprosy bacilli were measured for each patient. The goal is to determine the effect of drug treatments on the posttreatment count of bacilli. (data: leprosy.txt).

The code:

ata drugtest;

input Drug $ PreTreatment PostTreatment @@;


A 11  6   A  8  0   A  5  2   A 14  8   A 19 11

A  6  4   A 10 13   A  6  1   A 11  8   A  3  0

D  6  0   D  6  2   D  7  3   D  8  1   D 18 18

D  8  4   D 19 14   D  8  9   D  5  1   D 15  9

F 16 13   F 13 10   F 11 18   F  9  5   F 21 23

F 16 12   F 12  5   F 12 16   F  7  1   F 12 20


proc glm;

class Drug;

model PostTreatment = Drug PreTreatment / solution;

lsmeans Drug / stderr pdiff cov out=adjmeans;


proc print data=adjmeans;



  • (8pts) Ignoring the pretreatment score, use ANOVA to test whether the 3 treat- ments differ significantly. Specify your hypothesis, test statistics and its value, p-value and conclusion. If they are not the same, use Tukey method to explore the pairwise
  • (extra credits: 2pts) Write down the fitted model in (1) and clarify your
  • (5pts) Fit an appropriate model and conduct hypothesis testing to check the linear relationship between posttreatment and pretreatment scores. Do we need to use pretreatment score as a covariate? [Specify your model, hypothesis, test statistics and its value, p-value and conclusion]
  • (3pts) State an ANCOVA model that can be used to compare the three treatments, controlling for pretreatment score. Clarify your
  • (8pts) Conduct an ANCOVA to test whether the 3 treatments differ significantly, controlling for the pretreatment score. Specify your hypothesis, test statistics and its value, p-value and conclusion. If they are not all the same, use Tukey’s method to simultaneously compare all possible pairs of treatments and draw your conclusion.
  • (extra credits: 3pts) Compare the results in questions 1 and 6 and interpret why the results differ. Which correctly describe the effects of the 3 treatments?




Type one SSI for the Drug(293.6) produces sum of the squares in the variance analysis.

Post treatment =Drug

Type three SS for the drug (68.5537) provider

Sum of squares for covariate

Ho: LS mean is equal to 0

Ho: LS mean(i) = LS mean(j)

i and j ae the treatment levels.

OUT = and COV

Thea alternative hypothesis: if drug =A, post; (-0.435+- 3.446) + 0.987. Pre

if drug =D, post; (-0.435+-3.337) + 0.987. Pre

if drug = F, post; (-0.435 + 0.987. Pre




Essay Assign
Calculate your paper price
Pages (550 words)
Approximate price: -

Our Advantages

Plagiarism Free Papers

All our papers are original and written from scratch. We will email you a plagiarism report alongside your completed paper once done.

Free Revisions

All papers are submitted ahead of time. We do this to allow you time to point out any area you would need revision on, and help you for free.


A title page preceeds all your paper content. Here, you put all your personal information and this we give out for free.


Without a reference/bibliography page, any academic paper is incomplete and doesnt qualify for grading. We also offer this for free.

Originality & Security

At Essay Assign, we take confidentiality seriously and all your personal information is stored safely and do not share it with third parties for any reasons whatsoever. Our work is original and we send plagiarism reports alongside every paper.

24/7 Customer Support

Our agents are online 24/7. Feel free to contact us through email or talk to our live agents.

Try it now!

Calculate the price of your order

We'll send you the first draft for approval by at
Total price:

How it works?

Follow these simple steps to get your paper done

Place your order

Fill in the order form and provide all details of your assignment.

Proceed with the payment

Choose the payment system that suits you most.

Receive the final file

Once your paper is ready, we will email it to you.

Our Services

We work around the clock to see best customer experience.


Flexible Pricing

Our prices are pocket friendly and you can do partial payments. When that is not enough, we have a free enquiry service.


Admission help & Client-Writer Contact

When you need to elaborate something further to your writer, we provide that button.


Paper Submission

We take deadlines seriously and our papers are submitted ahead of time. We are happy to assist you in case of any adjustments needed.


Customer Feedback

Your feedback, good or bad is of great concern to us and we take it very seriously. We are, therefore, constantly adjusting our policies to ensure best customer/writer experience.