## MATH 533 Course Project Altogether Part A, B, C – SALESCALL Inc.

__Introduction__

SALESCALL Inc. has thousands of salespeople throughout the country. A sample of 100 salespeople is selected, and data is collected on the following variables.

- SALES (the number of sales made this week)
- CALLS (the number of sales calls made this week)
- TIME (the average time per call this week)
- YEARS (years of experience in the call center)
- TYPE (the type of training, either group training, online training of no training)

The data file can be found in Doc Sharing titled Course Project Data.xlsx.

This project is due in three parts, at the end of Weeks 2, 6, and 7 respectively.

** PROJECT PART A: Exploratory Data** Analysis

- Open the file Course Project Data.xlsx from the Course Project Data Set folder in Doc Sharing.
- For each of the five variables, process, organize, present, and summarize the data. Analyze each variable by itself using graphical and numerical techniques of summarization. Use Minitab as much as possible, explaining what the printout tells you. You may wish to use some of the following graphs: stem-leaf diagram, frequency or relative frequency table, histogram, boxplot, dotplot, pie chart, or bar graph. Caution: not all of these are appropriate for each of these variables, nor are they all necessary. More is not necessarily better. In addition, be sure to find the appropriate measures of central tendency and measures of dispersion for the above data. Where appropriate, use the five number summary (the min, Q1, median, Q3, max). Once again, use Minitab as appropriate, and explain what the results mean.
- Analyze the connections or relationships between the variables. There are 10 pairings here (SALES and CALLS, SALES and TIME, SALES and YEARS, SALES and TYPE, CALLS and TIME, CALLS and YEARS, CALLS and TYPE, TIME and YEARS, TIME and TYPE, YEARS and TYPE). Use graphical and numerical summary measures. Explain what you see. Be sure to consider all 10 pairings. Some variables show clear relationships, while others do not.
- Prepare your report in Microsoft Word (or some other word processing package),
*integrating your graphs and tables with text explanations and interpretations.*Be sure that you have graphical and numerical back up for your explanations and interpretations. Be selective in what you include in the report. I’m not looking for a 20-page report on every variable and every possible relationship (that’s 15 things to do). Rather, what I want you do is to highlight what you see for*three individual variables*(no more than one graph for each, one or two measures of central tendency and variability, and two or three sentences of interpretation). For the 10 pairings, identify and report only on*three of the pairings,*again using graphical and numerical summary (as appropriate), with interpretations.*Please note that at least one of your pairings must include TYPE, and at least one of your pairings must not include TYPE.*

__PROJECT PART B: Hypothesis Testing and Confidence Intervals__

Your manager has speculated the following:

- The average (mean) sales per week exceeds 41.5 per salesperson.
- The true population proportion of salespeople that received online training is less than 55%.
- The average (mean) number of calls made per week by salespeople that had no training is less than 145.
- The average (mean) time per call is greater than 15 minutes.

- Using the sample data, perform the hypothesis test for each of the above situations in order to see if there is evidence to support your manager’s belief in each case A–D. In each case, use the five-step hypothesis testing procedure, with α = .05, and explain your conclusion in simple terms. Also, be sure to compute the p-value and interpret.
- Follow this up with computing 95% confidence intervals for each of the variables described in A–D and again interpreting these intervals.

- Write a report to your manager about the results, distilling down the results in a way that would be understandable to someone who does not know statistics. Clear explanations and interpretations are critical.
- All DeVry University policies are in effect, including the plagiarism policy.

- Project Part B report is due by the end of Week
- 6.

- B is worth 100 total points. See grading rubric below.

## STATUS

**Submission: ***The report from Part 3 and all of the relevant work done in the hypothesis testing (including Minitab) in Part 1, and the confidence intervals (Minitab) in Part 2 as an appendix*

** PROJECT PART** C: Regression and Correlation Analysis

Using Minitab, perform the regression and correlation analysis for the data on SALES (Y) and CALLS (X) by answering the following questions.

- Generate a scatterplot for SALES versus CALLS, including the graph of the best fit line. Interpret.

- Determine the equation of the best fit line, which describes the relationship between SALES and CALLS.

- Determine the coefficient of correlation. Interpret.
- Determine the coefficient of determination. Interpret.
- Test the utility of this regression model (use a two tail test with α =.05). Interpret your results, including the p-value.
- Based on your findings in 1–5, what is your opinion about using CALLS to predict SALES? Explain.
- Compute the 95% confidence interval for beta-1 (the population slope). Interpret this interval.
- Using an interval, estimate the average weekly sales for weekly calls that are 150. Interpret this interval.
- Using an interval, predict the weekly sales when weekly calls are 150. Interpret this interval.
- What can we say about the weekly sales when weekly calls are 300? Explain your answer. ……… In an attempt to improve this model, we attempt to do a multiple regression model predicting SALES based on CALLS, TIME, and YEARS.
- Using Minitab, run the multiple regression analysis using the variables CALLS, TIME, and YEARS to predict SALES. State the equation for this multiple regression model.
- Perform the global test for utility (F-Test). Explain your conclusion.
- Perform the t-test on each independent variable. Explain your conclusions, and clearly state how you should proceed. In particular, state which independent variables should we keep, and which should be discarded.
- Is this multiple regression model better than the linear model that we generated in parts 1–10? Explain.
- All DeVry University policies are in effect, including the plagiarism policy.

- Project Part C report is due by the end of Week
- 7.

- C is worth 100 total points. See grading rubric below.

Summarize your results from 1–14 in a report that is two pages or less in length and explains and interprets the results in ways that are understandable to someone who does not know statistics.

**Submission: ***The summary report and all of the work done in 1–14 (Minitab Output + interpretations) as an appendix.*