Before starting any type of analysis classify the data set as either continuous or attribute, and in many cases it is a mixture of both types. Continuous information is characterized by variables that can be measured on a continuous scale including time, temperature, strength, or monetary value. A test is to divide the benefit by 50 percent and see if it still is sensible.

Attribute, or discrete, data can be associated with defined grouping and after that counted. Examples are classifications of negative and positive, location, vendors’ materials, product or process types, and scales of satisfaction like poor, fair, good, and excellent. Once an item is classified it can be counted as well as the frequency of occurrence can be determined.

The next determination to create is whether the data is **统计学代写**. Output variables tend to be called the CTQs (critical to quality characteristics) or performance measures. Input variables are what drive the resultant outcomes. We generally characterize a product, process, or service delivery outcome (the Y) by some function of the input variables X1,X2,X3,… Xn. The Y’s are driven by the X’s.

The Y outcomes can be either continuous or discrete data. Samples of continuous Y’s are cycle time, cost, and productivity. Samples of discrete Y’s are delivery performance (late or promptly), invoice accuracy (accurate, not accurate), and application errors (wrong address, misspelled name, missing age, etc.).

The X inputs can also be either continuous or discrete. Samples of continuous X’s are temperature, pressure, speed, and volume. Examples of discrete X’s are process (intake, examination, treatment, and discharge), product type (A, B, C, and D), and vendor material (A, B, C, and D).

Another set of X inputs to continually consider are the stratification factors. These are variables that may influence the product, process, or service delivery performance and really should not be overlooked. Whenever we capture these details during data collection we can study it to find out if this makes a difference or not. Examples are period of day, day of the week, month of the year, season, location, region, or shift.

Given that the inputs can be sorted from your outputs and also the data can be considered either continuous or discrete your selection of the statistical tool to apply boils down to answering the question, “The facts that we wish to know?” The following is a listing of common questions and we’ll address each one of these separately.

What is the baseline performance? Did the adjustments made to the process, product, or service delivery make a difference? Are there any relationships involving the multiple input X’s and also the output Y’s? If there are relationships do they really make a significant difference? That’s enough questions to be statistically dangerous so let’s start by tackling them one-by-one.

Precisely what is baseline performance? Continuous Data – Plot the information in a time based sequence utilizing an X-MR (individuals and moving range control charts) or subgroup the info employing an Xbar-R (averages and range control charts). The centerline in the chart provides an estimate from the average from the data overtime, thus establishing the baseline. The MR or R charts provide estimates of the variation with time and establish the upper and lower 3 standard deviation control limits for the X or Xbar charts. Create a Histogram in the data to see a graphic representation in the distribution from the data, test it for normality (p-value ought to be much in excess of .05), and compare it to specifications to gauge capability.

Minitab Statistical Software Tools are Variables Control Charts, Histograms, Graphical Summary, Normality Test, and Capability Study between and within.

Discrete Data. Plot the info in a time based sequence employing a P Chart (percent defective chart), C Chart (count of defects chart), nP Chart (Sample n times percent defective chart), or perhaps a U Chart (defectives per unit chart). The centerline supplies the baseline average performance. The upper and lower control limits estimate 3 standard deviations of performance above and below the average, which makes up about 99.73% of all expected activity as time passes. You will get an estimate from the worst and best case scenarios before any improvements are administered. Create a Pareto Chart to view a distribution of the categories and their frequencies of occurrence. In the event the control charts exhibit only normal natural patterns of variation with time (only common cause variation, no special causes) the centerline, or average value, establishes the capability.

Minitab Statistical Software Tools are Attributes Control Charts and Pareto Analysis. Did the adjustments designed to this process, product, or service delivery really make a difference?

Discrete X – Continuous Y – To evaluate if two group averages (5W-30 vs. Synthetic Oil) impact fuel useage, utilize a T-Test. If you will find potential environmental concerns that may influence the test results make use of a Paired T-Test. Plot the results on a Boxplot and evaluate the T statistics with all the p-values to produce a decision (p-values less than or similar to .05 signify which a difference exists with at the very least a 95% confidence that it is true). When there is a positive change choose the group with the best overall average to fulfill the goal.

To check if several group averages (5W-30, 5W-40, 10W-30, 10W-40, or Synthetic) impact fuel useage use ANOVA (analysis of variance). Randomize the order from the testing to reduce any moment dependent environmental influences on the test results. Plot the final results on a Boxplot or Histogram and evaluate the F statistics using the p-values to create a decision (p-values less than or similar to .05 signify which a difference exists with a minimum of a 95% confidence that it must be true). When there is a change pick the group using the best overall average to fulfill the objective.

In either of the aforementioned cases to evaluate to determine if there exists a difference within the variation due to the inputs because they impact the output use a Test for Equal Variances (homogeneity of variance). Use the p-values to make a decision (p-values less than or comparable to .05 signify which a difference exists with a minimum of a 95% confidence that it is true). If you have a change select the group with the lowest standard deviation.

Minitab Statistical Software Tools are 2 Sample T-Test, Paired T-Test, ANOVA, and Test for Equal Variances, Boxplot, Histogram, and Graphical Summary. Continuous X – Continuous Y – Plot the input X versus the output Y utilizing a Scatter Plot or maybe you can find multiple input X variables use a Matrix Plot. The plot provides a graphical representation in the relationship between the variables. If it appears that a romantic relationship may exist, between several in the X input variables as well as the output Y variable, conduct a Linear Regression of one input X versus one output Y. Repeat as required for each X – Y relationship.

The Linear Regression Model provides an R2 statistic, an F statistic, as well as the p-value. To be significant for a single X-Y relationship the R2 should be more than .36 (36% of the variation within the output Y is explained from the observed modifications in the input X), the F should be much greater than 1, and also the p-value needs to be .05 or less.

Minitab Statistical Software Tools are Scatter Plot, Matrix Plot, and Fitted Line Plot.

Discrete X – Discrete Y – In this sort of analysis categories, or groups, are when compared with other categories, or groups. As an example, “Which cruise line had the highest customer satisfaction?” The discrete X variables are (RCI, Carnival, and Princess Cruise Companies). The discrete Y variables are the frequency of responses from passengers on their own satisfaction surveys by category (poor, fair, good, excellent, and excellent) that relate with their vacation experience.

Conduct a cross tab table analysis, or Chi Square analysis, to evaluate if there were differences in degrees of satisfaction by passengers based upon the cruise line they vacationed on. Percentages are used for the evaluation as well as the Chi Square analysis supplies a p-value to help quantify if the differences are significant. The overall p-value related to the Chi Square analysis needs to be .05 or less. The variables who have the largest contribution towards the Chi Square statistic drive the observed differences.

Minitab Statistical Software Tools are Table Analysis, Matrix Analysis, and Chi Square Analysis.

Continuous X – Discrete Y – Does the fee per gallon of fuel influence consumer satisfaction? The continuous X is definitely the cost per gallon of fuel. The discrete Y is definitely the consumer satisfaction rating (unhappy, indifferent, or happy). Plot the info using Dot Plots stratified on Y. The statistical method is a Logistic Regression. Yet again the p-values are utilized to validate that the significant difference either exists, or it doesn’t. P-values which can be .05 or less mean that we have a minimum of a 95% confidence that a significant difference exists. Make use of the most frequently occurring ratings to help make your determination.

Minitab Statistical Software Tools are Dot Plots stratified on Y and Logistic Regression Analysis. Are there relationships between the multiple input X’s and the output Y’s? If there are relationships will they make a difference?

Continuous X – Continuous Y – The graphical analysis is actually a Matrix Scatter Plot where multiple input X’s can be evaluated up against the output Y characteristic. The statistical analysis technique is multiple regression. Assess the scatter plots to search for relationships involving the X input variables and the output Y. Also, search for multicolinearity where one input X variable is correlated with another input X variable. This can be analogous to double dipping so that we identify those conflicting inputs and systematically remove them from the model.

Multiple regression is a powerful tool, but requires proceeding with caution. Run the model with variables included then evaluate the T statistics and F statistics to identify the first set of insignificant variables to eliminate from your model. Through the second iteration in the regression model turn on the variance inflation factors, or VIFs, which are utilized to quantify potential multicolinearity issues 5 to 10 are issues). Assess the Matrix Plot to identify X’s associated with other X’s. Take away the variables with all the high VIFs as well as the largest p-values, but ihtujy remove one of many related X variables within a questionable pair. Assess the remaining p-values and take off variables with large p-values through the model. Don’t be amazed if the process requires some more iterations.

If the multiple regression model is finalized all VIFs will likely be lower than 5 and all p-values is going to be under .05. The R2 value ought to be 90% or greater. It is a significant model as well as the regression equation can certainly be used for making predictions as long while we keep the input variables in the min and max range values that were utilized to create the model.

Minitab Statistical Software Tools are Regression Analysis, Step Wise Regression Analysis, Scatter Plots, Matrix Plots, Fitted Line Plots, Graphical Summary, and Histograms.

Discrete X and Continuous X – Continuous Y

This situation requires using designed experiments. Discrete and continuous X’s can be utilized as the input variables, nevertheless the settings to them are predetermined in the design of the experiment. The analysis technique is ANOVA which had been previously mentioned.

The following is an illustration. The aim is always to reduce the number of unpopped kernels of popping corn in a bag of popped pop corn (the output Y). Discrete X’s could possibly be the type of popping corn, type of oil, and model of the popping vessel. Continuous X’s might be quantity of oil, quantity of popping corn, cooking time, and cooking temperature. Specific settings for each of the input X’s are selected and incorporated into the statistical experiment.