Importance
of Educational Statistics
Why
is this material important?
Statistics
is
a
branch of
mathematics that involves making sense of data. Data are quantities
obtained from some type of systematic observation. Presented
as a list of numbers, data are difficult to comprehend. By using
statistical techniques, researchers work with data to organize,
categorize, condense, summarize, describe, illustrate, analyze,
compare, synthesize, evaluate, and infer.
Whether
you intend to employ
qualitative methods or quantitative methods or mixed methods to
investigate the research questions for your dissertation, you will
encounter statistics in several contexts. The obvious application of
statistics that you may encounter is the need to use
statistics for data analysis
in your quantitative or mixed
methods study. Perhaps you are not planning to employ quantitative
methods - what is the value of this course for qualitative researchers?
If your study involves a group of participants, you may find it helpful
and informative to describe
your participants' characteristics. To help your reader
understand your participants' context, you might also find it useful to
describe
the characteristics of the setting for your study. These
examples represent the producer
role of using statistics.
Even if you
are not intending to include numerical data in your study,
chances are very good that other researchers working in your
topic have employed statistical techniques, which you need to
understand and evaluate. So, instead of a producer role, everyone who
conducts or reads research studies plays a critical consumer role. When
you review previous research (Chapter 2 of your dissertation), if that
research involves statistics, you must be able to understand
how the researcher applied the statistical techniques and what the
results mean. Furthermore, you need to be able to challenge
the quantitative approach to research in meaningful ways.
This course aims to build your skills in filling both roles.
Introduction
to Statistics
Applications
of Statistics
Think about
when and where you've
encountered statistics - in newspaper articles, on news reports, in
journal articles, at the ball game, in advertising, at a swim meet, in
product evaluations, in your child's homework, in your profession, in
your classroom, or in testing results. Not only are statistical
treatments of data widely found in our society, but these techniques
are fundamental to almost all disciplines of study. Evidence of the
increasing importance of statistics can be found in the early
introduction of the topic in school curricula. For example, there is a content standard for statistics for
kindergarten students. Thirty
years ago, the first mention of statistics might have been in an
introductory graduate research course. Let's explore possible reasons
for this change.
Purpose
Statistical
techniques are used to organize and analyze data, where data are
unambiguous quantifications of observed phenomena
(i.e., a regular way of assigning numbers to certain events or
characteristics). In addition to organizing and analyzing data,
statistical techniques include methods for illustrating the results of
the statistical analysis. The prime example of this use of statistics
involves the creation of graphical displays of information using charts
and graphs. Not only is a picture worth a thousand words, it's
worth even more numbers. Through these analyses and graphics,
trends and other patterns may emerge that help researchers and others
understand and explain phenomena in nature.
Brief
History
One
of the earliest applications of statistics derives from the need to
make sense out of census data. In fact, the derivation of the word,
statistics, includes the notion of understanding
the political state.
A census is used to understand the characteristics of the residents of
a country. In the United States, a census is conducted every ten years.
Among other decisions, the composition of the House of Representatives
is based on census data. As you can see, the need to describe
observed phenomena is central
to the use of statistics.
The
mathematics of statistics has its roots in probability
theory -
the
branch of mathematics that deals with chance and uncertainty.
Explaining the characteristics of games
of chance
through mathematics was one of the early goals of probability theory.
Essentially understanding games of chance involves predicting
outcomes (e.g., heads or tails; blackjack; a royal flush; a roll of
seven). How might understanding phenomena and predicting outcomes be
linked? They are linked through a process of reasoning called statistical
inference,
which uses a known state of nature as the basis for predictions about
the future or about a wider context. To read more about the general
background of statistics, you might visit: http://en.wikipedia.org/wiki/Statistics.
Role
of
Computers
So,
statistics combines tools to describe phenomena and make inferences
based on descriptions - why has the subject increased in popularity, as
evident from its inclusion in kindergarten curricula? As with many
modern trends, the answer can be linked to the influence of cheap, yet
powerful, computers with graphical displays. Not too long ago, people
who needed to make sense of data either needed to conduct tedious
calculations by hand or needed access to a powerful mainframe computer.
Imagine perusing lists of census data to determine the average
household income in the Bay Area and then calculating the figure by
hand. Now, not only are many results of statistical analyses readily
available to you, but you can obtain the actual raw data and conduct
your own analysis. Visit http://www.census.gov/
to view what is available. Putting these data and tools in the hands of
more people requires a more general understanding of statistical
concepts and techniques. As you will see first-hand, having the data
and statistical software program is necessary but not sufficient for
conducting meaningful analyses. That's the reason for the increased
emphasis on topics like statistical reasoning and statistical literacy.
Statistics
as a
Subset of the Tools for Research
If
you've completed the Research Methods course, you've been introduced to
many tools for conducting research. The primary divisions of these
tools is usually along the continuum
that runs from qualitative, interpretive inquiry to quantitative,
scientific investigations.
At the quantitative end of the continuum lie most of the statistical
techniques that we will study. There are many, many more that we will
not have the time to study. So, one of the goals for the course is to
introduce you to fundamental statistical concepts that underlie most
statistical procedures. As with qualitative and quantitative approaches
in general, the specific
statistical tests and routines that you should use are determined by
the questions you are addressing.
As you will see in the text and elsewhere, your selection of
appropriate statistical tools can be guided by decision trees or flow
charts that lead you through a series of questions intended to identify
the appropriate statistical test to use for your given situation.
Examples
of Data
Data are
all around us - physical
characteristics, such as height, weight, eye color, dominant hand,
gender, age, ethnicity; social
characteristics, such as socioeconomic level, citizenship, residency,
marital status, family structure, years of education; or school-based
data, such as grade in school, test scores, number of absences, number
of referrals, grades, placements, abilities, aptitude, achievements.
Each item named in the previous list represents a variable, which is a
set of data points, all of which represent the same construct. In the
context of a research study, a variable is a set of data that comprise
different values - the
values of variables vary!
If you are studying students at a boys' high school, gender is not a
variable in your study, because there is only one value for gender -
male.
Scan
through the list in the preceding paragraph once more. In addition to
the categories of physical, social, and school data, can you discern
other differences between these variables? For example, would you
describe the process of assigning of numbers to height to be similar to
the process of assigning numbers to ethnic classifications? Hopefully,
you find these two processes to differ in a fundamental way - namely,
heights are measured with a measuring device like a tape measure and
assigned a length whereas ethnicities are based on ancestry and any
number assigned to a specific ethnicity is completely arbitrary. For
example, you could assign a 1 to Asian or a 5 to Asian and, as long as
no other group is assigned a 1 or 5, either would be an adequate
quantification of ethnicity.
Let's
describe these differences more formally. Variables are measured at
different levels - called levels of measurement. There are four levels:
nominal, ordinal, interval, and ratio. Before explaining each one, you
might wonder why these measurement levels matter. The reason is simple
- the
level of measurement determines which statistical techniques are
appropriate to use. The
assignment of numbers to values of the variable is often called coding.
Nominal - the
numbers assigned to the values of the variable are completely arbitrary
- they are just labels that are consistently applied. For example,
gender is a variable that typically has two values, male and female.
You can assign 1 to represent male and 2 to represent female or vice
versa or 5 to represent female and 3 to represent male. As long as all
males are assigned the same number and all females are assigned the
same number (and the two numbers are different), the coding of
gender is appropriate. Variables measured at the nominal level are also
called qualitative or categorical variables. When there are just two
values, as is the case with gender, the variable is called dichotomous.
Dichotomous variables are quite common in research because they
represent the division of a sample into two groups.
Ordinal
- the numbers assigned to the values of the variable
indicate order but not the actual size. The prototype to
remember
is rankings. For example, runners who finish a race are labeled first,
second, and third (these are called ordinal numbers, by the way). The
place in which they finish does not indicate their actual time though.
In fact, the difference between first place and second place may be 2
seconds, while the difference between second and third may be 10
seconds. In more formal terms, the intervals between consecutive values
on an ordinal scale are not equal. Think of ranking your students by
some ability - the top five students may be very close in ability
levels and then there may be a substantial decrease between the fifth
place student and the sixth place student.
Interval
- the numbers assigned to the values of the variable indicate measured
amounts. The intervals between these numbers are equal. For example,
temperature is measured on an interval scale - the interval between 40
degrees and 50 degrees is the same as the interval between 70 degrees
and 80 degrees. Most educational variables are treated as if they were
measured on interval scales.
Ratio
- the numbers assigned to the values of the variable meet the
properties of an interval scale and include a "true" zero, which makes
a comparison of values meaningful. A true zero is the complete lack of
the measured quantity. If we counted the change in students' pockets,
students without any change would be assigned a 0. Reporting that Juan
had twice the change that Nina had would make sense. On the other hand,
if we measured math ability using a math test, a student who scored 0
on the test can't be said to lack all math ability. Furthermore,
reporting that Beatrix, who scored a 90, is twice as able,
mathematically, as Que, who scored a 45, isn't meaningful.
Return to
the previous list of variables and try your hand at classifying them
according to these four levels of measurement.
Uses
of
Statistics - Types of Research Questions
Research
questions that are answered through the use of statistics involve
describing a current context or making an inference about a different,
but related, setting. A typical descriptive question might be one that
asks: What are the reading levels of first-grade students at ABC
Elementary who use the Write-to-Read instructional program? A typical
inferential question might be one that asks: What is the effect of the
Write-to-Read instructional program for students in the XYZ district?
Descriptive
Statistics
- Descriptive
statistics summarize
data by reporting a number that represents the entire set of data. For
example, the mean (average) score represents a summary of all of the
scores on a test.
- Descriptive
statistics can also be used to organize
a set of data. For example, a table of age ranges and tallies
(frequencies) of participants within those age ranges helps to organize
the observed values of the age variable.
- Descriptive
statistics allow researchers to illustrate
entire sets of data so that overall patterns can be seen. For example,
a pie chart showing ethnic classifications can describe these
characteristics for a group of students. Likewise, a graph that
compares reading levels and hearing abilities can illustrate how these
two variables are related.
Inferential
Statistics
- Inferential
statistics are used to compare
groups on one or more variables.
For example, reading ability of girls might be compared to that of boys
for a subgroup of students in order to make general statements about
the comparable abilities of girls and boys.
- Inferential
statistics can also be used to compare
two variables within a group of people.
For example, the relationship between nutritional habits and school
achievement levels might be studied for a subgroup of students so that
general statements might be made about how these two variables could be
linked.
- Similar to
comparing variables, inferential statistics can be used to generate
mathematical models that help to predict
particular outcomes.
For example, an admission officer might use historical data about high
school performance and subsequent success in college to help inform an
admissions decision.
Generalizing
Results - from Sample to Population; from Today to Tomorrow; from Here
to There
Remember
that a researcher using quantitative methods, and specifically
inferential statistics, is intent on producing results that generalize
to other people, in other settings, and at other times. In order to
achieve this goal, the proper selection of a particular set of
participants, called a sample, is vital. In generalizing, the
researcher has a large group of people in mind - this is called the
population. If the generalization to the population from the sample is
going to be seen by others as valid, the sample needs to mirror the
larger population in every way that is determined to be important. Keep
this important point in mind as you continue to read research reports
and learn about statistics.
Suggestions
for Succeeding in Studying Statistics
Heed
the suggestions of our text's author very carefully. Math, in general,
and statistics, in particular, are both very hierarchical by their
nature. This means that later concepts and techniques depend on earlier
ones. This situation can be either good or bad. If you build a solid
foundation to start with and keep up with the material, you can build
your skills incrementally. The downside is that if you do not
understand a topic or concept, you cannot skip over it, hoping to avoid
it in the future. Reading carefully and thoroughly, taking notes,
working exercises, and practicing new skills will help you achieve
success in this course.