Medical Statistics & Demography Devashish Sharma
INDEX
×
Chapter Notes

Save Clear


Classification and TabulationChapter 1

2There are two types of data, (1) Primary data and (2) Secondary data. Primary data is one which was originated by the investigator and Secondary data is that data which the investigator does not originate but obtains from someone's record.
Both primary and secondary data are broadly divided in two categories:
  1. Attributes (Qualitative data).
  2. Variables (Quantitative data).
Attributes: are qualitative characteristics which are not capable of being described numerically or, the data obtained by classifying the presence or absence of attribute, e.g. Sex, Nationality, Colour of eyes, Socioeconomic status. They can further divided into two groups: (a) Nominal (b) Ordinal.
  1. Nominal: The quality that can be easily differentiated by mean of some natural or physical line of demarcation, e.g. some physical characteristic such as colour of eyes, sex, physical status of a person, etc.
  2. Ordinal: An ordered set is known as ordinal, i.e. when the data are classified according to some criteria which can be given an order such as socioeconomic status.
Variable: are quantitative characteristics which can be numerically described. Variables may be discrete or continuous.
Discrete variables: can take exact values, e.g. Number of family members, number of living children, etc.
Continuous variables: if a variable can take any numerical value within a certain range is called continuous variable, e.g. Height in cm, Weight in kg, etc.3
 
REPRESENTATION OF DATA
Data may be representation either by means of graph or diagram or by means of tables.
 
Tables
Tables are of two types: (1) Simple table or Complex depending the number of measurements of single or multiple sets of item, (2) Frequency distribution table.
There are certain general principles, which should be followed while presenting the data into tabulated form:
  1. A table should be numbered.
  2. A title should be given, title should be brief and self explanatory.
  3. Heading of columns and rows should be clear.
  4. Data must be presented according to size and importance.
  5. If percentage or averages are to be compared it should be placed as close as possible.
  6. Foot note may be given where necessary.
 
Simple Table
Table 1.1:   Showing number of patients attending hospital in winter season*
Months
Male
Female
No.
%
No.
%
November
250
25.0
150
30.00
December
350
35.00
100
20.00
January
100
10.00
70
14.00
February
400
40.00
180
36.00
Source* = Hospital Outdoor attendance
4
 
Frequency Distribution Table
In a frequency distribution table, the data is first split up into convenient groups (class interval) and the number of items (frequencies) which occur in each group is shown in adjacent column.
Following are the ages of 23 cases admitted to a hospital: 20, 35, 46, 10, 5, 25, 48, 33, 37, 41, 26, 29, 15, 6, 29, 56, 69, 66, 64, 25, 26, 56, 42.
Age group
Tally marks
Frequencies
0 – 10
10 – 20
20 – 30
30 – 40
40 – 50
50 – 60
60 – 70
||
||
||
|||
||||
||
|||
2
2
7
3
4
2
3
Table 1.2:   Age distribution of admitted cases
Age group (in years)
Cases admitted
No
%
0 – 10
10 – 20
20 – 30
30 – 40
40 – 50
50 – 60
60 – 70
2
2
7
3
4
2
3
8.69
8.69
30.46
13.04
13.04
8.69
13.04
Total
23
100
5
In constructing frequency distribution table, the question that arise is: into how many groups the data should be split? As per rule it might be stated that when there is large data, a maximum of 20 groups, and when there is not much data, a minimum of 5 groups could be conveniently taken.
As far as possible class interval should be equal.
 
GRAPHS OR DIAGRAMS
Bar chart: This is a simple way of representing data. In bar diagram the length of bar is proportional to the magnitude to be represented. Bar charts are of three types: (a) Simple bar chart, (b) Multiple bar chart, (c) Component bar chart.
zoom view
Figure 1.1:
6
Pie chart: In pie chart the area of segment of circle represents frequency. The total frequency comprises of 360°. Area of each segment depends upon the angle corresponding to frequency of each group. Pie diagram is particularly useful when the data is represented in percentage. In such cases 1% is equal to 3.6°.
zoom view
Figure 1.2:
Pictogram: Small pictures or symbols are used to present data
zoom view
Figure 1.3:
7
Cumulative Frequency Curve or Ogive: Cumulative frequencies are obtained by adding the frequencies of each variable. The cumulative frequency table is obtained as follows:
zoom view
Less than Cumulative Frequency Curve: Less than cumulative frequency table is expressed as:
Age in years
Frequencies
Cumulative frequency
20
5
Less than or equal to 20 = 5
21
3
Less than or equal to 21 = 8
23
7
Less than or equal to 23 = 15
35
10
Less than or equal to 35 = 25
36
3
Less than or equal to 36 = 28
45
5
Less than or equal to 45 = 33
67
8
Less than or equal to 67 = 41
Total
41
8
zoom view
Figure 1.4:
More than Cumulative frequency curve: More than cumulative frequency table is expressed as:
Age in years
Frequencies
Cumulative frequency
20
21
23
35
36
45
67
5
3
7
10
3
5
7
More than or equal to 20 = 41
More than or equal to 21 = 36
More than or equal to 23 = 33
More than or equal to 35 = 26
More than or equal to 36 = 16
More than or equal to 45 = 13
More than or equal to 67 = 8
Total
41
9
zoom view
Figure 1.5:
Line Diagram: Line diagram are used to show the trend with the passage of time. Time is independent variable represented on X-axis and the dependent variable on Y- axis. It is essential to show zero point on y-axis.
zoom view
Figure 1.6:
10
Histogram: Histogram is used to represent a continuous frequency distribution, is essentially an area chart in which the area of the bar represents the frequency associated with the corresponding interval. It is not essential to show zero point on X-axis (horizontal axis) but necessary to show it on vertical axis.
zoom view
Figure 1.7:
Frequency Polygon: It is obtained by joining the upper mid points of Histogram blocks by a straight line.
Frequency Curve: It is obtained by joining the upper mid points of Histogram blocks by a smooth line
zoom view
Figures 1.8A and B:
11
Scattered Diagram: Scattered diagram is used to represent two variables simultaneously. Each point represent one individual.
zoom view
Figure 1.9:
 
MULTIPLE CHOICE QUESTIONS
  1. Scatter diagram show:
    1. Trend event with the passage of time
    2. Frequency distribution of a continuous variable
    3. The relation between maximum and minimum values
    4. Relation between two variables (AI,90)12
  1. Sex composition can be demonstrated in which of the following:
    (a) Age pyramid
    (b) Pie chart
    (c) Component bar chart
    (d) Multiple bar chart
    (JIPMER, 91)
  1. Quantitative data can be best represented by:
    (a) Pie chart
    (b) Pictogram
    (c) Histogram
    (d) Bar diagram
    (PGI, 80; AMC, 83, 87)
  1. Percentage of data can be shown in:
    (a) Graph presentation
    (b) Pie chart
    (c) Bar diagram
    (d) Histogram
    (PGI, 79; Delhi, 87)
  1. Graph showing relation between 2 variables is a:
    (a) Scatter diagram
    (b) Frequency polygon
    (c) Picture chart
    (d) Histogram
    (AI, 96)
  1. Weight in kg is a:
    (a) Discrete variable
    (b) Continuous variable
    (c) Nominal scale
    (d) None of the above
    (AI, 96)
  1. All are the example of nominal scale except:
    (a) Age
    (b) Sex
    (c) Body weight
    (d) Socioeconomic status
    (AI, 96)
  1. The average birth weights in a hospital are to be demonstrated by statistical representation. The is best done by:
    (a) Bar chart
    (b) Histogram
    (c) Pie chart
    (d) Frequency polygon
    (AIIMS 95)
    13
  1. All are included in the nominal scale except:
    (a) Colour of eye
    (b) Sex
    (c) Socioeconomic status
    (d) Occupation
    (MP, 98)
  1. Age and sex distribution is best represented by:
    (a) Histogram
    (b) Pie chart
    (c) Bar diagram
    (d) Age pyramid
    (DNB, 2001)
  1. Continuous quantitative variables are expressed by:
    (a) Bar chart
    (b) Histogram
    (c) Frequency polygon
    (d) Ogive
    (e) Pie chart
    (PGI, 2002)
  1. Cumulative frequencies are represented by:
    (a) Histogram
    (b) Line diagram
    (c) Pictogram
    (d) Ogive
  1. In which type of graphical representation frequencies are represented by area of a rectangle
    1. Bar diagram
    2. Component bar diagram
    3. Age pyramid
    4. Histogram
  1. Two variables can be plotted together by:
    (a) Pie chart
    (b) Histogram
    (c) Frequency polygon
    (d) Scatter diagram (AI,95)
  1. Which of the following statement is false:
    1. Primary data is originated by the investigator
    2. Primary data originated by an investigator may be used as secondary data by other investigator
    3. Data obtained from records of Hospitals are secondary data
    4. None of the above statements are true14
  1. Best way to study relationship between two variables is:
    (a) Scatter diagram
    (b) Histogram
    (c) Bar chart
    (d) Pie chart
    (AI,92)
  1. All are the examples of nominal scale except:
    (a) Race
    (b) Sex
    (c) Iris colour
    (d) Socioeconomic status
    (AI,96)
  1. Low birth weight statistics of a hospital is best shown by:
    (a) Bar charts
    (b) Histogram
    (c) Pictogram
    (d) Frequency polygon
    (AIIMS, Dec 95)
  1. Categorical values are:
    (a) Age
    (b) Weight
    (c) Gender
    (Manipal, 2002)
  1. If the grading of diabetes is classified as “mild”, “moderate” and “severe” the scale of measurement used is:
    (a) Interval
    (b) Nominal
    (c) Ordinal
    (d) Ratio
  1. The best method to show the association between height and weight of children in a class is by:
    (a) Bar chart
    (b) Line diagram
    (c) Scatter diagram
    (d) Histogram
    (AI, 2002)
  1. Mean and standard deviation can be worked out only if data is on:
    (a) Interval/Ratio scale
    (b) Dichotomous scale
    (c) Nominal scale
    (d) Ordinal scale
    (AIIMS, 2005)