Undercoverage in 21st Century RDD Sampling

Tuesday, December 11, 2007

Overview:
List-assisted Random Digit Dial (RDD) sampling methodology, where sample telephone numbers are selected within 100-series telephone banks with at least one listed number, was developed decades ago when local telephone exchanges relied on such telephone banks as physical building blocks.  In recent years, however, the telecommunication industry has undergone a number of fundamental changes including a complete transition from analog to digital call routing and departure from an AT&T-dominated infrastructure to what is provided by regional independent operating companies as well as a growing number of alternative landline service providers.  Combined with the decline in the proportion of directory-listed households and dilution of the residential landline assignment density due to a sharp increase in the number of residential exchanges, these changes have all but eliminated the utility of 100-series banks for frame construction and sampling purposes.  As a result, the efficiency and coverage of RDD samples selected using list-assisted methods have diminished significantly.
In spite of the above drastic changes, the sampling frame construction methodology for RDD samples has changed very little (if any) over the years.  This note provides an overview of the research conducted to reexamine the underlying assumptions that were conducive to list-assisted RDD sampling against the realities of today’s telephone network.  Specifically, the extent of undercoverage in traditional RDD samples is quantified while alternative methods of frame construction are introduced that aim to restore some of the lost coverage.


RDD in the Good Old Days:

A major breakthrough in telephone survey research methodology was introduced when the Mitofsky-Waksberg (1970) technique of RDD sampling was simplified to include only 100-series banks with at least one listed telephone number.  As such, a two- stage cluster sampling methodology that entailed both operational and technical complexities was replaced by a single-stage equal probability selection method (epsem) that could produce survey estimates with smaller sampling variances.  Of note, these impressive gains were exercised at the expense of accepting a modest coverage bias that could be easily tolerated when time and cost saving considerations were kept in balance
Connor and Herringa (1992) and Brick et al. (1995) had estimated that only about 3.5% of all telephone households were not covered when the frame was confined to listed 100-series banks.  After a decade of fundamental changes in US telephony, however, the question that has been conveniently ignored up to now is:


How Large is the Extent of Undercoverage in Current List-assisted RDD Samples?

In order to provide a current estimate of the undercoverage in list-assisted RDD samples, MSG selected a stratified sample of 38,000 telephone numbers from three 100-series bank strata that collectively constitute the entire pool of available landline telephone numbers: 0-listed banks, 1+listed banks, and numbers in remaining POTS with mixed-use banks with no listed numbers.  For this research, all sample telephone numbers were called a maximum of 9 times using MSG’s operator-attended screening service (GENESYS-CSS) to obtain an initial disposition for each number.  Subsequently, the pool of 2,722 telephone numbers that remained CSS-undetermined (no answer or busy) were cross-referenced against commercial databases to determine a final disposition for each sample telephone number.  Finally, the entire sample was weighted to reflect the employed stratified design before unbiased estimates of residential hit rates could be developed for each of the three strata.
As depicted in the following chart, prime among the findings of this research is that the extent of undercoverage in list-assisted RDD samples that exclude 0-listed banks is no longer as little as 3.5%.  Indeed, this rate has now peaked to about 20%, representing a non-ignorable and most likely a nonrandom subset of US households.
 
More specifically, these results suggest that over 14% of this undercoverage is attributed to residences whose telephone numbers are now in 0-listed banks.  As mentioned earlier, this is a direct byproduct of the significant increase in the number of residential exchanges during the past decade.

Alternative Frame Construction Methodologies:

List-assisted RDD samples selected from listed 100-series banks no longer provide a representative sample or one that could be remedied through applications of post-stratification adjustment techniques.  Actually, coupled with the fact that more than 16% of US households are now reachable only via cell phones, it can be deduced that traditional RDD samples at best cover less than 70% of all US households.  To make the situation even more complicated, it is estimated that a growing percent of households – currently estimated at about 15% (Blumberg and Luck 2008) – are mostly reachable via cell phones.  These cell-only and cell-mostly households present yet another formidable source of coverage bias for list-assisted RDD samples.  Is using the current method of RDD for sample selection no longer a practical option?  Clearly, results from this and related studies suggest that if frame construction and sampling methodologies stay the same, results from such samples can no longer withstand scientific scrutiny.
In order to eliminate some of the undercoverage currently undermining the utility of RDD samples selected from the listed 100-series banks, it is our submission that future RDD frames have to be developed using 1000-series telephone numbers as their basic building blocks.  As such, the listed status of each block will have to be determined based on whether the associated 1000-series block contains any listed numbers or not; this way, many of the 100-series banks that are currently unlisted can be included as part of a listed 1000-series block.  In the extreme case, a 1000-series block can be comprised of nine 0-listed 100-series banks and only one listed 100-series bank.  This is how a transition to listed 1000-series blocks can entail additional screening resources to identify residential numbers for the benefit of reducing undercoverage.  Based on our research, frames developed from all 1+listed 1000-series blocks are expected to increase the residential coverage rate from 80 to about 90%.  On the negative side, household hit rates are expected to decrease by about 10%.
Also, it is no longer justifiable to limit the sampling frames to include only traditional exchanges, since the landline coverage rate even when using listed 1000-series blocks for frame construction is expected to be at best 90%.  Consequently, future frames should be supplemented with the remaining POTS exchanges deemed to have residential assignments.  This too, however, will further dilute the sampling frame as the rate of residential number assignments in such exchanges is currently very low.  Lastly, it is becoming an obvious necessity for future RDD samples to include proper mixtures of cellular phone numbers to compensate for the cell-only and cell-mostly households that are not covered by the landline frames.  Given the growing number of such households, however, it is impractical to suggest standard methods for this supplementation at the present time.


Summary and Conclusions:

Digital transition of the telephone network infrastructure has all but invalidated the utility of the 100-series banks.  The unfolding changes in US telephony have introduced new sources of undercoverage in traditional RDD samples with magnitudes that are no longer ignorable.  Recapturing this coverage will require developing sampling frames that are more inclusive even though this will entail lower residential hit rates and additional costs for screening efforts.
Given the fluidity of the current situation, it is important to implement tracking mechanisms that can assess and report the ongoing changes in the structure of telephone frames.  Also, it is highly advisable for the research community to investigate and shed more light on the emerging peculiarities associated with these changes.  For instance, it will be revealing to know why the time-to-listing of residential numbers among alternative providers is so long and whether such low listed rates are due to number porting.  In parallel, it will be necessary to introduce new screening procedures that can reverse the cost drain associated with decreased hit rates resulting from expanded RDD frames. 
It should be noted that MSG has since conducted a second study based on a sample of 10,000 telephone numbers and obtained results that completely corroborate with what is presented in this paper.  Moreover, similar results have been reported by government sponsored surveys such as NHES 2007, NIS 2006, and NHTAS 2007.  MSG will be conducting periodic research to monitor and report back to the research community related changes that can impact the residential coverage rates associated with list-assisted RDD samples.  Also, MSG is in the process of implementing the third round of this study based on a sample of about 30,000 telephone numbers.  Results from this study will be provided in our upcoming Newsletter.

Skip Navigation LinksCorporate Home News View News

Marketing Systems Group

565 Virginia Drive
Fort Washington, PA 19034-2706

Phone 800.336.7674
Fax 215.653.7115

Site Resources