McMaster Home HiRU HomeCE&B Home
Skip Navigation Links.

The focus of the Hedges Project (see attached for additional information), which is funded by the National Library of Medicine, is to investigate ways to develop and harness search filters ("hedges") that will improve retrieval of scientifically sound and clinically relevant study reports from large, general purpose, biomedical research bibliographic databases including MEDLINE, EMBASE, and PsycINFO. The purposes of the search filters are:

1. to enable health care providers to do their own clinical searches effectively and efficiently;
2. to help reviewers of published evidence concerning health care problems to retrieve all relevant citations;
3. to provide resources for librarians to help clinicians to construct their own searches; and
4. to provide input to the database producers about their indexing processes and the organization of their databases.

Improved search filters are needed and are important given the inherent problems of indexing and retrieval in large databases, and the widespread and rapidly increasing direct use of these databases by clinicians, researchers, educators, administrators, lawyers, journalists, patients, and the general public, whose interests are primarily directed towards a very small subset of the literature that is of most relevance to the cause, course, diagnosis, prevention, and management of health care problems. Our long-term objective is to harness the highest quality, clinically relevant contents of these electronic databases so that their effects on health care and policy can be enhanced.

Click on the following links to view the search filters for MEDLINE, EMBASE, and PsycINFO.

Our Clinical Hedges database contains data for the year 2000 for each article in each of the issues of 170 clinical journals. 161 of these journals were indexed in MEDLINE and 135 were indexed in EMBASE. Expert and highly calibrated research staff have identified and tagged the records for articles that report original and review studies (definitions shown in Table 1) about the cause (causation [etiology]), course (prognosis), diagnosis, prevention or therapy or rehabilitation, clinical prediction, or economics of human health disorders, as well as studies of quality improvement of health services, the continuing education of health professionals, and studies of a qualitative nature (definitions shown in Table 2). Studies in these "purpose categories", except for qualitative and cost studies, have been further tagged for whether they "pass" or "fail" pre-specified methodologic criteria for applied clinical research (criteria shown in Table 3).

To develop search filters in MEDLINE we assembled a list of search terms and phrases in a subset of MEDLINE records matched with a hand search of the contents of 161 journal titles for the year 2000. The search filters were treated as "diagnostic tests" for sound studies and the manual review of the literature was treated as the "gold standard". The sensitivity, specificity, accuracy, and precision (a library science term that is equivalent to the diagnostic test term "positive predictive value") of single- and multiple-term MEDLINE search filters were determined as shown in Table 4. The sensitivity for a given filter is defined as the proportion of high quality articles that are retrieved; specificity is the proportion of low quality or off topic articles not retrieved; precision is the proportion of retrieved articles that are of high quality; and accuracy is the proportion of all articles that are correctly categorized by the search filter. 49,028 articles were included in the analysis and 4,862 unique single-terms were tested.

To view the MEDLINE strategies click on the relevant article category: Therapy, Diagnosis, Review, Prognosis, Causation (etiology), Economics, Cost, Clinical Prediction Guides, Qualitative, all categories.

To develop search filters in EMBASE, we assembled a list of search terms and phrases in a subset of EMBASE records matched with a hand search of the contents of 55 of the 135 journals titles indexed in the year 2000. Search strategies were developed using a 55-journal subset chosen based on those journals that had the highest number of methodologically sound studies, that is, studies that clinicians should be using when making patient care decisions. This selection enriches the sample of target articles, improving the precision of estimates of search term performance and simplifying data processing, but is unlikely to bias the estimates of the sensitivity and specificity of search terms. As with MEDLINE, search strategies were treated as “diagnostic tests” for sound studies and the manual review of the literature was treated as the “gold standard”. 27,769 articles were included in the analysis and 4,843 unique single-terms were tested.

To view the EMBASE filters, click on the relevant article category: Therapy, Diagnosis, Prognosis, Reviews, Clinical Prediction Guide, Qualitative, Causation (etiology), Economics, all categories.

We recently obtained additional funding from the Canadian Institutes of Health Research and the National Library of Medicine to continue our research in this area. Our new project will address the following questions:

  1. What is the relation between the handsearch database size and performance characteristics of search filters?
  2. To what extent can journal subsets be defined for various clinical disciplines using the bibliographies of systematic reviews? How much is the precision of searching enhanced by running searches in such subsets of journals compared with the entire handsearch journal database? What is the trade-off, if any, in sensitivity for high quality studies?
    • How consistent is the indexing of MEDLINE records for methodologically sound original and review articles on the treatment and diagnosis of human health disorders?
    • What difference, if any, does this make to the performance of search filters that include both MeSH terms and methodologic textwords?
  3. For studies of diagnosis, how accurate and complete is the reporting of studies in MEDLINE records and EMBASE records? Has the accuracy and completeness of reporting improved since the Standards for Reporting of Diagnostic Accuracy (STARD) initiative [2] which set out to enhance and standardize the reporting for diagnostic test studies in journal publications?
  4. Has indexing consistency of studies of diagnostic accuracy improved since the STARD initiative?
  5. Is the retrieval of studies from journals that contain structured abstracts better than from those journals that have semi-structured or unstructured abstracts?
  6. What are the consequences of using different search filters (most sensitive search, most specific search, or "optimal" search [which minimizes the sum of false negative and false positive errors]) on article retrieval for systematic reviews and the corresponding measures and conclusions of systematic reviews of diagnostic accuracy?
  7. When practicing clinicians conduct a time-limited search for studies of treatment or diagnosis with the content terms supplied, what is the yield of relevant citations, comparing the main PubMed search screen with Clinical Queries? Are clinicians more satisfied with the retrieval of studies in MEDLINE when searching using the specific clinical hedges, via the PubMed Clinical Queries screen, than when searching from the PubMed main screen without the clinical hedges? What are end-user perceptions while searching? What are the effects of limiting searches to a core journal subset for internal medicine, compared with the full PubMed journal database?