Tag Archives: searching

On Validating Search Strategies

validation

This question came up because of this:

Varela-Lema L, Punal-Riobóo J, Acción BC, Ruano-Ravina A, García ML. Making processes reliable: a validated pubmed search strategy for identifying new or emerging technologies. Int J Technol Assess Health Care. 2012 Oct;28(4):452-9. http://www.ncbi.nlm.nih.gov/pubmed/22995101

What did they mean when they said “a validated PubMed search strategy”? Our MLA systematic review team that is working on search strategies for emerging technologies identification was, shall we say, curious. For this article, it meant that they tested the search results against the next best method previously used (handsearching). The topic was emerging technologies, and what they did was select influential journals and scanned the TOCs manually (which actually means by using their own eyeballs). The journals they scanned were: Science, JAMA, Lancet, Annals of Internal Medicine, Archives of Internal Medicine, BMJ, Annals of Surgery, Am J Transplantation, Endoscopy, J Neurology Neurosurgery Psych, Archives of Surgery, Annals of Surgical Oncology, British Journal of Surgery, and Am J Surg Path. Of the 35 articles that qualified from these journals, the search strategy accounted for 29. The ‘missing’ articles lacked appropriate title words relating to the novelty of the concept, OR used text words that had been removed from the search strategy to improve specificity (reduce total numbers retrieved).

Is that an appropriate way to validate a search strategy? Probably a pretty fair approach for this one, IMHO, especially since they did such a good job of reporting the specific calculations and details of the actual findings of the searches. Is that how most search strategies are validated? Well, perhaps not.

What I’ve been doing to validate search strategies for systematic reviews is to test and compare the search results to a defined set of sentinel articles. The sentinel articles are selected by the team’s subject experts as being good examples of articles that should be retrieved by a search on the defined question. The requirements beyond topic are that each of the sentinel articles should be older than two years, newer than 1990 (this can be flexible, depending on the topic), and must meet all of the defined inclusion criteria for the review. I usually recommend that the pool of selected sentinel articles include no fewer than 3 and no more than 10 citations. This is to make it possible to achieve complete success, as with each added citation, inclusion of all of them becomes more difficult. I also emphasize that the articles do not need to be excellent or required articles on the topic (ie. “gold standard” articles), but that it is, in my opinion, actually more effective for testing if the articles are a selection of relevant, but not necessarily the best ever written on the topic.

Draft versions of the search are tested against this set of articles, and if any “drop out” (are not included) we need to then figure out why, and determine whether to revise the search to include them, or justify the exclusion, or request NLM to correct the coding error in that article’s record. In these last two cases, the exclusion must be reported in the methods. Ideally, one would also describe the strengths, weaknesses, and/or limitations of the search strategy.

Here are some citations to other ways in which searches are validated.

Hausner E, Waffenschmidt S, Kaiser T, Simon M. Routine development of objectively derived search strategies. Systematic Reviews 2012 1:19. http://www.systematicreviewsjournal.com/content/1/1/19
NOTE: This is basically the same “sentinel articles” approach described above.

Hausner E, Guddat C, Hermanns T, Lampert U, Waffenschmidt S. Development of search strategies for systematic reviews: validation showed the noninferiority of the objective approach. J Clin Epid Feb 2015 68(2):191-199. http://www.sciencedirect.com/science/article/pii/S0895435614003874
NOTE: Interesting article tests the reproducibility of Cochrane reviews and their reported search strategies. The emphasis is on the need for objective and reproducible search strategies in systematic review publications.

Van Walraven C, Bennett C, Forster AJ. Derivation and validation of a MEDLINE search strategy for research studies that use administrative data. Health Serv Res. 2010 Dec;45(6 Pt 1):1836-45. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3026961/
NOTE: Compared to handsearching.

Walsh ES, Peterson JJ, Judkins DZ. Searching for disability in electronic databases of published literature. Disability and Health Journal Jan 2014 7(1):114-118. http://www.sciencedirect.com/science/article/pii/S1936657413001647
NOTE: Very interesting two part test to manage the quality control of the search strategy. First, they used the method described above (compared to sentinel articles), then, because the search excluded specific topic terms in favor of broad keyword searching, they validated by comparing retrieval to the results of a known topic search.

Hempel S, Rubenstein LV, Shanman RM, Foy R, Golder S, Danz M, Shekelle PG. Identifying quality improvement intervention publications–a comparison of electronic search strategies. Implement Sci. 2011 Aug 1;6:85. doi: 10.1186/1748-5908-6-85. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3170235/
NOTE: Compared relevance and quality of search strategies by results being reviewed for relevance by independent experts. My personal misgivings about this method for validating a search is that it cannot test for what is missed that you don’t know about.

Tanon AA, Champagne F, Contandriopoulos AP, Pomey MP, Vadeboncoeur A, Nguyen H. Patient safety and systematic reviews: finding papers indexed in MEDLINE, EMBASE and CINAHL. Qual Saf Health Care. 2010 Oct;19(5):452-61. http://www.ncbi.nlm.nih.gov/pubmed/20457733
NOTE: Compared sensitivity & specificity for new search strategies in comparison to previously published search strategies on the same topic. Validated by comparing to a large selection of sentinel articles. Very difficult to achieve, and am ambitious strategy!

Brown L, Carne A, Bywood P, McIntyre E, Damarell R, Lawrence M, Tieman J. Facilitating access to evidence: Primary Health Care Search Filter. Health Info Libr J. 2014 Dec;31(4):293-302. http://www.ncbi.nlm.nih.gov/pubmed/25411047
NOTE: Interesting strategy that first created and validated a search strategy in OVID for quality control over the search development process, and then converted the strategy to PUBMED and validated it again. The validation was again through the selection of a set of sentinel citations, but they explicitly selected for the best quality articles in the topic and referred to the set as the “gold standard.”

Damarell RA, Tieman JJ, Sladek RM. OvidSP Medline-to-PubMed search filter translation: a methodology for extending search filter range to include PubMed’s unique content. BMC Med Res Methodol. 2013 Jul 2;13:86. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3700762/
NOTE: Same strategy as the article by Brown, Carne…Tieman above, but a different topic.

You can find more articles on this topic by exploring the following search results:

(validated OR validation OR “quality control” OR “quality assessment”) search strategy review http://www.ncbi.nlm.nih.gov/pubmed/?term=(validated+OR+validation+OR+%22quality+control%22+OR+%22quality+assessment%22)+search+strategy+review

Finding & Using Images, Lessons Learned (the Hard Way)

I can talk for a very long time about finding Creative Commons and Public Domain images, and the tricks and troubles associated with doing so. This is going to be an image-heavy words-light post about some of what I’ve learned over the past several years. For the short short version, go right to the bottom for my favorite places to find images to use.

What is Creative Commons?

Creative Commons: New License Chooser

Many people say “Creative Commons” as a one-size-fits-all term for I-can-use-this-picture-without-paying-money-and-not-get-in-legal-trouble. That isn’t what it means. Some people still believe that they can take and use any image on any website. This is not remotely true! I myself use a lot of screenshots, and had some earnest worried talks with University Counsel before starting to do so. Strictly speaking, the screenshots may be a questionable practice from a legal point of view, but it has become common practice and rarely causes a problem as long as the site is open to the public. The caveat? If someone complains, be prepared to take the image down, and apologize.

Most of the images people are thinking of that fall into the safe-to-use category are actually public domain. Creative Commons usually applies when someone made a new image that could be copyrighted and has chosen to give advance permissions for some types of use, but not all. This means you can re-use the images in some ways, but can still get into trouble if you don’t do something they asked (like keep their name with the image) or do something they asked people NOT to do (like rework the image).

The Creative Commons organization has a tool to help people decide what license to choose for their own images. I find that tool very helpful in understanding how the creator of the image was thinking about their own work, and thus allowing me to be more respectful of their stated requirements.

Creative Commons: http://creativecommons.org/

Choose a License: http://creativecommons.org/choose/

University of Michigan Libraries Research Guides: Creative Commons: http://guides.lib.umich.edu/content.php?pid=483716

What is Public Domain?

Public domain basically means when an image or text is not copyrightable. This can be because of who created it, what type of information it includes, or that it used to be copyrighted and is now too old. The best simple overview of this is from Lolly Gasaway.

Lolly Gasaway: When US Works Pass Into the Public Domain: http://www.unc.edu/~unclng/public-d.htm

If you want a lot more information about public domain, check out the manifesto, which discusses many of the issues of public domain in the contemporary online environment.

The Public Domain Manifesto: http://www.publicdomainmanifesto.org/node/8

I Can Use This Because It’s Old … Maybe

Historic images are often considered to be public domain, but … not always. If the photograph of the original image is copyrighted, then it doesn’t matter how old the original was. If I went to a museum, they probably won’t allow me to take pictures. If they do, and I take a picture of a painting there (like this picture of St. Apollonia from the University of Michigan Sindecuse Museum of Dentistry), there are issues with (a) did I have permission to take the image, (b) did I have permission to share the image, (c) did I give permission to use or share the image, and so forth.

Sindecuse Museum of Dentistry: St. Apollonia

OR (and this is important), I could follow the assumptions of the Wikimedia Commons team:

The official position taken by the Wikimedia Foundation is that “faithful reproductions of two-dimensional public domain works of art are public domain, and that claims to the contrary represent an assault on the very concept of a public domain”. For details, see Commons:When to use the PD-Art tag. This photographic reproduction is therefore also considered to be in the public domain.

In that case, a straight-on shot of the complete work is usually considered safe to use. That is also the case for this next image.

This next image is of a Japanese calligraphy created in 1923. In USA law, that would be in the public domain. I found it in Wikimedia Commons, my favorite place to find public domain and creative commons images. Why I like them best is because they give the full provenance (or history) of the image, why they think it is legally ok to use, where they got it, and what license or credit should be given with the image when used.

Example public domain image

From Wikimedia Commons


File:Zen painting and calligraphy on silk signed Hachijûgo (85 year old) Nantembô Tôjû, 1923.jpg: http://commons.wikimedia.org/wiki/File:Zen_painting_and_calligraphy_on_silk_signed_Hachij%C3%BBgo_(85_year_old)_Nantemb%C3%B4_T%C3%B4j%C3%BB,_1923.jpg

Because this is considered public domain, there is no license statement provided with it, but if I dig into the information in the image summary, I can find more about where it come from to provide a proper source and attribution. Notice that the image is actually from a book or journal, still in print, and available for purchase. I am sure glad that Wikimedia made the judgment call on this one, because I would not feel safe using it if they had not already established precedent.

Description
English: Zen painting and calligraphy on silk by Nakahara Nantenbo signed “Hachijūgo (85 year old) Nantembō Tōjū”, 1923
Date: 1923
Source: Andon, No. 85, p. 59
Author: Nakahara Nantenbo

Because it is considered public domain, I can do pretty much ANYTHING with it. I can use it in writings or slides, change its format, reprint it as a poster or tshirt and sell it, recolor it, make it 3D, make a parody of it, convert it into music, move it into a virtual world, and so forth. In the image below, I moved it into Second Life, where I made a poster of it for the wall of a memorial for the victims of the 2011 Japanese tsunami.

Pic of the day - Sad Face

I Can Use This Because I Made It Different … Sometimes

Here’s a similar example, with important differences.

From Wikimedia Commons

Low-resolution image of Escher’s Relativity print


Escher’s Relativity: http://en.wikipedia.org/wiki/File:Escher%27s_Relativity.jpg

Now this one is from 1953, and is not out of copyright anywhere in the world. Even more confusing, the image is taken directly from Escher’s own official website. How can they get away with this?

“This image is of a drawing, painting, print, or other two-dimensional work of art, and the copyright for it is most likely owned by either the artist who produced the image, the person who commissioned the work, or the heirs thereof. It is believed that the use of low-resolution images of works of art for critical commentary on
* the work in question,
* the artistic genre or technique of the work of art or
* the school to which the artist belongs
on the English-language Wikipedia, hosted on servers in the United States by the non-profit Wikimedia Foundation, qualifies as fair use under United States copyright law.”

People love this print, and it has been re-used in many different ways. Here is an example of a couple of avatars wandering around in a 3d replica of the space.

Cool Toys Pic of the day - Primtings Museum

Here is a link to a copyrighted image of a reconstruction of the print as a real world 3d object, made with … LEGOs.

Escher’s “Relativity” in LEGO®: http://www.andrewlipson.com/escher/relativity.html

What makes these ok? They aren’t exactly low-rez images, but what they are is substantially innovative reworkings of the original concept. That can be a tricky point in intellectual property law, so don’t trust on it as a get-out-of-jail-free-card.

I Can Use This Because the Government Made It … Maybe

There is a perception that it is OK to take and use any image on any website that has a government web address (“.gov”). Often that’s pretty close, but again, it is not always true, only sometimes. Sometimes the government uses images made by other people who are not government employees. Rights for those images belong to the person who made them, or to the company they work for. You need to know which, and you usually need to ask first just to be sure. Read the fine print.

“Some of these photos are in the public domain or U.S. government works and may be used without permission or fee. However, some images may be protected by license or copyright. You should read the disclaimers on each site before using these images.” U.S. Government Photos and Images: http://www.usa.gov/Topics/Graphics.shtml

So, I’m a bit of an astro fan, and just love astronomy images. Here is one from the Hubble telescope, credited to NASA, the European Space Agency, and Hubble themselves.

Cat's Eye Nebula
Cat’s Eye Nebula: http://hubblesite.org/gallery/album/pr2004027a/
Credit: NASA, ESA, HEIC, and The Hubble Heritage Team (STScI/AURA)
Acknowledgment: R. Corradi (Isaac Newton Group of Telescopes, Spain) and Z. Tsvetanov (NASA)

If you look at the image on the Hubble site, they give information for how to credit the image, and it isn’t entirely clear on the image page if it is legal to use or not. They do have another page for all their images stating they use Creative Commons licensing requiring attribution.

Hubble: Usage of images, videos and web texts: http://www.spacetelescope.org/copyright/

They actually had a big contest for folk who used or spotted uses of their images in Popular Culture. I won 3rd place for “Weird” with this dress and avatar design, in which the image above (the Cat’s Eye Nebula) became the eyes of the avatar. And this is all perfectly fine and hunky-dory, and I even won copies of some images and videos from them.

Hubble Gown

SEARCHING

There are a TON of places that say they offer a search for images that are creative commons, public domain, or royalty free. Most of them are not actually very safe to use. I’m going to show you several here, with brief comments along the lines of yes, no, maybe so.

CompFight
Creative Commons Search, Yea or Nay
CompFight: http://compfight.com/

MAYBE.
They use the Flickr API to provide an alternate Flickr search experience, powered by ads. Why not just go to Flickr? PS – There are so many other search engines that say they are Creative Commons search engines that are really just using the Flickr API. I am not going to list them all.

Creative Commons
Creative Commons
Creative Commons: Search: http://search.creativecommons.org/

MAYBE.
“Do not assume that the results displayed in this search portal are under a CC license. You should always verify that the work is actually under a CC license by following the link.”

Flickr
Creative Commons Search, Yea or Nay
Flickr: The Commons: http://www.flickr.com/commons

Creative Commons Search, Yea or Nay
Flickr: Creative Commons: http://www.flickr.com/creativecommons/

CAREFUL! YES &/OR MAYBE
When people search creative commons images in Flickr, they often aren’t aware that there are TWO completely different “Commons” in Flickr. When you search or browse “The Commons” you are getting images from schools, libraries, museums, and other famous and authoritative sources. When you search Flickr’s “Creative Commons” search you are trusting whoever put the image there. Not everyone is equally savvy and responsible.

I once had posted a slidedeck that used images from Flickr’s Creative Commons search. I listed the sources of all the images, with link, and then went back to each Flickr page and posted a thank you for making their image CC-licensed, giving the link to where I used it. One of the posters then commented on my comment, saying, “Don’t thank me! I took the image from [insert here name of most famous newspaper you can think of].” Oh. Uh oh. Oh, no.

Google: Image Search
Creative Commons
Google: Advanced Image Search: https://www.google.com/advanced_image_search?hl=en&biw=953&bih=787&q=commons&tbm=isch

Close up:
Creative Commons

SOMETIMES, (BUT MOSTLY NO).
Google Image Search is a case of Good News, Bad News. The Good News is, “Wow, look! They have a way to let you search by license or image rights!” The Bad News is that, of necessity, they trust the information provided on the source page about licensing, and sometimes people steal images, often without realizing they are doing so, and repost them with more freedom than was provided under the original license. That means many, if not most, of the images listed as Creative Commons in Google’s Image Search, well, AREN’T!!

Imagestamper
Cool Toys Pic of the Day - ImageStamper
Imagestamper: http://www.imagestamper.com/

MAYBE. MAYBE NOT.
Not an image search site, but a license management site. Imagestamper offers to manage licensing for you, and keep a record of the license you negotiated at a particularly point in time. The logic is that some images appear to be Creative Commons at one point in time, and then later the owner changes their mind. This really does happen. Ouch. Unfortunately, I’m not sure it’s still alive.

In their words: “The service is in early beta. Currently the service works with images hosted on Flickr, but we will soon add support for images hosted on deviantART and a number of other image-sharing websites. © 2008-2011 ImageStamper.com”

Morgue File
Creative Commons Search, Yea or Nay
Morgue File: http://www.morguefile.com/archive/

NO.
They use a their own home-grown variant of Creative Commons licensing that hasn’t been tested in the courts, and is thus not reliable. Basically, who really knows if these images are safe to use, and how to properly provide attribution?

Noun Project
Cool Toys Pic of the day - Noun Project
Noun Project: http://thenounproject.com/

YES, SOMETIMES
Absolutely fabulous site, but only for images of icons created for and submitted to the site. More info.

Ookaboo
Cool Toys Pic of the day - ookaboo
Ookaboo: http://ookaboo.com/o/pictures/

MAYBE.
Mostly mines Wikimedia Commons for content, under a different interface. Personally, I prefer the original. More info.

SpinXPress
Cool Toys pics of the day: SpinXPress
SpinXPress: http://www.spinxpress.com/getmedia

SOMETIMES.
This is basically a different interface to searching from the Creative Commons portal, with many of the same caveats, and some new ones, too. More Info

Wikimedia Commons
Cool Toys Pic of the Day - Wikimedia Commons (Open Free Pics, Photos and Media)
Wikimedia Commons: http://commons.wikimedia.org/

YES.
But always:
1) click on the image to go to the page for the image information, and
2) scroll to the bottom of the page for each image to check licensing information and use their recommended credit or attribution statement.

Wylio
Cool Toys Pic of the day - WylioCreative Commons Search, Yea or Nay
Cool Toys Pic of the day - WylioCool Toys Pic of the day - Wylio
Wylio: http://www.wylio.com/

MAYBE. MAYBE NOT.
They used to say they’d manage licensing for you, now that’s not clear. They used to just let you search, and now they require you to log in with a Google account. They want you to pay them for the full service, but these are images that are supposed to be free. And on the bottom of the results pages, they feed you to sites that sell images for money. I dunno. Hmmm.

ASSUMPTIONS THAT WORK SOMETIMES, BUT NOT ALWAYS

I can use this because:
– it’s old
– it’s different from the original
– the government made it

BEST PRACTICES

Read the fine print.
Ask for permission if you aren’t sure.
Don’t assume.

MY FAVORITE PLACES FOR FREE IMAGES

1. Wikimedia Commons: http://commons.wikimedia.org

2. Flickr Commons (NOT Flickr Creative Commons): http://www.flickr.com/commons

3. For University of Michigan people, here is more information, LOTS more information!

Research Guides: Images: http://guides.lib.umich.edu/content.php?pid=32604

Your Health – A Lot of It Is About Asking Questions. The Right Questions.

Pic of the day - What makes it happen smart?

During Sunday’s frantically paced #HCSM Twitter chat, one of the topics that came up was the problem of getting help and learning when you have a new diagnosis. That is when your brain usually goes into some sort of frozen state, and you forget important things, like that you already knew some of this, or how to spell the words, or to ask how to spell the name of the thing you have. You know, things you would think of if you weren’t sitting there stunned.

I had two recommendations. 1) Ask a librarian for help, ASAP, especially a medical librarian. 2) Look for suggestions or lists of questions you should be asking, just to make sure you don’t miss something important. Here are some resources and tips for both.

#1: ASK A LIBRARIAN

Ask a librarian

A lot of people replied that it isn’t as easy as I think to ask a librarian. Not because they were embarrassed about asking, but because they couldn’t find a librarian. Oh. Really?!? Oh, wow.

So first thing I did was post a couple of links on where and how to find medical librarians. Now, of course, you can always ask a healthcare professional, it is just I assume that you’ve already tried that, or that your appointment was too short, or that you didn’t think of the right questions then. Libraries are great for just dropping in and asking for help.

Find a Librarian: National Network of Libraries of Medicine Find a Librarian: Medical Library Association

National Network of Libraries of Medicine: Members: http://nnlm.gov/members/

Medical Library Association: For Health Consumers: http://mlanet.org/resources/consumr_index.html#2

Find a Librarian: MedlinePlus Find a Librarian: healthfinder

MedlinePlus: Find a Library: http://www.nlm.nih.gov/medlineplus/libraries.html

healthfinder: Find Services Near You: http://healthfinder.gov/FindServices/

Find a Librarian: LoC / NLS Find a Librarian: Ed.gov

Library of Congress: NLS Reference Directories: Library Resources for the Blind and Physically Handicapped 2009: http://www.loc.gov/nls/reference/directories/resources.html

Ed.gov: Library Search: http://nces.ed.gov/surveys/libraries/librarysearch/

Find a Librarian: WorldCat Find a Librarian: Internet Public Library

Worldcat: Libraries: http://www.worldcat.org/libraries

Internet Public Library: Library Locator: http://www.ipl.org/div/liblocator/

The next concern was along the lines of “What about finding a librarian at 3:00AM when I can’t sleep because I’m so frantically worried about everything that’s happening right now? Librarians are hard to find at 3AM!”

Believe it or not, there is a solution for this, too.

Internet Public Library: Ask Us: http://www.ipl.org/div/askus/
“This service runs 24 hours/day, 7 days/week during most of the year.”

Yes, really.

Ask a Librarian: Internet Public Library

#2: FINDING THE RIGHT QUESTIONS

Ten questions to ask your doctor

When you are first diagnosed, there can be this sense of urgency, a need to find out everything you need to know, except … where do you start? There is so much to learn! What do you need to know first? What questions should you be asking?

For most diagnoses, someone has written up a list of questions for exactly this. The problem is first, thinking to ask what questions to ask, and second, finding these lists of questions. It is kind of like being granted three wishes in a fairy tale, with the rule “No wishing for more wishes!” Far too often, people find out later questions they wish they had asked at the beginning.

There are a few search strategies I’ve found helpful over the years for finding these. You can ask a librarian for help, but you can also do your own searches. For each of these examples, try adding in the name of your diagnosis to the search strategy given below. Try changing the word “doctor” to the type of health professional you are seeing — nurse, or therapist might be other choices.

See what lists of questions you find. Then write down the questions you like, and make a list. Order the questions by what’s most important, because sometimes there won’t be time for all of the questions.

(1)
“ask * doctor”

(2)
“question to ask” doctor

(3)
“ask * questions” doctor

(4)
“asking * questions” doctor

(5)
(“frequently asked questions” OR FAQs OR FAQ)

AHRQ: Questions are the Answer

Remember these tips from the Agency for Healthcare Research and Quality — Questions ARE the Answer.

AHRQ: Questions are the Answer: http://www.ahrq.gov/legacy/questions/index.html

They have ten standard questions, and a tool to build and print your own custom question list. Here are the ten basic ones.

1. What is the test for?
2. How many times have you done this procedure?
3. When will I get the results?
4. Why do I need this treatment?
5. Are there any alternatives?
6. What are the possible complications?
7. Which hospital is best for my needs?
8. How do you spell the name of that drug?
9. Are there any side effects?
10. Will this medicine interact with medicines that I’m already taking?

What’s Wrong With Google Scholar for “Systematic” Reviews

Systematic!!!

Monday I read the already infamous article published January 9th which concludes that Google Scholar is, basically, good enough to be used for systematic reviews without searching any other databases.

Conclusion
The coverage of GS for the studies included in the systematic reviews is 100%. If the authors of the 29 systematic reviews had used only GS, no reference would have been missed. With some improvement in the research options, to increase its precision, GS could become the leading bibliographic database in medicine and could be used alone for systematic reviews.

Gehanno JF, Rollin L, Darmoni S. Is the coverage of google scholar enough to be used alone for systematic reviews. BMC Med Inform Decis Mak. 2013 Jan 9;13(1):7. http://www.biomedcentral.com/1472-6947/13/7/abstract

Screen Shot: "Is the coverage of google scholar enough ..."

Leading the argument from the library perspective is Dean Giustini, who has already commented on the problems of:
– precision
– generalizability
– reproducibility

Giustini D. Is Google scholar enough for SR searching? No. http://blogs.ubc.ca/dean/2013/01/is-google-scholar-enough-for-sr-searching-no/

Giustini D. More on using Google Scholar for the systematic review. http://blogs.ubc.ca/dean/2013/01/more-on-using-google-scholar-for-the-systematic-review/

While these have already been touched upon, what I want to do right now is to bring up what distresses me most about this article, which is the same thing that worries me so much about the overall systematic review literature.

Problem One: Google.

Google Search

First and foremost, “systematic review” means that the methods to the review are SYSTEMATIC and unbiased, validated and replicable, from the question, through the search, delivery of the dataset, to the review and analysis of the data, to reporting the findings.

Let’s take just a moment with this statement. Replicable means that if two different research teams do exactly the same thing, they get the same results. Please note that Google is famed for constantly tweaking their algorithms. SEOMOZ tracks the history of changes and updates to the Google search algorithm. Back in the old days, Google would update the algorithm once a month, at the “dark of the moon”, and the changes would them propagate through the networks. Now they want to update them more often, so there is no set time. It happens when they choose, with at least 23 major updates during 2012, and 500-600 minor ones. That is roughly twice a day. That means you can do exactly the same search later in the same day, and get different results.

Google Algorithm Change History: http://www.seomoz.org/google-algorithm-change

That is not the only thing that makes Google search results unable to be replicated. Google personalizes the search experience. That means that when you do a search for a topic, it shows you what it thinks you want to see, based on the sort of links you’ve clicked on in the past, and your browsing history. If you haven’t already seen the Eli Pariser video on filter bubbles and their dangers, now is a good time to take a look at it.


TED: Eli Pariser: Beware Online Filter Bubbles. http://www.ted.com/talks/eli_pariser_beware_online_filter_bubbles.html

If you are using standard Google, it will give you different results than it would give to your kid sitting on the couch across the room. This is usually a good thing. It is NOT a good thing if you are trying to use the search results to create a standardized dataset as part of a scientific study.

People often think this is not a big problem. All you have to do is log out of any Google products. Then it goes back to the generic search, and you get the same things anyone else would get. Right? Actually, no. Even if you switch to a new computer, in a different office or building, and don’t log in at all, Google is really pretty good at making a guess at who you are based on the topics you search and the links you choose. Whether or not it guesses correctly doesn’t matter for my concerns, the problem is that it is customizing results AT ALL. If there is any customization going on, then that is a tool that is inappropriate for a systematic review.

Now, Google does provide a way to opt-out of the customization. You have to know it is possible, and you have to do something extra to turn it off, but it is possible and isn’t hard.

Has Google Popped the Filter Bubble?: http://www.wired.com/business/2012/01/google-filter-bubble/

Now, the most important question is does it actually turn off the filter bubble. Uh, um, well, … No. It doesn’t. Even if you turn off personalization, go to a new location, and use a different computer, Google still knows where that computer is sitting and makes guesses based on where you are. That Wired article about Google getting rid of the filter bubble was dated in January of 2012. I participated in a study done by DuckDuckGo on September 6th, and reported in November on their blog. Each participant ran the same search strategies at the same time, twice, once logged in and once logged out. They grabbed screenshots of the first screen of search results and emailed them to the research team. The searchers were from many different places around the world. Did they get different results? Oh, you betcha.

Magic keywords on Google and the consequences of tailoring results: http://www.gabrielweinberg.com/blog/2012/11/magic-keywords-on-google-and-the-consequences-of-search-tailoring-results.html

Now try to imagine the sort of challenge we face in the world of systematic review searchers. Someone already published a systematic review. You want to do a followup study. You want to use their search strategy. You need to test that you are using it right, so you limit the results to the same time period they searched, to see if you get the same numbers. I don’t know about you, but I am busting with laughter trying to imagine a search in Google, and saying, “No, I just want the part of Google results that were available at this particular moment in time five years ago and three months and ten days, if I was sitting in Oklahoma City.” Yeah, right.

Take home message? Google cannot be used for a systematic review. Period. And not just because you get 16,000 results instead of 3,000 (the precision and recall question), or because Google is a more comprehensive database than the curated scholarly databases that libraries pay for and thus you end up with poor quality results (also impacting on sensitivity and specificity), but purely on methodological grounds.

Problem Two: Process.

Systematic Reviews and Clinical Practice Guidelines to Improve Clinical Decision Making

First and foremost, “systematic review” means that the methods to the review are SYSTEMATIC and unbiased, validated and replicable, from the question, through the search, delivery of the dataset, to the review and analysis of the data, to reporting the findings.

Doing a systematic review is supposed to be SYSTEMATIC. Not just systematic for the data analysis (a subset of which is the focus of the Gehanno Google Scholar article), but systematic for the data generation, the data collection, the data management, defining the question, analysing the data, establishing consensus for the analysis, and reporting the findings. It is systematic ALL THE WAY THROUGH THE WHOLE PROCESS of doing a real systematic review. The point of the methodology is to make sure the review is unbiased (to the best of our ability, despite being done by humans), and replicable. If both of those are true, someone else could do the same study, following your methodology, and get the same results. We all know that one of the real challenges in science is encountering challenges with replicating results. That doesn’t mean it is OK to be sloppy.

The Gehanno article tries to test a tiny fraction of the SR process – if you can find the results. But they search them backwards from the normal way such a search would be done. The idea that the final selected studies of interest in specific systematic reviews will be discoverable in Google Scholar is also fairly predictable, given that Google Scholar scrapes content from publicly accessible databases such as PubMed, and thus duplicates that content.

It is unfortunately that their own methodology is not reported in sufficient detail as to allow replicating their study. What they’ve done is a very tiny partial validation study to show that certain types of content is available in Google Scholar. That is important for showing the scope of Google Scholar, but has absolutely nothing to do with doing a real systematic review, and the findings of their study should have no impact on the systematic review process for future researchers. Specifically, this sentence is what is most misstated.

“In other words, if the authors of these 29 systematic reviews had used only GS, they would have obtained the very same results.”

All we really know is what happened for the researchers who did these several searches on the days they searched. It might have been possible, but to say that they would have obtained the same results is far too strong of a claim. For the statement above to be true, it would have been necessary to first find a way to lock in Google search results for specific content at specific times; second, to replicate the search strategies from the original systematic reviews in Google Scholar and to compare coverage; third, to have vastly more sophisticated advanced searching allowing greater precision, control, and focus; and so forth. Gehanno et al are well aware of these issues, and mention them in their study.

“GS has been reported to be less precise than PubMed, since it retrieves hundreds or thousands of documents, most of them being irrelevant. Nevertheless, we should not overestimate the precision of PubMed in real life since precision and recall of a search in a database is highly dependent on the skills of the user. Many of them overestimate the quality of their searching performance, and experienced reference librarians typically retrieve about twice as many citations as do less experienced users. … . It just requires some improvement in the advanced search features to improve its precision …”

More importantly, in my mind, is that the Gehanno study conflates the search process and the data analysis in the systematic review methodology. These are two separate steps of the methodological process, with different purposes, functions, and processes. Each is to be systematic for what is happening at that step in the process. They are not interchangeable. The Gehanno study is solid and useful, but placed in an inappropriate context which results in the findings being misinterpreted.

Problem Three: Published

Retraction Watch & Plagiarism
Adam Marcus & Ivan Oransky. The paper is not sacred: Peer review continues long after a paper is published, and that analysis should become part of the scientific record. Nature Dec 22, 2011 480:449-450. http://www.nature.com/nature/journal/v480/n7378/full/480449a.html

The biggest problem with the Gehanno article, for me, is that it was published at all, at least in its current form. There is much to like in the article, if it didn’t make any claims relative to the systematic review methodological process. The research is well done and interesting, if looked at in the context of potential utility of Google Scholar to support bedside or chairside clinical decisionmaking. There are significant differences between the approaches and strategies for evidence-based clinical practice and doing a systematic review. While the three authors are all highly respected and expert informaticians, the content of the article illustrates beyond a shadow of a doubt that the authors have a grave and worrisome lack of understanding of the systematic review methodology. It is worse than that. It isn’t just that the authors of the study don’t understand how systematic review methodologies, but that their peer reviewers ALSO did not understand, and that the journal editor did not understand. That is not simply worrisome, but flat out frightening.

The entire enterprise of evidence-based healthcare depends in large part on the systematic review methodology. Evidence-based healthcare informs clinical decisionmaking, treatment plans and practice, insurance coverage, healthcare policy development, and other matters equally central to the practice of medicine and the welfare of patients. The methodologies for doing a systematic review were developed to try to improve these areas. As will any research project, the quality of the end product depends to a great extent on selecting the appropriate methodology for the study, understanding that methodology, following it accurately, and appropriately documenting and reporting variances from the standard methodology where they might impact on the results or findings.

My concern is that this might be just one indicator of a wide-spread problem with the ways in which systematic review methodologies are understood and applied by researchers. These concerns have been discussed for years among my peers, both in medical librarianship and among devoted evidence-based healthcare researchers, those with a deep and intimate understanding of the processes and methodologies. There are countless examples of published articles that state they are systematic reviews which … aren’t. I have been part of project teams for systematic reviews where I became aware partway through the process that other members of the team were not following the correct process, and the review was no longer unbiased or systematic. While some of those were published, my name is not on them, and I don’t want my name associated with them. But the flaws in the process were not corrected, nor reported, creating a certain level of alarm for me with respect to that particular project, as well as looking to them as indicators of challenges with published systematic review in general.

I used to team teach systematic review methodologies with some representatives from the Cochrane Collaboration. At that time, I was still pretty new to the process and had a lot to learn, but I did know who the experts really were, and who to go to with questions. One of the people I follow rigorously is Anne-Marie Glenny, who was a co-author on a major study examining the quality of published systematic reviews. Here is what they found.

“Identified methodological problems were an unclear understanding of underlying assumptions, inappropriate search and selection of relevant trials, use of inappropriate or flawed methods, lack of objective and validated methods to assess or improve trial similarity, and inadequate comparison or inappropriate combination of direct and indirect evidence. Adequate understanding of basic assumptions underlying indirect and mixed treatment comparison is crucial to resolve these methodological problems.”
Song F, Loke YK, Walsh T, Glenny AM, Eastwood AJ, Altman DG. Methodological problems in the use of indirect comparisons for evaluating healthcare interventions: survey of published systematic reviews. BMJ. 2009 Apr 3;338:b1147. doi: 10.1136/bmj.b1147. PMID: 19346285 http://www.bmj.com/content/338/bmj.b1147?view=long&pmid=19346285

We have a problem with systematic reviews as published, and the Gehanno article is merely a warning sign. There are serious large concerns with the quality of published systematic reviews in the current research base, and equally large concerns with the ability of the peer-review process to identify quality systematic reviews. This is due, in my opinion, to weaknesses in the educational process for systematic review methodologies, and in the level of methodological expertise on the part of the authors, editors, and reviewers of the scholarly journals. Those concerns are significant enough to generate doubt about the appropriateness of depending on systematic reviews for developing healthcare policies.

Medlib’s Blog Carnival 2.1: Free Speech in Health Information, and More

WARNING: After this entry was originally posted, it came to my attention that I had not received all of the early entries for this round of the Carnival. The following post was edited to reflect these updates.


In the context of the looming deadline for comments on the FDA’s development of social media guidelines, the Medlib’s Blog Carnival theme this month was on free speech in health information. Briefly, the FDA has a long history of managing and establishing guidelines to prevent unethical publication of inaccurate or misleading health information from persons or corporate entities promoting the use or sales of drugs or medical devices. The flip side of this is to encourage informed decisionmaking based on high quality unbiased health information. There were few submissions this month, but those received were sound contributions looking at various aspects of this complicated issue.

Laika provided not one, but TWO excellent posts. The first one, “NOT ONE RCT on Swine Flu or H1N1?! – Outrageous!,” discusses the issue of popular news and hype as opinion influencers in comparison with actual research. Taking H1N1 as an example, she begins with a Twitter post and popular press, then discusses when it is appropriate to expect what kind of evidence in support of a question, simple tips for finding better quality evidence, as well as specific scientific and clinical contextual issues that beautifully illustrate not just issues of scientific research and methodology, access to information and information quality assessment, but also quite a bit of useful information about H1N1 itself! Laika provides a strong voice for clear reason and balanced information, but at the same time respects the importance of scientific dialog and communication in shaping the evolution of what we know about any given topic.

Laika’s MedLibLog: NOT ONE RCT on Swine Flu or H1N1?! – Outrageous!. http://laikaspoetnik.wordpress.com/2009/12/16/not-one-rct-on-swine-flu-or-h1n1-outrageous/

In her second post for this Carnival, Laika again zeroes in on the issue of dialog in science, and the broader issue of respect. This is true not just for dialog between scientists, as in the example she discusses, but even more so among the public and news media. The life lessons learned by Laika in her tale of disrespect and influence among scientists are ones we should all keep in mind when observing disagreements about science. I wanted to cheer when I read her excellent, methodical review of the limits of evidence-based medicine, and when one should or should not apply its finding to a given situation. While EBM is a very useful tool, I also have encountered worrisome instances in which a useful, low-risk, low-cost intervention is not used because there are not yet sufficient RCTs or because it is being researched for XYZ use but hasn’t yet been approved for it by the FDA. When EBM becomes a barrier to good clinical care, we have a different problem. I particularly liked the example she gave of a systematic review finding insufficient evidence to support the use of parachutes when jumping from a plane, and the selection of quotations from comments. My favorite, succinct and clear, was this line from a clinician at my institution, “RCTs aren’t holy writ, they’re simply a tool for filtering out our natural human biases in judgment and causal attribution. Whether it’s necessary to use that tool depends upon the likelihood of such bias occurring.” Read, read, and read this post again.

Laika’s MedLibLog: #NotSoFunny – Ridiculing RCTs and EBM. http://laikaspoetnik.wordpress.com/2010/02/01/notsofunny-ridiculing-rcts-and-ebm/

Dr. Shock’s post about BioMedSearch focused on “free” as in free access to quality healthcare information. A related concept in his post were the barriers traditional search methods provide to discovery of quality health information, and if it is time for a change. While you are visiting his blog, you might want to take a look at another recent post on “The Hidden and Informal Curriculum During Medical Education,” which talks about overt and covert concepts and communications in medical education. While the specific example was about narratives in a secured online space, the concepts are perhaps even more important when thinking about healthcare communications in unsecured social media spaces.

Dr. Shock, A Neurostimulating Blog: BioMedical Search on BioMedSearch: http://www.shockmd.com/2009/11/28/biomedical-search-on-biomedsearch/

In an oblique connection, Novoseek, the innovative biomedical web search engine covering Medline, grants and online publications, offered a post on their new feature, allowing searchers to limit by publication type. While this doesn’t directly connect to free speech (rather the reverse) it does directly connect to quality of health information and control through peer review, both of which are implied contextual issues. Being able to use a health specific search tool automatically focuses results on a narrower and higher quality subset of the information available on the web. Being able to limit by publication type enables the searcher to slice the search even more finely, focusing on just the highest quality health information available.

Novoseek: Tip #1 to improve searches in novoseek – Filter results by publication type. http://blog.novoseek.com/index.php/resources/tip-1-to-improve-searches-in-novoseek-filter-results-by-publication-type.html/

PS. While you are taking a look at that blogpost, you might want to also take a look at an earlier post from Novoseek called The importance of context in text disambiguation. It is a kind of geeky, technical post, but the fundamental concept is central to how humans (as well as computers) identity quality when they see it.