Top tips for finding research information

Free Search Tools for Finding Research Information

This week I was in Canterbury leading a workshop and discussion on Google and Google Scholar for finding research information. Although the emphasis was on Google we also covered other specialist tools designed to search for scientific and research information. We also had an interesting discussion on h-index, other citation indices and services such as ORCID and ResearchGate. The slides for the session are available on authorSTREAM (http://www.authorstream.com/Presentation/karenblakeman-1706478-google-scholar-research-information/), Slideshare (http://www.slideshare.net/KarenBlakeman/scholar-research-information) and temporarily at http://www.rba.co.uk/as/.

Anyone who has attended one of my workshops knows that I ask the group to propose at the end of the session their top tips. These are the Canterbury group’s top 10 tips.

1. What’s going on?
Try and find out what’s going on behind the scenes and how the different search tools work. For example, Google and Google Scholar are quite different in the way they manage your search. Understanding how they operate means that you can adapt your search strategy accordingly and also manage your expectations; for example Google Scholar does not use the publishers’ meta data so author and date search are unreliable.

2. Personalisation and ‘unpersonalisation’
Google personalises your search based on past activity, who is in your social networks,and a whole host of other ‘stuff’. You can quickly ‘unpersonalise’ your results by using a separate browser window that does not use cookies or your web history as part of the search algorithm.

If you use Chrome as your browser, open what is called an incognito window. In the top right hand corner of your screen there is an icon with three lines. Click on it and from the drop down menu select New incognito window. Alternatively press the Ctrl Shift N keys on your keyboard

If you use Firefox, from the menu at the top of the screen select Tools followed by Start Private Browsing.

In Internet Explorer select Tools followed by InPrivate Browsing. If you cannot see InPrivate under Tools try looking under the Safety option.

3. Advanced search commands
Use Google advanced commands  such as filetype: to focus on PDFs, presentations, spreadsheets containing data and site: to look for information on just one site or a range of sites such as UK government. Although the advanced search screen has boxes for you to fill in for the commands the file format or filetype option is limited. It does not include options for the newer Microsoft Office formats such as .pptx and xlsx. Use filetype: as part of your search strategy, for example:

nasa dark energy dark matter filetype:pptx

Google Scholar commands are more limited – see slide 28 of the presentation.

4. intext:
Google automatically looks for variations on your terms and sometimes omits words from your search if it thinks the number of results is too low. Prefixing a term with intext: tells Google that it must be included in your search and exactly as you have typed it in. For example:

UK public transport intext:biodiesel statistics

tells Google that biodiesel must be included in the search and exactly as typed in.

5. Reading Level
Use Reading level if Google is failing to return any research oriented documents for a query. Run the search and from the menu above the results select Search toolsAll results and then from the drop menu Reading level. Options for switching between basic, intermediate and advanced reading levels should then appear just above the results. Google does not give much away as to how it calculates the reading level and it has nothing to do with the reading age that publishers assign to publications. It seems to involve an analysis of sentence structure, the length of sentences, the length of the document and whether scientific or industry specific terminology appears in the page.

6. Date options
In Google web search, use the date options in the menus at the top of the results page to restrict your results to information that has been published within the last hour, day, week, month, year or your own date range. Click on Search tools, then Any time and select an option. This works best with news, discussion boards, and blogs and web sites that use blogging software  to generate pages but Google is getting better at identifying the correct date of a web page.

Google Scholar handles publication dates differently. On the results page you can select a date range from the menu on the left hand of the page. Alternatively, you can run a Google advanced search and enter your publication years. However, Google Scholar looks for publication years in the area of the document where the date is most likely to be. As a result it may identify a page number or part of an author’s address as a year!

7. Google Scholar alerts
To be used with caution as the searches periodically stop without warning, and so have to be set up again, and they sometimes include documents that are several years old. Whatever your search you can set up an alert by selecting Create alert from the menu on the left hand side of the results page.

If the author has created a profile on Google Scholar, from their profile page you can follow new articles and/or new citations for that author. From past experience I warn you that this is not entirely reliable.

Google Scholar Follow Author

8. Metrics – top publications
Although it claims to search all scholarly literature Google Scholar does not always cover all of the key journals in a subject area. There is no complete source list but there is a top publications for subjects and languages under the ‘Metrics’ link in the upper right hand corner of the Scholar home page.

9. Microsoft Academic Search – visualisations
Microsoft Academic Search (http://academic.research.microsoft.com/) is a direct competitor to Google Scholar. The site is sometimes slow to load and it often assigns authors to the wrong institution. Nevertheless, the visualisations such as the co-author and citation maps can be useful in identifying who else is working in a particular area of research. The visualisations can be accessed by clicking on the Citation Graph image to the left of the search results or author profile.

Microsoft academic search citation graph
Author Citation Graph


10. Mednar visual
Deep Web Technologies has developed in conjunction with various institutions a number of science and research specific portals, some of which are publicly available. The sources that they cover are different but they all have similar search and display options. Results are automatically ranked by relevance but this can be changed to date, title or author. In addition to the standard relevance ranked list of results the portals create clusters of topics on the left hand side of the screen. The topics include broad subject headings, authors, publications, publishers, and year of publication and are a useful tool for narrowing down a search. Some of the portals, such as Mednar (http://mednar.com/), offer a clickable ‘visual’ of topics and sub-topics.

Mednar Macular Degeneration Visual

Tweets from the past

Embarrassed by some of your first tweets from 2007? Wish you hadn’t got involved in that drunken virtual brawl on Twitter last Christmas? There was a time when you could safely assume that those ramblings would be lost in the mists of Twitter’s archive never to be seen again. A search on Twitter would only give the last few days worth of postings and Google no longer archives the whole of Twitter. True, the Library of Congress does keep copies of every single tweet for posterity but access is only allowed for serious research purposes. So far, the Library has received  about  400 inquiries but has not yet been able to provide access (http://blogs.loc.gov/loc/2013/01/update-on-the-twitter-archive-at-the-library-of-congress/). So you can breathe easily again? Unfortunately not.

There are commercial organisations such as Datasift (http://datasift.com/) and Gnip (http://gnip.com/) that charge an arm and a leg for analysing tweets and other social media comments, but the cost puts their services out of the reach of the casual searcher. You may find, though, that your forthright hashtagged tweets at a conference have been recorded for all to see free of charge (Sharing (or Over-Sharing?) at #ILI2012, http://ukwebfocus.wordpress.com/2012/11/02/sharing-or-over-sharing-at-ili2012/). And Twitter, itself, is finally providing access to historical tweets.

You can now download your entire collection. Go to your Twitter home page, click on the cog wheel in the upper right hand corner and select settings.

Twitter Settings

At the bottom of the Settings page is a link to request your archive.

Request your archive

You should receive an email a few minutes later with a download link. The file is zipped and once you have unpacked it you can browse your tweets by year and month or search the archive using keywords or hashtags.

Downloaded Twitter Archive
Browse downloaded Twitter archive by year and month
Search downloaded Twitter archive
Search downloaded Twitter archive

I have not been able to work out how often you are allowed to download your archive and, rather annoyingly, there is no top-up option.

Twitter also runs searches on its entire archive – sort of. There is no obvious date option at the moment, not even under advanced search, so it is appears to be all or nothing, and it does not give you everything straightaway. I thought I would have a look at the tweets on Internet Librarian International 2009, hashtag #ili2009, and was surprised that there seemed to be so few. I scrolled down to the bottom of the results and saw “You’ve reached the end of the Top Tweets for #ili2009” with a link to “View all tweets”. Twitter then loaded the remaining tweets as I continued to scroll down the page. About Top Tweets Twitter says:

“We’ve built an algorithm that finds the Tweets that have caught the attention of other users. Top Tweets will refresh automatically and are surfaced for popularly-retweeted subjects based on this algorithm. We do not hand-select Top Tweets.”

There are also links at the top of the results page that enable you to view Top, All, and tweets from just ‘People you follow’.

Twitter Archive search

There are in fact advanced search commands that can be used to include a date range in your search (see https://support.twitter.com/articles/71577 for details). Changing my search to #ili2009 since:2009-10-01 until:2009-10-31 did seem to work. I am not convinced, though, that Twitter is giving me everything, even when I choose ‘All’. It’s a start and long overdue, but I’m not going to abandon my own archiving strategies just yet.

Search Strategies: new and updated articles

Two free fact sheets,  Google Search Tips and Top Search Tips, have been updated. They are both available as HTML and PDFs.

A new article is available in the subscribers area of Search Strategies. “Free Search Tools for Finding Research Information” is a 42 page PDF covering five things you need to know about Google, advanced searching in Google, alternative web search tools, institutional repositories and specialist tools. If you do not wish to purchase an annual subscription for the whole of Search Strategies this article is available for £5.99. (See http://www.rba.co.uk/search/ResearchInformationTools.shtml for further details).

Sections of the article are also available separately to subscribers in HTML and PDF format:

Advanced search commands for finding research information HTML article and PDF (7 pages)
(Created 17th February 2013)

Google Scholar HTML article and PDF (7 pages)
(Created 11th February 2013)

Institutional repositories HTML article and PDF (2 pages). (Created 23rd January 2013)

Mendeley as a search tool for research papers. HTML article and PDF (Created January 9th, 2013)

Microsoft Academic Search HTML article and PDF (5 pages) (Created February 12th, 2013)

Science Search Tools HTML article and PDF (9 pages) (Created February 13th 2013)

Scirus HTML article and PDF (Created January 10th, 2013)

Details on how to purchase and order a subscription are available on the Search Strategies purchase page.

Forthcoming workshops

I am running three workshops in April on business information and search. All three have a practical element so that you can try out resources and techniques for yourself.

Introduction to Business Research

This is being organised by TFPL and will be held in London on Thursday, 18th April. This course provides an introduction to many areas of business research including statistics, official company information, market information, biographical information and news sources. It will cover explanations of the jargon and terminology, regulatory issues, assessing the quality of information, primary and secondary sources. Further information is available on the TFPL web site at http://www.tfpl.com/services/coursedesc.cfm?id=TR1116&pageid=-9&cs1=&cs2=f

Business information: key web resources

This is also being organised by TFPL in London and is being held on Friday, 19th April. This workshop looks in more detail at the resources that are available for different types of information, alerting services and free vs. fee. It also covers search strategies for tracking down industry, market and corporate reports. Further information is available at http://www.tfpl.com/services/coursedesc.cfm?id=TR945&pageid=-9&cs1=&cs2=f

Make Google behave: techniques for better results

This is a very popular workshop and is being organised by UKeiG. It is being held in Manchester on Tuesday, 30th April.

Topics include:

  • How Google works
  • Recent developments and their impact on search results
  • How Google personalises your results and can you stop it?
  • How to use existing and new features to focus your search and control Google
  • How and when to use Google’s specialist tools and databases
  • What Google is good at and when you should consider alternatives

The workshop will be repeated in London on Wednesday, 30th October. Details and booking information are on the UKeiG website at http://www.ukeig.org.uk/trainingevent/make-google-behave-techniques-better-results-karen-blakeman

New Search Strategies articles

There are three new articles available in the subscribers area of Search Strategies:

Searching for research information: Institutional Repositories HTML article and PDF

Mendeley as a search tool for research papers. Available as an HTML article and PDF

Scirus. Available as an HTML article and PDF

Annual individual subscription rates are £48/year (£40 + £8 VAT). Multi-user and corporate rates are available on request. For further details contact Karen Blakeman publications@rba.co.uk.

To purchase a subscription go to http://www.rba.co.uk/search/purchase.shtml

Medicine search on Google

In November of last year Google announced that it was going to start showing a knowledge graph for searches on medicines. (Look up medications more quickly and easily on Google, http://insidesearch.blogspot.co.uk/2012/11/look-up-medications-more-quickly-and.html). I am now seeing it in my search results but only on Google.com.

When I search on ibuprofen Google now gives me some key facts on the drug in a box to the right of the standard web results. The information includes indications for use, side effects, brand names, contraindications and other drugs that people also searched for. The sources it uses are the National Library of Medicine, US FDA, DailyMed and and Micromedex.

Google results for ibuprofen

Ibuprofen is the generic name for this painkiller and is one of the names under which it is sold in the UK and many other countries. Searching on the brand name Nurofen, which is not available in the US, brings up web search results with shopping options at the top. There is no knowledge graph this time.

Google results for nurofen

I played around with a few other brand names and found that if it is on sale in the US, for example Motrin, Google is able to identify the active ingredient.  

Google search results for Motrin

So Google’s new medicine search is US-centric: US brand names and US sources of information. It will be interesting to see if and how they roll it out to other countries. Meanwhile, for those of in the UK NHS Choices provides better and more detailed information on medicines at http://www.nhs.uk/medicine-guides/, and if you are interested in a drug’s physical or chemical properties Chemspider (http://www.chemspider.com/) is a good starting point.

Already appearing in UK Google results is the related medical conditions feature. Type in a symptom and Google lists possible related conditions at the top of the page.

Google related medical conditions

If you are using Google.co.uk or are based in the UK clicking on any of the conditions in the list brings up content that is UK focused. It will be interesting to see if they do the same with the medicines knowledge graph.

Google Scholar author fail

Eight months after setting up my Google Scholar author profile and “claiming” my papers I have received my first alert. If you only use Google Scholar (http://scholar.google.com/) to search for papers you may not be aware that if you have published papers you can set up a Google Scholar author profile and add those papers to your profile. Google then creates a page showing a graph of when and how often your papers were cited and generates an H-index and i10-index for you.

Google Scholar profile for Karen Blakeman

This only covers the papers that Google Scholar has in its database and there are serious gaps in its coverage for some sectors. On the other hand, it does sometimes include articles, web sites and blog postings that are not peer reviewed in the conventional way. This can be a good thing because it may pick up some very useful grey literature. It can be a bad thing because it is possible to fool Scholar into adding a paper of dubious quality by mimicking the structure of an academic paper – title and author names in large font, affiliation, abstract, keywords, list of references etc.

Another feature of Scholar is that you can create alerts for keyword searches, new papers by an author or new citations to their articles. Needless to say I have set up alerts on my own name! Sadly, until last week I had received nothing so had to assume that no-one was interested in or citing my papers. Or perhaps the alerts do not work? Whatever the reason, I was delighted that at last someone had mentioned me in some way in an article. Clicking through to the item, though, led me to Katie Fraser’s blog and “Communicating with postgraduate research students: some themes from the library literature” (http://www.chuukaku.com/blog/2013/01/communication-with-pgr.html). Was I mentioned or cited in the posting? No, but my own blog was listed in her blogroll to the left of the article.

Having got over the disappointment I turned my attention to working out why Scholar had picked up this particular post. Why wasn’t I receiving alerts every time Katie updated her blog? The answer appears to be at the end of the posting in question: Katie has provided a list of references.

Blog - list of references

Another factor, I thought, might be that Katie has an author profile and claimed her papers but I could not see it anywhere in her profile.

Katie Fraser author profile

On further investigation, and unfortunately for Katie, Google Scholar is unaware that she is the author of this article. It appears that it is someone called MA Lib.

This was confirmed when I clicked on the ‘Cite’ option. This presents you with formatted citations that you can cut and paste into an article or import into a bibliography manager. The author is definitely MA Lib.

Google Scholar citation

Google Scholar has failed to recognise Katie Fraser as the author and has decided that the MA Lib link in the side menu of her blog is a person’s name. There are many similar examples and it is well known that Scholar is unreliable when it comes to identifying authors. Peter Jacso has written several articles detailing Scholar’s shortcomings in this area. (1, 2, 3, 4). Many of his articles are available as pre-prints (5)

What this means for Katie is that although Google Scholar believes her blog posting (6) is worthy of inclusion in its database it is not listed in her author profile and does not contribute towards her h or i10-index. And in case you are wondering, yes I have appended references to this posting to see if Google regards it as scholarly literature and adds it to the Scholar database

Update: Katie Fraser has now “claimed” the posting for her profile but the Google Scholar database has not yet been updated to reflect this.

References

(1) Jacsó, Péter. “Metadata mega mess in Google Scholar.” Online Information Review 34.1 (2010): 175-191.

(2) Jacsó, Péter. Newswire Analysis: Google Scholar’s Ghost Authors, Lost Authors, and Other Problems  [Online] 24 September 2009 [Accessed 4 February 2013.] http://www.libraryjournal.com/article/CA6698580.html

(3) Jacsó, Péter. “Google Scholar Author Citation Tracker: is it too little, too late? “Online Information Review 36.1 (2012): 126-141.

(4) Jacsó, Péter. “Using Google Scholar for journal impact factors and the h-index in nationwide publishing assessments in academia–siren songs and air-raid sirens.” Online Information Review 36.3 (2012): 462-478.

(5) Jacso – Savvy Searching Columns, Online Information Review http://www2.hawaii.edu/~jacso/savvy-mcb.htm [Accessed 4 February 2013]

(6) Lib, M. A. “www.chuukaku.com.”

Slides: Born digital – time for a rethink

My slides from yesterday’s NetIKX workshop, “Digital Native, Digital Immigrant – does it matter?”, are now available on authorSTREAM at http://www.authorstream.com/Presentation/karenblakeman-1650757-born-digital-time-rethink/

 

Update: if you are having difficulties viewing the presentation via authorSTREAM or the embed above, it is also available on Slideshare http://www.slideshare.net/KarenBlakeman/born-digital and the PowerPoint is at http://www.rba.co.uk/as/

Zanran – great for data in tables, charts and graphs

I regularly mention Zanran (http://www.zanran.com/) in my workshops on search and business information, and it often finds its way into the Top Tips compiled by the delegates at the end of the day.

Zanran is not a Google alternative. Rather than search the text of web pages it extracts and indexes numerical data presented as tables, charts and images in PDF reports, spreadsheets and ordinary web pages. You can simply type in your search terms but there are additional options for narrowing down the search by location of the web server, specifying an individual site, selecting a time period and limiting by file type.

The results page lists the files it has found with an extract highlighting the content containing your terms. In this example I am looking for data on agricultural methane emissions in the UK.

Zanran search results

To the left of each entry is a thumbnail. Moving the cursor over the thumbnail brings up a preview of the page containing the relevant chart, table or image. This enables you to immediately assess the relevance of the data without having to download and go through a lengthy document.

Zanran document preview

If you click on the thumbnail or the title to view the whole document you have to register (free of charge) as copies of the indexed documents are stored by Zanran. If you prefer to go to the original document click on the URL button attached to the summary of the page and click on the link that is then revealed. Unfortunately, you may see “page not found” especially if it is on a UK government department web site. Many of these have now been closed and their content archived making it difficult to track them down. Registering with Zanran is by far the easier option. Also, rather than deluge you with documents from a single site, as Google all too often does, Zanran gives you a link telling you if and how many other results are available on a site.

How does it compare with Google? Well, Google did come up with relevant results for my search but I had to spend a lot of time ploughing through them to identify the best documents. And Google did not pull up in the first 100 results the very useful archived UK government documents that Zanran gave me.

Google v Zanran

If you are looking for data or statistics Google still does a very good job but I recommend you also run a  search in Zanran. It may well come up with a real gem, as it often has for me.

 

Latest Tales from the Terminal Room now available

The December 2012 issue of Tales from the Terminal Room is now available at http://www.rba.co.uk/tfttr/archives/2012/dec2012.shtml

This month’s issue includes:

  • Karen Blakeman’s Blog now available for the Kindle
  • Search tools
    • Another reason to say no to Google+?
  • Search Strategies update
  • EU launches public beta of its open data portal
  • ICAEW’s gateways and guides to business information
  • Information on companies in Israel
  • Twitter Notes

News and comments on search tools and electronic resources for research