Category Archives: Search Strategies

Google announces Hummingbird…hmmm

Google announced a major new search algorithm to celebrate its 15th anniversary called Hummingbird, because it is “precise and fast”. Interestingly, the change was implemented about a month before the announcement was made and most of us did not notice any difference! That is probably because Google is continually making minor changes to the way it presents results, and then there are the live experiments that we are all subjected to (Just Testing: Google Users May See Up To A Dozen Experiments  http://searchengineland.com/just-testing-google-searchers-may-see-up-to-a-dozen-experiments-141570). So, even if we do suspect that our results have changed, it’s difficult to know whether it’s the usual combination of personalisation and experiments or the new algorithm.

Phil Bradley has written a neat summary of what Hummingbird is (http://philbradley.typepad.com/phil_bradleys_weblog/2013/09/what-is-google-hummingbird.html) and Danny Sullivan has compiled an FAQ at http://searchengineland.com/google-hummingbird-172816. There is also a short video on the BBC website at http://www.bbc.co.uk/news/business-24292897 in which Amit Singhal, VP, Google Search talks about the launch and the future direction of Google search.

I have not noticed any significant differences in either desktop or mobile search but that is probably a good thing. We tend to spot changes only when Google completely messes up. I suspect that we’ll see little difference when using advanced commands such as filetype or site; Hummingbird seems to be geared more towards handling natural language queries. It is far too early to say how this is going to affect in depth research, but if you suddenly find strange things happening to your search it could be Hummingbird and not you.

If you are keen to find out more about how Google works and how to get better results I am presenting a workshop later this month in London that has been organised by UKeiG. Further details are on the UKeiG website. Alternatively, if you have had enough of Google and want to explore alternatives there is the Anything but Google workshop, again organised by UKeiG.

 

Presentation to East of England Information Services Group

The slides for the presentation I gave to the East of England Information Services Group in Cambridge last week are now up on authorSTREAM at http://www.authorstream.com/Presentation/karenblakeman-1881622-whatever-whenever-wherever-want/. They are also available on Slideshare at http://www.slideshare.net/KarenBlakeman/whatever-whenever-and-wherever-you-want.

The presentation was part of a day looking at “Providing Effective Information Services for Generation Y”. Many of the slides probably won’t mean much on their own but feel free to take a look and share.

Top Tips from SWAMP

Swansea_20130624_400
View from Swansea Central Library

Towards the end of June I headed off to Swansea Central Library to facilitate a workshop on search tools and techniques for finding business information and statistics. The session was organised for the libraries of the wonderfully named SWAMP – South West and Mid Wales Partnership.

We had fantastic views from the library of the sea and shore line so they did very well to remain focused on the work in hand. The top tips that the group suggested at the end of the day were a mixture of search techniques and business information sites.

1. Persistence.
Don’t give up and don’t get stuck in a rut. If your first attempts fail to produce anything useful try a different approach to your search. Try some of the tips mentioned below: use advanced search commands, a different search tool or go direct to a website that covers your subject area or type of information.

 2. Verbatim.
Google automatically looks for variations on your search terms and sometimes drops terms from your search without telling or asking you. To beat Google into submission and make it run your search exactly as you have typed it in, click on ‘Search tools’ in the menu above your results, then click on the arrow next to ‘All results’ and from the drop down menu select Verbatim.

3. Private Browsing.
To stop search engines personalising your results according to your previous searches and browsing behaviour, find out where the private browsing option is in your browser (in Chrome it is called Incognito). This ignores all cookies and past search history and is as close as you can get to unfiltered results.

Short cuts to private browsing in the main browsers are:

Chrome – Ctrl+Shift+N

FireFox – Ctrl+Shift+P

Internet Explorer – Ctrl+Shift+P

Opera – Ctrl+Shift+N

Safari – click on Safari next to the Apple symbol in the menu bar, select Private Browsing and then click on OK.

4. The site: command.
Include the site: command in your search to focus your search on particular types of site, for example site:ac.uk, or to search inside a large rambling site. You can also use -site: to exclude sites from your search. For example, if you are searching for information on Wales and Australian websites mentioning New South Wales keep coming up include -site:au in your search.

5. The filetype: command.
Use the filetype: command to limit your research to PowerPoint for presentations, spreadsheets for data and statistics or PDF for research papers and industry/government reports. Note that in Google filetype:ppt and filetype:xls will not pick up the newer .pptx and xlsx formats so you will need to incorporate both into your strategy, for example filetype:ppt OR filetype:pptx, or run separate searches for each one. In Bing.com, though, filetype:pptx will pick up both .ppt and .pptx files.

6. Guardian Data Store (http://www.guardian.co.uk/data/)
For datasets and visualisations relating to stories in the news. This is proving to be a very popular site on both the public and in-house workshops. As well as the graphs and interactive maps the source of the data is always given and there are links to the original datasets that are used in the articles.

7. Company Check (http://www.companycheck.co.uk/)
Company Check repackages Companies House data and provides 5 years of figures and graphs for Cash at Bank, Net Worth, Total Liabilities and Total Current Liabilities free of charge. It also  lists the directors of a company. Click on a director’s name and you can view other current and past directorships for that person.

8. BL BIPC industry Guides
The British Library Business Information and IP Centre’s industry guides at  http://www.bl.uk/bipc/dbandpubs/Industry%20guides/industry.html highlight relevant industry directories, databases, publications and web sites. Excellent starting points if you are new to the sector.

9. Web archives for documents, pages and sites that are no longer “live”.
Most people know about the Internet Archive’s Wayback Machine at http://www.archive.org/and its collection of snapshots of websites taken over the years. There is also a collection of old UK government webpages at http://www.nationalarchives.gov.uk/webarchive/, and the British Library has a UK web archive at http://www.webarchive.org.uk/ukwa/.

10. Keep up to date
Keep up to date with what the search engines are up to, changes to key resources and new sites. Identify blogs and commentators that are relevant to your research interests and subject areas and follow them using RSS or email alerts.

North Wales Libraries Partnership Top Tips

Cyril in the John Spalding Library

The John Spalding Library in Wrexham hosted the North Wales Libraries Partnership (NWLP) workshop “Search is more than just Google”. Delegates from public, government, academic and NHS libraries gathered together to look at the effect of mobile technologies on search, open access, getting better results from Google and alternative search tools. The consensus reached during one of the breaks was that Cyril, one of the library’s residents and pictured on the left, should have ignored Google’s nutrition advice and gone for the more authoritative sources available in the library and on the web. If only he had waited and attended the workshop he would have known exactly where to look!

There was much discussion on how mobile devices change how we can search – not always for the best – and there was concern, as usual, over how much we willingly give away about ourselves to services such as Google and Facebook. Open access was debated in the afternoon along with possible directions for academic publishing.

An edited set of the slides is available on authorSTREAM at http://www.authorstream.com/Presentation/karenblakeman-1856150-search-google/ and Slideshare at http://www.slideshare.net/karenblakeman1/search-is-more-than-just-google.

The Top Tips that the group came up with included some of the usual advanced Google commands but others concerned cloud computing and social media. Here they are.

1. Back up your stuff. Having your data hosted in the cloud means you don’t have to worry about it disappearing when your laptop or server crashes. But what if your cloud service goes under or your account is deleted for some reason? Have you made a local backup of your essential files and treasured family photos? One of the participants mentioned the Library of Congress digital preservation toolkit for preserving family memories (http://www.digitalpreservation.gov/personalarchiving/).

2. Private browsing for “un-personalising” search results. If you want to make sure that your results are not being influenced by past searches and browsing behaviour, find out where the private browsing option is in your browser (in Chrome it is called Incognito). This ignores all cookies and past search history and is as close as you can get to unfiltered results.

3. Change the order of your search terms to change the order in which results are listed. This is an old trick but still seems to work.

4. Use advanced search commands such as site:, filetype;, intext:, to focus your search. Some of the commands are available not just in Google but also in Bing and DuckDuckGo.

5. Create “newspapers” of articles mentioned on Twitter, Facebook, Google+ or news sites by using services such as Paper.li (http://paper.li/). These can be generated from hashtags, keyword searches or your own Twitterstream. Have a look in the Paper.li news stand to see if someone has already created a paper on your topic. Paper.li automatically compiles the newspaper but there are other services such as Storify (http://storify.com/) and Scoop.it (http://www.scoop.it/) that enable individuals to curate the content that appears in their personal newspaper.

6. Guardian Data Store for datasets and visualisations relating to stories in the news (http://www.guardian.co.uk/data). This was so popular that it was mentioned twice for inclusion in the top tips. What people liked about this is that the source of the data is always given and there are links to the original datasets.

7. Million Short http://millionshort.com/. If you are fed up with seeing the same results from Google again and again give Million Short a try. Million Short runs your search and then removes the most popular web sites from the results. Originally, as its name suggests, it removed the top 1 million but the default has changed to the top 10,000. The page that best answers your question might not be well optimised for search engines or might cover a topic that is so “niche” that it never makes it into the top results in Google or Bing. One person loved it because the type of research they do often pulls up pages of Amazon and eBay results in Google. Not a problem with Million Short

8. Google Reading level to change the type of results that you see. Run your search and from the menu above the results select ‘Search tools’, ‘All results’ and from the drop menu ‘Reading level’. Options for switching between basic, intermediate and advanced reading levels should then appear just above the results. Click on the Advanced option to see results biased towards research.

9. Beware fragmented discussions. Articles can be posted and reposted in many different places: blogs, websites, LinkedIn, Facebook etc. with the result that potentially useful and informative discussions are dotted all over the place. Learn how to locate fragmented discussions in your subject area and where they are likely to occur.

10. Try something other than Google. Take a look at the slides for a few(!) suggestions of what you could use.

Google – you can say “NO!”

Picture the scene: an obviously distressed researcher is hunched over a computer screen, sobbing hysterically. All they wanted was a list of donkey sanctuaries in Surrey. How difficult is that? But Google decided that what they really wanted was a field guide to identifying buttercups. Our researcher tries all the advanced search commands and options they know but to no avail. It seems that Google has locked them into its dreaded live experiments (1) with no possibility of escape, and the information is needed NOW.

There is hope, though. There are other search engines out there. Bing may seem consumer/retail focused, but its list of advanced search commands is great at unearthing serious research information that Google buries at around the 2 millionth entry in your results list. My comparison and summary of search commands at http://www.rba.co.uk/search/compare.shtml lists the Bing commands that you are most likely to need. Or if you just want a no nonsense summary of your topic without all of Google’s personalisation and experiments look no further than DuckDuckGo. But should you even be using Google or similar, generic search engines in the first place? Think about the type of information you are looking for.

For news, RSS feeds are still a great way to pull together updates from your favourite newspapers, blogs and websites. Google Reader is about to disappear into a black hole but there are other, better RSS readers out there. I use a desktop client called RSS Owl (http://www.rssowl.org/) but if that doesn’t suit you Phil Bradley has a list of alternatives on his blog at http://philbradley.typepad.com/phil_bradleys_weblog/2013/03/20-alternatives-to-google-reader.html. Or you could try a different approach: create a Twitter list of essential news sources, or use Paper.li to create daily “newspapers” using keyword searches or hashtags. See my own “daily” at http://paper.li/karenblakeman or the paper.li on biofuels at http://paper.li/karenblakeman/1321447614

Interested in statistics and open data? Try the University of Auckland’s statistics portal (http://www.offstats.auckland.ac.nz/) or the Guardian’s Datastore (http://www.guardian.co.uk/data).

If you are looking for images Flickr.com is an obvious alternative. For photos you can re-use without fear of being dragged through the courts for copyright infringement try Geograph (http://www.geograph.org.uk/) or Morguefile (http://www.morguefile.com/).

And when it comes to free search tools for tracking down open access and research information there are dozens, some of which are listed at http://www.rba.co.uk/search/links.shtml#research.

These and many more are covered in my workshop “Anything but Google”, which is is being held in Newcastle later this month. Further details are on the UKeiG web site at http://www.ukeig.org.uk/trainingevent/anything-google-karen-blakeman.

We may not be able to avoid Google completely but there are equally good, if not better, tools available. Take the first step and say “No” to Google.

(1) Just Testing: Google Users May See Up To A Dozen Experiments http://searchengineland.com/just-testing-google-searchers-may-see-up-to-a-dozen-experiments-141570

Make Google behave: Top Tips

Update: Top Tip number 4, Translated foreign pages, has now been axed by Google. See my blog posting Google drops translated foreign pages for another way to search foreign language pages.

Google lived up to its reputation at the UKeiG workshop “Make Google behave: techniques for better search results” and it didn’t take long for it to start presenting different results and layouts for the same query. We went through a vast array of commands, search options and specialist Google tools and by the end of the day we felt we had regained some control, or at least were finding more sensible results. Held in the training suite in the Library at Manchester University the delegates were a mix of information professionals from the private, legal, government and academic sectors. They were certainly not slow in suggesting top tips at the end of the day and came up with 15 instead of the usual 10. Here are the top tips.

1. site:

Include the site: command in your search to focus your search on particular types of site, for example site:ac.uk or to search inside a large rambling site. You can also use -site: to exclude sites from your search. For example a search for statistics on Wales kept coming up with Australian sites mentioning New South Wales. Including -site:au quickly disposed of those.

2. Reading Level

Try ‘Reading level’ if Google is failing to return any research or business related documents for a query. Run your search and from the menu above the results select ‘Search tools’, ‘All results’ and from the drop menu ‘Reading level’. Options for switching between basic, intermediate and advanced reading levels should then appear just above the results. Click on the Advanced option to see results biased towards research. Google does not give much away as to how it calculates the reading level and it has nothing to do with the reading age that publishers assign to publications. It seems to involve an analysis of sentence structure, the length of sentences, the length of the document and whether scientific or industry specific terminology appears in the page.

 3. Verbatim

Google automatically looks for variations on your search terms and sometimes drops terms from your search without telling or asking you. This is not always very helpful. Quote marks around phrases or individual words do not always force an exact match or inclusion in the search. If you want Google to run your search exactly as you have typed it in, click on ‘Search tools’ in the menu above your results, then click on the arrow next to ‘All results’ and from the drop down menu select Verbatim.

4. Translated foreign pages

For a different perspective or for information on government policies, companies, people, industries located in another country try ‘Translated foreign pages’. Run your search as usual and click on ‘Search tools’, then ‘All results’ and from the drop down menu select ‘Translated foreign pages’. Google will then give you a list of the most commonly used languages but you can add a language of your own to the list. Select the translation option you wish to use  by clicking on the number or results next to the language. Google will then run a translation of your search on pages in that language and then translate the results back into your language. If you have mentioned a country in your search Google will automatically translate and search using the language of that country. Beware, though. Machine translation!

5. Asterisk * 

Use the asterisk between two words to stand in for 1-5 words. This is useful if you want two of your keywords close to one another but suspect that there may often be one or two words separating them. For example solar * panels will find solar photovoltaic panels, solar water heating panels. One of the workshop delegates found that placing an asterisk between a keyword and the word ‘report’ significant improved the quality of results when looking for official information, industry or research reports.

6. Personalise Google news 

There may be times when you do want personalise information and results. There may be some sections and sources in Google News that you do not want to see or you may want to increase the amount of information on a topic. Sign in to your Google account and on the Google News page click on the cog wheel in the upper right hand area of the page. You should then see options on the right hand side for personalising topics and newspaper options.

7. Date options

In Google web search, use the date options in the menus at the top of the results page to restrict your results to information that has been published within the last hour, day, week, month, year or your own date range. Click on ‘Search tools’, then ‘Any time’ and select an option. Unfortunately you cannot use this with Verbatim but you can use the daterange: command. You have to convert your dates to Julian date format and this is explained at http://aa.usno.navy.mil/data/docs/JulianDate.php and it will even do the date conversion for you. It is far easier though, to use a tool such as GMacker date range search at http://gmacker.com/web/content/gDateRange/gdr.htm. Fill in the boxes and on the Google results page apply Verbatim in the usual way.

8. Usage rights for images

If you are looking for images that you can reuse then use the usage rights option on the Image advanced search screen as a filter. First run your search in Google images. On the results page click on the cog wheel in the upper right hand area of the screen and select ‘Advanced search’. Towards the bottom of the advanced screen there is a ‘usage rights’ box. Click on the downward pointing arrow for a list of options that include four “free to use….” licences. Select the relevant licence and Google will limit your search accordingly. Do double check, though, that the licence applies to the image you want to use. Go to the original web page that contains the image and make sure the licence is indeed associated with it and not with a different image on the same page.

9. Image reverse search

If you already have an image and want to search for different sizes, or images that are similar to it, then use the reverse image search. The Google Images search box has a camera icon to the left of the search button. Click on the camera and you will be given the option to either paste in the URL of the image or upload an image.

10. Changing number of results per page

By default, the number of results that Google displays on your results page is 10. If you want to increase this go to http://www.google.co.uk/preferences or click on the cog wheel in the upper right hand area of a results page and click on ‘Search settings’. First make sure that Google Instant predictions is set to ‘Never show instant results’ otherwise Google will ignore your changes to the results per page. Then under ‘Results per page’ click on the required number on the slider bar and then on ‘Save’ at the bottom of the screen.

11. Google Trends

Google Trends (http://www.google.com/trends/) lets you see and compare how often people are searching on terms. Type in your terms separated by commas. On the results page you can further refine your search by date and country. The frequency graph is annotated with news items that may explain unexpected peaks. Trends may show, for example, whether a marketing campaign has been successful and increased the level of awareness of a brand or product, and can also be used to see how competitors are faring in the search popularity stakes.

12. Google’s main index and supplemental index

Google does not automatically search everything it has. It first searches it main index and only includes information from the supplemental index if it thinks that the number of results is relatively low. Increasing the number of search terms and using Verbatim, or any of the advanced search commands, seems to force Google to search both indexes, which explains why you sometimes see more results as try and refine your search.

13. Public data explorer

The Public Data Explorer is one of Google’s best kept secrets. It can be found at http://www.google.com/publicdata/ and allows you to search open data sets from organisations such as the IMF, OECD, IM,  Eurostat and the World Bank. You can compare the data in various ways and there are several chart options.

14. Google Art Project http://www.googleartproject.com/

This is a collaboration between Google and over 150 galleries from across the world. You can take a virtual tour of a gallery and zoom in on a painting to see the brushstrokes. You can view paintings and drawings by gallery or by artist. Warning: this is highly addictive!

15. Cycle lanes on Google maps

For all you cycling fans Google Maps now displays cycle routes for the UK. In the UK Google has been working with Sustrans (http://www.sustrans.org.uk/) to include bike trails, lanes and recommended roads. Set your starting point and destination as usual and the directions area on the screen should include a bicycle icon in addition to the car, public transport and walking icons. If you just want an idea of what is available in a particular area click on the Traffic option in the upper right hand area of the displayed map and select Bicycling. Trails are shown as solid dark green lines, dedicated lanes are light green lines and bicycle friendly roads are displayed as dotted green lines.

The Google workshop will be run again in London in October (http://www.ukeig.org.uk/trainingevent/karen-blakeman-make-google-behave-techniques-better-results-0). If you’d rather explore alternatives to Google I am leading a workshop in Newcastle in June on “Anything but Google”! (http://www.ukeig.org.uk/trainingevent/anything-google-karen-blakeman)

Test your search skills with SearchReSearch

Many of us seem to be in Google bashing mode at the moment but they do produce good stuff at times, or at least some of their employees do. Dan Russell, who works at Google, has an excellent blog called SearchReSearch at http://searchresearch1.blogspot.com/.  The blog is “about search, search skills, teaching search, learning how to search, learning how to use Google effectively, learning how to do research. It also covers a good deal of sensemaking and information foraging“. Dan comes up with a topic for research and invites people to comment on what they find and how they found it. The questions usually arise when Dan is out and about and spots something curious. A recent query was about the roadside use of weedkiller and was asked because he and a friend had noticed brown strips of dead vegetation along the edge of the highway. (See ‘How much death at the roadside’ http://searchresearch1.blogspot.com/2013/03/answer-how-much-death-at-roadside.html).

The questions are a great way to test your search skills and see how others have tackled them. Don’t be deterred by the US emphasis. After all, many of us sometimes have to research industries and events in other countries. It’s wonderful exercise for the little grey cells.

Search tools for research information – Kindle version

At last! I’ve managed to convert my article on “Free search tools for research information”  into a Kindle version (http://www.amazon.com/dp/B00C11XLVQ). It took me four attempts to get it right (and I hope it is indeed OK). The Amazon instructions are here, there and everywhere. Amazon’s general guide on producing a Kindle version is OK, but it’s the detailed stuff that is hard to find. The link I have given takes you to Amazon.com. If your “local” Amazon is different you’ll need to search for either the title or my name in the Kindle store.

Search Strategies new articles

New Search Strategies articles are now available.

“Excluding sites from your search” (subscribers only) is at http://www.rba.co.uk/search/subscribers/ExcludeSites.shtml

“From tourism to research information: how to change the emphasis of results” (subscribers only) covers techniques for changing the type of information returned by the search engines, for example consumer vs. more research focused pages (http://www.rba.co.uk/search/subscribers/Emphasis.shtml).

“Free Search Tools for Finding Research Information” is a 42 page PDF covering five things you need to know about Google, advanced searching in Google, alternative web search tools, institutional repositories and specialist tools. If you do not wish to purchase an annual subscription to the whole of Search Strategies, this article can be purchased on its own for £5.99. See http://www.rba.co.uk/search/ResearchInformationTools.shtml for further details.

 A full list of Search Strategies fact sheets and articles is at http://www.rba.co.uk/search/.

Search Strategies covers facts and tips, reviews of search tools and detailed strategies for more effective searching. Some information such as the fact sheets and Top Tips are available free of charge. The more detailed information on strategies is available on subscription. Annual individual subscription rates are £48/year (£40 + £8 VAT). Multi-user and corporate rates are available on request.

Details of how to purchase a subscription are at http://www.rba.co.uk/search/purchase.shtml

The case of the disappearing press release

UK government departments and organisations frequently change their names, merge or disappear altogether. The same applies to their websites and documents held on those sites. Tracking down copies of older reports, data and superseded guidelines and regulations is becoming increasingly difficult, especially as so many sites are now being closed down. Information is supposed to be transferred to the new Gov.uk web site (http://www.gov.uk/) but historical information is in danger of vanishing altogether.

I recently needed to get back to a press release issued by the Potato Council (yes, there really is such a thing!) dated November 9, 2007. The title of the document was “Provisional Estimate of GB Potato Supply for 2007” and I had the original URL in my notes. The URL is no longer on the Potato Council’s web site and searching the site failed to turn up the document. Searching the Potato Council’s web site using the Google site: command also failed to find it. I next ran the URL through Google, Bing and DuckDuckGo and found 2 references to it in research papers but not the press release itself.

As I had the URL my next stop was the Internet Archive Wayback Machine (http://www.archive.org/) but the archive found nothing. The Wayback Machine periodically takes snapshots of web sites and lets you browse those copies by date. You can enter the URL of a home page or an individual page. The snapshots are not taken every time a website changes so there are gaps in its coverage, and a page or document can be missed. Hoping that the URL might have changed at some point I browsed copies of the Potato Council’s site for late 2007 and early 2008, but no joy.

Next I tried the UK Government Web Archive at the National Archives (http://www.nationalarchives.gov.uk/webarchive/). This is similar to the Wayback Machine but concentrates on UK government sites and related official bodies. One of the options is to browse the A-Z directory. I found fewer archive copies than in the Wayback Machine but hoped that the one entry for 2008 might come up trumps. Unfortunately it did not.

Archive copies of the Potato Council web site

Another possibility was that Zanran (http://www.zanran.com/) might have a copy. Zanran concentrates on indexing and searching information contained in charts, graphs and tables of data. It archives copies of the documents and I have used it several times to track down information that has been removed from the live web. A search on potato supply estimate UK 2007 came up with a list of results with my document at the top.

Zanran search result

At first glance, it does not appear to match the document I am looking for because the title is different. The titles listed by Zanran are not always those of the whole document but the labels or captions associated with the individual charts and tables. If you hover over the thumbnail to the left of the entry you can see a preview of a much larger section to make sure you have the right document. Clicking on the thumbnail or title will usually take you to Zanran’s archive copy.

Had I not found the press release on Zanran, I would next have contacted the Potato Council. My experience, though, is that very few organisations are able or willing to supply older documents such as press releases. My last resort would have been to contact the authors of the two papers I had found via Google to see if they had kept copies.

I usually keep copies of all papers and pages that I use as part of my research on major projects but inevitably there are times when I forget. As demonstrated above, there are several tools that can be used to try and track down documents that have disappeared from the web but success is not guaranteed.