Category Archives: Search Strategies

Oi! Google – you have seriously overstepped the mark

Yes, I am talking to you Google and  this time you really have gone too far.

All I wanted to do was check up on the background of a photo I had taken of the wall surrounding the graveyard of a church in Reading. The church in question is St Laurence. We have all become accustomed to the “Did you mean….?” option at the top of our search results. I found it invaluable early in the morning or late at night when typos were inevitable in my search strategy: yes, thank you, I really did mean ‘widget manufacturers’ and not ‘wigdet manufacturers’. Recently, though, Google has abandoned the optional corrected search and now runs instead the corrected strategy as the default with yours as the extra option. Google has taken this a stage further and runs your search as it thinks fit.

So Google decided that I really meant to search for Saint Lawrence and has included that in the search. There is no option to search on just Saint Laurence:

Google St Laurence search

On this occasion there were some relevant pages in my results. But yes, Google, I really did want to search for Saint Laurence! Now, it seems, I have to prefix all of my search terms with a plus sign or enclose them in double quote marks to stop Google’s dictatorial behaviour.  But why should I have to do that?

In one of my presentations last year on Google vs. Bing/Yahoo I commented that Google would have to do something really stupid before users would switch to another search engine. For me, Google has done that really stupid thing. I am now seriously contemplating switching search engines for basic web searching. My final decision will be based on relevance of results and how quickly they are delivered. I have to spend too much time and click too many times to get them on Google

UPDATE: It has just got worse. I tried a search on the phrase “Saint Laurence” thinking Google would carry out an exact match search, but Google will have no truck with such obvious ploys. (Ignore the Twitter search at the top of the results screen – that is a Greasemonkey script add-on for FireFox).

Google search changes

I now have to click on the option for “Saint Laurence” to get results for the search I had originally requested. Putting a plus sign before my phrase in the search box does not change Google’s mind. “Excuse me, Google, but I do know what I am doing and when I tell you to carry out an exact match search I WANT AN EXACT MATCH SEARCH! Got it?”

x-Factor web pages are “advanced” says Google’s reading level

Google has rolled out a new search option that assigns a reading level to the pages in your results list. Don’t be surprised if you haven’t spotted it yet; it is hidden on the advanced search screen. Under the “Need more tools?” section you can choose from the drop down menu to see all of the results with reading level annotations, basic results, intermediate results or advanced results.

Google Reading Level

Google does not give much away as to how it calculates the reading level and it has nothing to do with the reading age that publishers assign to books. It could involve sentence structure, grammar, the length of sentences on a web page, the length of the document, the terminology used and doubtless many other criteria. But Google isn’t saying.

If you have opted to see the annotations, at the top of your results page you will see a graphic showing the percentages for each of the categories. Under the title of each entry in your results list is the reading level.

Google Reading Level Results

Click on the Basic, Intermediate or Advanced links next to the bar chart to see pages for that reading level. The eagle-eyed amongst you will have spotted that Google appears to be mathematically challenged because the numbers do not add up to 100%. In all of the searches I have done so far 1 or 2% are missing from the statistics. Looking through the lists of results some pages have no reading level assigned to them and they seem to be documents that contain very little information, have more numbers than text, and some are formatted files. Note, though, that most file formats do have a reading level so why some are not picked up remains a mystery to me. Some Daily Mail articles do not have a reading level either but many would argue that they fall into the ‘very little information’ category!

Once you have used the Reading Level in the advanced search screen you can change your search on the results page and it remains as part of your search strategy until you close down your browser or tab.

You can also check out an entire web by using the site command, for example site:rba.co.uk

Google Reading Level for RBA site

And this is where you can start to have some fun comparing sites (WARNING – this is addictive!). Phil Bradley has done some in his blog posting Google adds reading level
(http://philbradley.typepad.com/phil_bradleys_weblog/2010/12/google-adds-reading-level.html). He also highlights some potential problems with labelling pages in this way. For example ‘basic’ does not necessarily mean stupid, but some people may be deterred from selecting basic pages because of the tag.

Most of my pages are classed as intermediate and I am happy with that. Many of them are listings and analyses of business information sources. My husband’s blog on the other hand is 71% advanced and 27% intermediate. This comes as no surprise to me as he has a habit of littering his postings with complex calculations on topics such as wind turbine energy generation and the EROEI of tar sands oil production. (Just the sort of thing not to read before you have had your second cup of coffee of the day.) That plus the industry specific jargon that he uses makes an advanced tag inevitable.

Google Reading Level Energy Balance Blog

The evidence so far seems to be suggesting that using terms or jargon that are relatively uncommon in the whole of the Google database is a heavy factor in determining the reading level. Let’s look at what one might consider to be an intellectually challenging topic: the use of zeolites in environmental remediation.

Google Reading Level Zeolites search

That seems to confirm it.

As a final test and for a bit of fun let’s look at what Google makes of a search on the recent x Factor final.

Google Reading Level xFactor

Noooooo! Surely some mistake? The X factor home page is rated as basic but 93% of the results are advanced. There is indeed a mistake but it was my sloppy search strategy. Changing the x factor part of the search to a phrase gives what I would expect and a switch to 53% basic, 40% intermediate and 6% advanced.

ReadingLevelxFactor2.jpg

Out of curiosity, I looked at the content of the advanced pages and am now totally bemused. I cannot see how they could ever have been classified as such, but then this is Google we’re talking about. Perhaps Google cannot comprehend the scoring system, why so many people watch it or why the programme exists at all?

Google Reading Level xFactor

I have experimented with several other searches. Some came up with results as bizarre as those for the x Factor search but it is interesting how the breakdown can be changed by slightly modifying your search strategy, for example by using phrases when appropriate or a plus sign before a term to force an exact match search. Google’s Reading Level could be useful as a training tool to show how small alterations to a search strategy can radically change the results. But as with all things Google, we do not know how it works and the results can sometimes be very strange. Use with caution.

Advanced search tips and tricks

An interesting list of search tips came from the participants of the search workshops I recently ran in-house for a well known academic institution. (My Twitter followers will be able to work out who it was). As well as being experienced, savvy searchers they are fortunate in that they can choose which browser to use for searching. Attempts to demonstrate Google Instant failed, however. I was not able to show Google’s latest “enhanced search experience” in action, even when using the latest versions of the browsers and being signed in to a Google account. This was probably due to their firewall. Personally, I think that is a plus for the institution. Some of you may disagree.

Here is their combined top search tips list.

1. Keep it Simple!

There is a plethora of advanced search options and Google alternatives but starting off with a simple search string is often the best approach. Looking for data on the UK rat population? You might be tempted to include a file format limitation in your search and/or a site:gov.uk command but simply typing in a search for uk rat population statistics was quicker and came up with the relevant information. Note: the simple approach worked at the time with this example because it was a “hot topic” in the UK news. It might not work now, which brings us to number 2…

2. Be aware of personalisation and hot topics

The major search engines monitor what you search for and the links you click on, and use this to “personalise” your results and sponsored links/ads accordingly. This information is stored in cookies on the computer you used for the search. They also try and work out your location from your IP address so that they can deliver local content (this sometimes goes horribly wrong!). What is currently hitting the headlines will also be a factor in determining the results that are displayed on the first page (increase your displayed results per page to more than the default 10 and ideally to at least 50). This means that you will see different results from one day to the next and if you use a computer other than your usual machine.

3. Google isn’t infallible

We covered a range of search techniques that you can try to bring Google to heel but if you are not getting anywhere try another search tool. Google does not cover everything and your best result may be number 1,200,675 in the results list. Try Yahoo or Bing as alternatives and also think about using specialist search tools for real time and social media, images, and subjects/industries.

4. Get to know the Google alternatives

There is no easy way to do this but visiting Zuula (http://www.zuula.com/) or Browsys Finder (http://www.browsys.com/finder/) once very couple of weeks will remind you of the alternatives and alert you to new kids on the block.

5. Google additional search options

Open up and explore the additional Google search options on the left hand side of your results page. You can restrict your search to news, videos, blogs, images etc and to a time period. There are also options for related searches, less or more shopping sites and….

8. The Wonderwheel

Use this to extract phrases and concepts from the top results and to change the direction of your search. Worth investigating if you are stuck in a rut and fed up with seeing the same results again and again.

9. Google Public Data Explorer

This is currently a Google Labs project at http://www.google.com/publicdata/home “..makes large datasets easy to explore, visualize and communicate. As the charts and maps animate over time, the changes in the world become easier to understand.” There is a list of sources at http://www.google.com/publicdata/directory but the data available is more varied than the list suggests at first glance. The World Development Indicators and OECD Factbook are worth looking at in more detail to see if they have data that can help with frequently asked questions.

10. Creative Commons and public domain images

If you are looking for an image for a presentation or promotional literature, search for images that have the appropriate Creative Commons license. There are several licenses with varying degrees of restrictions. Details are on the Creative Commons web site at http://www.creative.commons.org/.  You can search Flickr photos that have a specific creative commons license at http://www.flickr.com/creativecommons/ or use Compfight (http://www.compfight.com/). There are several other sites you can use for Creative Commons images but Geograph (http://www.geograph.org.uk/) was mentioned several times by the workshop participants. Geograph “aims to collect geographically representative photographs and information for every square kilometre of Great Britain and Ireland” and all photos have a CC 2 license, which means that they can be used commercially with attribution.

11. TinEye Reverse Image Search
http://www.tineye.com/

Type in the URL of an image or upload one of your own and TinEye will find similar images, how it is being used, if modified versions of the image exist, or if there is a higher resolution version. Provided by Idée Inc who also offer..

12. Multicolr Search Lab
http://labs.ideeinc.com/multicolr/

Search 10 million Creative commons Flickr images by colour. You can specify more than one colour and click on a colour several times to increase its prominence within the image. You can easily click through to the original Flickr image to double check the license.

13 . Slidefinder

http://www.slidefinder.net/

Ideal for locating individual presentation slides that contain your search terms. There is an Advanced Search that enables you to search specific areas of a slide for example title, text, notes. You can also limit your search to a university. There are browsable lists at the bottom of the page but they do not list every institution: there are only 47 for the UK. One workshop participant had been given a paper copy of a complex slide and it had taken her “ages” to find an electronic version. She had had to wade through hundreds of slides in presentations that had been identified by using the advanced filetype: ppt search. Slidefinder found it straight away.

14. Twitter search tools

Do not expect Google, Yahoo or Bing to carry out a reliable Twitter search. Use specialist search tools such as Twitter Search (http://search.twitter.com/), Twazzup (http://www.twazzup.com/), BackTweets (http://www.backtweets.com/) for tweets that refer to your content, Tweepz (http://www.tweepz.com/) for finding people and organisations on Twitter, and TwapperKeeper (http://www.twapperkeeper.com/) for archives of tweets on a conference hashtag or keyword.

15. Google custom search engine

http://www.google.com/cse/

Ideal for groups or collections of sites that you regularly search and use. Google CSE is very quick and easy to set up and can be hosted on Google. Two that had been set up by a workshop participant were a list of library associations worldwide and selected UK higher and further education web sites.

16. Watchthatpage

Tracking changes to web pages that do not themselves offer RSS or email alerts was not covered by the main part of the workshop but the question arose during one of the practical sessions. There is a list of some web based and downloadable programs and their features at Tracking Web Page Changes http://www.rba.co.uk/sources/monitor.htm . Watchthatpage (http://www.watchthatpage.com/) won the vote because it is free, web based and offers email alerts.

17. Evernote

http://www.evernote.com/

“Capture anything… Type a text note. Clip a web page. Snap a photo. Grab a screenshot. Evernote will keep it all safe.”. I don’t use this myself but it had several fans in this organisation. ( I use Firefox add-on Scrapbook to do a similar thing).

18. Add-ons for Firefox

If you are a Firefox user explore the many add-ons that are available to make searching and managing information easier. For example Feedly (https://addons.mozilla.org/en-US/firefox/addon/8538/) to organize your favourite sources into a magazine-like start page;  Scrapbook (https://addons.mozilla.org/en-US/firefox/addon/427/) to save and organize web pages; and Optimize Google (https://addons.mozilla.org/en-US/firefox/addon/52498/) for customizing your Google searches and results.

19. Don’t re-invent the wheel – re-use and share

As well as images, many presentations have Creative Commons licenses and their authors are often happy for you to re-use slides from them as long as you acknowledge the source and do not incorporate them into a product or service that you then sell. Slideshare.net is a good starting point but do check the license to confirm what you can and cannot do with the content – not all are CC. Also, consider assigning a CC license to your own photos and presentations. The Creative Commons web site (http://creativecommons.org/choose/) can help you decide which one to use.

20. Time to explore

There was time to explore new techniques and tools during the workshop but it is not so easy to try out, for example, a new option on Google when you are back in the office and an enquirer wants that result NOW! Try and incorporate some “play time” into your schedule so you can keep up with new developments, even if it is just 10 minutes a week.

London Workshop: Advanced Google Searching

I am running a series of hands-on workshops this autumn in London, and the first is on Advanced Google Searching. It is being held on September 23rd at Just IT, 7 Sandy’s Row, which is near Liverpool Street.

Google is the first port of call for many of us when it comes to searching the Internet, and with more data and services being added all the time it seems the obvious place to start. More information, more search features but not necessarily more relevant results. This hands-on workshop will look at the latest developments in Google and how to focus your search to obtain better results.

Topics covered include:

  • recent developments and new services from Google
  • how Google personalises your results
  • how Google is incorporating social media
  • essential advanced search commands
  • how to use the new options to narrow down your search for more relevant results
  • how to access and use the specialist tools
  • image, video and news search
  • build your own Google Custom Search Engine

This workshop is suitable for all levels of experience. The techniques and approaches covered can be applied to all subject areas.

Please note: this workshop concentrates on Google and does not cover the same topics as my recent UKeiG “Changing Landscape of Search” session.

A booking form is available at http://www.rba.co.uk/training/AdvancedGoogle.htm

Top search tips – 14th July 2010 workshop

An interesting mix of sectors were represented at my recent UKeiG workshop “The Changing Landscape of search”. With social media becoming such an important part of search, there was a lot to cover in just one day and still include time for delegates to try out search tools for themselves. At the end of these workshops I ask the group to come up with their own top 10 tips. On this occasion we ended up with 13 and then a few people emailed me some more, thereby doubling the number to 20! The list is a combination of simple tried and tested techniques, new services and tools, and new strategies for dealing with the vast amount of information that is returned by the search engines.

  1. Set up your own Google custom search engine (http://www.google.com/cse/) for groups of sites that you regularly search and use. It is quick and easy to do, and you can keep them private or make them public.
  2. Docjax (http://www.docjax.com/) for searching Google and Yahoo for file formats ppt, doc, xls, pdf
  3. Use Twitter (http://www.twitter.com/) to keep up with what people are saying about your organisation or industry, and to find out what is happening at conferences.
  4. Nearby Tweets (http://nearbytweets.com/) for monitoring tweets on a subject and from a geographical location
  5. Save tweets and Twitter searches if you are using Twitter for competitive intelligence or reputation monitoring/management.
  6. Try out the the Google Wonderwheel to see connections between concepts and to change the direction of your search. Run your search, open up the options in the menu to the left of your search and click on Wonderwheel. This had mixed reviews from the workshop participants and even its fans said that it does not always help with the search. Nevertheless, worth trying if you are stuck in a rut and fed up with seeing the same results again and again.
  7. In Google  use the menu options to the left of your search results to help you focus your search and for more relevant results.
  8. Separate real time and “traditional” web search. Google, Bing and Yahoo incorporate real time and social media results into the main search results. These results are not comprehensive and give a superficial, biassed view of the topic. Use the specialised real time search tools for searching social media.
  9. Slidefinder (http://www.slidefinder.net/) for locating individual presentation slides that contain your search terms. There is an Advanced Search that enables you to search specific areas of a slide, for example title, text, notes. You can also limit your search to a university. There are browsable lists at the bottom of the page but they do not list every institution: there are only 47 for the UK!
  10. View the cached page version of a document in your search results to see where and how often your terms occur. Useful for very large documents.
  11. Biznar (http://www.biznar.com/). Real time federated search tool covering selected business sites, some of which are not searched by Google et al.
  12. Google Timeline to see the distribution of pages and documents over time. Remember, though, that the dates are not always when the content was published. A date or year might just have been mentioned in the text or Google mistakenly interpreted a number as a date.
  13. Use  double quotes “” around phrases to find specific names or titles. This one is a golden oldie but one that is often forgotten. Works in nearly every search tool.
  14. Try alternative names or change a single term to expand your search results, for example BP oil spill vs. BP oil leak. See what the search engine suggests as you type in your strategy and in Google look at  the Related Searches option in the menu to the left of your search results.
  15. Add the year to your strategy when searching for somebody or something from a particular year. A simple, obvious trick but another one that is often forgotten. This will only look for the number in the text and does not run a date search, but it does significantly narrow down your search.
  16. Try using non-UK and non-US versions of Google, for example http://www.google.com.ar/ or http://www.google.es/ if the information is likely to be in Spanish.
  17. When using Google, click on ‘similar’ to find related information and sites similar in content and type.
  18. Bing for images. No need to keep clicking the next page for more images, just keep scrolling down. Some also commented that the quality of the results and the layout are better than Google.
  19. For video archives try BBC Motion Gallery – BBC Archive at http://www.bbcmotiongallery.com/gallery/home/archives.do and NewsFilm Online at http://www.nfo.ac.uk/
  20. Social Mention (http://www.socialmention.com/). Great for monitoring mentions in the social media about a person, company or topic.

The slides for the day can be found on Slideshare at http://www.slideshare.net/KarenBlakeman/changing-landscape-of-search

IFEG Advanced Search, Statistics & Market Research

I have now uploaded the slides for my workshop at the Information for Energy Group (IFEG). As usual, I have uploaded them to several different web sites in case one or more are blocked by corporate firewalls. If you have problems accessing any of the locations, let me know and I’ll sort out some other means of getting the presentation to you.

Workshop: Advanced Internet Searching for Energy Information & Market Research
Organised for:
Information for Energy Group
Venue: The Energy Institute, New Cavendish Street, London.
Date: Thursday 13 May 2010

PowerPoint Presentation (download from the RBA site – 7.5 MB)
authorSTREAM
Slideboom
Slideshare

Another workshop – another Top 10 Search Tips

The participants at the latest advanced search workshop were all from the public sector and had very strong views on some of the new developments in search. They were definitely not impressed by Google automatically enabling web history with a view to “personalizing” search results. (See Your Google results are about to get weirder
http://www.rba.co.uk/wordpress/2009/12/17/your-google-results-are-about-to-get-weirder/). (The workshop participants  are switching off Web History as soon as they get back to the office!) There were several sites and search features, though, that did impress them. This is their list of Top 10 Search Tips.

1. The Google Wonderwheel was the clear winner of the day with this group. When your results page appear on screen, click on “Show options” just above the results and to the left of the screen. Then select Wonderwheel from the list on the left of the page. (For further details see Google new search and display options
http://www.rba.co.uk/wordpress/2009/10/05/google-new-search-and-display-options/)

2. Google’s Timeline was a close second in the popularity stakes. This is also under Show options in Google when you do a default web search and is also available in Google News. It shows the distribution of your articles over time and gives you an idea of when something started to become a “hot topic” and how a story has developed over time. It is not 100% accurate but is good enough to give you an overall picture of how interest in a subject has waxed and waned.

3. LGSearch http://lgsearch.net/ They liked this one a lot! This a Google Custom Search Engine (CSE) set up by Dave Briggs (http://davepress.net/) that searches UK public sector web sites in one go. On the results page you can, if you wish, narrow down your search further to Local Government, Central Government, Health, Police & Fire, LG Related or Social Media.

4. Slideshare http://www.slideshare.net/. A site used by many people and organisations to provide access to PowerPoint presentations. Search for presentations on any topic or by a specific person then view online or download the original if the author permits. Once you have selected a relevant presentation Slideshare also shows you a list of other presentations containing similar content. No registration required if you just want to search.

5. Try something else other than Google. As well as giving Yahoo or Bing a go, try and think about the type of information you are looking for: news, video, statistics, what people are talking about. Then use the appropriate search tool for that type of information.

6. Twitter search http://search.twitter.com/ You may not want to indulge in Twitter yourself but it can give you an idea of what people are saying about a topic. It is also an essential part of reputation monitoring and competitive intelligence: what are people saying about you or your products and services? You do not have to have a Twitter account to search Twitter, just go to search.twitter.com.

7. Google Blogsearch (http://blogsearch.google.com/) and Blogpulse (http://www.blogpulse.com/) Blogs are another useful source of views and opinions on every topic imaginable. Blogpulse has a “trend this” option on the results page that displays a graph showing you how many blog posts mention your search terms over time.

8. Zuula.com (http://www.zuula.com/) for quick and easy access to a wide range of search tools covering different types of information. Enter your search once, click on the tab for the type of resource (video, images, reference, news), and then work your way through the list of search engines.

9. Google Custom Search Engines (CSE). We looked at several Google CSEs, LGsearch.net and Directionlessgov (http://directionlessgov.com) being just two of them. You can, though, set up your own CSE at http://www.google.com/cse/. Useful if you search the same web sites day after day. You will need a Google account or Gmail account to set up a CSE but you can host your CSE on your own web site or on Google. CSEs can be made public or kept private.

10. University of Auckland Official Statistics (OFFSTATS)  http://www.offstats.auckland.ac.nz/ This set of web pages provides information on Official Statistics on the Web and is an excellent starting point for official statistics by country and subject/industry.

Exalead changes filetype commands

If you are a user of Exalead (http://www.exalead.com/search/) and use the filetype command you will need to make note of some changes to the file extensions. If you are looking for Excel spreadsheets you will now have to include ‘filetype:excel’ in your search strategy, for PowerPoint it is ‘filetype:powerpoint’ and for Word documents type in ‘filetype:word’. I assume that the changes are to ensure that the ‘new’ Microsoft Office 2007 extensions pptx, docx and xlsx are picked up. Alternatively, you could run just a keyword search and select the filetype from the menu down the right hand side of the results page.

In Google you have to run separate command line searches if you want to pick up both ppt and pptx files. The advanced search screen file format drop-down menu options only search for pre Microsoft Office 2007 file extensions. Bing does not seem to recognise the newer file extensions at all but you can search for them in Yahoo using the ‘originurlextension:’ command. Like Google, Yahoo’s advanced search screen file format box does not pick up the 2007 extensions.

Most people who use Microsoft Office 2007 generally convert files to 97-2003 format before uploading them to the web, but Office 2010 is well into beta testing and the new extensions will start to become more commonplace. It will be interesting to see if and how Google, Yahoo and Bing manage search for these new filetypes.

Slidefinder

Slidefinder  (http://www.slidefinder.net/) was recommended to me way back in August 2009 and I have been using it ever since to track down information inside presentations. PowerPoint presentations can hold a wealth of information: corporate structures, strategic plans, research activity, statistics, industry information etc. Using the advanced file format search options in the general search engines is one way of locating relevant presentations and there are also searchable presentation sharing sites such as Slideshare (http://www.slideshare.net/) and Authorstream (http://www.authorstream.com/). Slidefinder (http://www.slidefinder.net/) is a similar service but locates and presents you with individual slides that contain your search terms. This means that you do not have to wade through the whole file to find the information you want.

It covers publicly available PowerPoint presentations that are on the web but does not include services such as Slideshare or Authorstream. The default simple search is straightforward. Type in your search terms and relevant slides are displayed as thumbnails. The advanced search enables you to search by slide title, text, notes, presentation name, keywords, language and site. To see a larger version of a slide and any notes associated with it move the cursor over a slide, or you can download the entire presentation if you wish.

There are also options to restrict your search to university sites. These are listed by country in regions (Europe, North America, Oceania and Asia) but it is not comprehensive. Once you have identified the university you want you can either browse the title slides or keyword search the available presentations. Phil Bradley has already reviewed the service and he commented that no UK universities were listed. This is obviously a part of the service that is under continual development and I note today that  two universities have been added to the UK list since I last looked. It is not clear how the universities are selected for inclusion (there are only 47 for the UK) and many major institutions such as Reading University are missing from the list.

Slidefinder is powered by Slide Executive (http://www.slideexecutive.com/) and is a showcase for Swedish company Novatrox’s desktop and enterprise presentation management tools. They are essentially search tools for presentations stored on your own computer or networks but they also enable you to build new presentations from existing slides and manage “libraries”. There are a range of products depending on the number of users and how you wish to create and organise your files. They are all priced but you can download free trials. I am currently looking at the single user desktop edition and although I know my own presentations inside out and their location I am finding Slide Executive very useful for presentations given to me by co-workers and colleagues. The question for me now is whether or not it is worth 249 Euros. Possibly not, but the free Slidefinder is definitely worth adding to your search toolkit.

Online Information 2009 presentations

The three  presentations I gave at Online Information 2009 are now available on Slideshare: