Online Information pre-conference workshop: Searching without Google

The slides for my workshop “Searching without Google”, 28th November 2011, are available at:

http://www.rba.co.uk/as/  – please note that this is a temporary location for the presentation and it will be removed after 2-3 months. Archive copies will remain on authorSTREAM and Slideshare

authorSTREAM 

Slideshare

There is also an addendum to the presentation that summarises some of the questions and answers covered throughout the day together with “top tips” and sites that the participants themselves suggested. This is also available at http://www.rba.co.uk/as/.

authorSTREAM

Slideshare

The text of the addendum is reproduced below.

Tools for creating your own search engine

Several people said they will investigate setting up a custom search engine for their preferred sources and frequently used web sites. Google’s custom search engine is at http://www.google.com/cse/ and Blekko.com lets you set up ‘slashtags’ to create lists of sites for searching. One person said that they are going to try both and compare ease of use and results.

What can one do when the link to the Google’s advanced search screen disappears altogether?

The link to Google’s advanced search screen has been moved to the drop down menu underneath the cogwheel in the upper right hand corner of your screen, but several reported that it had even vanished from there for a couple of days last week. Next time you use the screen bookmark its URL so that you can go directly to it (of course Google can always change that!). Also learn the advanced search commands e.g. filetype: site: etc. so that you can type them into the standard search box.

There is a list of commands at http://www.rba.co.uk/search/SelectedGoogleCommands.shtml (also a PDF version)

(The GoogleGuide list of commands at http://www.googleguide.com/advanced_operators_reference.html was last updated in 2008 and contains commands no longer in use)

Tools for monitoring social media and alerting you when a subject is mentioned

Although Google and Bing both include social media in their search results it is often better to use a tool designed specifically for the job. All three of the following offer RSS alerts for searches.

Topsy.com –  searches tweets, videos and photos on Twitter, and now Google+

SocialMention.com (can be slow at times) – blogs, Twitter, bookmarks, video, audio

Icerocket.com – blogs, twitter, Facebook, images

Are there still some directories alive and updated e.g. DMOZ? Or is the war of directories vs search engines over?

The Yahoo directory is still online although it is not easy to find and it has not been updated for several years. Similarly, some sections of DMOZ appear not to have been updated for at least a year and the entries under some headings look like advertising. The day of the mega-directory may be over but specialist and subject specific listings are still being developed. They do, though, require considerable time and effort to maintain and inevitably some are forced to close because of a lack of funding e.g. Intute.

Are there good tools for events search by subject, region, date?

The events databases that some of us accessed via services such as DataStar have long gone so it is not possible for example to search for events on nanotechnology taking place next year between June and September in Europe. Possible alternative search strategies include:

– identifying major events venues and their calendars

– locating relevant trade and industry newsletters, portals, magazines that may list events in their sectors

– relevant trade and professional bodies and associations

Are there tools that search the live web rather than using indexes of cached pages?

Biznar.com, Mednar.com and some social media search tools search the “live” web but they are limited to searching a small number of sites and are slower than Google and Bing in returning results. There are no free public search tools that search the entire web live – it would take far too long – and by the time the search engine would have finished the information would be out of date!

Searching for scientific publications that are not published in major English language journals

Google and Google Scholar are still a good starting point for this type of search, but it was suggested by several of the workshop participants that Open Access journals could also be investigated as well as national digitised collections and subject specific listing and portals.

Searching news in other languages (alternatives to Google News)

Country versions of Google News give priority to local content but you can identify news sources in individual countries at the Newspaper & News Media Guide http://www.abyznewslinks.com/. You cannot search the publications from this site but it will tell you what is available and the language of publication.

What will be the trend of the next 5 years? More competition? More takeovers of the smaller search engines? More specialist tools?

All of that! Many smaller specialist search tools continually emerge and many of them quickly disappear or are bought up by the competition. It is impossible to predict exactly what will happen, or even if Google will remain the dominant search tool on the web. If Google’s popularity starts to wane it probably will not be because a “Google-killer” arrives on the scene but because Google goes too far in trying to take control and automatically “improve” results for users. Many of us feel that it is already going in that direction.

Top tips and tools to try back at work

  1. Custom Search Engines – use Google CSE (http://www.google.com/cse/) or set up a ‘slashtag’ on Blekko.com so that you can quickly and easily search those sites you regularly use. Note: they will not include password protected sites or sites where you need to conduct a database search
  2. Biznar.com – real time federated search of selected key business resources
  3. Chemspider.com – brings together chemical information from a wide range of resources. Maintained by the Royal Society of Chemistry
  4. Investigate image search sites other than Google. (Multicolr http://labs.ideeinc.com/multicolr/ was specifically mentioned
  5. Paper.li – gather together tweets and/or Google+ posts containing links based on keywords or from a user and their twitterstream. Results are presented in an easy to read newspaper style.
  6. http://www.zanran.com/ – searches for data and statistics contained in graphs, charts and tables
  7. http://duckduckgo.com/  – alternative search engine that does not customise or personalise your web results
  8. http://integrals.wolfram.com/ – Wolfram Mathematica online integrator. Ideal for maths homework.
  9. http://www.coremine.com/ – Norwegian initiative providing an interesting visual interface to the biomedical literature
  10. Central Index of Digitized Imprints (zvdd) http://www.zvdd.de/ Access to and search options for German digitized works from the 15th Century to the present. Collections are listed at http://www.zvdd.de/dms/browsen/. See also http://www.europeana.eu/portal/  “to explore the digital resources of Europe’s museums, libraries, archives and audio-visual collections”

Yahoo Site Explorer closes – try Blekko instead

A reminder that Yahoo Site Explorer is closing down tomorrow (November 21st ) and I assume that the link and linkdomain commands will go with it, although they are not specifically mentioned (http://www.ysearchblog.com/2011/11/18/site-explorer-reminder/). Webmasters are being told to use Bing Webmaster Tools. This enables you to analyse links to your own domains but is no use if you want to find who links to other web sites as part of research. Bing, or Live.com as it then was, removed its link and linkdomain commands in November/December 2007 and Yahoo was left as the only reliable alternative. The link command enabled you to find who linked to a specific page on the web and linkdomain found links to anywhere on a specified web site. Both were useful ways of finding other sites containing similar content and discovering what others were saying about a page. Google’s link command is useless as it picks up a minuscule number of results, which now leaves Blekko (http://blekko.com/) as the only realistic alternative.

Blekko enables you to track down linked pages in two ways but both lead to the same results. The first is to use their slashtags ‘/links’ and ‘/domainlinks’ with a URL or domain name. For example http://www.rba.co.uk/sources/registers.htm /links will find pages that link to my official company registers page whereas http://www.rba.co.uk/ /domainlinks finds all inbound links to my site rba.co.uk.

The second route is via your search results. Below each entry is a downwards pointing arrow. Click on this and select ‘links’ from the pop-up box.

Blekko Lonks

 

You will then see a list of sites that link to that page.

To view inbound links to the whole of the web site click on the seo option below the result and you will see some statistics together with the total number of inbound links.

Blekko SEO

 

Click on the inbound links number and Blekko presents you with a list of domains containing links to yours and how many.

 

Blekko Domain Links

 

To see exactly where the links are located and where they go to on your site just click on the number in the links column.

 

Blekko inbound links

 

I have only looked in detail at a couple of sites but Blekko seems to do a good job and is certainly far superior to Google. The Blekko data on my own site seems to correspond with that available from Bing Webmaster Tools but of course I cannot compare other sites in the same way. My initial thoughts are that for link searching Blekko is definitely worth adding to your research toolkit.

Google: Verbatim for exact match search

Well it looks as though the user feedback to Google on the discontinuation of the +/plus sign for enforcing an exact match search has paid off. Google removed the plus sign as a web search option a few weeks ago and told searchers to use double quotes around terms instead.The double quote marks option does not always force an exact match and increasingly Google is ignoring them and making  some of your search terms optional. (See my blog posting Dear Google, stop messing with my search, http://www.rba.co.uk/wordpress/2011/11/08/dear-google-stop-messing-with-my-search/). The official reason for the change was that hardly anyone used it: the real reason has become clear with Google implementing its Google+ Direct Connect Service. This enables you to go direct to an individual’s or company’s Google+ page by prefixing their name with the plus sign, for example +BASF.

 

For those of us who really do NOT want Google to second guess what we are looking for there is now a Verbatim command. Google’s Inside Search blog (http://insidesearch.blogspot.com/2011/11/search-using-your-terms-verbatim.html) says:

 

 With the verbatim tool on, we’ll use the literal words you entered without making normal improvements such as
  • making automatic spelling corrections
  • personalizing your search by using information such as sites you’ve visited before
  • including synonyms of your search terms (matching “car” when you search [automotive])
  • finding results that match similar terms to those in your query (finding results related to “floral delivery” when you search [flower shops])
  • searching for words with the same stem like “running” when you’ve typed [run]
  • making some of your terms optional, like “circa” in [the scarecrow circa 1963]
So be warned: when using Verbatim you are rejecting Google’s “improvements”!

 

Verbatim can be found in the options on the left hand side of your results page, which means that you have to run your search before you can implement it. Go to the menu to the left of your results and click on ‘More search tools’ at the bottom. This will open up a menu that includes the Verbatim option.
Google Verbatim
It works!. When I run a Verbatim search on St Laurence I get only St Laurence and not St Lawrence as well. And my Heron Island Caversham UK parrot search now finds only those pages that contain all of my terms. There is one drawback in that Verbatim is all or nothing. I often want to have an exact match search on just one or two of my terms but am happy to have Google mess around with the remainder. Verbatim works on your whole search strategy but I think that you can include advanced search commands in your strategy. Running searches such as ‘”Heron Island” Caversham UK ~parrot’ or ‘”Heron Island” Caversham UK parrot OR pigeon’ followed by Verbatim gives me what I would expect. However, more complex searches incorporating filetype: and site: gave me very bizarre results. I need to do more research on this part of the strategy.

 

Overall, I welcome Verbatim and thank Google for listening to its users. However, as Phil Bradley has said it is a tool that “Google should not need to have created” (Google Verbatim tool http://philbradley.typepad.com/phil_bradleys_weblog/2011/11/google-verbatim-tool.html)

Free UK company information: Company Director Check

Company Director Check (http://company-director-check.co.uk/) is a sister database to Company Check (http://companycheck.co.uk/), which I reviewed earlier this year (http://www.rba.co.uk/wordpress/2011/01/10/free-uk-company-information/). It provides free access to information on current and past directors of UK companies that until now has only been available for a fee. Director searches can unearth links between apparently unrelated companies and help you identify “families” or groups of companies. It can also bring to light interesting patterns of behaviour. For example, I carried out a search on a director whose business activities had aroused my suspicions. I knew he had run companies in the past that had been dissolved and his most recent venture had gone into liquidation. Looking at the list of companies of which he had been director it became clear that 6-8 weeks before a company was dissolved or went under he would set up a completely new company. This had happened so often that it was not just me who had begun to smell a rather large rodent. I understand that he is “currently under investigation”!

If you are viewing a company in Company Check click on the director’s name and you are taken straight to their record in Director Check. Alternatively just run a search on the person’s name in Director Check. A list of possible matches will be presented to you, which you can refine by entering a postcode. Alternatively just work through the list until you are certain that you have found the correct person. Do not be surprised if you find a director has multiple IDs. There is nothing “dodgy” about this, it just reflects the way the system has evolved over the years. Companies House have carried out a massive exercise to try and fix this but there are still some multiple IDs in the database.

The information that is provided includes full name, short name, month and year of birth, address and past and present directorships.

Director Check

The status of each directorship – active, dissolved, resigned – is displayed followed by a summary of each of the companies. More detailed information on the individual companies can be found on the Company Check web site.

Now that so much directorship information is freely available it will be interesting to see if more directors make use of the option to provide a service rather than their home address for the public record.

Definitely one to add to your business research toolkit.

Dear Google, stop messing with my search

I have been complaining for several months that Google does not always “AND” your search terms and delivers results that do not contain all of your terms, or their synonyms, in the page itself or in links to the page. There was a time when you could force Google to deliver exactly what you wanted by prefixing your terms with a plus sign. That option has now gone and Google says that you have to use double quote marks around your terms and phrases instead. Not only is it tedious to have to surround every term with “…” but it does not always work!

The evidence

I recently took a photograph of autumn leaves on Heron Island in Caversham and uploaded it to Flickr. At the time I hadn’t noticed that there was a bird hiding amongst the leaves and when it was pointed out to me I assumed it was a pigeon of some sort. Someone else, however, thought it might be a parrot. I have not heard of any sightings of parrots in my area but decided to check Google to see if there were any reports. My first search strategy was parrot “Heron island” caversham UK

Google search results 1

 

Over 8,000 results! Unbelievable – which it was. Looking at the top results and their cached copies revealed that Google had decided to forget about parrots or birds of any kind and look for just “heron island” caversham UK.

Google search results 2

 

Changing the search to “parrot” “Heron island” caversham UK reduced the number of results to 84.

Google search results 3

 

This time Google was leaving out Caversham or UK or both. Amending the strategy yet again so that both caversham and UK were within quote marks reduced the number of hits to 23.

Google search results 4

 

There were a handful of directory listings containing all of my terms but the rest contained only one or two of my terms, for example the Tripadvisor page shown below.

Google Search results 5

There was no obvious logic as to why these irrelevant pages had been chosen by Google – remember that the grand total was a mere 23 – and they were not advertisements. Using advanced search and the allintext option made no difference whatsoever. For the final version of my search Bing found 15 pages that contained all of my terms but sadly nothing to do with parrots in Caversham, UK. DuckDuckGo found three documents but again no sighting of a parrot of the feathered variety in my neighbourhood.

I was disappointed that my original identification of the bird seems to have been correct but extremely annoyed with Google. I had to wade through irrelevant documents and wasted time tweaking my search only to find that Google was ignoring my strategy anyway. I could understand it if my search had zero results and Google wanted to give me something, but there were some documents that did have all of my words. Various scripts that automatically add quote marks around your terms have been written since Google withdrew the + sign for general searching. These really aren’t much help because I sometimes want Google to look for variations of some of my terms and Google seems to be ignoring the quotes marks when it feels like it. More reasons to look seriously at the alternative search tools that are out there.

DuckDuckGo – silly name but a neat little search tool

Fed up with Google ignoring your search terms and giving you something completely different? Confused by irrelevant tweets and postings in your results? At the recent Internet Librarian International conference in London one of my fellow participants told me that he would not mind Google collecting his search and personal information if it gave him better results but he said that it seems to make them worse. Judging by the comments from some of the other conference goers Google’s attempts at personalisation and semantic search are not always delivering what the searcher needs. There are several steps you can take to try and depersonalise your results but even then Google can still mess up the search. Perhaps it’s time to seek out a different search tool.

Yahoo is now using Bing’s database and search results for web and image search so you might just as well go straight to Microsoft’s Bing (http://www.bing.com/). The trouble is that Bing is starting to behave like Google by messing with your search terms (Bing becomes more like Google and personalises http://www.rba.co.uk/wordpress/2011/10/07/bing-becomes-more-like-google-and-personalises/). So what are the other serious alternatives? DuckDuckGo (http://duckduckgo.com/), also known as DDG, may have a silly name – it certainly put me off from using it for some time – but once you get over that it does have a lot going for it.

It has been around for a while and when it was launched one of its main selling points was that it does not track or share your search and web browsing habits, or try to personalise your results (see https://duckduckgo.com/privacy.html for more information). That’s all very well but how good are the results?

The home page is minimalist as are most search engine’s these days.

 

As soon as you start typing  you’ll notice that there are no suggestions appearing in a drop-down menu below the search box. Some may regard that as a good thing but I do occasionally find them helpful if I am researching an unfamiliar area. In compensation DDG offers “search ideas” on the results page that make up for the absence of suggestions and related search options. The results page is clean and uncluttered with search ideas on the right hand side of the screen. You add one of the “ideas” or terms to your search simply by clicking on it, but you cannot add more than one and the search ideas disappear from subsequent results pages. The only way I can see of adding more than one is to type them into the search box yourself.

 

When you hover over an entry a “more results” link appears that finds more articles from that site and if you look at the results URL you will see that the site: command is used. There is no link to an advanced search screen but there are an incredible number of what DDG calls “Goodies”. The ones that I have found to be most useful are:

  • site: followed by a domain name –  searches for your terms within the specified site
  • inbody:  followed by your search term – looks for your term in the main part of the page
  • intitle: followed by your search term –  looks for your term in the title of  the page
  • filetype: followed by a file extension – looks for specified file formats containing your terms
  • sort:date to sort by date (uses results from Blekko)
  • region: followed by the standard two letter country code e.g. regions:fr to boost pages from France

Then there are the DDG !bang commands (https://duckduckgo.com/bang.html). These automatically take you to other search engines, for example your search terms followed by !images runs an image search on Google and !videos will run a video search on Bing. Details on general syntax, keyboard short cuts and ‘tech goodies’ are at  http://duckduckgo.com/goodies.html  and  http://duckduckgo.com/tech.html. It all looks somewhat daunting but it is worth working your way through them and drawing up your own list of what you think you might use on a regular basis. If you still find it all a bit too much to take in then use the options under the arrow next to the search box at the top of the results page. This brings up a menu of some of the more popular types of searches.

DuckDuckGo Search options

For some searches DDG gives you a red box at the top of the results page containing “zero-click” information extracted from pages and DDG’s Topic Lists for example a possible answer to your question or the result of a conversion/calculation. For a search on Mapledurham watermill it gave me a description and link to Wikipedia along with links to DDG Topics Lists for Grade II* listed buildings in Oxfordshire and museums in Oxfordshire.

DuckDuckGo zero-click info

 

My request to convert euros into pounds came up with a calculation generated by Wolfram Alpha.

DuckDuckGo Wolfram Alpha results

Would I use DuckDuckGo as my default search tool? Difficult to say at this stage. I do miss Google’s time search option but DDG’s sort:date goes some way to offset that, and I regret to say that there are times when I miss Google’s localisation and personalisation. Looking for pubs or restaurants in Reading and Caversham is so much harder in DuckDuckGo. However, I am getting a feel for the type of searches that work well on DDG and for general web searching it is a good alternative to Google and Bing. It does not play around with your search terms, supports advanced search commands and most important of all it delivers relevant results, some of which are not always in Bing or Google.

ILI 2011 web search presentations

The presentations I gave at International Librarian International this week in London are now available on my Advanced Search page at http://www.rba.co.uk/as/. They are also available on authorSTREAM and Slideshare.

Searching without Google

Presentation given as part of the main conference on Friday, 28th October 2011

It is also available on Slideshare at http://www.slideshare.net/KarenBlakeman/searching-without-google

Web Search Academy

This was a pre-conference workshop held on Wednesday26th October with myself, Marydee Ojala and Arthur Weiss presenting.

Alternative Search Tools 

Please note the content of this presentation is similar to that of my main conference presentation “Searching without Google”.

Also available on Slideshare at http://www.slideshare.net/KarenBlakeman/alternative-search-tools

Visual Search

Looks at image search tools, video search engines and visualisations.

The Slideshare version is available at http://www.slideshare.net/KarenBlakeman/visual-search-9892558

Google dumps ‘+’ operator

You will either have read about this on other blogs or found out yourself when searching that Google has dumped the ‘+’ operator. This was a useful way to stop Google automatically searching for variations and synonyms of your terms. The theory was that by prefixing your term with a plus sign Google would be forced to look for an exact match. Try it now and Google tells you to use double quotation marks around the term instead. To be honest, the + sign has not worked reliably for several months but I often have the same problems with the double quotation marks. If I search on St. “Laurence” or “St. Laurence” Google still includes page on St Lawrence in my results.

Search Engine Land has covered the news and the reason why + has been dumped in “Google Removes The + Search Command” http://searchengineland.com/google-sunsets-search-operator-98189. It suggests + has been dumped because of Google+, their social network, and Google now suggests auto completing your friend’s names when you use the operator. As Danny Sullivan comments “it seems to have been tossed out and replaced by quotes because of a problem Google created for itself, by picking stupid names for its social network.

I’ve noticed anther worrying trend – Google does not always look for all of my terms in the page. Viewing the cached copies of some of my results I see that not only are some of my terms missing from the page itself but they are not even in links to the page. So is Google now deciding when to ‘OR’ our search terms?

Bing becomes more like Google and personalises

So you thought you could escape filtering and personalisation of search results by fleeing Google and running into the arms of Bing? Afraid not. Bing has announced that it is rolling out a new personalisation feature called adaptive search. Details are on Bing’s blog Adapting Search to You (http://www.bing.com/community/site_blogs/b/search/archive/2011/09/14/adapting-search-to-you.aspx). According to Bing the “more you search, the more Bing can learn”.

The feature is being rolled out first in the US and is cookie based. The cookie and personalisation lasts for 28 days if you are not signed in to Bing and 18 months if you are. You can clear and turn off your search history at any time.

Bing seems to be trying to be more and more like Google all the time. I tried one of my test searches on Hewish mild and Bing did a Google on me by unilaterally deciding to include results for Jewish mild in my results. Placing a plus sign before Hewish did force an exact match but the related searches it offered me all involved Jewish – Jewish Chronicle, Jewish jokes, Jewish festivals etc. Yahoo does exactly the same, which is not surprising since it uses the Bing database and search algorithms.

 

Build your own web naughty list on Google

Google first announced that it was introducing an option for users to exclude web sites from results in March of this year (Google lets you create your own naughty list http://www.rba.co.uk/wordpress/2011/03/12/google-lets-you-create-your-own-naughty-list/). Then it disappeared, reappeared, disappeared and then reappeared for some people and only if you were logged in to a Google account. Now it is back for everyone. Run a search, view a result and then use the back button to get back to your results list. You should now see a link next to the result offering to block all further pages from that site.

Google Blocked Sites

If you are not already logged in to a Google account you are prompted to do so.

Next time you run a search that would normally include pages from a blocked site Google displays a message at the bottom of the results offering you the options to show the blocked results or to go to ‘Manage blocked sites’ where you can unblock them altogether.

Google Blocked Sites

You can also manage your blocked sites by going to your Google account dashboard.

Google Blocked Sites

Be warned: this does not only affect your results. Google.com is using this data as part of their general search ranking algorithms “to help users find more high quality sites”. This may be extended to other countries in the future. So don’t block sites unless you really mean it. If you want to remove a site from just one particular search then use the site: command prefixed with a minus sign in your search strategy. For example -site:wkipedia.org

The original announcement can be found at “Hide sites from anywhere in the world – Inside Search” http://insidesearch.blogspot.com/2011/09/hide-sites-from-anywhere-in-world.html

 

News and comments on search tools and electronic resources for research