Google Reader – what next?

Google has announced that as of July 1st 2013 Google Reader will be no more (A second spring of cleaning http://googleblog.blogspot.co.uk/2013/03/a-second-spring-of-cleaning.html). It comes as no surprise since I doubt Google receives very little revenue from it. On the other hand there must be a wealth of information on users’ reading habits and network connections, but obviously not enough. Google cites declining use as the reason.

To be honest I have never got on with Google Reader. I spend a lot of my time travelling on trains with dodgy wifi and erratic mobile broadband connections, so I download as much as possible to my desktop when I do have a connection. My favourite RSS reader at the moment is RSSOwl (http://www.rssowl.org/). I probably don’t use all of its features to the full but it does everything I need.

It is a desktop client with no web option or apps as far as I can see so will not suit many people. My second choice would probably be FeedReader (http://www.feedreader.com/). Originally only available as a desktop client it is now online. Number three on my list is Netvibes (http://www.netvibes.com/). This wouldn’t really be suitable for me as it is a web based service but it does offer some very neat alternative display options and has been used by many organisations to provide ‘start pages’ for their users.

I am not going to list or review all of the possible Google Reader alternatives. There are plenty of other articles that are doing that. 12 Google Reader Alternatives  (http://marketingland.com/12-google-reader-alternatives-36158) is one, although the list is now down to 11 as FeedDemon, which is dependent on Google Reader, has now announced it will close as well (The End of FeedDemon, http://nick.typepad.com/blog/2013/03/the-end-of-feeddemon.html). If you are interested in exploring more alternatives a list is being compiled at https://docs.google.com/spreadsheet/lv?key=0ApTo6f5Yj1iJdFRfWmhUVjV0WkktTjJhUUE4dGR5WUE. There were 33 when I last looked.

To move your RSS feeds you now have to use the Google Takeout service (http://www.dataliberation.org/google/reader). There is no longer an option within Google Reader itself to export an OPML file. Takeout is going to be a problem for some people as it creates a zip file, which some organisations automatically block.

The demise of Google Reader is not a problem for me as I have never used it on a regular basis. What does worry me is that Feedburner (http://feedburner.google.com/) might be next for the chop. There has been virtually no development of the service for a couple of years and in July 2012 Adsense for feeds was discontinued, an indication Google does not view it as a revenue stream. I am now actively looking for Feedburner alternatives.

The case of the disappearing press release

UK government departments and organisations frequently change their names, merge or disappear altogether. The same applies to their websites and documents held on those sites. Tracking down copies of older reports, data and superseded guidelines and regulations is becoming increasingly difficult, especially as so many sites are now being closed down. Information is supposed to be transferred to the new Gov.uk web site (http://www.gov.uk/) but historical information is in danger of vanishing altogether.

I recently needed to get back to a press release issued by the Potato Council (yes, there really is such a thing!) dated November 9, 2007. The title of the document was “Provisional Estimate of GB Potato Supply for 2007” and I had the original URL in my notes. The URL is no longer on the Potato Council’s web site and searching the site failed to turn up the document. Searching the Potato Council’s web site using the Google site: command also failed to find it. I next ran the URL through Google, Bing and DuckDuckGo and found 2 references to it in research papers but not the press release itself.

As I had the URL my next stop was the Internet Archive Wayback Machine (http://www.archive.org/) but the archive found nothing. The Wayback Machine periodically takes snapshots of web sites and lets you browse those copies by date. You can enter the URL of a home page or an individual page. The snapshots are not taken every time a website changes so there are gaps in its coverage, and a page or document can be missed. Hoping that the URL might have changed at some point I browsed copies of the Potato Council’s site for late 2007 and early 2008, but no joy.

Next I tried the UK Government Web Archive at the National Archives (http://www.nationalarchives.gov.uk/webarchive/). This is similar to the Wayback Machine but concentrates on UK government sites and related official bodies. One of the options is to browse the A-Z directory. I found fewer archive copies than in the Wayback Machine but hoped that the one entry for 2008 might come up trumps. Unfortunately it did not.

Archive copies of the Potato Council web site

Another possibility was that Zanran (http://www.zanran.com/) might have a copy. Zanran concentrates on indexing and searching information contained in charts, graphs and tables of data. It archives copies of the documents and I have used it several times to track down information that has been removed from the live web. A search on potato supply estimate UK 2007 came up with a list of results with my document at the top.

Zanran search result

At first glance, it does not appear to match the document I am looking for because the title is different. The titles listed by Zanran are not always those of the whole document but the labels or captions associated with the individual charts and tables. If you hover over the thumbnail to the left of the entry you can see a preview of a much larger section to make sure you have the right document. Clicking on the thumbnail or title will usually take you to Zanran’s archive copy.

Had I not found the press release on Zanran, I would next have contacted the Potato Council. My experience, though, is that very few organisations are able or willing to supply older documents such as press releases. My last resort would have been to contact the authors of the two papers I had found via Google to see if they had kept copies.

I usually keep copies of all papers and pages that I use as part of my research on major projects but inevitably there are times when I forget. As demonstrated above, there are several tools that can be used to try and track down documents that have disappeared from the web but success is not guaranteed.

Top tips on search and business information

Yesterday, I was in Manchester leading a workshop on search techniques and business information. As well as looking at sources of information we went through some advanced search techniques, so the the top tips that the participants suggested at the end are an interesting mix of business sites and search commands.

1. DomainTools http://www.domaintools.com/
If you want to find out who is behind a web site try Domain Tools. Type in the URL of the web site under the Whois Lookup tab and DomainTools will look for details of who owns it in the domain name registries. However, if the owner of the site really does not want to be identified they may hide behind an agent or a service such as http://privacyprotect.org/.

2. Personalise Google news  and web search for location
Personalisation of search results is not always a bad thing. Google tries to work out your location from your IP address but it does sometimes get it wrong. Or you may want to specify a more precise location so that Google gives priority to content more directly relevant to you. Run a web search and then click on the cog wheel in the upper right hand area of the screen. Select ‘Search settings’ from the drop down menu. On the the next page select ‘Location’ from the menu on the left hand side and enter a location in the box provided.

In Google News, click on the cog wheel in the upper right hand area of the page. You should then see options on the right hand side for personalising topics and below those an ‘Advanced’ link. Click on the link and on the next page go to the ‘Create a custom section’ on the right hand side of the screen. Under ‘Add a local section’ you can enter a town, city or post code.

3. Company Check and Company Director Check
http://companycheck.co.uk/ and http://company-director-check.co.uk/
These related sites repackage data from Companies House and offer access to a lot of it free of charge, although you will have to register to some of it. Company Check provides 5 years of figures and graphs for Cash at Bank, Net Worth, Total Current Liabilities and Total Current Assets. You can also download accounts, and monitor a company for financial changes or for when new accounts are filed. The directors are listed and you can click through on a name to view their record on Company Director Check and see details of current and past directorships. Credit and risk reports are priced.

4. Zanran http://zanran.com/
This is a search tool for searching information contained in charts, graphs and tables of data and within formatted documents such as PDFs, Excel spreadsheets and images. Enter your search terms and optionally limit your search by date and/or format type. Zanran comes up with a list of documents that match your criteria with thumbnails to the left of each entry. Hover over the thumbnail to see a preview of the page containing your data and further information on the document. If you click on the title to view the whole document you may have to register (free of charge) as the title link sometimes takes you to copies of the indexed documents that are stored on Zanran. If you prefer to go to the original document click on the URL button next to the summary of the page in the results and click on the link that is then revealed. Unfortunately, you may see “page not found” especially if it is on a UK government department web site. Many of these have now been closed and their content archived making it difficult to track down the document.

5. intext:
Google’s automatic synonym search can be helpful in looking for alternative terms but if you want just one term to be included in your search exactly as you typed it in then prefix the word with intext:. For example UK public transport intext:biodiesel. It also stops Google dropping that term from the search if it thinks the number of results is too low. 

6. filetype: to search for document formats or types of information
For example PowerPoint for experts or presentations, spreadsheets for data and statistics, or PDF for research papers and industry/government reports. Include filetype immediately followed by a colon (:) immediately followed by the file extension in your search strategy.

For example

waste vegetable oil energy generation filetype:pptx

Note that filetype:ppt will not pick up the newer .pptx so you will need to run searches on both. You will also need to look for .xlsx if you are searching for Excel spreadsheets and .docx for Word documents. The Advanced Search screen file type box does not search for the newer Microsoft Office extensions.

7. Google Finance for historical share prices
https://www.google.co.uk/finance
As well viewing historical graphs for share prices you can download the data as a spreadsheet. The data goes back to 1999 but you can only download one year at a time. You can change the date ranges in the boxes above the table on the Historical prices page. You can also specify a much shorter time span than a year, or put the same date in both boxes if you want a price for just one particular day. To access the data, first search for your stock on the Google Finance home page. Then from the menu on the left hand side of the screen select ‘Historical prices’.

8. Verbatim
Google automatically looks for variations of your terms and no longer looks for all of your terms in a document. If you want Google to run your search exactly as you have typed it in, click on the ‘Search tools’ in the menu above your results. A second menu will then appear. Click on ‘All results’ and then Verbatim at the bottom of the drop down menu.

9. ‘Clear’ your search options when you start a new search
If you use the menus above your results to refine your search, for example by using Verbatim or Translated foreign pages, use the ‘Clear’ option to return to the Google default. Otherwise your choices will be applied to the next search.

10. Disappearing sites and documents
Web sites close down, documents are deleted, and industry guidelines and standards are superseded. If you know the URL of where the document or page used to be try the Internet Archive Wayback Machine (http://archive.org/). Type in the URL of the page or document in the box next to the ‘Take Me Back’ button and click on the button. If it is in the database you should then see a calendar showing the snapshots and dates that are available. For UK government web sites a similar service is available at http://www.nationalarchives.gov.uk/webarchive/.

If you do not have an old URL or even a title of the document then it is time to start hunting around to see if it has been archived by a different web site. One of the workshop participants gave the example of trying to track down old engineering specifications and lapsed industry guidelines on deep sea oil exploration. The standard Google search techniques were not working. Thinking that Norwegian oil companies have a lot of expertise in this area, they changed the strategy to searching Norwegian web pages and used Google’s ‘Translated foreign pages’ search option (Click on ‘Search tools’ in the menu above the results, then ‘All results’ and select ‘Translated foreign pages’. Archive copies of the original documents, which were in English, were found!

How search works – sort of

Google has put together a site showing how Google search works (http://www.google.com/insidesearch/howsearchworks/thestory/). The main page is a scrolling animated graphic that just gives you some elementary facts but there are links to more detailed information and videos on the main topics of crawling and indexing, the searching and ranking algorithms, fighting spam and Google’s general policies. They are a useful set of pages for anyone who does not already know the basics of how Google works, but if you are looking for something that tells you how to get sensible results from Google you’ll be disappointed. As Phil Bradley says:

“…. boils down to ‘we find some stuff, do magic to it, filter out the crap that our magic didn’t get and then give it to you.’ Yes folks, an entire site to say that. Wasted opportunity.”