Download search queries data using Python – 20111222

http://googlewebmastercentral.blogspot.ca/2011/12/download-search-queries-data-using.html

For all the developers who have expressed interest in getting programmatic access to the search queries data for their sites in Webmaster Tools, we’ve got some good news. You can now get access to your search queries data in CSV format using a open source Python script from thewebmaster-tools-downloads project. Search queries data is not currently available via the Webmaster Tools API, which has been a common API user request that we’re considering for the next API update. For those of you who need access to search queries data right now, let’s look at an example of how the search queries downloader Python script can be used to download your search queries data and upload it to a Google Spreadsheet in Google Docs.

Example usage of the search queries downloader Python script
1) If Python is not already installed on your machine, download and install Python.
2) Download and install the Google Data APIs Python Client Library.
3) Create a folder and add the downloader.py script to the newly created folder.
4) Copy the example-create-spreadsheet.py script to the same folder as downloader.py and edit it to replace the example values for “website,” “email” and “password” with valid values for your Webmaster Tools verified site.
5) Open a Terminal window and run the example-create-spreadsheet.py script by entering “python example-create-spreadsheet.py” at the Terminal window command line:

python example-create-spreadsheet.py

6) Visit Google Docs to see a new spreadsheet containing your search queries data.


If you just want to download your search queries data in a .csv file without uploading the data to a Google spreadsheet use example-simple-download.py instead of example-create-spreadsheet.py in the example above.

You could easily configure these scripts to be run daily or monthly to archive and view your search queries data across larger date ranges than the current one month of data that is available in Webmaster Tools, for example, by setting up a cron job or using Windows Task Scheduler.

An important point to note is that this script example includes user name and password credentials within the script itself. If you plan to run this in a production environment you should follow security best practices like using encrypted user credentials retrieved from a secure data storage source. The script itself uses HTTPS to communicate with the API to protect these credentials.

Take a look at the search queries downloader script and start using search queries data in your own scripts or tools. Let us know if you have questions or feedback in the Webmaster Help Forum.

Written by , Webmaster Trends Analyst

Barry Wise said…

Excellent – been waiting for this. Just curious to know why Google always seems to prefer Python over PHP

DECEMBER 23, 2011 AT 4:25 AMJavier said…

Schedule in Windows 7  http://windows.microsoft.com/en-US/windows7/schedule-a-task

DECEMBER 23, 2011 AT 8:30 AMDan DeVeney said…

Really great addition guys. Thanks! As an FYI to everyone else, I’ve found you can change the selected_downloads in the script from “TOP_QUERIES” to “TOP_PAGES” to pull you top pages report instead.

DECEMBER 23, 2011 AT 11:14 AMFrank said…

For those who like to use this to download the data for all their sites at once, I’ve changed the scripts somewhat: http://www.webkruscht.com/2011/downloading-data-from-google-webmaster-tools

DECEMBER 24, 2011 AT 8:22 AMThomas Hey’l said…This is really a great Christmas present. With a little help of JohnMu I’ve been able to pimp those scripts so their output becomes configurable with dates, regions, query types, pages or queries, etc.
Please add the mapping for TOP_QUERY_CHART and TOP_PAGES_CHART to the JSON list of downloadable files – this would really improve the tool.

 

DECEMBER 25, 2011 AT 7:32 AMlaborant said…Hi I’m stuck at 4) where example-create-spreadsheet.py is executed. There I get thid output:
Traceback (most recent call last):
File “C:\Python27\Lib\gdata-2.0.15\test2\exa
in
downloader.DoDownload(website, selected_do
File “C:\Python27\Lib\gdata-2.0.15\test2\dow
d
sites_json = json.loads(available)
File “C:\Python27\lib\json\__init__.py”, lin
return _default_decoder.decode(s)
File “C:\Python27\lib\json\decoder.py”, line
obj, end = self.raw_decode(s, idx=_w(s, 0)
File “C:\Python27\lib\json\decoder.py”, line
raise ValueError(“No JSON object could be
ValueError: No JSON object could be decoded

The gdata python client works I’ve tested it with the example code from the installation HowTo. I’ve modified email password and website.
Does anyone has an idea?

DECEMBER 31, 2011 AT 10:42 PMSteve said…

If you’re looking for PHP support: I wrote something up here: http://code.google.com/p/php-webmaster-tools-downloads/source/browse/

JANUARY 26, 2012 AT 4:39 AMcarinth said…

I was considering using Google and you associates for web publishing and investing. However since you have now abandoned Democracy and taken the Communist view of Corporate Might is Right by going along will an Illegal bill which promotes regression, monopolization, subjugation and dictatorship. I hope you guys crash and burn for what you have done.

JANUARY 29, 2012 AT 5:18 AMUnknown said…

Seems the script is no more working for External Links and other features. Looks like the _GetDownloadList function returns just 2 or 3 downloadable URLs…

JULY 5, 2012 AT 11:57 AMMichael Stitt said…

Can you point me to the documentation for tweaking these files? For example, This gives the top search queries for the past 30 days, but what if I only want data for the past week? Or yesterday? Thanks for the article!

MARCH 5, 2013 AT 2:15 PMMichael Stitt said…

Nice article! I realize you wrote it over 2 years ago but the scripts still work. Can you point me to any documentation showing how to change the date range? For example, what if I only want top queries for the past week, or day?

JULY 3, 2013 AT 2:30 AMAaron said…

I was able to customize the date range by editing downloader.py and changing “url = self._GetFullUrl(path)” to “url = self._GetFullUrl(path + ‘&prop=WEB&region&db=20130801&de=20130807&more=true&format=csv’)” where “db” is the beginning date” and “de” is the end date.

SEPTEMBER 27, 2013 AT 3:49 PMMatt said…

got it to work and it works like a charm. can anyone point me in the right direction on how to setup it up so that it creates automatic monthly exports?

OCTOBER 1, 2013 AT 5:13 AMClayton Sheppard said…

Does this work with python33?

OCTOBER 17, 2013 AT 11:09 AMGoogle Webmaster Central said…

Hi everyone,

Since over a year has passed since we published this post, we’re closing the comments to help us focus on the work ahead. If you still have a question or comment you’d like to discuss, free to visit and/or post your topic in our Webmaster Central Help Forum.

Thanks and take care,
The Webmaster Central Team

The following two tabs change content below.

allenpg

Latest posts by allenpg (see all)

Leave a Reply