Scraping Top Pages in GWT

Set the Parameters

  • Select Report Type = Top Pages.
  • Set Date Range with “Start Date” and “End Date”.
  • Select Filter = {All | Web | Mobile} – disregard “Location” and “Traffic” options.
  • Set Rows to display = Total number of Top Queries by editing the “grid.s” parameter in URL.
  • In practice, we generate 3 datasets (Filter = {All | Web | Mobile})  for a given Date Range.
  • The most important single dataset is Filter = Web.
  • The comparison of Filter = Web vs. Filter = Mobile may also be of interest – if so, use Filter = All to generate the exhaustive list of Queries to support this comparison.

Download two datasets and save as .csv files:

  • Click on “Download This Table” – save as dtt-tp.csv.
  • Click on “Download Chart Data” – save as dcd-tp.csv.

Import the .csv files to Excel

  • Import dtt-tp.csv as comma-delimited UTF-8 format Data into Excel – save as <start date>-<end date>-<filter>-dtt-tp.xls e.g. 20140415-20140428-web-dtt-tp-xls.
  • Ditto dcd-tp.csv - save as <start date>-<end date>-<filter>-dcd-tp.xls.

Generate iMacros script to Blow-Out Queries

  • Copy and paste source code for a given Profile into  source.txt.
  • Using PowerGrep, re-write source.txt to as iMacros TAG commands:
    PowerGrep - Settings for Collect data from GWT source code
    Where we search on (getElementById)(\u0028\u0022)([^\u0022]+) and replace with/collect to TAG POS=1 TYPE=A ATTR=ID:\3
  • Copy and paste TAG POS lines from source.txt into iMacros macro scrape-tp.iim:
    VERSION BUILD=8810214 RECORDER=FX
    TAB T=1
    URL GOTO=https://www.google.com/webmasters/tools/top-search-queries?hl=en&siteUrl=http%3A%2F%2Fwww.site.com%2F&de=20131207&db=20131201&qv=change&type=urls&grid.s=500&prop=ALL&region=&more=true
    TAG POS=1 TYPE=A ATTR=ID:ud_4C057013_C11A7450_F2B4C340
    TAG POS=1 TYPE=A ATTR=ID:ud_7BB0D7E1_944C001F_71BB9CF7
    ...

Blow-Out & Scrape the “Top Pages” Queries

  • Run scrape-tp.iim to blow-out the queries for Top Pages.
  • Select, copy and paste the blown-out queries on-page table into Excel (say) <start date>-<end date>-<filter>-scr-tp.xls.
  • Note that the “Change” data is available for previous 30 days only.
    Page Impressions Change Clicks Change CTR Change Avg. position Change
    www.site.com/
    92,789 0% 10,536 -8% 11% -1 4.2 -0.2
    query01 62,674 7,742
    query01 5,146 723

For the other Filters defined in Step 1, repeat Steps 2 – 5.

Leaving us with …

For each of the 3 Filters defined in Step 1, we have generated 3 Excel files:

  • <start date>-<end date>-<filter>-dtt-tp.xls – from “Download This Table” – Step 3
  • <start date>-<end date>-<filter>-dcd-tp.xls – from “Download Chart Data” – Step 3
  • <start date>-<end date>-<filter>-scr-tp.xls – from scraping “Top Pages” table on-page - Step 5

See my video of (1) and (2) -  WMT – Setting Parameters.
For comparing Filter-specific datasets see Merging GWT \”Top Queries\” for Filter = (All | Web | Mobile).

It’s possible to compare datasets over time where “Change” data is available in some datasets but not in others – but this is a bit of a hassle and so our tools will allow for comparing datasets either with or without the Change data, but not both.

The following two tabs change content below.

allenpg

Latest posts by allenpg (see all)

Leave a Reply