SEMRush is a search marketing tool that SEOs can use to get all kinds of keyword and competitor data.
This data can then be used to inform a website’s content strategy. For a newish website, the holy grail is finding keywords that have a high volume (average number of monthly searches) with low difficulty (how easy it will be to rank the keyword on the first two pages of Google).
Of course, this raw data will only be useful if it is accurate.
For my own portfolio of websites, I wanted to get an idea of how accurate the keyword volume data on SEMRush is. By doing this, I could (if necessary) adjust the values accordingly and get a more complete picture of the traffic I could expect were I to rank a particular keyword.
To test this, I used a small sample of ten keywords that I already knew (fairly reliably) the monthly volume of searches for and compared them against what SEMRush reported as the monthly search volume.
The data source for the ‘reliable’ volume data was the impressions shown in Google Search Console (GSC) for a variety of keywords across different niches. I used a mixture of medium and low volume keywords that my websites had ranked in the top 3 for over 12 months. The time ranged used was the average over a 12-month period to keep it as close as possible to SEMRush’s data.
The table below shows the GSC volume data and SEMRush volume data for each of the ten keywords used along with the percentage error. Of course, I have no wish to share keywords that I rank for with the world so they have simply been named KW1, KW2 etc.
The primary takeaway from this research is that SEMRush volume data is a guestimate. However the guesstimates are (with the exception of KW9) within an error range of 20% in both directions.
This is useful because it can inform my future keyword research. For example, if I see a keyword in SEMRush with a search volume of 1,000, I will now think it is actually between 800 and 1,200. Similarly a keyword with 20,000 searches, I will now consider to be in the range of 16000 and 24,000.
KW9 is the outlier with an error of 97% – it is interesting to note that this was the longest-tail keyword I used. This tells me that SEMRush’s database is incomplete and that there are a lot of potentially lucrative keywords (perhaps 10%, according to this study) that cannot be discovered using the SEMRush tool.
This supports my own anecdotal evidence that one of the best places to discover potentially valuable keywords is your own search console. This is something I have been doing for years to easily rank websites for keywords that very few others really know exist!
Of course, there are several limitations to this study. Only a very small sample of data was used and the keywords were limited to those of my own website’s niches. Better conclusions could perhaps be drawn from a larger sample of keywords across a larger cross-section of websites.
Maybe I will do this at a later date…