Skip to main content

Backlink data has been in SEO industry since the beginning of Google. However, many SEO services still haven’t maximized their backlink data. For example, you may not know just how much backlink data does one need? And what should you do when you have it? Below are some typical backlink profiles that you can learn and follow.

It Matters How Links are scored

Instead of showing the details of how to use a search engine model to determine the minimum level of backlink, it is better to show what backlink data needs to show in the first place. One of the ways is by seeing site’s PR. In fact, for quite some times, many people like to put their links on sites that have high PR score. However, in this occasion, we won’t go into how it is calculated here. Otherwise, think of this as a raw or gross ranking ranking power metric for any given page on the internet. In the screenshot below, it’s called Gross Total Link Flow (on Webpage).


Moreover, you may need to put attention on your font. As your font size also plays an important role in determining your link score. If all the other links on that page are smaller in font, it actually gets a boost in link flow.

True Link Scoring, Relevance, and N-Order Scoring

In order to determine how “relevant” a particular link is, one must do processing not only on the target page but also the source page. But how to properly score the source page with a finite set of resources? Here are two options that you can use.

  1. Brute force it: buy a lot of compute time on Amazon, and build a very large search engine that will crawl (and score) each backlink, the backlinks of those backlinks, and so on, or
  2. Find a diminishing point of accuracy with regards to backlink data, reducing the data set to a sample size that can actually be used in a modern search engine model that calculates just like a search engine, but within a reasonable time frame.

Note that the first option will provide you with so many backlink data which their relevancies to your business are still questionable.

What Does a Typical Backlink Profile Look Like?

With plenty amounts of data, we need to know which metric to use in order to sort all of the backlinks. This will help us assuring the strongest subset of link data possible. But most of all, what we need to know is how much power each backlink is distributing to the target website.

In this example, we’ll call this ranking power “Link Flow Share”. If we sort by this Link Flow Share metric, we get a distribution that, for most backlink profiles, looks like this:


You will see that the majority of this Link Flow Share is distributed from a top select group of links. The “head” is what affects the bulk of the Link Flow Share is distributed from a top select group links.  While the “long tail” of Link Flow Share affects things like keyword positioning the “head” is what affects the bulk of the Link Flow to that page.

Where is the Cutoff Point for the Sample Size?

It is important to determine the cutoff point for a given list of backlinks. The cutoff point at least should not suffer from any loss of precision or accuracy when used in a statistical modeling environment. In the other words, we can measure the accuracy of a number of alternative subsets of backlink data by using our search engine model.

In the screenshot below, you can see how a search engine model uses a similar “Net Total Link Flow Boost” figure to incorporate into its query scoring model.


Through this process, you can determine “how much backlink data” we need. You can follow the steps below:

  1. Start with the top 10% of backlinks for a group of websites, and run that into our search engine model. Determine how well the model self-calibrates itself to reality.
  2. Next, reduce that backlink sample to 5% of the total backlinks, again determining how well the search engine model self-calibrates itself to reality.
  3. If the accuracy of the model drops by a minimum threshold percentage, then stop. You have found your cutoff point. If not, go back to step 2 and drop the subset of backlink data to 3%, 2%, 1% and so on, until you have found your cutoff point.