The Blogsphere - How pingbacks work

03 May 2019 - tsp
Last update 03 May 2019
Reading time 4 mins

Pingbacks are one of the core features of providing automated social interconnections between different blog pages in the blogsphere. They are formally specified at http://www.hixie.ch/specs/pingback/pingback. Pingback offers an automated way to ping all pages that are linked inside an blog entry about them being linked. This allows the called page to first verify that this is indeed the case, fetch some metadata about the page that links it and if it wants to it allows this page to add a list of pages that reference if (for example as an this article is mentioned by section or similar)

Pingback is normally implemented by your content management system or your static site generator. It processes your blog posts or webpage content, optionally verifies it against a blacklist (or a list of previously pinged resources), fetches the remote webpage and checks either for an HTTP header (X-Pingback or an meta information element <link rel="pingback" href="...">. Either mechanism may be used (of course as always the HTTP header is highly encouraged if possible in any way). This element references to an XML-RPC API endpoint that will be used to deliver the endpoint.

The XML-RPC API endpoint provides a single interface for the pingback.ping function. This function has the signature

String pingback.ping(String sourceURI, String targetURI)

The parameter sourceURI is the URI of the post that contains the link to the target site, targetURI is the absolute URI of the target of the link as given on the source page. Note that these links may have to be renormalized by the endpoint to be easily compareable.

In case the pingback is successful the server has to respond with a single string that contains as much information as the server deems useful (this is expected to be used for debugging purposes). In case of an failure the server must respond with an RPC fault value (see below). Clients should not show the response of a successful request to the user.

0: Generic error code.
0x0010: The source URI does not exist
0x0011: The source URI does not contain a link to the target URI
0x0020: The specified target URI does not exist. This must only be used when the target definitely does not exist rather than when the target may exist but is not recognised.
0x0021: The specified target URI cannot be used as target (it doesn’t exist or is not pingback enabled). On blogs typically only permalinks are pingback enabled, pages like the home page or an index page will normally fail
0x0030: Pinback already has been registered
0x0031: Access denied
0x0032: The server could not communicate with an upstream server or received an error from an upstream server.

The request may do whatever the server desires to do with it. Recommended behaviour is:

The server may attempt to fetch the source URI to verify that the source does indeed link to the target
The server may check its own data to ensure that the target exists and is a valid entry
Server may check that the pingback has not already been registered
The server may record the pingback
The server may regenerate the sites pages

How does the XML-RPC request look like

The request normally is a plain simple XML-RPC request:

?xml version="1.0" encoding="iso-8859-1"?>
<methodCall>
 <methodName>pingback.ping</methodName>
 <params>
  <param>
   <value>
    <string>http://source/url/here</string>
   </value>
  </param>
  <param>
   <value>
    <string>http://target/url/here</string>
   </value>
  </param>
 </params>
</methodCall>

The response can be either a successful response or a fault code. In case of an error:

<?xml version="1.0"?>
<methodResponse>
 <fault>
  <value>
   <struct>
    <member>
     <name>faultCode</name>
     <value><int>17</int></value>
    </member>
    <member>
     <name>faultString</name>
     <value><string>No link from source to target found</string></value>
    </member>
   </struct>
  </value>
 </fault>
</methodResponse>

In case everything went well:

<?xml version="1.0"?>
<methodResponse>
 <params>
  <param>
   <value>
    <string>Pingback successful</string>
   </value>
  </param>
 </params>
</methodResponse>

Spam

Note that pingbacks can - of course - originate from everywhere. Since many pages decide to show pingback sources after checking that the link is present the pingback might be used to distribute spam. One can use different countermeasures for that:

Periodically check that the page is still referenced and remove it when it’s not
Employ some kind of spam filtering. If the score of a pingback is too low do not automatically list
If your page is small just queue the received pingbacks until a user has manually checked they are valid and should be embedded on a webpage.

Amplification attack

An naively implemented pingback module can be used to amplify an DDoS attack because it allows for easy redirection of requests - and requests are usually smaller than responses. To counter that keep track of which resources have been queried after requests and employ rate limiting (do that on a per domain basis if possible). It’s also a good idea to implement blacklisting of domains in case you are using an affilate partner or similar services which you really don’t want to ping.

The Blogsphere - How pingbacks work

How does the XML-RPC request look like

Spam

Amplification attack

Related articles

How does the Internet work and what is it anyways?

Using an IPv6 tunnel broker

Statistics on chat service online status

Language negotiation for static website using Apache httpd

A look at the pastry distributed hashtable

How to handle LoRA messages from thethingsnetwork from JavaEE with TTNClientJ2

Implementing a simple websocket server in Python

mini-apigw: A Lightweight Gateway for Multi-Model AI Infrastructure

Also on this blog

Using RaspberryPis I2C from Python

LM317 calculator

Using Aspell to perform spellchecking (manually and inside the build pipeline)

Getting started with MCP servers