The Blogsphere - How pingbacks work
03 May 2019 - tsp
Last update 03 May 2019
4 mins
Pingbacks are one of the core features of providing automated social interconnections
between different blog pages in the blogsphere. They are formally specified at
http://www.hixie.ch/specs/pingback/pingback.
Pingback offers an automated way to ping all pages that are linked
inside an blog entry about them being linked. This allows the called page to first
verify that this is indeed the case, fetch some metadata about the page that
links it and if it wants to it allows this page to add a list of pages that
reference if (for example as an this article is mentioned by section or similar)
Pingback is normally implemented by your content management system or your static site
generator. It processes your blog posts or webpage content, optionally verifies it against
a blacklist (or a list of previously pinged resources), fetches the remote webpage and checks
either for an HTTP header (X-Pingback
or an meta information element <link rel="pingback" href="...">
.
Either mechanism may be used (of course as always the HTTP header is highly encouraged if possible
in any way). This element references to an XML-RPC API endpoint that will be used to deliver the
endpoint.
The XML-RPC API endpoint provides a single interface for the pingback.ping
function. This function
has the signature
String pingback.ping(String sourceURI, String targetURI)
The parameter sourceURI
is the URI of the post that contains the link to the target
site, targetURI
is the absolute URI of the target of the link as given on the source page.
Note that these links may have to be renormalized by the endpoint to be easily compareable.
In case the pingback is successful the server has to respond with a single string that contains as
much information as the server deems useful (this is expected to be used for debugging purposes). In
case of an failure the server must respond with an RPC fault value (see below). Clients should not
show the response of a successful request to the user.
- 0: Generic error code.
- 0x0010: The source URI does not exist
- 0x0011: The source URI does not contain a link to the target URI
- 0x0020: The specified target URI does not exist. This must only be used when the target definitely does
not exist rather than when the target may exist but is not recognised.
- 0x0021: The specified target URI cannot be used as target (it doesnβt exist or is not pingback enabled). On
blogs typically only permalinks are pingback enabled, pages like the home page or an index page will normally
fail
- 0x0030: Pinback already has been registered
- 0x0031: Access denied
- 0x0032: The server could not communicate with an upstream server or received an error from an upstream server.
The request may do whatever the server desires to do with it. Recommended behaviour is:
- The server may attempt to fetch the source URI to verify that the source does indeed link to the target
- The server may check its own data to ensure that the target exists and is a valid entry
- Server may check that the pingback has not already been registered
- The server may record the pingback
- The server may regenerate the sites pages
How does the XML-RPC request look like
The request normally is a plain simple XML-RPC request:
?xml version="1.0" encoding="iso-8859-1"?>
<methodCall>
<methodName>pingback.ping</methodName>
<params>
<param>
<value>
<string>http://source/url/here</string>
</value>
</param>
<param>
<value>
<string>http://target/url/here</string>
</value>
</param>
</params>
</methodCall>
The response can be either a successful response or a fault code.
In case of an error:
<?xml version="1.0"?>
<methodResponse>
<fault>
<value>
<struct>
<member>
<name>faultCode</name>
<value><int>17</int></value>
</member>
<member>
<name>faultString</name>
<value><string>No link from source to target found</string></value>
</member>
</struct>
</value>
</fault>
</methodResponse>
In case everything went well:
<?xml version="1.0"?>
<methodResponse>
<params>
<param>
<value>
<string>Pingback successful</string>
</value>
</param>
</params>
</methodResponse>
Spam
Note that pingbacks can - of course - originate from everywhere. Since many pages decide to show
pingback sources after checking that the link is present the pingback might be used to distribute
spam. One can use different countermeasures for that:
- Periodically check that the page is still referenced and remove it when itβs not
- Employ some kind of spam filtering. If the score of a pingback is too low do not
automatically list
- If your page is small just queue the received pingbacks until a user has manually
checked they are valid and should be embedded on a webpage.
Amplification attack
An naively implemented pingback module can be used to amplify an DDoS attack because it
allows for easy redirection of requests - and requests are usually smaller than responses. To counter
that keep track of which resources have been queried after requests and employ rate limiting (do that
on a per domain basis if possible). Itβs also a good idea to implement blacklisting of domains in case
you are using an affilate partner or similar services which you really donβt want to ping.
This article is tagged: