How to detect whether a website is hosted on Neocities in Ruby?

Parsing a URI to get the domain name

p URI.parse('https://example.com/').host
# 'example.com'

The simplest case: *.neocities.org

HOST = 'lambdafun.neocities.org'

if HOST =~ /\.neocities\.org$/
  puts "#{HOST} is hosted on Neocities"
end

Custom domain

I check the DNS A record for a specific IP address.

Je cherche une addresse IP spécifique dans les enregistrements DNS (de type A).

First a few prereqs: we need a function that returns the first IP in the A record given a domain name. It defaults to nil if there are none.

En premier lieu, définir une fonction qui retourne la première adresse IPv4 de l'enregistrement A. Si le domaine n'en possède pas, on retourne nil.

require 'net/dns' # gem install net-dns
    
def get_dns_ipv4 domain
  Net::DNS::Resolver
  .start(domain, Net::DNS::A)
  .each_address { |ip|
    return ip.to_s
  }
  nil
end

Then we define the core function.

On définit ensuite la fonction principale.

HOST = 'weeb.tanguy.space'

case get_dns_ipv4(HOST)
when '198.51.233.1'
  puts "#{HOST} is hosted on Neocities"
end

Timeouts & error handling

Depending on your use cases, you'll have to do some error handling as well as setting up timeouts for net/dns.

En fonction de la situation, vous aurez à capturer les exceptions ainsi qu'à tweaker les paramètres de timeouts.

def get_dns_ipv4 domain
  Net::DNS::Resolver
  .new(:udp_timeout => UdpTimeout.new(2),
       :tcp_timeout => TcpTimeout.new(2))
  .search(domain, Net::DNS::A)
  .each_address { |ip|
    return ip.to_s
  }
  nil
end

Other parameters to consider: retry_number and retry_interval.

If anything, this function may fail with a Net::DNS::Resolver::Error exception (ie. When nameserver is unreachable). You might want to rescue from that.

Notre fonction n'est pas à toute épreuve; elle peut rencontrer des exceptions lorsque le serveur n'est pas joignable ou que celui-ci renvoi une réponse incorrecte.

In the case of Github pages

When setting up Github pages, you have to fill the A record with several IP addresses.

Ajouter un nom de domaine nécessite, chez Github, de créer un enregistrement A pointant vers plusieurs adresses IPv4.

The following example works both with *.github.io and custom domains.

when '185.199.108.153',
     '185.199.109.153',
     '185.199.110.153',
     '185.199.111.153'
  puts "#{host} is hosted on Github Pages!"

Using a reference domain, the need to refactor

Instead of hardcoding IP addresses; we could match A records against "trusted" domains'.

Au lieu de coder les IP en dur, il serait peut-être plus intelligent de prendre celles d'un domaine de référence, dont on sait qu'il sera toujours associé au service à détecter.

For instance, in the first example, we can replace the hardcoded IP with get_dns_ipv4('kyledrake.com') which is hosted on Neocities.

Regarding Github pages, consider matching against pages.github.com.

You'd need to refactor get_dns_ipv4() to return an array of IPs instead of just the first one. We would then be able to write the following:

Une refactorisation est nécessaire pour obtenir un tableau de toutes les IP de l'enregistrement A, et non juste la première.

gh_records        = get_dns_ipv4('pages.github.com')
neocities_records = get_dns_ipv4('kyledrake.com')

get_dns_ipv4(host).first.then { |target_record|
  if gh_records.include?(target_record)
    puts "#{host} is hosted on Github pages!"
  elsif neocities_records.include?(target_record)
    puts "#{host} is hosted on Neocities!"
  end
end

Afterword

I made this while developing a top-secret web app which will come out next month.

After a rapid survey in Ruby-Talk, it appears that mapping IP addresses to individual hosting providers is a good pretext for a Gem (Ruby package) and could benefits other projects. So I will work on that too; with a view to learn more about rake and rspec3.

Après un petit sondage sur Ruby-Talk, il s'avère que mapper des IP avec des services en ligne est une bonne idée de Gem (les paquets Ruby). Du coup je travaillerais sur ça aussi; avec l'objectif d'en apprendre plus sur rake et rspec3.

Written by Tanguy Andreani
on 4th of July