Update (20/Apr/2015 09:50): The IT department from Cetramsa has contacted me and told me that the server was automatically blocked on Saturday due to a high amount of requests. However, analyzing the server history I found no requests peak at all on Saturday. The block is still active, but they will remove it.
Yesterday my app Next bus Barcelona started reporting that the servers from AMB (Àrea Metropolitana de Barcelona) were down, and the alternate TMB server was being used. At first I didn’t worry at all because this tends to happen on some weekends: the AMB server stops working and noone restarts it until Monday morning (I guess that asking for a person that checks it in these cases is too costly for the administration…). However, this morning my monitoring service at Monitis started reporting that the services were reporting an error. When I logged in to the server to check what was happening, I discovered that the host “www.ambmobilitat.cat” was not responding at all. There was a bug in my code that resulted in an Internal Server Error, which I fixed in 5 minutes, but the root cause was that the AMB servers were down. Nothing too alarming: as I said, his tends to happen more frequently than it should.
However, what has really upset me is that the official app “AMB Temps bus” and the website were really up and getting real-time data correctly. So I decided to check if it was only a problem in my server.
- I can communicate to www.ambmobilitat.cat from my own computer with no problem at all:
ereza@sylvarant:~$ wget http://www.ambmobilitat.cat/Principales/Inicio.aspx --2015-04-19 13:47:53-- http://www.ambmobilitat.cat/Principales/Inicio.aspx S'està resolent www.ambmobilitat.cat (www.ambmobilitat.cat)... 220.127.116.11 S'està connectant a www.ambmobilitat.cat (www.ambmobilitat.cat)|18.104.22.168|:80...connectat. HTTP: s'ha enviat la petició, s'està esperant una resposta...200 OK Mida: 145314 (142K) [text/html] S'està desant a: «Inicio.aspx» 100%[===================================================================================================================================================================================================>] 145.314 613K/s en 0,2s 2015-04-19 13:48:02 (613 KB/s) - s'ha desat «Inicio.aspx» [145314/145314]
- A server on the same subnet as the Next bus Barcelona server has no problem at all either:
root@brinstar:~# wget http://www.ambmobilitat.cat/Principales/Inicio.aspx --2015-04-19 13:50:37-- http://www.ambmobilitat.cat/Principales/Inicio.aspx Resolving www.ambmobilitat.cat (www.ambmobilitat.cat)... 22.214.171.124 Connecting to www.ambmobilitat.cat (www.ambmobilitat.cat)|126.96.36.199|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 145314 (142K) [text/html] Saving to: `Inicio.aspx' 100%[===================================================================================================================================================================================================>] 145,314 391K/s in 0.4s 2015-04-19 13:50:39 (391 KB/s) - `Inicio.aspx' saved [145314/145314]
- The Next bus Barcelona server can’t contact the AMB server:
root@crateria:~# wget http://www.ambmobilitat.cat/Principales/Inicio.aspx --2015-04-19 13:50:40-- http://www.ambmobilitat.cat/Principales/Inicio.aspx Resolving www.ambmobilitat.cat (www.ambmobilitat.cat)... 188.8.131.52 Connecting to www.ambmobilitat.cat (www.ambmobilitat.cat)|184.108.40.206|:80... failed: Connection timed out.
This seems to point out that my server has been blocked by IP address. Here is a diagram to make it more understandable to everyone:
I want to believe that this is the result of an automatic blocking and not a manual block. However, past incidents tell me that it could well be a manual block. I will try to contact Àrea Metropolitana to ask for directions.
Meanwhile, in order to avoid a service disruption to the users, I have configured my scripts to use Tor (I copied the idea from Roc Boronat, who already did this on his Vicing app). By doing this, I can circumvent the blocking, and the IP address will vary on each request, so it become difficult for them to block me again. This has a drawback: connection will be a little slower, because of all the additional layers between my server and the AMB server, but for now it will do (let’s hope that it will handle all Monday morning traffic successfully). The non-realtime data fetching scripts still don’t use Tor, but they are not critical (the previous data will be available). Again, here’s a little diagram:
Anyway, I would like to ask the Àrea Metropolitana, Barcelona City Council, and whoever is in charge: Is this the “open data” you promote? How can anyone create any product (commercial or not), with the possibility of suddenly losing access to the data? If this is a manual block, why are the public entities so afraid of an app made by someone on their free time? Why are the servers down on Sundays? What do I have to do to avoid having more problems with my app?
Also, I am still waiting for anyone from CETRAMSA to contact me in order to access their data in an official way. And that’s one year and a half since they told me they would. I know that there have been some efforts in opening some data, but no one has ever contacted me warning me of changes or telling me that they have put out data. However, they did have time to add “Proper BUS” (sic) to the list of users of “open” data.