Skip to content

Error 500 when fetching some websites #460

@togarha

Description

@togarha

In a new deployment of wallabag-docker-service on a raspberry pi I'm getting an error 500 when trying to fetch some sites:

[2025-09-06T10:18:46.780809+00:00] graby.INFO: Cached site config with key: viajeroscallejeros.com.merged {"key":"viajeroscallejeros.com.merged"} []
[2025-09-06T10:18:46.780817+00:00] graby.INFO: Fetching url: https://www.viajeroscallejeros.com/que-ver-en-el-hierro/ {"url":"https://www.viajeroscallejeros.com/que-ver-en-el-hierro/"} []
[2025-09-06T10:18:46.780842+00:00] graby.INFO: Trying using method "get" on url "https://www.viajeroscallejeros.com/que-ver-en-el-hierro/" {"method":"get","url":"https://www.viajeroscallejeros.com/que-ver-en-el-hierro/"} []
[2025-09-06T10:18:46.780863+00:00] graby.INFO: Use default user-agent "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.92 Safari/535.2" for url "https://www.viajeroscallejeros.com/que-ver-en-el-hierro/" {"user-agent":"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.92 Safari/535.2","url":"https://www.viajeroscallejeros.com/que-ver-en-el-hierro/"} []
[2025-09-06T10:18:46.780873+00:00] graby.INFO: Use default referer "http://www.google.co.uk/url?sa=t&source=web&cd=1" for url "https://www.viajeroscallejeros.com/que-ver-en-el-hierro/" {"referer":"http://www.google.co.uk/url?sa=t&source=web&cd=1","url":"https://www.viajeroscallejeros.com/que-ver-en-el-hierro/"} []
[2025-09-06T10:18:46.793277+00:00] httplug.INFO: Sending request: GET https://www.viajeroscallejeros.com/que-ver-en-el-hierro/ 1.1 {"uid":"68bc0a86c1a893.72020535"} []
[2025-09-06T10:18:46.950777+00:00] httplug.ERROR: Error: cURL error 52: Empty reply from server when sending request: GET https://www.viajeroscallejeros.com/que-ver-en-el-hierro/ 1.1 {"exception":"[object] (Http\\Client\\Exception\\NetworkException(code: 0): cURL error 52: Empty reply from server at /var/www/wallabag/vendor/php-http/guzzle5-adapter/src/Client.php:116)\n[previous exception] [object] (GuzzleHttp\\Exception\\ConnectException(code: 0): cURL error 52: Empty reply from server at /var/www/wallabag/vendor/guzzlehttp/guzzle/src/Exception/RequestException.php:49)\n[previous exception] [object] (GuzzleHttp\\Ring\\Exception\\ConnectException(code: 0): cURL error 52: Empty reply from server at /var/www/wallabag/vendor/guzzlehttp/ringphp/src/Client/CurlFactory.php:126)","milliseconds":158,"uid":"68bc0a86c1a893.72020535"} []
[2025-09-06T10:18:46.952336+00:00] graby.WARNING: Request throw exception (with no response): cURL error 52: Empty reply from server {"error_message":"cURL error 52: Empty reply from server"} []
[2025-09-06T10:18:46.952408+00:00] graby.INFO: Data fetched: array{"effective_url":"https://www.viajeroscallejeros.com/que-ver-en-el-hierro/","body":"","headers":[],"status":500} {"data":{"effective_url":"https://www.viajeroscallejeros.com/que-ver-en-el-hierro/","body":"","headers":[],"status":500}} []
[2025-09-06T10:18:46.952529+00:00] graby.DEBUG: Fetched HTML {"html":""} []
[2025-09-06T10:18:46.952591+00:00] graby.DEBUG: HTML after regex empty nodes stripping {"html":""} []
[2025-09-06T10:18:46.952645+00:00] graby.INFO: Looking for site config files to see if single page link exists [] []
[2025-09-06T10:18:46.952707+00:00] graby.INFO: Returning cached and merged site config for viajeroscallejeros.com {"host":"viajeroscallejeros.com"} []
[2025-09-06T10:18:46.952761+00:00] graby.INFO: No "single_page_link" config found [] []
[2025-09-06T10:18:46.952808+00:00] graby.INFO: Attempting to extract content [] []
[2025-09-06T10:18:46.952869+00:00] graby.INFO: Returning cached and merged site config for viajeroscallejeros.com {"host":"viajeroscallejeros.com"} []
[2025-09-06T10:18:46.952932+00:00] graby.DEBUG: Actual site config {"siteConfig":{"Graby\\SiteConfig\\SiteConfig":{"title":["//meta[@property=\"og:title\"]/@content"],"body":[],"author":[],"date":["//meta[@property=\"article:published_time\"]/@content"],"strip":["//*[contains(@class, 'google-dfp-ad-wrapper')]","//iframe/@srcdoc"],"src_lazy_load_attr":null,"strip_id_or_class":["sharedaddy","i-amphtml-replaced-content"],"strip_image_src":["doubleclick.net"],"native_ad_clue":[],"http_header":[],"tidy":null,"autodetect_on_failure":null,"prune":null,"test_url":[],"if_page_contains":[],"single_page_link":[],"next_page_link":[],"parser":null,"find_string":["<amp-img","</amp-img>"],"replace_string":["<img","<!-- nothing -->"],"cache_key":null,"requires_login":false,"not_logged_in_xpath":null,"login_uri":null,"login_username_field":null,"login_password_field":null,"login_extra_fields":[],"skip_json_ld":false,"wrap_in":[]}}} []
[2025-09-06T10:18:46.953061+00:00] graby.INFO: Strings replaced: 0 (find_string and/or replace_string) {"count":0} []
[2025-09-06T10:18:46.953115+00:00] graby.DEBUG: HTML after site config strings replacements {"html":""} []
[2025-09-06T10:18:46.953169+00:00] graby.INFO: Attempting to parse HTML with libxml {"parser":"libxml"} []
[2025-09-06T10:18:46.953840+00:00] graby.INFO: Body size after Readability: 85 {"length":85} []
[2025-09-06T10:18:46.953940+00:00] graby.DEBUG: Body after Readability {"dom_saveXML":"<html xmlns=\"http://www.w3.org/1999/xhtml\"><head><title/></head><body>\n</body></html>"} []
[2025-09-06T10:18:46.954089+00:00] graby.INFO: Trying //meta[@property="og:title"]/@content for title {"pattern":"//meta[@property=\"og:title\"]/@content"} []
[2025-09-06T10:18:46.954197+00:00] graby.INFO: Trying //meta[@property="article:published_time"]/@content for date {"pattern":"//meta[@property=\"article:published_time\"]/@content"} []
[2025-09-06T10:18:46.954295+00:00] graby.INFO: Trying //html[@lang]/@lang for language {"pattern":"//html[@lang]/@lang"} []
[2025-09-06T10:18:46.954367+00:00] graby.INFO: Trying //meta[@name="DC.language"]/@content for language {"pattern":"//meta[@name=\"DC.language\"]/@content"} []
[2025-09-06T10:18:46.954439+00:00] graby.INFO: Trying //*[contains(@class, 'google-dfp-ad-wrapper')] to strip element {"pattern":"//*[contains(@class, 'google-dfp-ad-wrapper')]"} []
[2025-09-06T10:18:46.954533+00:00] graby.INFO: Trying //iframe/@srcdoc to strip element {"pattern":"//iframe/@srcdoc"} []
[2025-09-06T10:18:46.954599+00:00] graby.INFO: Trying sharedaddy to strip element {"string":"sharedaddy"} []
[2025-09-06T10:18:46.954717+00:00] graby.INFO: Trying i-amphtml-replaced-content to strip element {"string":"i-amphtml-replaced-content"} []
[2025-09-06T10:18:46.954944+00:00] graby.DEBUG: DOM after site config stripping {"dom_saveXML":"<html xmlns=\"http://www.w3.org/1999/xhtml\"><head><title/></head><body>\n</body></html>"} []
[2025-09-06T10:18:46.955167+00:00] graby.INFO: Using Readability [] []
[2025-09-06T10:18:46.957103+00:00] graby.INFO: Date is bad (wrong year):  {"date":""} []
[2025-09-06T10:18:46.957283+00:00] graby.INFO: Trying again without tidy [] []
[2025-09-06T10:18:46.957358+00:00] graby.DEBUG: Actual site config {"siteConfig":{"Graby\\SiteConfig\\SiteConfig":{"title":["//meta[@property=\"og:title\"]/@content"],"body":[],"author":[],"date":["//meta[@property=\"article:published_time\"]/@content"],"strip":["//*[contains(@class, 'google-dfp-ad-wrapper')]","//iframe/@srcdoc"],"src_lazy_load_attr":null,"strip_id_or_class":["sharedaddy","i-amphtml-replaced-content"],"strip_image_src":["doubleclick.net"],"native_ad_clue":[],"http_header":[],"tidy":null,"autodetect_on_failure":null,"prune":null,"test_url":[],"if_page_contains":[],"single_page_link":[],"next_page_link":[],"parser":null,"find_string":["<amp-img","</amp-img>"],"replace_string":["<img","<!-- nothing -->"],"cache_key":null,"requires_login":false,"not_logged_in_xpath":null,"login_uri":null,"login_username_field":null,"login_password_field":null,"login_extra_fields":[],"skip_json_ld":false,"wrap_in":[]}}} []
[2025-09-06T10:18:46.957488+00:00] graby.INFO: Strings replaced: 0 (find_string and/or replace_string) {"count":0} []
[2025-09-06T10:18:46.957544+00:00] graby.DEBUG: HTML after site config strings replacements {"html":""} []
[2025-09-06T10:18:46.957599+00:00] graby.INFO: Attempting to parse HTML with libxml {"parser":"libxml"} []
[2025-09-06T10:18:46.957763+00:00] graby.INFO: Body size after Readability: 7 {"length":7} []
[2025-09-06T10:18:46.957833+00:00] graby.DEBUG: Body after Readability {"dom_saveXML":"<html/>"} []
[2025-09-06T10:18:46.957938+00:00] graby.INFO: Trying //meta[@property="og:title"]/@content for title {"pattern":"//meta[@property=\"og:title\"]/@content"} []
[2025-09-06T10:18:46.958025+00:00] graby.INFO: Trying //meta[@property="article:published_time"]/@content for date {"pattern":"//meta[@property=\"article:published_time\"]/@content"} []
[2025-09-06T10:18:46.958097+00:00] graby.INFO: Trying //html[@lang]/@lang for language {"pattern":"//html[@lang]/@lang"} []
[2025-09-06T10:18:46.958165+00:00] graby.INFO: Trying //meta[@name="DC.language"]/@content for language {"pattern":"//meta[@name=\"DC.language\"]/@content"} []
[2025-09-06T10:18:46.958233+00:00] graby.INFO: Trying //*[contains(@class, 'google-dfp-ad-wrapper')] to strip element {"pattern":"//*[contains(@class, 'google-dfp-ad-wrapper')]"} []
[2025-09-06T10:18:46.958312+00:00] graby.INFO: Trying //iframe/@srcdoc to strip element {"pattern":"//iframe/@srcdoc"} []
[2025-09-06T10:18:46.958377+00:00] graby.INFO: Trying sharedaddy to strip element {"string":"sharedaddy"} []
[2025-09-06T10:18:46.958472+00:00] graby.INFO: Trying i-amphtml-replaced-content to strip element {"string":"i-amphtml-replaced-content"} []
[2025-09-06T10:18:46.958632+00:00] graby.DEBUG: DOM after site config stripping {"dom_saveXML":"<html/>"} []
[2025-09-06T10:18:46.958831+00:00] graby.INFO: Using Readability [] []
[2025-09-06T10:18:46.960341+00:00] graby.INFO: Date is bad (wrong year):  {"date":""} []
[2025-09-06T10:18:46.960467+00:00] graby.INFO: Success ?  {"is_success":false} []
[2025-09-06T10:18:46.960542+00:00] graby.INFO: Extract failed [] []
[2025-09-06T10:18:46.960815+00:00] app.DEBUG: Extracting images from content to provide a default preview picture [] []
[2025-09-06T10:18:46.961402+00:00] app.DEBUG: 0 pictures found [] []
[2025-09-06T10:18:46.963695+00:00] security.DEBUG: Stored the security token in the session. {"key":"_security_secured_area"} []

If I make a direct curl https://www.viajeroscallejeros.com/que-ver-en-el-hierro/ from inside the container, it works perfectly.

I can fetch some addresses without problems, but some others I cannot fetch and I get an that error 500

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions