trafficserver-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Łukasz Nowak <luk...@nowak.io>
Subject Some disconnections from origin
Date Tue, 12 Jan 2021 09:20:34 GMT
Hello,

I am using TrafficServer which has only one origin - haproxy - and which
serves as a cache for many sites.

Details are:

Version of Traffic Server used: 7.1.12 (also applies to 8.1.0)
Platform: Linux 64 bit, gcc 8.3.0
Any relevant configuration changes you've made from the default
configurations (particularly for records.config), part of `traffic_ctl
diff`:
proxy.config.http.cache.open_write_fail_action has changed
        Current Value   : 2
        Default Value   : 0
proxy.config.http.negative_revalidating_enabled has changed
        Current Value   : 1
        Default Value   : 0
proxy.config.http.negative_revalidating_lifetime has changed
        Current Value   : 86400
        Default Value   : 1800
proxy.config.http.insert_client_ip has changed
        Current Value   : 0
        Default Value   : 1
proxy.config.http.insert_squid_x_forwarded_for has changed
        Current Value   : 0
        Default Value   : 1
proxy.config.http.transaction_no_activity_timeout_in has changed
        Current Value   : 600
        Default Value   : 30
proxy.config.http.transaction_no_activity_timeout_out has changed
        Current Value   : 600
        Default Value   : 30
proxy.config.http.connect_attempts_max_retries has changed
        Current Value   : 0
        Default Value   : 3
proxy.config.http.connect_attempts_max_retries_dead_server has changed
        Current Value   : 0
        Default Value   : 1
proxy.config.http.connect_attempts_timeout has changed
        Current Value   : 600
        Default Value   : 30
proxy.config.http.post_connect_attempts_timeout has changed
        Current Value   : 600
        Default Value   : 1800
proxy.config.http.normalize_ae_gzip has changed
        Current Value   : 0
        Default Value   : 1
proxy.config.cache.ram_cache.size has changed
        Current Value   : 1073741824
        Default Value   : -1
proxy.config.url_remap.pristine_host_hdr has changed
        Current Value   : 1
        Default Value   : 0

I have trafficserver connecting to origin (haproxy) on the same host with:

cat etc/trafficserver/remap.config
map /HTTPS/ http://10.0.251.170:21443
map / http://10.0.251.170:41080

There is constant stream of requests going to trafficserver. About 0.5% of
them fails with returning 502 to the client, with entry in
var/log/trafficserver/error.log:

20210107.13h02m15s CONNECT: could not connect to 10.0.251.170 for '
http://10.0.251.170:41080/path/' (setting last failure time)
20210107.13h02m15s RESPONSE: sent 10.0.251.170 status 502 (Server Hangup)
for 'http://10.0.251.170:41080/path/'

also seen in var/log/trafficserver/squid.log:

1610020935.756 43 10.0.251.170 TCP_REFRESH_FAIL_HIT/502 498 GET
http://10.0.251.170:41080/path/ - DIRECT/10.0.251.170 text/html

Thanks to enabling diagnostics with:


CONFIG proxy.config.diags.debug.enabled INT 1
CONFIG proxy.config.diags.debug.tags STRING http.*

I saw in var/log/trafficserver/traffic.out:

[Jan  7 13:02:15.757] Server {0x1462a05c1700} DEBUG: <HttpSM.cc:2666
(main_handler)> (http) [5687] [HttpSM::main_handler,
VC_EVENT_WRITE_COMPLETE]
[Jan  7 13:02:15.757] Server {0x1462a05c1700} DEBUG: <HttpSM.cc:1994
(state_send_server_request_header)> (http) [5687]
[&HttpSM::state_send_server_request_header, VC_
EVENT_WRITE_COMPLETE]
[Jan  7 13:02:15.795] Server {0x1462a05c1700} DEBUG: <HttpSM.cc:2666
(main_handler)> (http) [5687] [HttpSM::main_handler, VC_EVENT_EOS]
[Jan  7 13:02:15.795] Server {0x1462a05c1700} DEBUG: <HttpSM.cc:1836
(state_read_server_response_header)> (http) [5687]
[&HttpSM::state_read_server_response_header, V
C_EVENT_EOS]
[Jan  7 13:02:15.795] Server {0x1462a05c1700} DEBUG: <HttpSM.cc:1923
(state_read_server_response_header)> (http_seq) Error parsing server
response header
[Jan  7 13:02:15.795] Server {0x1462a05c1700} DEBUG: <HttpSM.cc:5513
(handle_server_setup_error)> (http) [5687]
[&HttpSM::handle_server_setup_error, VC_EVENT_EOS]
[Jan  7 13:02:15.795] Server {0x1462a05c1700} DEBUG: <HttpTransact.cc:3394
(HandleResponse)> (http_trans) [5687] [HttpTransact::HandleResponse]
[Jan  7 13:02:15.795] Server {0x1462a05c1700} DEBUG: <HttpTransact.cc:3395
(HandleResponse)> (http_seq) [5687] [HttpTransact::HandleResponse] Response
received
[Jan  7 13:02:15.795] Server {0x1462a05c1700} DEBUG: <HttpTransact.cc:8497
(ink_cluster_time)> (http_trans) [ink_cluster_time] local: 1610020935,
highest_delta: 0, cl
uster: 1610020935
[Jan  7 13:02:15.795] Server {0x1462a05c1700} DEBUG: <HttpTransact.cc:3402
(HandleResponse)> (http_trans) [5687] [HandleResponse]
response_received_time: 1610020935
+++++++++ Incoming O.S. Response +++++++++
-- State Machine Id: 5687
HTTP/1.0 0

I clearly see that setting

proxy.config.http.send_http11_requests INT 0

Drops the amount of problems to almost 0 (but they still appear).

As a workaround I used (with keeping HTTP/1.1):

CONFIG proxy.config.http.connect_attempts_max_retries INT 3
CONFIG proxy.config.http.connect_attempts_max_retries_dead_server INT 1

Then my client is never served with 502 in such case, and with
TrafficServer 7 there is nothing in the var/log/trafficserver/error.log,
while using TrafficServer 8 there is:

20210111.14h30m34s CONNECT:[0] could not connect [CONNECTION_CLOSED] to
10.0.251.170 for 'http://10.0.251.170:41080/'

And well, TrafficServer just reconnects to the origin.

My questions are:

 * Is it a possible bug in TrafficServer (somewhat similar to
https://issues.apache.org/jira/browse/TS-3959)?
 * Is it misconfiguration of TrafficServer?
 * Is there all ok with TrafficServer and there is really a problem in my
origin software?

Thanks for the tips,
Regards,
Łukasz

Mime
View raw message