summaryrefslogtreecommitdiffhomepage
path: root/src
AgeCommit message (Collapse)Author
2020-09-09fix negative timeout resulting in select() EINVALrofl0r
2020-09-08get_request_entity: fix regression w/ CONNECT methodrofl0r
introduced in 88153e944f7d28f57cccc77f3228a3f54f78ce4e. when connect method is used (HTTPS), and e.g. a filtered domain requested, there's no data on readfds, only on writefds. this caused the response from the connection to hang until the timeout was hit. in the past in such scenario always a "no entity" response was produced in tinyproxy logs.
2020-09-07make acl lookup 450x faster by using sblistrofl0r
tested with 32K acl rules, generated by for x in `seq 128` ; do for y in `seq 255` ; do \ echo "Deny 10.$x.$y.0/24" ; done ; done after loading the config (which is dogslow too), tinyproxy required 9.5 seconds for the acl check on every request. after switching the list implementation to sblist, a request with the full acl check now takes only 0.025 seconds. the time spent for loading the config file is identical for both list implementations, roughly 30 seconds. (in a previous test, 65K acl rules were generated, but every connection required almost 2 minutes to crunch through the list...)
2020-09-07acl: typedef access_list to acl_list_trofl0r
this allows to switch the underlying implementation easily.
2020-09-07check_acl: do full_inet_pton() only once per iprofl0r
if there's a long list of acl's, doing full_inet_pton() over and over with the same IP isn't really efficient.
2020-09-07get_request_entity: respect user-set timeoutrofl0r
get_request_entity() is only called on error, for example if a client doesn't pass a check_acl() check. in such a case it's possible that the client fd isn't yet ready to read from. using select() with a timeout timeval of {0,0} causes it to return immediately and return 0 if there's no data ready to be read. this resulted in immediate connection termination rather than returning the 403 access denied error page to the client and a confusing "no entity" message displayed in the proxy log.
2020-09-07change loglevel of start/stop/reload messages to NOTICErofl0r
this allows to see them when the verbose INFO loglevel is not desired. closes #78
2020-09-07upstream: fix ip/mask calculation for types other than nonerofl0r
the code wrongly processed the site_spec (here: domain) parameter only when PT_TYPE == PT_NONE. re-arranged code to process it correctly whenever passed. additionally the mask is now also applied to the passed subnet/ip, so a site_spec like 127.0.0.1/8 is converted into 127.0.0.0/8. also the case where inet_aton fails now produces a proper error message. note that the code still doesn't process ipv6 addresses and mask. to support it, we should use the existing code in acl.c and refactor it so it can be used from both call sites. closes #83 closes #165
2020-09-07html-error: substitute template variables via a regexrofl0r
previously, in order to detect and insert {variables} into error/stats templates, tinyproxy iterated char-by-char over the input file, and would try to parse anything inside {} pairs and treat it like a variable name. this breaks CSS, and additionally it's dog slow as tinyproxy wrote every single character to the client via a write syscall. now we process line-by-line, and inspect all matches of the regex \{[a-z]{1,32}\}. if the contents of the regex are a known variable name, substitution is taking place. if not, the contents are passed as-is to the client. also the chunks before and after matches are written in a single syscall. closes #108
2020-09-07Do not give error while storing invalid header[anp/hsw]
2020-09-07config parser: increase possible line length limitrofl0r
let's use POSIX LINE_MAX (usually 4KB) instead of 1KB. closes #226
2020-09-06allow SIGUSR1 to be used as an alternative to SIGHUProfl0r
this allows a tinyproxy session in terminal foreground mode to reload its configuration without dropping active connections.
2020-09-06main.c: remove set_signal_handler code duplicationrofl0r
2020-09-06do not catch SIGHUP in foreground-moderofl0r
it's quite unexpected for an application running foreground in a terminal to keep running when the terminal is closed. also in such a case (if file logging is disabled) there's no way to see what's happening to the proxy.
2020-09-06send_html_file(): also set empty variables to "(unknown)"rofl0r
2020-09-06transparent: remove usage of inet_ntoa(), make IPv6 readyrofl0r
inet_ntoa() uses a static buffer and is therefore not threadsafe. additionally it has been deprecated by POSIX. by using inet_ntop() instead the code has been made ipv6 aware. note that this codepath was only entered in the unlikely event that no hosts header was being passed to the proxy, i.e. pre-HTTP/1.1.
2020-09-05filter: reduce memory usage, fix OOM crashesrofl0r
* check return values of memory allocation and abort gracefully in out-of-memory situations * use sblist (linear dynamic array) instead of linked list - this removes one pointer per filter rule - removes need to manually allocate/free every single list item (instead block allocation is used) - simplifies code * remove storage of (unused) input rule - removes one char* pointer per filter rule - removes storage of the raw bytes of each filter rule * add line number to display on out-of-memory/invalid regex situation * replace duplicate filter_domain()/filter_host() code with a single function filter_run() - reduces code size and management effort with these improvements, >1 million regex rules can be loaded with 4 GB of RAM, whereas previously it crashed with about 950K. the list for testing was assembled from http://www.shallalist.de/Downloads/shallalist.tar.gz closes #20
2020-09-01Change loglevel for "Maximum number of connections reached"Nicolai Søborg
I was hit by this, and did not see anything in the log, connections was just hanging. Think warning is a better log level
2020-08-19upstream: allow port 0 to be specifiedrofl0r
this is useful to use upstream directive to null-route a specific target domain. e.g. upstream http 0.0.0.0:0 ".adserver.com"
2020-07-15enforce socket timeout on new sockets via setsockopt()rofl0r
the timeout option set by the config file wasn't respected at all so it could happen that connections became stale and were never released, which eventually caused tinyproxy to hit the limit of open connections and never accepting new ones. addresses #274
2020-07-06fix check_acl compilation with --enable-debugxiejianjun
regression introduced in f6d4da5d81694721bf50b2275621e7ce84e6da30. this has been overlooked due to the assert macro being optimized out in non-debug builds.
2020-03-18transparent: fix invalid memory accessrofl0r
getsockname() requires addrlen to be set to the size of the sockaddr struct passed as the addr, and a check whether the returned addrlen exceeds the initially passed size (to determine whether the address returned is truncated). with a request like "GET /\r\n\r\n" where length is 0 this caused the code to assume success and use the values of the uninitialized sockaddr struct.
2020-03-16anonymous: fix segfault loading config itemrofl0r
unlike other functions called from the config parser code, anonymous_insert() accesses the global config variable rather than passing it as an argument. however the global variable is only set after successful loading of the entire config. we fix this by adding a conf argument to each anonymous_* function, passing the global pointer in calls done from outside the config parser. fixes #292
2020-01-15conf: use 2 swappable conf slots, so old config can stay validrofl0r
... in case reloading of it after SIGHUP fails, the old config can continue working. (apart from the logging-related issue mentioned in 27d96df99900c5a62ab0fdf2a37565e78f256d6a )
2020-01-15conf: fix loading of default valuesrofl0r
previously, default values were stored once into a static struct, then on each reload item by item copied manually into a "new" config struct. this has proven to be errorprone, as additions in one of the 2 locations were not propagated to the second one, apart from being simply a lot of gratuitous code. we now simply load the default values directly into the config struct to be used on each reload. closes #283
2020-01-15remove duplicate code calling reload_config_file()rofl0r
as a side effect of not updating the config pointer when loading the config file fails, the "FIXME" level comment to take appropriate action in that case has been removed. the only issue remaining when receiving a SIGHUP and encountering a malformed config file would now be the case that output to syslog/logfile won't be resumed, if initially so configured.
2020-01-15access config via a pointer, not a hardcoded struct addressrofl0r
this is required so we can elegantly swap out an old config for a new one in the future and remove lots of boilerplate from config initialization code. unfortunately this is a quite intrusive change as the config struct was accessed in numerous places, but frankly it should have been done via a pointer right from the start. right now, we simply point to a static struct in main.c, so there shouldn't be any noticeable changes in behaviour.
2020-01-15remove config file name item from conf structrofl0r
since this is set via command line, we can deal with it easily from where it is actually needed.
2020-01-15remove godaemon member from config structurerofl0r
since this option can't be set via config file, it makes sense to factor it out and use it only where strictly needed, e.g. in startup code.
2020-01-15log: remove special case code for daemonized mode without logfilerofl0r
if daemon mode is used and neither logfile nor syslog options specified, this is clearly a misconfiguration issue. don't try to be smart and work around that, so less global state information is required. also, this case is already checked for in main.c:334.
2020-01-15syslog: always use LOG_USER facilityrofl0r
LOG_DAEMON isn't specified in POSIX and the gratuitously different treatment is in the way of a planned cleanup.
2020-01-15move commandline parsing to main()rofl0r
2020-01-15move initialize_config_defaults to conf.crofl0r
2019-12-21implement detection and denial of endless connection loopsrofl0r
it is quite easy to bring down a proxy server by forcing it to make connections to one of its own ports, because this will result in an endless loop spawning more and more connections, until all available fds are exhausted. since there's a potentially infinite number of potential DNS/ip addresses resolving to the proxy, it is impossible to detect an endless loop by simply looking at the destination ip address and port. what *is* possible though is to record the ip/port tuples assigned to outgoing connections, and then compare them against new incoming connections. if they match, the sender was the proxy itself and therefore needs to reject that connection. fixes #199.
2019-12-21do hostname resolution only when it is absolutely necessary for ACL checkrofl0r
tinyproxy used to do a full hostname resolution whenever a new client connection happened, which could cause very long delays (as reported in #198). there's only a single place/scenario that actually requires a hostname, and that is when an Allow/Deny rule exists for a hostname or domain, rather than a raw IP address. since it is very likely this feature is not very widely used, it makes absolute sense to only do the costly resolution when it is unavoidable.
2019-12-21move sockaddr_union to sock.hrofl0r
2019-12-21log.c: protect logging facility with a mutexrofl0r
since the write syscall is used instead of stdio, accesses have been safe already, but it's better to use a mutex anyway to prevent out- of-order writes.
2019-12-21conf.c: merely warn on encountering recently obsoleted config itemsrofl0r
if we don't handle these gracefully, pretty much every existing config file will fail with an error, which is probably not very friendly. the obsoleted config items can be made hard errors after the next release.
2019-12-21conf.c: pass lineno to handler funcsrofl0r
2019-12-21simplify codebase by using one thread/conn, instead of preforked procsrofl0r
the existing codebase used an elaborate and complex approach for its parallelism: 5 different config file options, namely - MaxClients - MinSpareServers - MaxSpareServers - StartServers - MaxRequestsPerChild were used to steer how (and how many) parallel processes tinyproxy would spin up at start, how many processes at each point needed to be idle, etc. it seems all preforked processes would listen on the server port and compete with each other about who would get assigned the new incoming connections. since some data needs to be shared across those processes, a half- baked "shared memory" implementation was provided for this purpose. that implementation used to use files in the filesystem, and since it had a big FIXME comment, the author was well aware of how hackish that approach was. this entire complexity is now removed. the main thread enters a loop which polls on the listening fds, then spins up a new thread per connection, until the maximum number of connections (MaxClients) is hit. this is the only of the 5 config options left after this cleanup. since threads share the same address space, the code necessary for shared memory access has been removed. this means that the other 4 mentioned config option will now produce a parse error, when encountered. currently each thread uses a hardcoded default of 256KB per thread for the thread stack size, which is quite lavish and should be sufficient for even the worst C libraries, but people may want to tweak this value to the bare minimum, thus we may provide a new config option for this purpose in the future. i suspect that on heavily optimized C libraries such a musl, a stack size of 8-16 KB per thread could be sufficient. since the existing list implementation in vector.c did not provide a way to remove a single item from an existing list, i added my own list implementation from my libulz library which offers this functionality, rather than trying to add an ad-hoc, and perhaps buggy implementation to the vector_t list code. the sblist code is contained in an 80 line C file and as simple as it can get, while offering good performance and is proven bugfree due to years of use in other projects.
2019-11-27Use gai_strerror() to report errors of getaddrinfo() and getnameinfo()Martin Kutschker
2019-06-14fix usage of stathost in combination with basic authrofl0r
http protocol requires different treatment of proxy auth vs server auth. fixes #246
2019-05-05filter file: Don't ignore lines with leading whitespace (#239)Janosch Hoffmann
The new code skips leading whitespaces before removing trailing whitespaces and comments. Without doing this, lines with leading whitespace are treated like empty lines (i.e. they are ignored).
2018-12-15child.c: properly initialize fdset for each select() call (#216)rofl0r
it was reported that because the fdset was only initialized once, tinyproxy would fail to properly listen on more than one interface. closes #214 closes #127
2018-11-23Basic Auth: allow almost all possible characters for user/passVasily
previously was restricted to alphanumeric chars only.
2018-09-01Remove unused authors.c/authors.h and generation mechanism.Michael Adam
Signed-off-by: Michael Adam <obnox@samba.org>
2018-09-01main: remove the "-l" switch to display the license and authorsMichael Adam
Signed-off-by: Michael Adam <obnox@samba.org>
2018-05-29fix socks5 upstream user/pass subnegotiation checkrofl0r
RFC 1929 specifies that the user/pass auth subnegotation repurposes the version field for the version of that specification, which is 1, not 5. however there's quite a good deal of software out there which got it wrong and replies with version 5 to a successful authentication, so let's just accept both forms - other socks5 client programs like curl do the same. closes #172
2018-03-29fix basicauth string comparisonrofl0r
closes #160
2018-03-27html-error: Make a switch fallthrough explicitMichael Adam
This silences a gcc v7 compile warning. Signed-off-by: Michael Adam <obnox@samba.org>