* COMMENT -*- mode: org -*- #+Date: 2015-09-07 Time-stamp: <2015-09-07> * 2015-09-07 write an apache2 access log parser in haskell. practice using attoparsec. 10.21.176.7 - - [06/Sep/2015:06:26:02 +0800] "GET /v3/auth/tokens HTTP/1.1" 200 3997 "-" "-" 1011068 10.21.176.7 - - [06/Sep/2015:06:26:03 +0800] "GET /v3/auth/tokens HTTP/1.1" 200 4065 "-" "python-requests/2.2.1 CPython/2.7.3 Linux/3.13.0-38-generic" 5481 10.21.176.7 - - [06/Sep/2015:06:26:02 +0800] "GET /v2.0/tokens/e79ce79f16f04852a7d981009090fb75 HTTP/1.1" 200 2692 "-" "python-requests/2.2.1 CPython/2.7.3 Linux/3.13.0-40-generic" 879293 10.21.176.7 - - [06/Sep/2015:06:50:02 +0800] "POST /v2.0/tokens HTTP/1.1" 200 2615 "-" "python-neutronclient" 211874238 allow me to - filter slow requests. -gt 2s - find most frequent request. order by request verb and path frequency. print out first N most frequently viewed verb&path. get top 5 request cd /var/log/apache2 ~sylecn/a2p -i access.keystone.admin.ssl.log -t 5 stack build && stack exec a2p -- -i ~/d/t1 --top 1 output for user: M% COUNT POST xxx M% COUNT GET xxx return value for this Map (verb, path) count Order by count. get first N. filter the map before converting to alist. - how to do insert or update? when doing counting. you have to use lookup. it's probably not thread safe. - works perfectly on production access log. how to use the parser? a2p -i ACCESS_LOG_FILE [-gt 20s] - created project at ~/haskell/apache2-log-parser - let's test parse efficiency first. print a summary line: parsed N input lines in M seconds. (P line/s) - support -f param and parsing lines from file. - can't convert duration to Int easily. - Applicative is powerful. I didn't know about <* before. logParser :: Parser [AccessLog] logParser = many $ parseLine <* endOfLine - how to use int type param in optparse-applicative? haskell - How to use Int with the optparse-applicative inside a data constructor? - Stack Overflow http://stackoverflow.com/questions/31636962/how-to-use-int-with-the-optparse-applicative-inside-a-data-constructor