Newer
Older
- lib shared code for server and client, the reliable-download library
- rd-api rd-api cli tool, server side
- rd rd cli tool, client side
- misc/ learning tools and temp codes
- test/ tests
- package.yaml stack project description
- pypi/ for release on pypi, see pypi/Makefile
** 2018-05-09 how to release latest code on PyPI? how to make a release?
- update version number in
- lib/RD/CliVersion.hs (required, used by pypi pkg)
- package.yaml (optional)
- update README file. add ChangeLog entry on *README.rst
./README.md
./pypi/rd-api/README.rst
./pypi/rd-client/README.rst
- build wheel and test it in production server
make dist -C pypi
wheel will be built in pypi/rd-api/dist, pypi/rd-client/dist dir.
export RD_API_TWINE_TOKEN=xxx
export RD_CLIENT_TWINE_TOKEN=xxx
To release only the server:
export RD_API_TWINE_TOKEN=xxx
To release only the client:
export RD_CLIENT_TWINE_TOKEN=xxx
- problems
- how to sync build files for pypi?
rsync -n -air --files-from=pypi/build_files ./ s02:projects/reliable-download/
rsync -air --files-from=pypi/build_files ./ s02:projects/reliable-download/
** 2018-05-08 example run in prod env
- try it on de03
on ryzen5,
cd ~/projects/reliable-download/
FN=`stack exec which rd-api`
gzip -k "$FN"
scp "$FN.gz" de03:d/
chmod +x rd-api
env WEB_ROOT=$PWD ./rd-api
curl -v http://de03.dev.emacsos.com:8082/rd/
curl -I http://de03.dev.emacsos.com:8082/virtio-win-0.1.215.iso
tmake stack exec rd -- -d ~/d/.blocks -o ~/d/ http://de03.dev.emacsos.com:8082/virtio-win-0.1.215.iso
tmake ~/d/rd -d ~/d/.blocks -o ~/d/ http://de03.dev.emacsos.com:8082/virtio-win-0.1.215.iso
#+BEGIN_SRC sh
sylecn@ryzen5:~/projects/reliable-download$ tmake stack exec rd -- -d ~/d/.blocks -o ~/d/ http://138.201.95.248:8082/gitlab-ce_10.3.5-ce.0_amd64_xenial.deb
Tue May 8 00:45:31 CST 2018
will start timed run in 3 sec
running command: stack exec rd -- -d /home/sylecn/d/.blocks -o /home/sylecn/d/ http://138.201.95.248:8082/gitlab-ce_10.3.5-ce.0_amd64_xenial.deb
GET /rd/ api ok
Downloading file: "gitlab-ce_10.3.5-ce.0_amd64_xenial.deb", 377 MiB, 189 blocks
189 new block(s) ready on server side
combining blocks to create /home/sylecn/d/gitlab-ce_10.3.5-ce.0_amd64_xenial.deb
file downloaded to /home/sylecn/d/gitlab-ce_10.3.5-ce.0_amd64_xenial.deb
all urls downloaded.
started at 2018-05-08 00:45:34
stopped at 2018-05-08 00:47:20
Duration: 106 seconds
sylecn@ryzen5:~/projects/reliable-download$
#+END_SRC
except for lacking progress info and download speed info.
download works perfectly.
sha1sum for the whole file matches.
- 2018-05-08 when using 5 threads for the download.
#+BEGIN_SRC sh
block 188 fetched
combining blocks to create /home/sylecn/d/gitlab-ce_10.3.5-ce.0_amd64_xenial.deb
file downloaded to /home/sylecn/d/gitlab-ce_10.3.5-ce.0_amd64_xenial.deb
all urls downloaded.
started at 2018-05-08 21:56:26
stopped at 2018-05-08 21:57:42
Duration: 76 seconds
#+END_SRC
** 2018-05-06 how to run rd-api in dev env
- how to run rd-api
cd ~/projects/reliable-download/
env WEB_ROOT=/home/sylecn/persist/cache stack exec rd-api
- Test it is working:
static file hosting:
curl -XGET http://localhost:8082/sdkman.sh
rd api:
curl -XGET http://localhost:8082/rd/ideaIC-2018.1.tar.gz
curl -XGET http://localhost:8082/rd/ideaIC-2018.1.tar.gz | jq .
To clear cached file status for ideaIC-2018.1.tar.gz,
redis-cli del "/home/sylecn/persist/cache/ideaIC-2018.1.tar.gz_2097152_status"
the 2097152 there is 2MiB block size in bytes.
- client tool:
cd ~/projects/reliable-download/
curl http://localhost:8082/rd/ideaIC-2018.1.tar.gz
stack exec rd -- -d ~/d/.blocks -o ~/d/ http://localhost:8082/ideaIC-2018.1.tar.gz
test fresh block-not-ready state:
redis-cli del "/home/sylecn/persist/cache/sdkman.sh_2097152_status"
redis-cli del "/home/sylecn/persist/cache/sdkman.sh_2097152"
stack exec rd -- -d ~/d/.blocks -o ~/d/ http://localhost:8082/sdkman.sh
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
** 2018-05-05 write the main logic of creating block metadata.
then make it work with a thread pool with a single thread.
env WEB_ROOT=/home/sylecn/persist/cache stack exec rd-api
curl -XGET http://localhost:8082/rd/ideaIC-2018.1.tar.gz
this should return json of the block metadata.
- block metadata looks like this:
GET /rd/bigfile
#+BEGIN_SRC sh
{"ok": true,
"block_size": "2MiB", # this is a fixed value.
"file_size": xxxx, # file size in bytes
"block_count": 24,
"blocks": [
[0, 0, 2097151, block1_sha1sum],
[1, 2097152, 4194303, block2_sha1sum],
...
[N, start, end, blockN_sha1sum]
]}
#+END_SRC
** 2018-05-06 calculate sha1sum for blocks using a thread pool. design try 2.
- data protocol via redis.
hset <filepath>_blockSize blockId sha1sum
set <filepath>_blockSize_status working|done
- worker pool is there for calculating all blocks for one file.
fileQueue
Main ------------> fileWorker
GET /rd/file
if file status is None, push file to fileQueue.
do normal logic.
fileWorker:
fetch file from fileQueue.
start working on blocks one by one.
if block already cached in redis, skip it.
when all done, set <filepath>_blockSize_status done.
- this works and is easy to understand.
WIP info is also kept in redis for each file's block.
- works on first try. excellent.
- problems
- how to fail when redis hget or hset fail?
just return False
if some block fail, set status to error.
next time a GET /rd/file, it will trigger the queue again.
a cron job can also trigger a run.
- mapM how to skip rest when some action failed?
If there is a redis error halfway during calculation, I don't want to
calculate the rest sha1sum, because the result can't be stored.
-
** 2018-05-06 how to run hlint
stack exec hlint -- src api client logtest
or run on all git files:
stack exec hlint -- -g
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
** 2018-05-05 it's impossible to do logging easily in haskell.
two problems
- no way to pass around the logger engine/configuration without a log monad in
your stack. since you may want to log anywhere, you have to change all the
data types.
This requires you know monad and monad transformer really well.
You can't just get a logger from "global env" like in other language.
- no easy way to format a string as Data.Text.Text.
both printf and Text.Format.format has too much noise when doing logging.
I need to write:
runLogT "Main" logger $ do
logInfo_ $ LT.toStrict $ TF.format "will listen on {}:{}" ((host config), (port config))
I would like to write:
logInfo_ $ format "will listen on {}:{}" (host config) (port config)
try https://hackage.haskell.org/package/formatting
Formatting in Haskell
https://chrisdone.com/posts/formatting
logInfo_ $ sformat ("will listen on " % string % ":" % int) (host config) (port config)
// this is better.
** 2018-05-05 writing tests in hspec
- Test WAI application using hspec
hspec/hspec-wai: Helpers to test WAI application with Hspec
https://github.com/hspec/hspec-wai
Here has example. good.
Hspec: A Testing Framework for Haskell
http://hspec.github.io/
- Test.Hspec.Expectations
https://hackage.haskell.org/package/hspec-expectations-0.8.2/docs/Test-Hspec-Expectations.html#v:Expectation
shouldBe
shouldStartWith
shouldEndWith
shouldContain
etc
** 2018-05-04 make hoogle work in current project
stack build hoogle
requires building 54 pkgs. lots of dependencies.
DONE hoogle-5.0.14
- build local database
stack hoogle
DONE 64 pkgs
Updating Haddock index for snapshot packages in
/home/sylecn/.stack/snapshots/x86_64-linux-nopie/lts-10.3/8.2.2/doc/index.html
- stack hoogle html
now it works.
- info: file size
53M .stack-work/
** 2018-05-04 for project notes, see GTD.org id002
** 2018-05-06 client design doc (moved from GTD.org id002)
- client side start downloading blocks using a thread pool or similar.
block data is saved to .blocks/<fn>/blockN_<block_sha1sum> when it is
fetched in whole and verified. block data is removed when all blocks are
joined to the final file, unless user specify -k --keep-blocks on rd cli.
show a nice progress bar.
#+BEGIN_SRC sh
downloading bigfile
xxx blocks
downloading block 1
downloading block 2
1/24 ready, X%
downloading block 3
downloading block 4
2/24 ready, X%
downloading block 5
...
block N ready, 100%
bigfile downloaded.
#+END_SRC
block download use HTTP/1.1 Range header to fetch that block.
Range: bytes=0-499
Range: bytes=500-999
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
** 2018-05-05 API spec :design:
- GET /<path>
static file server. works even without redis.
- GET /rd/<path>
provides rd API for given static file, requires redis-server.
when <path> rd metadata is ready, it will return
#+begin_src json
{
"ok": true, // metadata is ready or not
"msg": "", // msg to user
"path": "test/中文.txt", // file path in URL
"filepath": "./test/中文.txt", // file path on server side
"file_size": 8,
"blocks": [
[
0, // block ID
0, // start byte, inclusive
7, // stop byte, inclusive
"69bca99b923859f2dc486b55b87f49689b7358c7" // block sha1sum
]
],
"block_count": 1,
"block_size": "2MiB"
}
#+end_src
- GET /test-rd/<path>
test rd API path parsing without calculating sha1sum
* lib docs :entry:
** 2018-05-06 http client docs
Making HTTP requests - http-client library
https://haskell-lang.org/library/http-client
Network.HTTP.Simple
https://www.stackage.org/haddock/lts-10.3/http-conduit-2.2.4/Network-HTTP-Simple.html
Network.HTTP.Client
https://www.stackage.org/haddock/lts-10.3/http-client-0.5.7.1/Network-HTTP-Client.html
how to handle exceptions in http-client?
http-client/TUTORIAL.md at master · snoyberg/http-client
https://github.com/snoyberg/http-client/blob/master/TUTORIAL.md#exceptions
** 2018-05-06 handle IO exceptions
- Control.Exception
https://www.stackage.org/haddock/lts-10.3/base-4.10.1.0/Control-Exception.html
- System.IO.Error
https://www.stackage.org/haddock/lts-10.3/base-4.10.1.0/System-IO-Error.html
** 2018-05-05 sol/hpack: hpack: An alternative format for Haskell packages
https://github.com/sol/hpack
** 2018-05-05 HUnit: A unit testing framework for Haskell
https://hackage.haskell.org/package/HUnit
** 2018-05-05 Web.Scotty
https://www.stackage.org/haddock/lts-11.7/scotty-0.11.1/Web-Scotty.html
html :: Text -> ActionM ()
scotty/examples at master · scotty-web/scotty
https://github.com/scotty-web/scotty/tree/master/examples
** 2018-05-05 Data.Aeson
http://hackage.haskell.org/package/aeson-1.3.1.0/docs/Data-Aeson.html
// this seems more readable than lts haskell's doc.
Aeson: the tutorial
https://artyom.me/aeson
** 2018-05-06 optparse-applicative :: Stackage Server
https://www.stackage.org/lts-10.3/package/optparse-applicative-0.14.0.0
** 2022-03-12 drop redis-server as rd-api dependency. :featurereq:
- use a built-in key-value db. such as Berkeley db, sqlite3, or leveldb.
use a well known path for the db name.
$HOME/.cache/reliable-downloader/rd-api.db
-
** 2018-05-05 allow config app at runtime.
via env var and command line parameter.
- HOST
- PORT
- REDIS_HOST
- REDIS_PORT
- WEB_ROOT web root dir, HTTP Path will be relative to this dir.
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
- WORKER
- 2018-05-10 I'm trying to write my parser for parsing env var, just like
optparse-applicative.
see ~/haskell/env-var-parser/src/Lib.hs
it is difficult. I can parse a single parameter easily, but I don't know how
to compose parsers to parse more complex data structure. read more about
optparse-applicative. I can't understand the source code without more
reading.
pcapriotti/optparse-applicative: Applicative option parser
https://github.com/pcapriotti/optparse-applicative
An applicative Parser is essentially a heterogeneous list or tree of
Options, implemented with existential types.
See this blog post for a more detailed explanation based on a simplified
implementation.
Applicative option parser
https://paolocapriotti.com/blog/2012/04/27/applicative-option-parser/
it requires usage of GADTs, that's why I don't understand it on first look.
The ConsP constructor is the combination of an Option returning a function,
and an arbitrary parser returning an argument for that function. The
combined parser applies the function to the argument and returns a result.
the ConsP constructor is where all magic happen.
I can't define a data structure like this myself.
because I don't understand what it is.
In the end, the applicative is defined on the list structure. Not on any
option itself. when creating parser, you are creating a list. see option and
optionR. list is easily an instance of functor and applicative.
I think I can make it work on env variable, although I can't write this code
myself.
- how it constructs a value for any data type? I think the Applicative Parser
already make that work.
how to do error handling for "parse" failures? just return Nothing.
how to make all fields optional? if key not found, just return Nothing for
that field.
- I think it really should happen inside optparse-applicative. otherwise
default value, parsing data is duplicated.
but that will be too difficult for me. This is the first time I see GADT
used. and first time I see a value can be constructed for any data type.
// wait. about "a value can be constructed for any data type", in non-record
syntax, it's just a normal function call. should be easy to construct value
using code.
- for my use case, I will just write non-portable functions.
allow config rd-api using env variable.
- how to construct a data value not using the constructor? just do a regular
function call. the data constructor is a function.
- can I update all records using: config1 {config2}? no.
- how to clean it up, remove intermediate variables.
- test these commands:
stack exec rd-api -- --host=127.0.0.1 --port=8060
env HOST=127.0.0.1 PORT=8060 stack exec rd-api
env HOST=127.0.1.1 PORT=8061 stack exec rd-api -- --host=127.0.0.1 --port=8060
env HOST=127.0.1.1 PORT=abc stack exec rd-api -- --host=127.0.0.1 --port=8060
env WORKER=1 stack exec rd-api
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
v1.1.1.0 all works.
but I don't like the code in updateRDConfigFromEnvPure
it's unpleasant code.
error handling shouldn't be this complicated.
- check how other people handle env variable.
- envparse, this is similar to ~/haskell/env-var-parser/
- envy
https://www.stackage.org/lts-10.3/package/envy-1.3.0.2
based on the doc. I can see why updateRDConfigFromEnvPure is too
complex. it is trying to do too many things.
the applicative style should be used to create values.
just find another way to do the merge on two value of the same type. or
update the runParser to support it.
envy also support infer env var from record field name, by using
GHC.Generics. So you no longer need to write a parser.
- check how envy works.
~/haskell/testing/handle-params/app/Main.hs
TODO envy doesn't handle Bool well.
only True is accepted as true.
yes, true, on, 1 not accepted as true.
- read-env-var, this is a simple wrapper on lookupEnv and
Text.Read.readMaybe. Not what I need.
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
* current :entry:
**
** 2024-04-07 when combining blocks to create final file, don't print
"No block fetched in last 10 seconds" log any more if there is no other files
in DL list.
** 2024-04-07 rd-api: when a file content is changed on disk, auto invalidate all
cached blocks.
- when user request rd metadata, and file's mtime changed, do a sha1sum on
first 4M and last 4M of the file. if any of these sha1sum changes,
invalidate cache in redis.
when calculate metadata, cache file's mtime, first 4M and last 4M of the
file's content's sha1sum.
- this will allow DL the correct file when file content for the same file name
is changed.
- give some log in console when file content changes.
-
** 2024-03-12 rd-api, if file is already transferred block by block, I can support
live compress easily. If the client request compress as param such as
?compress=zstd. default is no compress.
- when the source file is just tar, not compressed format like squashfs or
zip, this can speed up transfer by delaying compress.
** 2024-03-12 rd, add an DL remaining time estimate, based on estimated DL speed.
each block has a block size. I know when it is started and when it is
finished. I know how many more blocks to fetch. it should be easy to
estimate. calculate a moving avg speed using the last 5 blocks DLed.
** 2022-03-15 rd client, is there a built-in repeat/loop function?
IO () -> IO ()
I should not need to write showProgressLoop explicitly.
** 2018-05-09 test the app under unstable network.
I remember there are tools that can simulate packet loss.
policy in ovs can do it.
- 2018-05-11 tcp - Simulate delayed and dropped packets on Linux - Stack Overflow
https://stackoverflow.com/questions/614795/simulate-delayed-and-dropped-packets-on-linux
test this in vbox VM. stretch01
see stretch01 daylog.
* done :entry:
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
** 2019-02-28 bug: rd-api -d java/
option -d: cannot parse value `java/'
- stack exec rd-api -- -d t1/
this works fine.
probably a python wrapper issue.
- problems
- 2024-04-08 -d foo/ works fine.
-d foo, -d ./foo/ also works.
- but getFileStatus on ./test/中文.txt failed:
./test/中文.txt: getFileStatus: does not exist (No such file or directory)
on s02,
cd ~/projects/reliable-download
rd-api -d ./test/
on agem10,
rd http://[240e:388:8a05:500:be24:11ff:fe06:be5f]:8082/中文.txt
- try it on agem10,
rd http://127.0.0.1:8082/中文.txt
also fail.
file name encoding issue?
import System.FilePath ((</>))
let filepath = webRoot (rcConfig rc) </> T.unpack reqFilePath
getFileStatus filepath
I see no encoding issue.
try add some log.
no issue.
getFileStatus works in ghci.
stack ghci reliable-download:exe:rd-api
#+begin_src sh
ghci> import System.Posix.Files (getFileStatus, fileSize)
ghci> s1 <- getFileStatus("./test/TestApi.hs")
ghci> fileSize s1
7647
ghci> s2 <- getFileStatus("./test/中文.txt")
ghci> fileSize s2
8
ghci> :q
Leaving GHCi.
#+end_src
so why did it fail?
maybe it only fails when run via python?
LANG/locale issue?
try run via stack or just binary.
~/.local/pipx/venvs/rd-api/lib/python3.11/site-packages/rdapi/rd-api
it also fail. not python wrapper issue.
- use abs path in -d works.
it's relative path and CWD issue.
maybe I changed CWD somewhere. check it.
git grep setCurrentDirectory
yes.
-- static app only support serving from PWD
setCurrentDirectory (webRoot config)
so just always use relative path, don't join with webroot dir.
** 2024-04-08 should I use absolute file path in redis key?
this can reduce some sha1 calculation if user run rd-api in different root dir.
e.g.
cd /foo/bar/
rd-api
cd /foo/
rd-api -d bar
# or
rd-api -d /foo/bar
** 2024-04-07 rd-api, rd: switch to fast-logger, use local datetime in logs, not UTC time. :logging:featurereq:
- tinylog is not maintained any more. no longer in latest stackage LTS.
- tinylog types doesn't allow use local time in logs. requires writing lots of code.
- create a demo project for using RIO and fast-logger.
cd ~/projects/
stack new rio-fastlogger-demo rio
app/Main.hs
withLogFunc lo
https://www.stackage.org/haddock/lts-22.15/rio-0.1.22.0/RIO.html#v:withLogFunc
RIO has built-in log ts support.
RIO timestamp and level output is not pretty and not easily customizable.
maybe later.
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
** 2023-11-12 rd, ipv6 based URL not supported.
- on de05,
rd-api -h :: -p 8083 --redis-host 10.96.195.242
on pve,
cd /wh01/share/songs/
rd http://[2a01:4f8:c0c:9c42::1]:8083/可一儿歌.tar
BUG nope. rd doesn't support ipv6 URL.
- 2024-04-08
stack exec rd-api -- -h ::
curl http://[::1]:8082/test/中文.txt
this works.
stack exec rd -- -v http://127.0.0.1:8082/test/中文.txt
this works.
stack exec rd -- -v http://[::1]:8082/test/中文.txt
this fails.
getRDResponse :: RDClientRuntimeConfig -> T.Text -> IO RDResponse
(do
req <- parseRequest $ T.unpack url
debugl rc $ "GET /rd" <> decodeUtf8 (path req)
resp <- httpJSON $ req { path="/rd" <> path req }
return $ getResponseBody resp)
search: haskell http-client host ipv6 address support
How to make a request to an IPv6 address using the http-client package in haskell? - Stack Overflow
https://stackoverflow.com/questions/70863436/how-to-make-a-request-to-an-ipv6-address-using-the-http-client-package-in-haskel
http-client 0.7.11 has the fix merged.
lts-18.27 http-client-0.6.4.1
try upgrade lts.
lts-20.26 http-client-0.7.13.1
yeah, that would work.
- problems
- tinylog is not in lts-20.26
build tinylog failed under lts-20.26
#+begin_quote
tinylog > Preprocessing library for tinylog-0.15.0..
tinylog > Building library for tinylog-0.15.0..
tinylog > [1 of 4] Compiling System.Logger.Message
tinylog >
tinylog > /tmp/stack-f92234dbf5c8a904/tinylog-0.15.0/src/System/Logger/Message.hs:57:1: error:
tinylog > Could not find module ‘Data.ByteString.Lazy.Builder’
tinylog > Perhaps you meant
tinylog > Data.ByteString.Builder (from bytestring-0.11.4.0)
tinylog > Data.ByteString.Lazy.Char8 (from bytestring-0.11.4.0)
tinylog > Use -v (or `:set -v` in ghci) to see a list of the files searched for.
tinylog > |
tinylog > 57 | import qualified Data.ByteString.Lazy.Builder as B
tinylog > | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tinylog >
tinylog > /tmp/stack-f92234dbf5c8a904/tinylog-0.15.0/src/System/Logger/Message.hs:58:1: error:
tinylog > Could not find module ‘Data.ByteString.Lazy.Builder.Extras’
tinylog > Perhaps you meant
tinylog > Data.ByteString.Builder.Extra (from bytestring-0.11.4.0)
tinylog > Use -v (or `:set -v` in ghci) to see a list of the files searched for.
tinylog > |
tinylog > 58 | import qualified Data.ByteString.Lazy.Builder.Extras as B
tinylog > | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#+end_quote
last bytestring package that has this module is
https://hackage.haskell.org/package/bytestring-0.10.12.1
but this is a base package, many pkg fail when this is downgraded.
- it's better to get rid of tinylog or maintain it myself to drop dependency
on
Data.ByteString.Lazy.Builder
Data.ByteString.Lazy.Builder.Extras
- should really get rid of tinylog.
which logger does rd-api use?
also tinylog.
- FIXED fix tinylog turned out to be easy. only need to update the import.
no other code change.
I can maintain that.
see agem10 ~/projects/tinylog/
- FIXED after fix tinylog. one aeson API change.
J.decode now returns KeyMap instead of HashMap.
** 2024-04-06 rd client: when server side doesn't support GET /rd/ api.
give a more clear msg to client side.
it's not client side's fault.
#+begin_quote
root@pve:/wh01/share/tv-series/Kingdom# rd http://1.116.206.228:8082/kingdom.tar
2024-04-06T07:14:31 E GET /rd/ api failed: "No redis connection, GET /rd/ disabled"
2024-04-06T07:14:31 E 1 urls failed/skipped.
#+end_quote
** 2018-05-05 utf-8 character not working well in path. :bug:
curl http://localhost:8082/rd/%E4%B8%AD%E6%96%87%E6%96%87%E4%BB%B6%E5%90%8D.rar
{"ok":true,"path":"中"}
only first character is in path key.
- a scotty bug?
full_path is also wrong. not regexp problem.
- well, since this encoding doesn't work well. I think I will pass the path in
json body instead of in the URL.
I don't need it in the URL anyway. download must be handled by a rd client.
- search: haskell scotty path utf-8 character
check source code.
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
Web.Scotty.Route
https://www.stackage.org/haddock/lts-11.7/scotty-0.11.1/src/Web.Scotty.Route.html
matchRoute
path req
req :: Request
path :: Request -> T.Text
path = T.fromStrict . TS.cons '/' . TS.intercalate "/" . pathInfo
pathInfo is defined in Network.Wai!
so problem is not in scotty. probably in Warp.
- check Network.Wai.pathInfo source code
https://www.stackage.org/haddock/lts-11.7/wai-3.2.1.2/Network-Wai.html#v:pathInfo
data Request = Request {
, pathInfo :: [Text]
}
check how warp fill this field.
Network.Wai.Handler.Warp.Request
https://www.stackage.org/haddock/lts-11.7/warp-3.2.22/src/Network.Wai.Handler.Warp.Request.html
import qualified Network.HTTP.Types as H
hdrlines <- headerLines firstRequest src
(method, unparsedPath, path, query, httpversion, hdr) <- parseHeaderLines hdrlines
pathInfo = H.decodePathSegments path
H.decodePathSegments doesn't have problem. I already tested it.
- check parseHeaderLines
https://www.stackage.org/haddock/lts-11.7/warp-3.2.22/src/Network.Wai.Handler.Warp.RequestHeader.html#parseHeaderLines
try parse this line using warp's code:
GET /rd/%E4%B8%AD%E6%96%87%E6%96%87%E4%BB%B6%E5%90%8D.rar HTTP/1.1
stack repl
import Network.Wai.Handler.Warp.RequestHeader (parseHeaderLines)
:l ~/fromsource/wai/warp/Network/Wai/Handler/Warp/RequestHeader
too many dependencies
try build warp in it's source dir.
cd ~/fromsource/wai/warp
stack build
lts-10.0 plan.
96 pkgs to build.
- problems
- can't find warp source code.
warp is inside wai repo.
https://github.com/yesodweb/wai
cloned to ~/fromsource/wai/
just import Network.Wai.Handler.Warp.Internal
it includes every module. but it doesn't export that function.
- search: haskell how to use non exported function
** 2023-11-12 rd-api unicode path bug.
- on de05,
rd-api -h :: -p 8083 --redis-host 10.96.195.242
on pve,
cd /wh01/share/songs/
rd http://49.12.207.182:8083/可一儿歌.tar
unicode in path is not supported. on rd-api server side.
- rd-api: .: openBinaryFile: inappropriate type (is a directory)
2023-11-12T06:07:53 I user request rd metadata for "."
BUG: unicode character in PATH is not properly supported.
- 2024-04-08 this is a known bug in warp. see later section in this file.
check whether it's fixed in latest warp.
check wai/warp changelog.
search: wai/warp changelog path unicode
https://hackage.haskell.org/package/wai-3.2.4/changelog
nothing.
wai 2.x doesn't have a changelog file.
- wait. curl on the resource works fine.
only rd fail. it's my code's problem?
I see, it's client issue. not server issue.
client when send request to server, should encode URL first.
it should be easy to fix.
downloadFile :: RDClientRuntimeConfig -> T.Text -> MaybeT IO Bool
I see, the url is of type T.Text, not properly encoded before sending via
HTTP.
getRDResponse rc url
req <- parseRequest $ T.unpack url
http-client parseRequest
I can add test case for this function.
this works. it will auto encode URL unsafe characters.
- resp <- httpJSON $ req { path="/rd" <> path req }
update this code to always parse from full URL. don't use <> on path segment.
test it.
stack exec rd-api -- -v
stack exec rd -- http://127.0.0.1:8082/中文.txt
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
I do need the path segment.
client is correct.
now check server side, how it convert the path back to Text.
- getRdHandler :: RDRuntimeConfig -> ExceptT T.Text ActionM ()
- when using urlDecode on matched path.
path <- lift $ param "1"
debugl rc $ "path is " <> showt path
it's incomplete.
- get (regex "^/rd/(.*)") $ do
this capture doesn't capture all of path.
try just use raw path, no captures.
how to get path inside ActionM?
https://hackage.haskell.org/package/wai-3.2.4/docs/Network-Wai.html#g:3
using a query param seems easier to parse in server side.
in next major release, I think I can switch to use new API on client side.
server side will serve both old and new /rd/ API.
- DL works now.
log msg needs some work.
DONE also the /test/rd/ URL requires some work. change to /test-rd/ should fix it.
use T.pack instead of showt should fix the log msg.
client side:
#+begin_quote
2024-04-08T07:20:22 I GET /rd/ api ok for "\20013\25991.txt"
2024-04-08T07:20:22 I Downloading file: /rd/中文.txt, 0.0 MiB, 1 blocks
2024-04-08T07:20:22 I all 1 block(s) ready on server side
2024-04-08T07:20:22 I progress: [100%] 1/1 blocks, /rd/中文.txt
2024-04-08T07:20:22 I Combining blocks to create "/home/sylecn/d/t2/\20013\25991.txt"
2024-04-08T07:20:22 I File downloaded to "/home/sylecn/d/t2/\20013\25991.txt"
2024-04-08T07:20:22 I All urls downloaded. 1 files, 1 blocks.
#+end_quote
server side:
#+begin_quote
sylecn@agem10:~/projects/reliable-download$ stack exec rd-api -- -v
2024-04-08T07:18:34 I creating 2 file worker(s)
2024-04-08T07:18:34 I fileWorker is waiting for jobs...
2024-04-08T07:18:34 I fileWorker is waiting for jobs...
2024-04-08T07:18:34 I rd-api 1.4.0.0
2024-04-08T07:18:34 I webRoot is .
2024-04-08T07:18:34 I will listen on :::8082
2024-04-08T07:19:15 D path is "/rd/\20013\25991.txt"
2024-04-08T07:19:15 D decodedPath is "\20013\25991.txt"
2024-04-08T07:19:15 D filepath is "./\20013\25991.txt"
2024-04-08T07:19:15 I user request rd metadata for "./\20013\25991.txt"
2024-04-08T07:19:15 I "./\20013\25991.txt" is a new file, sending task to worker
2024-04-08T07:19:15 I fileWorker working on "./\20013\25991.txt"
2024-04-08T07:19:15 D fillSha1sum: redis hgetall "./-\135.txt_2097152" ok
2024-04-08T07:19:15 D redis hset "./-\135.txt_2097152" 0 ok
2024-04-08T07:19:15 D Set file status to done for "./\20013\25991.txt"
2024-04-08T07:19:15 I fileWorker done for ./中文.txt, 0.0 MiB, 1 blocks
2024-04-08T07:19:15 I fileWorker is waiting for jobs...
2024-04-08T07:20:22 D path is "/rd/\20013\25991.txt"
2024-04-08T07:20:22 D decodedPath is "\20013\25991.txt"
2024-04-08T07:20:22 D filepath is "./\20013\25991.txt"
2024-04-08T07:20:22 I user request rd metadata for "./\20013\25991.txt"
2024-04-08T07:20:22 D "./\20013\25991.txt" is not a new file
2024-04-08T07:20:22 D file status is done
2024-04-08T07:20:22 D fillSha1sum: redis hgetall "./-\135.txt_2097152" ok
#+end_quote
- blockSha1sumHashKey fbp = Char8.pack (fbpFilepath fbp) <> "_" <> (Char8.pack . show) (fbpBlockSize fbp)
redis key seems not well encoded.
- git grep -n showt
check and fixed all usage of showt.
- MOVED should I use full path in redis key?
this can reduce some sha1 calculation if user run rd-api in different root dir.
** 2022-03-15 stack test should not rely on
/home/sylecn/persist/cache/ideaIC-2018.1.tar.gz
try use a smaller file within git tree.
** 2018-05-10 use a proper module hierarchy.
import RD.Utils
import RD.Api.Lib
import RD.Api.Config
import RD.Client.Lib
import RD.Client.Opts
- stack repl doesn't like duplicated Lib module. also for libs, correct mdoule
hierarchy is important.
-
** 2022-03-14 build on debian 9. push a new release to pypi.
- binary built on ryzen5 won't work because of high libc version.
#+BEGIN_SRC sh
root@de03:~/d# ./rd-api --version
./rd-api: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.27' not found (required by ./rd-api)
./rd-api: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by ./rd-api)
#+END_SRC
- problems
- is readme_renderer still required? does twine require it?
yes. it's required for "twine check" command.
** 2022-03-16 io-thread-pool, github field in package.yaml is wrong.
how to use my own non-github url for source URL?
- search: haskell package.yaml github field
package.yaml is from https://github.com/sol/hpack
use git field.
** 2022-03-16 add io-thread-pool as a git submodule. so the project can be built by other people.
just use git URL in extra-deps.
** 2018-05-07 loopUntilAllBlocksReady, how to track progress?
use a thread pool to download blocks, print overall progress when some parts
done or some time elapsed.
- how to track progress?
I used mapM to fetch block.
results <- mapM (fetchBlockAsync opts rc url rdResp) newReadyBlocks
how to show some progress info?
I need a supervisor thread. and I need a shared data structure.
a mapM is not enough to do this.
-
** 2022-03-15 client log, don't show each block fetch. show progress instead.
- log overall progress every 30s
- xx/xx blocks fetched, xx%
percentage show integer.