Introduction | Get started | Advanced | Code
Flixport scans all photos in 2 different approaches, by photoset or by collection. User makes the choice with -w
or --by_collection
and while photoset is the default approach. By collection means Flixport traverses all collections first, then iterates photoset in each leaf collection. By photoset simply means it iterates all photosets. Why does it matter to user?
The way Flixport figures out the exact destination of a particular photo is based on 3 options:
-d
or -dest_spec
-p
or -dest_dir
-n
or -dest_file_name
The destination of each photo is <dest_spec><desc_dir>/<dest_file_name>
, where dest_dir
and dest_file_name
supports $ syntax. In most cases the default values of dest_dir
and dest_file_name
works well.
dest_file_name
is ${f.title}.${f.originalFormat}
.dest_dir
is /${s.title}
. /${c.title}/${s.title}
.With the default settings above command line like below
java -jar flixport-0.0.1.jar -d s3:mybucket/flickr
would copy photo myphoto1.jpg
in Album my_album to s3 bucket mybucket
with key flickr/my_album/myphoto1.jpg
when it exports photo by photoset.
If my_album is in collection my_collection and command line runs by collection, the same photo would be flickr/my_collection/my_album/myphoto1.jpg
.
In these expressions, $f
is file, $s
is a photoset and $c
is a collection.
$s
is available for dest_dir
when photos are exported by photoset.$s
and $c
are available for dest_dir
when photos are exported by collection.$f
and whatever is available for dest_dir
are available for dest_file_format
.Max number of files can be limited by -m
or --max_files
, so that users can get an idea what's going to happen in the full run. For example:
java -jar flixport-0.0.1.jar -d s3:mybucket/flickr -m 20
Another way to preview the execution is to dry run the command line tool without actually copying any file. With -r
or --dry_run
option, Instead of copying the file, the tool simply logs a message saying it would copy a file from a location to a location. In dry run mode, it becomes particularly important to keep the log files. For example:
java -jar flixport-0.0.1.jar -d s3:mybucket/flickr -r 2>&1 | tee /tmp/flixport.log
By default flixport runs in a single thread, which copies one file after another. This is very inefficient if you have large number of files to copy. Since the command line is making many TCP calls, the thread is mostly idle while waiting for the calls to return. In your final run you almost always need to specify number of threads to use with -t or --threads option to keep the run time reasonable. For example:
java -jar flixport-0.0.1.jar -d s3:mybucket/flickr -t 20