Advanced
Introduction | Get started | Advanced | Code
By photoset vs. collection
Flixport scans all photos in 2 different approaches, by photoset or by collection. User makes the choice with -w
or --by_collection
and while photoset is the default approach. By collection means Flixport traverses all collections first, then iterates photoset in each leaf collection. By photoset simply means it iterates all photosets. Why does it matter to user?
Destination path
The way Flixport figures out the exact destination of a particular photo is based on 3 options:
-d
or-dest_spec
-p
or-dest_dir
-n
or-dest_file_name
The destination of each photo is <dest_spec><desc_dir>/<dest_file_name>
, where dest_dir
and dest_file_name
supports $ syntax. In most cases the default values of dest_dir
and dest_file_name
works well.
- The default value of
dest_file_name
is${f.title}.${f.originalFormat}
. - When photos are exported by photoset, the default value of
dest_dir
is/${s.title}
. - When photos are exported by collection, the default value of dest_dir is
/${c.title}/${s.title}
.
With the default settings above command line like below
java -jar flixport-0.0.1.jar -d s3:mybucket/flickr
would copy photo myphoto1.jpg
in Album my_album to s3 bucket mybucket
with key flickr/my_album/myphoto1.jpg
when it exports photo by photoset.
If my_album is in collection my_collection and command line runs by collection, the same photo would be flickr/my_collection/my_album/myphoto1.jpg
.
More about the expression
In these expressions, $f
is file, $s
is a photoset and $c
is a collection.
- Only
$s
is available fordest_dir
when photos are exported by photoset. $s
and$c
are available fordest_dir
when photos are exported by collection.$f
and whatever is available fordest_dir
are available fordest_file_format
.
Execution options
Max number of files
Max number of files can be limited by -m
or --max_files
, so that users can get an idea what's going to happen in the full run. For example:
java -jar flixport-0.0.1.jar -d s3:mybucket/flickr -m 20
Dry run mode
Another way to preview the execution is to dry run the command line tool without actually copying any file. With -r
or --dry_run
option, Instead of copying the file, the tool simply logs a message saying it would copy a file from a location to a location. In dry run mode, it becomes particularly important to keep the log files. For example:
java -jar flixport-0.0.1.jar -d s3:mybucket/flickr -r 2>&1 | tee /tmp/flixport.log
Multi-thread
By default flixport runs in a single thread, which copies one file after another. This is very inefficient if you have large number of files to copy. Since the command line is making many TCP calls, the thread is mostly idle while waiting for the calls to return. In your final run you almost always need to specify number of threads to use with -t or --threads option to keep the run time reasonable. For example:
java -jar flixport-0.0.1.jar -d s3:mybucket/flickr -t 20