README 08-08-2014

This README includes instructions on how to setup and use the
client.sh script to communicate with the 2014 MED/MER IO Server.

This tool was developed for unix-like environments (Mac OS X, Linux,
Cygwin), the rest of this documentation assumes you are working in such
an environment.

Setup:

  There are three setup steps required to begin using the IO Server
  and client..

  1. Obtaining required tools/software
  2. x509 Certificate generation
  3. Client configuration

  Detailed instructions are as follows.

  Required tools/software:

    OpenSSL (http://www.openssl.org/)
    cURL (http://curl.haxx.se/)
    xmllint (included with Libxml2, http://xmlsoft.org)
    Perl (http://www.perl.org/)

  x509 Certificate generation:

    Before you can connect to the IO Server, you'll need to generate a
    private key and Certificate Signing Request (CSR), and send the
    CSR file to NIST at med_poc@nist.gov.  After NIST reviews and
    signs your request, NIST will send you a signed certificate needed
    to connect to the IO Server.

    To generate a private key ..

      openssl genrsa -out key.pem 2048

    Then, to generate a Certificate Signing Request ..

      openssl req -new -key key.pem -out csr.pem

    NOTE** While generating the CSR you'll be prompted to answer
    several questions about your location/organization, as well as a
    common name.  When prompted for a challenge password you may leave
    it blank.

    IMPORTANT** Your common name will be used by the IO Server to
    identify your team, you must choose a common name using only
    alphanumeric characters, '-' and '_'.

  Client Configuration:

    Before using the client software you will need to create a
    client.cfg configuration file, using the client_blank.cfg file as
    a template.  Notably, you will need to enter the path for both
    your signed certificate received from NIST, and the private key
    associated with that certificate.  Specifically the
    'user_certificate' field corresponds to signed certificate
    received from NIST (e.g. cert.pem), and 'user_key' field
    corresponds to your private key used when generating the CSR file
    (e.g. key.pem).

    The 'prf' field designates whether or not you intend to generated
    additional search results using Pseudo-Relevance Feedback, it
    should be set to PRF or noPRF.

    The 'mer' field designates whether or not your team is
    particpating in the MER Evaluation, and as such, will be including
    recounting archives along with your search results.  It should be
    set to Yes or No.

    The 'med_eval_part' field allows you to designate whether you will
    be searching the MED14-EvalFull or MED14-EvalSub set of videos.
    This value should be set to either Full or Sub.

Client/Server Interaction:

  The client.sh script serves as the interface to the IO Server.  The
  client.sh script serves two main functions, first, to parse input
  and input files from the user, perform validate then send this input
  on to the IO Server, where it is again validated and then collated.
  Second, the client.sh script makes requests to the IO Server for
  resources on behalf of a user, then presents those resources back to
  the user.

  As a user, and driver of the client, a notional workflow may look
  something like ..

  1. Request a list of available tasks.
  2. Parse the returned list of tasks, pick a task (or dispatch
     automatically), and request the resources for that task.
  3. Dispatch to the appropriate module the resources/inputs returned
     from the request.
  4. When the module finishes the task, send the module output to the
     IO Server via the client script.
  5. Continue with steps 1-4 until no more tasks are available, or all
     required tasks are complete.

Client Usage:

  Running the client can be done as follows (assuming you're running
  commands from this directory) ..

    bash client.sh
 
  -- or --

    ./client.sh

  Running the command without any options will show the usage text,
  with more detailed instructions on how to use the client tool.
  Running the command with the '-h' option will also show this usage
  text.

    ./client.sh -h

  IMPORTANT** Be wary of using relative paths in your configuration
  file, these paths are always considered to be relative to the
  directory from which you are running the client script.  This is
  important if you intend to run the client script from different
  directories.

Client/Server Response:

  Generally speaking the client.sh script returns an exit status of 0
  when it has successfully interacted with the IO Server, as well as
  some output.  The form of this output is dependent on the
  command/taskID used, this is briefly described in the usage text of
  the client script but is described with more detail below.

  Response formats:

    next - Returns a csv file with information regarding tasks for
    which are currently available.  The fields of which are ..

      TaskID,Optionality,Opens,Closes

    NOTE** Tasks which are only dependent on tasks which are optional
    will show up in this list, assuming they are otherwise available.
    Once a task (that depends on an optional task) has been checked
    out, the depending task can no longer be completed.  For that
    reason, teams should excercise caution when automatically
    processing tasks returned from next.

    outstanding - Returns a csv file with information regarding tasks
    which have been checked out, but have not been checked in.  The
    fields of which are ..

      TaskID,Start,Closes

    all - Returns a csv file with information regarding all tasks
    which have been checked out or completed.  The fields of which are
    ..

      TaskID,CheckedOut,CheckedIn

    check_out - Depending on the task being checked out (specifically,
    the module being used), this command returns a list of key:value
    pairs with the input names and their values, these pairs are
    returned in no particular order.  Each set of key:value pairs is
    described below ..

      MG:
        InMS:"name of the input metadata store"
	OutMS:"name of the output metadata store"
	video-set:File containing the list of videos to compute
	  metadata for
	PW:"password for the video set data (Not always present)"

      SQG:
        EventDescription:File containing the event description for the
          event
	eventID:"ID of the event (e.g. E001)"

      EQG:
        SemanticQuery:Semantic Query file
	ExemplarLabels:csv file containing listing the exemplar
	  videoIDs
	TrainMS:"name of the metadata store containing the exemplar
	  video metadata"

      ES:
        Query:Query file
	SearchMS:"name of the metadata store to perform the search on"
	PRF:"PRF or noPRF, designating whether or not the Event Search
	  module should include Psuedo-Relevance Feedback results"

    check_in - Regardless of the task being checked in, this command
    returns the time taken (in milliseconds) for task completion, as
    recorded by the server.  The format of this response is ..

      TimeTaken:Milliseconds taken

    up_sysdesc | rerun_ps_mg | rerun_ps_training_mg | reset_dr_tasks -
    These commands don't take a TaskID but if they are successfully
    processed by the IO Server, the format of the response will be ..

      Response:"OK"

    NOTE** Additional information regarding the specific file formats
    of input/output files is included in the MED/MER 2014 Evaluation
    Plan.

  When a request is unsuccessful, the client.sh script will exit with
  a non-zero exit status and return a justification for the failure.

Dry Run Test Script:

  A test script (tests/client_dr_test.sh) has been included with this
  package that automatically completes every task available for MED in
  the Dry Run phase of the evaluation in order to demonstrate the
  client/server interaction.  Note that when this test script finishes
  it also sends the reset_dr_tasks, and rerun_ps_mg commands to the
  server to open the tasks back up for further testing.

  To run the test script, you need to set some specific values in the
  client.cfg file, specifically you should set 'prf' to "noPRF" and
  'mer' to "Yes".  Also make sure that your client.sh script is able
  to contact the server (a good way to check this is by running
  './client.sh next').  Then, run ..

    ./tests/client_dr_test.sh

Changelog:

  08-08-2014:

    - Added client command to retrieve the list of all completed and
      outstanding tasks for your team and the timestamps recorded for
      each.
    - Bounding box attributes in recountings are now optional.

  07-15-2014:

    - Validation now correctly identifies detections with scores
      greater than (rather than greater than or equal) the threshold
      as positives for the purpose of checking for recounting archive
      completeness.
    - Extra attributes may now be included in queries/recountings to
      allow teams to build more parsable queries for their system.
      Note that all current required attributes are still required,
      and extra attributes will be ignored for the evaluation.
    - Semantic query search tasks are now correctly optional.
    - Failure to connect to the I/O Server now provides some guidance
      and additional detail.
    - The client/server now only requires teams participating in MER
      to submit recounting archives with their 010Ex search results.
    - Extra validations now checks the eventID in the threshold file

  06-17-2014:

    - PRF search tasks are now automatically checked out with non-PRF
      search tasks for teams who have set prf='PRF' in their
      client.cfg file.  PRF search results must be checked in as a
      separate check_in call using the PRF tagged TASKID (e.g. for
      DR_ES_E031_010Ex, the corresponding PRF TASKID would be
      DR_ES_E031_010Ex_PRF).  PRF search tasks may not be checked out
      explicitly, and will not show up in results from the 'next'
      command, but will show up in results from the 'outstanding'
      command when appropriate.

  06-05-2014:

    - 'next' command should no longer shows tasks which cannot be
      completed.
    - Non-recounting files are no longer permitted to be included in
      recounting archives.  Validation for ES tasks is now more
      robust, and provides slightly more information when failing
      to validate.  
    - Task lockout is now checked on 'check_out' as well as
      'check_in'.
    - Video set passwords are now distributed with the corresponding
      MG task as PW="password".  NOTE** This information is only
      distributed for password protected video sets.

  05-14-2014:

    - Absolute path for 'working_dir' field now works correctly.
    - Relative paths specified in the configuration file are now
      relative to where the client script is being run.
    - Semantic_Metadata size attribute in schemas/ms_info.xsd now
      expects gigabytes instead of megabytes.
    - Memory attribute for GPUs in schemas/mg-hw_description.xsd and
      schemas/cots-hw_description.xsd now expects gigabytes instead of
      megabytes.

