Spectral Converter: User Guide

The Spectral Converter is a tool that transforms spectra originally stored in a ground-based one-dimensional image-like FITS format to a fully IVOA SpectrumDM compliant format.

It can be both used on the command line, or over the web when installed as CGI

Key concepts

  • template A text file that follows the human-readable format of FITS headers. Its contents will be "patched in" the produced FITS file.
  • context A python script that allows to costumise the tool, including specifying a template, and the metadata sources to fill in the template variables.
  • datasource Connection to either a database, or to an URL resource returning JSON format.
  • datacursor Query to execute against a datasource.

Quick start

How to run the tool

This will run the tool with minimum functionality. It will only convert the data in FITS files using the 1D-image format, to a binary table.

On the command line:

convert.py your_fits_file_here.fits

The parameter, your_fits_file_here.fits, can either be a local file or an URL. The output FITS file can be specified using --output-file. By default, it is stored on your system's temporary folder (the exact location printed on the screen).

How to provide your own template

The next level of functionality is to include FITS keywords (metadata) of your choosing to better describe the spectrum. This is achieved with a template: a text file that follows the human-readable format of FITS headers, e.g. (excerpt from an actual template):

APERTURE= '0.86'               / [arcsec] Aperture (width or lengthxwidth)      
DATE-OBS= '1997-02-07T10:23:47' / UT observation start time                     
EXPOSURE=            2699.7468 / [s] exposure duration                          
TSTART  =       50486.43319054 / [d] MJD exposure start time                    
TSTOP   =       50486.46591063 / [d] MJD exposure stop time                     
TMID    =       50486.44955059 / [d] MJD exposure mid time                      

These metedata will be appended to the appropriate FITS extension. The IVOA Spectrum Data Model. "Part 4: FITS serialization" details which FITS keywords to use. For consistency, templates should be stored under the "context" folder, with a "txt" extension.

On the command line:

convert.py -t context/myheader.txt your_fits_file_here.fits

Note that in this template, all FITS keywords' value are constants.

How to provide your own contexts

Templates with variable values are supported through contexts (and its datasources and datacursors) must be in place.

Contexts are python scripts, stored under the "context" folder (e.g the context "archival_images" will consist of context/archival_images.py), that allows to define:

  • how to fetch original FITS file
  • template to "patch in" the extension of the binary table
  • how to fetch metadata (datasources and datacursors)

They are also required to enable CGI.

See (and use) the provided context/default.py as a starting point. This context is appropriately named default: it is used when no "context" parameter is provided. Hence any changes done to this file will affect the usage described in the previous sections!

Below are the customisation options available. It consists of configuration items (properties of the python object defined in the script) or functionality (methods of the python object):

Properties:

  • description A short description of the types of files this context applies to. It will be used for display in the help message of the tool
  • template Path to the header template to apply
  • urlfetch URL to use when fetching the original FITS file. The file identifier given as parameter to the tool will replace a "%s" found in this string
  • datasources Dictionary of sources of data (database servers, json endpoints)
  • datacursors Dictionary of queries/processing to execute against the datasources
  • self_test_identifiers List of identifiers to use when running the tool in self test mode

Methods:

  • init() Execute some initialisation code.
  • retrieve_file(id) Returns the URL to use when fetching the original FITS file. If the urlfetch property is not flexible enough, it can be set to None, in which case this method will be called instead.
  • pre_metatada_fetch(identifier, datacursor_name, datacursor, metadata) This method is run before metadata is fetched from each datacursor. It is an opportunity to change the query before it is run
  • post_metadata_fetch(hdulist_in, keys, metadata) This method is run after the metadata is fetched from the datasources, but before applying the values to the template. It is an opportunity to compute values, apply formating, fix know errors in the metadata sources, etc

On the command line:

convert.py -c my_context your_fits_identifier_here

As a CGI:

http://localhost/cgi-bin/spectralconverter/convert.py?context=my_context&id=your_fits_identifier_here

Note that:

  • Specifying my_context means the context located at context/my_context.py will be loaded
  • Typically a context will define a smarter way to get to a file other than its full URL. Getting it by archival identifier makes more sense.

Reference Information

Command line parameters

  convert.py [options] [identifiers]

Identifiers are context dependent. If no context is specified, the
default context is assumed, and idetifiers are file names

The following options are accepted:
  -v, --verbose              enable debug messages
  -h, --help                 display this message
      --output-dir <path>    set output folder
  -o, --output-file <name>   set output file (overrides --output-dir)
      --force-overwrite      silently overwrites an existing file
  -c, --context <token>      select context
  -t, --template <path>      template for new FITS header
  -l, --log <path>           log messages to a file rather than the console
      --self-test            execute a set of test conversions

CGI parameters

  id       an identifier for the file
  context  a token that identifies which context to apply

Templates

The template syntax is similar to a FITS header, with the following differences:

  • FITS cards are separated with a newline
  • text matching ${...} will be replaced with values fetched from the datasources
  • text following the rightmost # (hash) symbol is considered a template comment and will be discarded.

The placeholders are replaced with values fetched from the datasources as defined in the datacursors property of a context. The syntax for the template placeholders is:

${datacursor_name:metadata_key}

Where:

  • datacursor_name name of a datacursor defined in the custumisation, or "fits"
  • metadata_key key defined within the datacursor

After processing the template variables and comments, the result should look like a FITS header, where FITS keywords semantics apply, namely:

  • keyword names must be either 8 characters long, or use the HIERARCH convention
  • string values are enclosed in single quotes
  • COMMENT, HISTORY and blank cards are allowed
  • the card cannot exceed 80 characters in length

The values of the original FITS file primary header keywords are also available, as a built-in datacursor, using "fits" as the datacursor name.

Examples:

EQUINOX = ${ssa:CoordSys_SpaceFrame_Equinox} / Coordinates precessed to J2000
SPECSYS = '${modelvalues:SpectralCoord.RefPos}'
DATALEN = ${fits:NAXIS1} / Number of points in spectrum

Contexts

datasources

Two types of datasources are available: database and JSON. For database access, a suitable database driver must be installed. Only Sybase is supported at present, but support for other databases can be added easily. Datasources are defined as entries in a dictionary. For databases:

_datasource_name_ : {
    'vendor' : _vendor_name_,         # 'sybase'
    'server' : _server_alias_,        # as defined in the Sybase interfaces file
    'user' : _database_username_,
    'password' : _password_,
    'database' : _default_database_,
    'isolation' : _isolation_level_,  # recommended: '1' (prevents dirty reads)
},

For JSON:

_datasource_name_ : {
    'vendor' : 'http-json',
    'url' : _URL_,          # a %s in the URL will be replaced with the file identifier
},

datacursor

Datacursors define how to fetch data from the datasources: queries to execute against the databases, sections to look up in the case of JSON.

'ssa' : {
    'type' : 'db',
    'datasource' : 'safdb',
    'casesensitive' : False,
    'query' : 'select * from ...'              # a @identifier in the string will be replaced with the file identifier
},
'modelvalues' : {
    'type' : 'db-vertical',
    'datasource' : 'safdb',
    'casesensitive' : False,
    'query' : 'select col1, col2 from ...',    # a @identifier in the string will be replaced with the file identifier
},
'ssa' : {
    'type' : 'json',
    'datasource' : 'eso-fileinfo',
    'casesensitive' : False,
    'section' : _key_in_JSON_response_,
},

The type of datacursor can be one of:

  • db The query will return only one row. Each column will be mapped into a dictionary entry: the column name as key, the cell value as value.
  • db-vertical The query will return multiple rows with two columns. Each row will be mapped into a dictionary entry: the first column name as key, the second column as value. To be used when the database is structured as key/value pairs.
  • json A JSON dictionary, identified by the section, will be loaded.

The casesensitive value defines whether the keys will be used in a case sensitive manner. Assigning False allows for case insensitive variable names in the template.

advanced (involves programming)

It is possible to do more than simply copying values from datasources to the header. One can transform, format, override, etc the values. This is achieved by writing python

Installation

Requirements

and optionally, depending on your datasources:

How to install as a CGI under Apache

  • edit spectralconverter/.htaccess to match your $PYTHONPATH
  • copy the spectralconverter files to the apche cgi-bin folder:
  • edit httpd.conf to enable for .htaccess files:
  • configure a cache folder for python eggs:

MacOSX

The following details apply if you wish to use the apache installation pre-installed in MacOSX.

  • edit spectralconverter/.htaccess to match your $PYTHONPATH

  • copy the spectralconverter files to the apche cgi-bin folder:
cp -r spectralconverter /Library/WebServer/CGI-Executables/

  • edit httpd.conf to enable for .htaccess files:
sudo vi /etc/apache2/httpd.conf
find: <Directory "/Library/WebServer/CGI-Executables">
on the next line, replace "None" for "All": AllowOverride All
restart Apache (System Preferences -> Sharing -> Web Sharing)

  • configure a cache folder for python eggs:
    • either create it:
mkdir /Library/WebServer/.python-eggs
chmod 777 /Library/WebServer/.python-eggs
    • or configure it to an existing folder
sudo vi .htaccess
SetEnv PYTHON_EGG_CACHE /tmp

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r2 - 27 Jul 2010 - AlbertoMicol
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback