Metadata extraction: from FITS files to databases with MEx

Note: this hands-on session runs only in the morning; in the afternoon you will deploy DALToolKit to create data access services using your database.

Abstract

Publishing images and spectra in a VO-compliant format is a two step procedure: first making sure all data is valid and described in a common, homogeneous way, and then providing query interfaces to the data, using the validated data descriptions.

MEx is a tool to aid the first step (metadata ingestion) where the data descriptions are extracted from FITS headers into a data repository. A set of required and optional metadata are defined for each type of data, and are extracted from the FITS files through mapping rules. The data repository (most commonly a database) separates the metadata ingestion from the second step, the query service.

The goal of this session is to load a simple MySQL database and deploy DALToolKit for providing SIA and SSA interfaces to the data.

Participants should bring their own data to ingest, data of which they have a good understanding (of the meaning of the FITS keywords). When a custom database structure should be supported, a small amount of programming might be required.

External References

Advisors (ESO)

  • Remco Slijkhuis
  • Bruno Rino
  • Jean-Christophe Malapert

Software Requirements


Session outline

  • Publishing data in the VO as a two-step procedure:
    • gathering metadata (knowledge of the data, e.g. FITS keywords) in a homogeneous fashion
    • building a (web) service that searches data using the metadata and aloows access to the data

  • gathering meta-data: MEx (ESO)
  • building the service: DALToolKit (ESA)

1st run: demo data, demo database

  • (one time only) setup a database
    • create SQL tables
  • gathering meta-data into the database
    • map FITS keywords to "concepts" AKA model items: Mapping Editor
      • get a sample FITS header
      • define mappings for (at least) the required metadata
      • test mappings against sample FITS file header
    • ingest metadata into database: MEx
      • execute mappings against all the FITS files
  • build service
    • MEx creates default DALToolKit configuration

2nd run: your data, your database

  • your data:
    • the Mapping Editor, revisited
    • several types of data in one package: Catalogue Builder
  • your data definition
    • model items configuration
    • your own mapedit
  • your database:
    • MEx scripting
    • adapt DALToolKit configuration

Steps

  • setup a database
    • make sure database is running
    • connect as "superuser"
mysql -u root -p
    • create database
create database esodata;
    • create database tables
use esodata
source samples/db/sia.sql

show tables;
desc SIA;

  • get a FITS header
    • dfits file
java -jar lib/fitshead-1.0.jar -x 0 samples/images/GOODS_ISAAC_03_H_V1.5.fits > sampleheader.txt
  • define/test mappings
    • mapping editor: http://vops1.hq.eso.org:8080/mapedit
    • upload the model item list specific for this workshop: config/modelitem_definitions.txt
    • select data type: image.reduced
    • upload fits header
    • edit mapping rules
    • fix errors
    • Note: min and max RA and Dec are mandatory for DALToolKit
    • when all are valid, download mappings file
  • set-up a directory for sharing in tomcat
    • unzip samples/files.zip into $CATALINA_BASE/webapps
  • ingest
    • edit mex configuration config/mex-daltookit.properties (e.g. db password, folder to copy data to)
    • run mex on files + mappings file
java -jar mex-java.jar --type SIA -d samples/images -m samples/mappings/isaac.txt
  • build service
    • edit DALToolKit configuration if needed
    • deploy DALToolKit service
cd DALToolKit
vim build.properties.local
ant deploy
[SIAP-v1.0-mex|SSAP-v0.1-mex|SSAP-v1.0-mex]
Edit | Attach | Watch | Print version | History: r16 < r15 < r14 < r13 < r12 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r13 - 25 Jun 2008 - BrunoRino
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback