It is by no means feature-complete or rigorously tested, but some may find it useful. Future revisions will cover more of the XML schema, and may allow limited transcoding to/from NetCDF to allow other tools (e.g. MATLAB/IDL) a read/write bridge. Please note that this is not officially supported nor endorsed by JPSS and should be considered user-community contributed utility scripts.
Using the glance adapter, I've started doing verification runs on some of the test output, statistically comparing the big-endian truth output provided with the ADL against test data generated locally. An example of elementary 'stats' for one variable comparison is below. More elaborate HTML and graphical test reports can also be generated, I'll try to post some mapped-product and multi-channel examples soon.
Adl_blob.py needs python 2.5~2.7 with numpy, and the (very limited) self-test uses matplotlib to display some ATMS-FSDR data. My patched version of glance can be found at http://www.ssec.wisc.edu/~rayg/dist/alpha/ for the moment. You'll need to have the adl_blob.py module in your python library path (PYTHONPATH), and glance requires a lot of modules including pycdf, pyhdf, h5py, matplotlib, and mako and may be most easily handled with a copy of the "holyhandgrenade" python distribution we use. Ubuntu/Debian prebuilt packages may exist for the majority of the dependencies as well.
As I get more pieces of this working and tested, I'll post a more complete package and wiki some more documentation.
Code: Select all
ln -s /path/to/ADL2.0/ADL/data/output
ln -s /path/to/ADL2.0/ADL/xml
for fn in output/viirsCalTruthOutputs/*SDR; do ln -s $fn $(basename $fn).BE; done
ln -s output/VIIRS*SDR .
ln -s xml/VIIRS*SDR.xml .
FORMAT=jpss_adl glance stats VIIRS-M9-SDR.BE VIIRS-M9-SDR >stats.txt
Code: Select all
--------------------------------
Bt_refl
Finite Data Statistics
a_finite_count: 2457600
a_finite_fraction: 1.0
b_finite_count: 2457600
b_finite_fraction: 1.0
common_finite_count: 2457600
common_finite_fraction: 1.0
finite_in_only_one_count: 0
finite_in_only_one_fraction: 0.0
General Statistics
a_missing_value: nan
b_missing_value: nan
epsilon: 0.0
epsilon_percent: None
max_a: 65535
max_b: 65535
min_a: 58
min_b: 58
num_data_points: 2457600
shape: (768, 3200)
spatially_invalid_pts_ignored_in_a: 0
spatially_invalid_pts_ignored_in_b: 0
Missing Value Statistics
a_missing_count: 0
a_missing_fraction: 0.0
b_missing_count: 0
b_missing_fraction: 0.0
common_missing_count: 0
common_missing_fraction: 0.0
NaN Statistics
a_nan_count: 0
a_nan_fraction: 0.0
b_nan_count: 0
b_nan_fraction: 0.0
common_nan_count: 0
common_nan_fraction: 0.0
Numerical Comparison Statistics
correlation: 1.0
diff_outside_epsilon_count: 344
diff_outside_epsilon_fraction: 0.000139973958333
max_diff: 1
mean_diff: 0.000139973958333
median_diff: 0.0
mismatch_points_count: 344
mismatch_points_fraction: 0.000139973958333
perfect_match_count: 2457256
perfect_match_fraction: 0.999860026042
r-squared correlation: 1.0
rms_diff: 0.0
std_diff: 0.0118302310047
Code: Select all
#!/usr/bin/env python
# encoding: utf-8
"""
adl_blob.py
Copyright 2011, University of Wisconsin Regents.
Licensed under GNU Public License (GPL) v3. See http://www.gnu.org/licenses/gpl-3.0-standalone.html
Parse ADL-generated XML files describing data structures
XML validation is not required
Create python numpy+ctypes representations of one or more data structures from the XML parse tree
Properly handle natural packed BLOBs in native endianness
Allow read-write access to BLOB files as Pythonic data structures, including numpy multidimensional arrays where appropriate with the following preliminary interface:
map( adl-xml-pathname, blob-pathname, optional-writable-flag, optional-byteorder-flag ) => data structure
create( adl-xml-pathname, optional-blob-pathname, optional-byteorder-flag ) => data structure
Transcode ADL blobs conforming to a given XML spec and a natural-packed BLOB in native-endian format into to NetCDF files.
Effectively adlxml + blob => netcdf3 or netcdf4, netcdf4 or netcdf3 + adlxml => blob
This will be a ‘naive’ transcoding with little additional metadata other than the version of the library used to transcode, and minimal provenance information identifying the BLOB and XML data used.
Be usable both as a library and as a standalone program requiring the python runtime
Python 2.6 or newer, 64-bit compiled on Linux or Darwin (OS X)
numpy 1.3 or newer
netcdf4-python 0.9.3 or newer when used for NetCDF transcoding
FUTURE functionality
Allow transcoding of BLOBs between endiannesses.
Allow direct access of non-native endian files for read-only access
Allow direct access of non-native endian files for read-write access
Allow alternate packing (non-natural) to be specified for transcoding input (but not output).
Allow alternate packing (non-natural) to be specified for read-only access
Mark-up BLOB files with provenance metadata as filesystem extended attributes (would require python xattr module) for bookkeeping purposes.
"""
__author__ = 'R.K.Garcia <rayg@ssec.wisc.edu>'
__version__ = '$Id: adl_blob.py 83 2011-03-14 20:31:28Z rayg $'
__docformat__ = 'Epytext'
import os,sys,logging
import xml.etree.ElementTree as ET
import ctypes as c
import numpy as np
import numpy.ctypeslib as npc
import mmap
from pprint import pformat
LOG = logging.getLogger(__name__)
# use different ctypes base classes to handle endianness
BIG_ENDIAN = c.BigEndianStructure
LITTLE_ENDIAN = c.LittleEndianStructure
NATIVE_ENDIAN = c.Structure
# dictionary of types that aren't covered by numpy
# #include <iostream>
# using namespace std;
# int main()
# {
# bool a[4];
# cout << int(sizeof(bool)) << endl;
# cout << int(sizeof(a) / 4) << endl;
# }
TYPEMAP = { 'bool' : c.c_byte,
'UInt8': c.c_uint8, # bug in numpy 1.3 makes us need to do this manually
'Int8' : c.c_int8 }
def ctype_from_str(typename):
"return an appropriate ctypes-compatible type for a given ADL typename e.g. Float32"
assert( type(typename)==str )
# take advantage of numpy including data types matching spelling, except lowercase
ctype = TYPEMAP.get(typename, None)
if ctype is not None:
return ctype
ctor = vars(np).get(typename.lower())
# FUTURE: do this without constructing a temporary object, it's kinda crufty
ctype = type(npc.as_ctypes(ctor()))
LOG.debug('%r found to be %r' % (typename, ctype))
return ctype
def Dimension(node):
"return name, width for a dimension node"
def _(name, type=str):
return type(node.find(name).text)
name = _('Name')
min_index = _('MinIndex',int)
max_index = _('MaxIndex',int)
if min_index!=max_index:
LOG.warning('MinIndex != MaxIndex in Dimension')
return name, max_index
def Field(node, dims = None):
"return a name, ctypes representation for a field xml node"
assert(node.tag=='Field')
def _(name, type=str):
return type(node.find(name).text)
name = _('Name')
symbol = node.find('Symbol')
if symbol is not None:
LOG.debug('using %s is symbol instead of %s' % (symbol.text, name))
name = symbol.text
offset = _('FieldOffset', int)
num_dims = _('NumberOfDimensions', int)
dim_info = [Dimension(x) for x in node.getchildren() if x.tag=='Dimension']
LOG.debug('dimension info: %r' % dim_info)
ctype = _('DataType', ctype_from_str)
num_data = _('NumberOfData', int)
# fillvalue = _('InitialFill', data_type)
if num_dims:
from operator import mul
# compound each dimension using reduce
LOG.debug('dimension reduction of %r' % dim_info)
ctype = reduce(mul, reversed([x[1] for x in dim_info]), ctype)
if dims is not None:
dims.update(dict(dim_info))
# attrs = dict(offset = offset, fillvalue = fillvalue)
return name, ctype
def ProductData(node, dims = None):
"""return a series of (name, ctypes representation) generated from a ProductData node
and optionally mark up dimension dictionary with dimension names and sizes
"""
assert(node.tag=='ProductData')
def _(name, type=str):
return type(node.find(name).text)
name = _('DataName')
LOG.debug('processing ProductData %r' % name)
field_type = _('ProductFieldType')
# if field_type != 'Regular':
# LOG.warning('%s is %s and not a Regular field type' % (name, field_type))
num_dims = _('NumberOfDimensions', int)
num_fields = _('NumberOfFields', int)
LOG.debug('%s has %d fields' % (name, num_fields))
for child in node.getchildren():
if child.tag != 'Field':
LOG.debug('skipping %s while looking for Fields' % child.tag)
continue
fname, fctype = Field(child, dims=dims)
LOG.debug('field name is %s' % fname)
yield fname, fctype
def NPOESSDataProduct(xml, base_class=NATIVE_ENDIAN, context=None):
"return (name, ctypes representation) of a NPOESSDataProduct node"
assert(xml.tag=='NPOESSDataProduct')
dimensions = dict()
fields = list()
for node in xml.getchildren():
if node.tag != 'ProductData':
LOG.debug('skipping %s' % node.tag)
continue
fields += list(ProductData(node, dimensions))
name = xml.find('ProductName').text
LOG.info('%s has %d fields' % (name,len(fields)))
LOG.info( ', '.join(x[0] for x in fields) )
LOG.debug(pformat(list(enumerate(fields))))
LOG.debug('dimensions:\n%s' % pformat(dimensions))
class _adl_struct_(base_class):
_fields_ = fields
return name, _adl_struct_
def from_file(xml_pathname, endian = NATIVE_ENDIAN, *product_names):
"""return name, ctypes structure definition for a given ADL product XML schema
"""
xml = ET.fromstring(file(xml_pathname, 'rt').read())
# FUTURE: multiple products in one file?
# return [tuple(x) for x in NPOESSDataProduct(xml) if not product_names or (x[0] in product_names)]
return NPOESSDataProduct(xml, endian)
def map(xml_pathname, blob_pathname, writable=False, endian=NATIVE_ENDIAN):
"""map a BLOB conforming to an XML specification
e.g. map( 'ATMS_FSDR.xml', 'ATMS-FSDR' )
optionally, map as read-write (writable=True)
byte_order (not yet implemented) allows mapping of BIG_ENDIAN, LITTLE_ENDIAN, NATIVE_ENDIAN
default byte_order is NATIVE_ENDIAN
"""
# map file as readwrite and mmap access as write-through, or
# readonly and copy-on-write for read-only mode
fflags, aflags = ('rb+', mmap.ACCESS_WRITE) if writable else ('rb', mmap.ACCESS_COPY)
# open the file and map it as a buffer
fp = file(blob_pathname, fflags)
mm = mmap.mmap(fp.fileno(), 0, access=aflags)
# parse the XML
name, struct = from_file(xml_pathname, endian)
# use ctypes to map a read-only copy or a read-write direct mmap
data = struct.from_buffer(mm)
# hang the mmap and its file from the data structure to hold reference count
data._file = fp
data._mmap = mm
# add the sync operation; FUTURE consider doing this with multiple inheritance in from_file
def sync(fp=fp if writable else None):
if fp:
fp.flush()
data.sync = sync
return data
def create(xml_pathname, blob_pathname = None, endian = NATIVE_ENDIAN):
raise NotImplemented
def test1():
fsdr = map('ATMS_FSDR.xml', 'ATMS-FSDR')
data = np.array(fsdr.correctedRayleighsTemperature[:])
from pylab import plot, title, grid, show, figure
figure()
plot( data[0].transpose() )
title('corrected rayleigh temperature ATMS-FSDR ADL 2.0 test scanline 0')
grid()
fsdr.sync()
return fsdr
def test2():
fsdr = map('ATMS_FSDR.xml', 'ATMS-FSDR.BE', endian = BIG_ENDIAN)
data = np.array(fsdr.correctedRayleighsTemperature[:])
from pylab import plot, title, grid, show, figure
figure()
plot( data[0].transpose() )
title('corrected rayleigh temperature ATMS-FSDR.BE ADL 2.0 test scanline 0')
grid()
fsdr.sync()
return fsdr
def main():
import optparse
usage = """
%prog [options] adl-xml-filename
"""
parser = optparse.OptionParser(usage)
parser.add_option('-t', '--test', dest="self_test",
action="store_true", default=False, help="run self-tests")
parser.add_option('-v', '--verbose', dest='verbosity', action="count", default=0,
help='each occurrence increases verbosity 1 level through ERROR-WARNING-INFO-DEBUG')
(options, args) = parser.parse_args()
# make options a globally accessible structure, e.g. OPTS.
global OPTS
OPTS = options
levels = [logging.ERROR, logging.WARN, logging.INFO, logging.DEBUG]
logging.basicConfig(level = levels[options.verbosity])
if options.self_test:
test1()
test2()
from pylab import show
show()
return 2
if not args:
parser.error( 'incorrect arguments, try -h or --help.' )
return 9
# split multiple filenames into a list if provided
xml_filenames = args[0].split('+')
# build a dictionary of data structures
strux = dict(from_file(xml_filename) for xml_filename in xml_filenames)
LOG.debug(repr(strux))
LOG.info( 'found structures: %s' % ', '.join(strux.keys()) )
# FIXME: transcode to/from netcdf
return 0
if __name__=='__main__':
sys.exit(main())